Skip to content

Conversation

@bukzor
Copy link
Contributor

@bukzor bukzor commented Oct 22, 2025

Summary

  • Fixes HTML text nodes containing &, <, > being output without proper escaping
  • Prevents data corruption when round-tripping HTML through xq | xq -j

Changes

  • Added escapeTextContent() function for minimal entity escaping (&amp;, &lt;, &gt;)
  • Modified FormatHtml() to escape text nodes properly
  • Added comprehensive tests verifying output is valid XML

Test plan

  • Unit tests pass: go test ./...
  • New tests verify proper escaping of &, <, > in HTML text nodes
  • Tests confirm xq output can be parsed as XML (required for -j flag)
  • Verified tests fail without the fix (showing they detect the bug)

Example

Before this fix:

echo '<html>1 &amp; 2</html>' | xq
# Output: <html>1 & 2</html>  (bare & causes parse error)
echo '<html>1 &amp; 2</html>' | xq | xq -j
# Error: invalid character entity & (no semicolon)

After this fix:

echo '<html>1 &amp; 2</html>' | xq
# Output: <html>1 &amp; 2</html>  (properly escaped)
echo '<html>1 &amp; 2</html>' | xq | xq -j
# Success: {"html": "1 & 2"}

🤖 Generated with Claude Code

HTML text nodes containing &, <, > were output without escaping,
causing xq's output to be unparseable when piped back through xq -j.

This commit adds:
- New escapeTextContent() function for minimal entity escaping
- Modified FormatHtml to escape text nodes with &amp;, &lt;, &gt;
- Tests verifying the output is valid XML

Example issue:
  echo '<html>1 &amp; 2</html>' | xq | xq -j
  # Before: Error - bare & in output
  # After: Success - properly escaped as &amp;

This is a critical fix preventing data corruption when round-tripping
HTML through xq.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@bukzor bukzor force-pushed the fix-160-formathtml-escaping branch from b32b900 to e35e4d8 Compare November 14, 2025 17:31
@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.77%. Comparing base (ec5a59a) to head (e35e4d8).
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #161      +/-   ##
==========================================
+ Coverage   80.57%   80.77%   +0.19%     
==========================================
  Files           5        5              
  Lines         690      697       +7     
==========================================
+ Hits          556      563       +7     
  Misses         92       92              
  Partials       42       42              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants