Commit 10e5c2a
committed
fix: handle mixed-case HTML tags correctly
HTML is case-insensitive by spec, but the library was failing to process
tags with mixed case (e.g., <Br>, <DIV>, <Strong>). This caused translation
to stop prematurely, resulting in data loss.
Root cause: The HTML parser with lowerCaseTagName: false would preserve the
original case, but wouldn't recognize mixed-case void elements like <Br> as
self-closing tags. This caused content after the tag to be incorrectly parsed
as children of that tag.
Solution:
1. Set lowerCaseTagName: true in nodeHtmlParserConfig to normalize all tags
2. Updated visitor.ts to handle tags case-insensitively using toUpperCase()
3. Added comprehensive tests for various mixed-case tag scenarios
All translator lookups and element matching now work regardless of the
original HTML tag casing, preventing data loss when processing HTML with
inconsistent capitalization.
Resolves #631 parent eae5d0d commit 10e5c2a
3 files changed
+74
-7
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
128 | 128 | | |
129 | 129 | | |
130 | 130 | | |
131 | | - | |
| 131 | + | |
132 | 132 | | |
133 | 133 | | |
134 | 134 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
111 | 111 | | |
112 | 112 | | |
113 | 113 | | |
114 | | - | |
| 114 | + | |
115 | 115 | | |
116 | 116 | | |
117 | 117 | | |
118 | 118 | | |
119 | 119 | | |
120 | | - | |
121 | | - | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
122 | 123 | | |
123 | 124 | | |
124 | 125 | | |
| |||
171 | 172 | | |
172 | 173 | | |
173 | 174 | | |
174 | | - | |
175 | | - | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
176 | 180 | | |
177 | 181 | | |
178 | | - | |
| 182 | + | |
179 | 183 | | |
180 | 184 | | |
181 | 185 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
101 | 101 | | |
102 | 102 | | |
103 | 103 | | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
104 | 167 | | |
0 commit comments