Commit c3f5385
authored
Php/iconv Should Not Treat FFFE/FFFF as Valid (#2910)
Fix #2897. We have been relying on iconv/mb_convert_encoding to detect invalid UTF-8, but all techniques designed to validate UTF-8 seem to accept FFFE and FFFF. This PR explicitly converts those characters to FFFD (Unicode substitution character) before validating the rest of the string. It also substitutes one or more FFFD when it detects invalid UTF-8 character sequences.
A comment in the code being change stated that it doesn't handle surrogates. It is right not to do so. The only case where we should see surrogates is reading UTF-16. Additional tests are added to an existing test reading a UTF-16 Csv to demonstrate that surrogates are handled correctly, and that FFFE/FFFF are handled reasonably.1 parent f90adcf commit c3f5385
File tree
6 files changed
+121
-45
lines changed- tests
- PhpSpreadsheetTests
- Reader/Csv
- data/Reader/CSV
6 files changed
+121
-45
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2785 | 2785 | | |
2786 | 2786 | | |
2787 | 2787 | | |
2788 | | - | |
2789 | | - | |
2790 | | - | |
2791 | | - | |
2792 | | - | |
2793 | | - | |
2794 | | - | |
2795 | | - | |
2796 | | - | |
2797 | | - | |
2798 | | - | |
2799 | | - | |
2800 | | - | |
2801 | | - | |
2802 | | - | |
2803 | | - | |
2804 | | - | |
2805 | | - | |
2806 | | - | |
2807 | | - | |
2808 | | - | |
2809 | | - | |
2810 | | - | |
2811 | | - | |
2812 | | - | |
2813 | | - | |
2814 | | - | |
2815 | | - | |
2816 | | - | |
2817 | | - | |
2818 | 2788 | | |
2819 | 2789 | | |
2820 | 2790 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
6 | 7 | | |
7 | 8 | | |
8 | 9 | | |
9 | 10 | | |
10 | 11 | | |
11 | 12 | | |
12 | | - | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
| |||
28 | 29 | | |
29 | 30 | | |
30 | 31 | | |
31 | | - | |
| 32 | + | |
32 | 33 | | |
33 | 34 | | |
34 | 35 | | |
35 | 36 | | |
36 | 37 | | |
37 | 38 | | |
38 | | - | |
| 39 | + | |
39 | 40 | | |
40 | 41 | | |
41 | 42 | | |
| |||
328 | 329 | | |
329 | 330 | | |
330 | 331 | | |
331 | | - | |
| 332 | + | |
332 | 333 | | |
333 | 334 | | |
334 | 335 | | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
335 | 345 | | |
336 | | - | |
337 | | - | |
338 | | - | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
339 | 350 | | |
340 | 351 | | |
341 | | - | |
| 352 | + | |
| 353 | + | |
342 | 354 | | |
343 | | - | |
| 355 | + | |
| 356 | + | |
344 | 357 | | |
345 | 358 | | |
346 | 359 | | |
347 | 360 | | |
348 | 361 | | |
349 | 362 | | |
350 | 363 | | |
351 | | - | |
| 364 | + | |
352 | 365 | | |
353 | 366 | | |
354 | 367 | | |
355 | 368 | | |
356 | 369 | | |
357 | 370 | | |
358 | | - | |
| 371 | + | |
359 | 372 | | |
360 | 373 | | |
361 | 374 | | |
362 | 375 | | |
363 | | - | |
| 376 | + | |
364 | 377 | | |
365 | 378 | | |
366 | 379 | | |
| |||
537 | 550 | | |
538 | 551 | | |
539 | 552 | | |
540 | | - | |
| 553 | + | |
541 | 554 | | |
542 | | - | |
| 555 | + | |
| 556 | + | |
543 | 557 | | |
544 | 558 | | |
545 | 559 | | |
| |||
686 | 700 | | |
687 | 701 | | |
688 | 702 | | |
689 | | - | |
| 703 | + | |
690 | 704 | | |
691 | 705 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
66 | 66 | | |
67 | 67 | | |
68 | 68 | | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
69 | 88 | | |
70 | 89 | | |
71 | 90 | | |
| |||
Lines changed: 44 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
119 | 119 | | |
120 | 120 | | |
121 | 121 | | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
122 | 151 | | |
Binary file not shown.
0 commit comments