Commit a3e517e
committed
[stdlib] String: Fix forward implementation of grapheme breaking rule 11
Rule GB11 in Unicode Annex 29 is:
GB11: Extended_Pictographic Extend* ZWJ × Extended_Pictographic
However, our forward grapheme breaking state machine implements it as:
GB11: Extended_Pictographic Extend* ZWJ+ × Extended_Pictographic
We implement the correct rules when going backward, which can cause String values to have different counts whether we’re going forward or back.
The rule as implemented would be fine (Unicode doesn’t care much about the placement of grapheme breaks in invalid sequences), but the directional inconsistency messes with String’s Collection conformance.
rdar://1042796711 parent d70e16b commit a3e517e
File tree
2 files changed
+47
-20
lines changed- stdlib/public/core
- validation-test/stdlib
2 files changed
+47
-20
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
723 | 723 | | |
724 | 724 | | |
725 | 725 | | |
726 | | - | |
727 | | - | |
728 | | - | |
729 | | - | |
730 | | - | |
| 726 | + | |
| 727 | + | |
| 728 | + | |
| 729 | + | |
| 730 | + | |
| 731 | + | |
| 732 | + | |
| 733 | + | |
| 734 | + | |
| 735 | + | |
| 736 | + | |
731 | 737 | | |
732 | 738 | | |
733 | 739 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
40 | 55 | | |
41 | 56 | | |
42 | 57 | | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | | - | |
47 | | - | |
48 | | - | |
| 58 | + | |
49 | 59 | | |
50 | 60 | | |
51 | 61 | | |
| |||
65 | 75 | | |
66 | 76 | | |
67 | 77 | | |
68 | | - | |
69 | | - | |
| 78 | + | |
| 79 | + | |
70 | 80 | | |
71 | 81 | | |
72 | 82 | | |
| |||
95 | 105 | | |
96 | 106 | | |
97 | 107 | | |
98 | | - | |
| 108 | + | |
99 | 109 | | |
100 | 110 | | |
101 | 111 | | |
102 | | - | |
103 | | - | |
104 | | - | |
105 | | - | |
106 | | - | |
107 | | - | |
| 112 | + | |
108 | 113 | | |
109 | 114 | | |
110 | 115 | | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
0 commit comments