33** Warning** : This library is at a very early stage of development, and it
44contains a substantial amount of ` unsafe ` code. Use at your own risk!
55
6- [ ![ Build Status] ( https://github.com/servo/html5ever/workflows/Tendril%20CI/badge.svg )] ( https://github.com/servo/tendril /actions )
6+ [ ![ Build Status] ( https://github.com/servo/html5ever/workflows/Tendril%20CI/badge.svg )] ( https://github.com/servo/html5ever /actions )
77
8- [ API Documentation] ( https://doc.servo.org /tendril/index.html )
8+ [ API Documentation] ( https://docs.rs /tendril )
99
1010## Introduction
1111
@@ -16,9 +16,9 @@ Further mutations occur in-place until the string becomes shared, e.g. with
1616` clone() ` or ` subtendril() ` .
1717
1818Buffer sharing is accomplished through thread-local (non-atomic) reference
19- counting, which has very low overhead. The Rust type system will prevent
20- you at compile time from sending a tendril between threads. (See below
21- for thoughts on relaxing this restriction.)
19+ counting, which has very low overhead. The Rust type system will prevent you at
20+ compile time from sending a tendril between threads. (See below for thoughts on
21+ relaxing this restriction.)
2222
2323Whereas ` String ` allocates in the heap for any non-empty string, ` Tendril ` can
2424store small strings (up to 8 bytes) in-line, without a heap allocation.
@@ -33,9 +33,9 @@ to go over the limit.
3333
3434` Tendril ` uses
3535[ phantom types] ( https://doc.rust-lang.org/stable/rust-by-example/generics/phantom.html )
36- to track a buffer's format. This determines at compile time which
37- operations are available on a given tendril. For example, ` Tendril<UTF8> ` and
38- ` Tendril<Bytes> ` can be borrowed as ` &str ` and ` &[u8] ` respectively.
36+ to track a buffer's format. This determines at compile time which operations are
37+ available on a given tendril. For example, ` Tendril<UTF8> ` and ` Tendril<Bytes> `
38+ can be borrowed as ` &str ` and ` &[u8] ` respectively.
3939
4040` Tendril ` also integrates with
4141[ rust-encoding] ( https://github.com/lifthrasiir/rust-encoding ) and has
@@ -45,33 +45,33 @@ preliminary support for [WTF-8][] buffers.
4545
4646### Ropes
4747
48- [ html5ever] [ ] will use ` Tendril ` as a zero-copy text representation. It would
49- be good to preserve this all the way through to Servo's DOM. This would reduce
48+ [ html5ever] [ ] will use ` Tendril ` as a zero-copy text representation. It would be
49+ good to preserve this all the way through to Servo's DOM. This would reduce
5050memory consumption, and possibly speed up text shaping and painting. However,
5151DOM text may conceivably be larger than 4 GB, and will anyway not be contiguous
5252in memory around e.g. a character entity reference.
5353
54- * Solution:* Build a ** [ rope] [ ] on top of these strings** and use that as
55- Servo's representation of DOM text. We can perhaps do text shaping and/or
56- painting in parallel for different chunks of a rope. html5ever can additionally
57- use this rope type as a replacement for ` BufferQueue ` .
54+ * Solution:* Build a ** [ rope] [ ] on top of these strings** and use that as Servo's
55+ representation of DOM text. We can perhaps do text shaping and/or painting in
56+ parallel for different chunks of a rope. html5ever can additionally use this
57+ rope type as a replacement for ` BufferQueue ` .
5858
59- Because the underlying buffers are reference-counted, the bulk of this rope
60- is already a [ persistent data structure] [ ] . Consider what happens when
61- appending two ropes to get a "new" rope. A vector-backed rope would copy a
62- vector of small structs, one for each chunk, and would bump the corresponding
63- refcounts. But it would not copy any of the string data.
59+ Because the underlying buffers are reference-counted, the bulk of this rope is
60+ already a [ persistent data structure] [ ] . Consider what happens when appending
61+ two ropes to get a "new" rope. A vector-backed rope would copy a vector of small
62+ structs, one for each chunk, and would bump the corresponding refcounts. But it
63+ would not copy any of the string data.
6464
65- If we want more sharing, then a [ 2-3 finger tree] [ ] could be a good choice.
66- We would probably stick with ` VecDeque ` for ropes under a certain size.
65+ If we want more sharing, then a [ 2-3 finger tree] [ ] could be a good choice. We
66+ would probably stick with ` VecDeque ` for ropes under a certain size.
6767
6868### UTF-16 compatibility
6969
70- SpiderMonkey expects text to be in UCS-2 format for the most part. The
71- semantics of JavaScript strings are difficult to implement on UTF-8. This also
72- applies to HTML parsing via ` document.write ` . Also, passing SpiderMonkey a
73- string that isn't contiguous in memory will incur additional overhead and
74- complexity, if not a full copy.
70+ SpiderMonkey expects text to be in UCS-2 format for the most part. The semantics
71+ of JavaScript strings are difficult to implement on UTF-8. This also applies to
72+ HTML parsing via ` document.write ` . Also, passing SpiderMonkey a string that
73+ isn't contiguous in memory will incur additional overhead and complexity, if not
74+ a full copy.
7575
7676* Solution:* Use ** WTF-8 in parsing** and in the DOM. Servo will ** convert to
7777contiguous UTF-16 when necessary** . The conversion can easily be parallelized,
0 commit comments