Skip to content

Commit a86e65d

Browse files
authored
Text-to-speech and content extraction (#48)
1 parent 884063c commit a86e65d

File tree

63 files changed

+3963
-162
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

63 files changed

+3963
-162
lines changed

CHANGELOG.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,16 @@ All notable changes to this project will be documented in this file. Take a look
66

77
## [Unreleased]
88

9+
### Added
10+
11+
#### Shared
12+
13+
* [Extract the raw content (text, images, etc.) of a publication](Documentation/Guides/Content.md).
14+
15+
#### Navigator
16+
17+
* [A brand new text-to-speech implementation](Documentation/Guides/TTS.md).
18+
919
### Deprecated
1020

1121
#### Shared

Documentation/Guides/Content.md

Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,202 @@
1+
# Extracting the content of a publication
2+
3+
:warning: The described feature is still experimental and the implementation incomplete.
4+
5+
Many high-level features require access to the raw content (text, media, etc.) of a publication, such as:
6+
7+
* Text-to-speech
8+
* Accessibility reader
9+
* Basic search
10+
* Full-text search indexing
11+
* Image or audio indexes
12+
13+
The `ContentService` provides a way to iterate through a publication's content, extracted as semantic elements.
14+
15+
First, request the publication's `Content`, starting from a given `Locator`. If the locator is missing, the `Content` will be extracted from the beginning of the publication.
16+
17+
```swift
18+
guard let content = publication.content(from: startLocator) else {
19+
// Abort as the content cannot be extracted
20+
return
21+
}
22+
```
23+
24+
## Extracting the raw text content
25+
26+
Getting the whole raw text of a publication is such a common use case that a helper is available on `Content`:
27+
28+
```swift
29+
let wholeText = content.text()
30+
```
31+
32+
This is an expensive operation, proceed with caution and cache the result if you need to reuse it.
33+
34+
## Iterating through the content
35+
36+
The individual `Content` elements can be iterated through with a regular `for` loop by converting it to a sequence:
37+
38+
```swift
39+
for (element in content.sequence()) {
40+
// Process element
41+
}
42+
```
43+
44+
Alternatively, you can get the whole list of elements with `content.elements()`, or use the lower level APIs to iterate the content manually:
45+
46+
```swift
47+
let iterator = content.iterator()
48+
while let element = try iterator.next() {
49+
print(element)
50+
}
51+
```
52+
53+
Some `Content` implementations support bidirectional iterations. To iterate backwards, use:
54+
55+
```swift
56+
let iterator = content.iterator()
57+
while let element = try iterator.previous() {
58+
print(element)
59+
}
60+
```
61+
62+
## Processing the elements
63+
64+
The `Content` iterator yields `ContentElement` objects representing a single semantic portion of the publication, such as a heading, a paragraph or an embedded image.
65+
66+
Every element has a `locator` property targeting it in the publication. You can use the locator, for example, to navigate to the element or to draw a `Decoration` on top of it.
67+
68+
```swift
69+
navigator.go(to: element.locator)
70+
```
71+
72+
### Types of elements
73+
74+
Depending on the actual implementation of `ContentElement`, more properties are available to access the actual data. The toolkit ships with a number of default implementations for common types of elements.
75+
76+
#### Embedded media
77+
78+
The `EmbeddedContentElement` protocol is implemented by any element referencing an external resource. It contains an `embeddedLink` property you can use to get the actual content of the resource.
79+
80+
```swift
81+
if let element = element as? EmbeddedContentElement {
82+
let bytes = try publication
83+
.get(element.embeddedLink)
84+
.read().get()
85+
}
86+
```
87+
88+
Here are the default available implementations:
89+
90+
* `AudioContentElement` - audio clips
91+
* `VideoContentElement` - video clips
92+
* `ImageContentElement` - bitmap images, with the additional property:
93+
* `caption: String?` - figure caption, when available
94+
95+
#### Text
96+
97+
##### Textual elements
98+
99+
The `TextualContentElement` protocol is implemented by any element which can be represented as human-readable text. This is useful when you want to extract the text content of a publication without caring for each individual type of elements.
100+
101+
```swift
102+
let wholeText = publication.content()
103+
.elements()
104+
.compactMap { ($0 as? TextualContentElement)?.text.takeIf { !$0.isEmpty } }
105+
.joined(separator: "\n")
106+
```
107+
108+
##### Text elements
109+
110+
Actual text elements are instances of `TextContentElement`, which represent a single block of text such as a heading, a paragraph or a list item. It is comprised of a `role` and a list of `segments`.
111+
112+
The `role` is the nature of the text element in the document. For example a heading, body, footnote or a quote. It can be used to reconstruct part of the structure of the original document.
113+
114+
A text element is composed of individual segments with their own `locator` and `attributes`. They are useful to associate attributes with a portion of a text element. For example, given the HTML paragraph:
115+
116+
```html
117+
<p>It is pronounced <span lang="fr">croissant</span>.</p>
118+
```
119+
120+
The following `TextContentElement` will be produced:
121+
122+
```swift
123+
TextContentElement(
124+
role: .body,
125+
segments: [
126+
TextContentElement.Segment(text: "It is pronounced "),
127+
TextContentElement.Segment(text: "croissant", attributes: [ContentAttribute(key: .language, value: "fr")]),
128+
TextContentElement.Segment(text: ".")
129+
]
130+
)
131+
```
132+
133+
If you are not interested in the segment attributes, you can also use `element.text` to get the concatenated raw text.
134+
135+
### Element attributes
136+
137+
All types of `ContentElement` can have associated attributes. Custom `ContentService` implementations can use this as an extensibility point.
138+
139+
## Use cases
140+
141+
### An index of all images embedded in the publication
142+
143+
This example extracts all the embedded images in the publication and displays them in a SwiftUI list. Clicking on an image jumps to its location in the publication.
144+
145+
```swift
146+
struct ImageIndex: View {
147+
struct Item: Hashable {
148+
let locator: Locator
149+
let text: String?
150+
let image: UIImage
151+
}
152+
153+
let publication: Publication
154+
let navigator: Navigator
155+
@State private var items: [Item] = []
156+
157+
init(publication: Publication, navigator: Navigator) {
158+
self.publication = publication
159+
self.navigator = navigator
160+
}
161+
162+
var body: some View {
163+
ScrollView {
164+
LazyVStack {
165+
ForEach(items, id: \.self) { item in
166+
VStack() {
167+
Image(uiImage: item.image)
168+
Text(item.text ?? "No caption")
169+
}
170+
.onTapGesture {
171+
navigator.go(to: item.locator)
172+
}
173+
}
174+
}
175+
}
176+
.onAppear {
177+
items = publication.content()?
178+
.elements()
179+
.compactMap { element in
180+
guard
181+
let element = element as? ImageContentElement,
182+
let image = try? publication.get(element.embeddedLink)
183+
.read().map(UIImage.init).get()
184+
else {
185+
return nil
186+
}
187+
188+
return Item(
189+
locator: element.locator,
190+
text: element.caption ?? element.accessibilityLabel,
191+
image: image
192+
)
193+
}
194+
?? []
195+
}
196+
}
197+
}
198+
```
199+
200+
## References
201+
202+
* [Content Iterator proposal](https://github.com/readium/architecture/pull/177)

0 commit comments

Comments
 (0)