Skip to content

Commit 14a4bc5

Browse files
committed
🐛 Add --plain-encoding option to dinglehopper-extract
1 parent a70260c commit 14a4bc5

File tree

1 file changed

+9
-2
lines changed

1 file changed

+9
-2
lines changed

src/dinglehopper/cli_extract.py

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,12 @@
1212
help="PAGE TextEquiv level to extract text from",
1313
metavar="LEVEL",
1414
)
15-
def main(input_file, textequiv_level):
15+
@click.option(
16+
"--plain-encoding",
17+
default="autodetect",
18+
help='Encoding (e.g. "utf-8") of plain text files',
19+
)
20+
def main(input_file, textequiv_level, plain_encoding):
1621
"""
1722
Extract the text of the given INPUT_FILE.
1823
@@ -23,7 +28,9 @@ def main(input_file, textequiv_level):
2328
use "--textequiv-level line" to extract from the level of TextLine tags.
2429
"""
2530
initLogging()
26-
input_text = extract(input_file, textequiv_level=textequiv_level).text
31+
input_text = extract(
32+
input_file, textequiv_level=textequiv_level, plain_encoding=plain_encoding
33+
).text
2734
print(input_text)
2835

2936

0 commit comments

Comments
 (0)