You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Update `img2txt` docs and add table styles
Expanded the `img2txt` documentation to include new options for provider selection and advanced OCR features.
* Update `img2txt` example image URLs
Copy file name to clipboardExpand all lines: src/AI/img2txt.md
+21-5Lines changed: 21 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,24 +4,40 @@ description: Extract text from images using OCR to read printed text, handwritin
4
4
platforms: [websites, apps, nodejs, workers]
5
5
---
6
6
7
-
Given an image will return the text contained in the image. Also known as OCR (Optical Character Recognition), this API can be used to extract text from images of printed text, handwriting, or any other text-based content.
7
+
Given an image will return the text contained in the image. Also known as OCR (Optical Character Recognition), this API can be used to extract text from images of printed text, handwriting, or any other text-based content. You can choose between AWS Textract (default) or Mistral’s OCR service when you need multilingual or richer annotation output.
A string containing the URL, or path (on Puter) of the image you want to recognize, or a `File` or `Blob` object containing the image.
21
+
A string containing the URL or Puter path of the image you want to recognize, or a `File`/`Blob` object containing the image. When calling with an options object, pass it as `{ source: ... }`.
20
22
21
23
#### `testMode` (Boolean) (Optional)
22
24
23
25
A boolean indicating whether you want to use the test API. Defaults to `false`. This is useful for testing your code without using up API credits.
24
26
27
+
#### `options` (Object) (Optional)
28
+
29
+
An options object with the following properties:
30
+
31
+
-`provider` (String) (Optional) - Choose the OCR backend. Can be `aws-textract` or `mistral`. Defaults to `aws-textract`.
32
+
-`model` (String) (Optional) - Mistral OCR model to use. Defaults to `mistral-ocr-latest`.
33
+
-`pages` (Array) (Optional) - Limit processing to specific page numbers (multi-page PDFs).
34
+
-`includeImageBase64` (Boolean) (Optional) - Mistral-only: requests the base64 of cropped regions in the response.
35
+
-`imageLimit` (Number) (Optional) - Control how many images are analyzed per document (Mistral).
36
+
-`imageMinSize` (Number) (Optional) - Set minimum size for images to be analyzed (Mistral).
37
+
-`bboxAnnotationFormat` (String) (Optional) - Mistral: format for bounding-box annotations (e.g., `yolo`, `xyxy`).
0 commit comments