You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
importtransformers# Wait until we use it. Transformers can't be lazy loaded for some reason!
28
-
29
-
os.environ["TOKENIZERS_PARALLELISM"] ="false"
30
-
31
-
ifself.computer.debug:
32
-
print(
33
-
"Open Interpreter will use Moondream (tiny vision model) to describe images to the language model. Set `interpreter.llm.vision_renderer = None` to disable this behavior."
28
+
ifself.easyocr==Noneandload_easyocr:
29
+
importeasyocr
30
+
31
+
self.easyocr=easyocr.Reader(
32
+
["en"]
33
+
) # this needs to run only once to load the model into memory
34
+
35
+
ifself.model==Noneandload_moondream:
36
+
importtransformers# Wait until we use it. Transformers can't be lazy loaded for some reason!
37
+
38
+
os.environ["TOKENIZERS_PARALLELISM"] ="false"
39
+
40
+
ifself.computer.debug:
41
+
print(
42
+
"Open Interpreter will use Moondream (tiny vision model) to describe images to the language model. Set `interpreter.llm.vision_renderer = None` to disable this behavior."
43
+
)
44
+
print(
45
+
"Alternatively, you can use a vision-supporting LLM and set `interpreter.llm.supports_vision = True`."
Copy file name to clipboardExpand all lines: interpreter/terminal_interface/profiles/defaults/codestral-os.py
+4-3Lines changed: 4 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -113,13 +113,14 @@
113
113
interpreter.offline=True
114
114
interpreter.os=True
115
115
116
+
# Vision setup
117
+
interpreter.computer.vision.load()
118
+
116
119
# Final message
117
120
interpreter.display_message(
118
121
"**Warning:** In this mode, Open Interpreter will not require approval before performing actions. Be ready to close your terminal."
119
122
)
120
123
interpreter.display_message(
121
124
"\n**Note:** Codestral is a relatively weak model, so OS mode is highly experimental. Try using a more powerful model for OS mode with `interpreter --os`."
122
125
)
123
-
interpreter.display_message(
124
-
"> Model set to `codestral`, experimental OS control enabled"
125
-
)
126
+
interpreter.display_message("> Experimental OS control enabled.")
Copy file name to clipboardExpand all lines: interpreter/terminal_interface/profiles/defaults/codestral-vision.py
+22-2Lines changed: 22 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -17,7 +17,7 @@
17
17
User: The code you ran produced no output. Was this expected, or are we finished?
18
18
Assistant: No further action is required; the provided snippet opens Chrome.
19
19
20
-
You have access to ONE special function called `computer.vision.query(query="Describe this image.", path="image.jpg")`. This will ask a vision AI model the query, regarding the image at path. For example:
20
+
You have access to TWO special functions called `computer.vision.query(query="Describe this image.", path="image.jpg")` (asks a vision AI model the query, regarding the image at path) and `computer.vision.ocr(path="image.jpg")` (returns text in the image at path). For example:
21
21
22
22
User: Rename the images on my desktop to something more descriptive.
23
23
Assistant: Viewing and renaming images.
@@ -53,6 +53,25 @@
53
53
```
54
54
User: The code you ran produced no output. Was this expected, or are we finished?
55
55
Assistant: We are finished.
56
+
User: What text is in the image 'user.png' on my desktop?
57
+
Assistant: ```python
58
+
import os
59
+
import string
60
+
from pathlib import Path
61
+
62
+
# Get the user's home directory in a cross-platform way
User: The code you ran produced this output: "29294 is the username". What does this mean?
74
+
Assistant: The output means that the `user.png` image on your desktop contains the text "29294 is the username".
56
75
57
76
NEVER use placeholders. Always specify exact paths, and use cross-platform ways of determining the desktop, documents, etc. folders.
58
77
@@ -65,15 +84,16 @@
65
84
66
85
# LLM settings
67
86
interpreter.llm.model="ollama/codestral"
68
-
interpreter.llm.load() # Loads Ollama models
69
87
interpreter.llm.supports_functions=False
70
88
interpreter.llm.execution_instructions=False
71
89
interpreter.llm.max_tokens=1000
72
90
interpreter.llm.context_window=7000
91
+
interpreter.llm.load() # Loads Ollama models
73
92
74
93
# Computer settings
75
94
interpreter.computer.import_computer_api=True
76
95
interpreter.computer.system_message=""# The default will explain how to use the full Computer API, and append this to the system message. For local models, we want more control, so we set this to "". The system message will ONLY be what's above ^
0 commit comments