You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+25Lines changed: 25 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -108,6 +108,31 @@ Start `operate` with the SoM model
108
108
operate -m gpt-4-with-som
109
109
```
110
110
111
+
### Locally Hosted LLaVA Through Ollama
112
+
If you wish to experiment with the Self-Operating Computer Framework using LLaVA on your own machine, you can with Ollama!
113
+
*Note: Ollama currently only supports MacOS and Linux*
114
+
115
+
First, install Ollama on your machine from https://ollama.ai/download.
116
+
117
+
Once Ollama is installed, pull the LLaVA model:
118
+
```
119
+
ollama pull llava
120
+
```
121
+
This will download the model on your machine which takes approximately 5 GB of storage.
122
+
123
+
When Ollama has finished pulling LLaVA, start the server:
124
+
```
125
+
ollama serve
126
+
```
127
+
128
+
That's it! Now start `operate` and select the LLaVA model:
129
+
```
130
+
operate -m llava
131
+
```
132
+
**Important:** Error rates when using LLaVA are very high. This is simply intended to be a base to build off of as local multimodal models improve over time.
133
+
134
+
Learn more about Ollama at its [GitHub Repository](https://www.github.com/ollama/ollama)
135
+
111
136
### Voice Mode `--voice`
112
137
The framework supports voice inputs for the objective. Try voice by following the instructions below.
113
138
**Clone the repo** to a directory on your computer:
0 commit comments