Skip to content

Commit a9351f2

Browse files
author
何涛
committed
add qwen vl to readme
1 parent c465dab commit a9351f2

File tree

4 files changed

+11
-3
lines changed

4 files changed

+11
-3
lines changed

README.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ ome
2020

2121
## Key Features
2222
- **Compatibility**: Designed for various multimodal models.
23-
- **Integration**: Currently integrated with **GPT-4o, o1, Gemini Pro Vision, Claude 3 and LLaVa.**
23+
- **Integration**: Currently integrated with **GPT-4o, o1, Gemini Pro Vision, Claude 3, Qwen-VL and LLaVa.**
2424
- **Future Plans**: Support for additional models.
2525

2626
## Demo
@@ -76,6 +76,13 @@ Use Claude 3 with Vision to see how it stacks up to GPT-4-Vision at operating a
7676
operate -m claude-3
7777
```
7878

79+
#### Try qwen `-m qwen-vl`
80+
Use Qwen-vl with Vision to see how it stacks up to GPT-4-Vision at operating a computer. Navigate to the [Qwen dashboard](https://bailian.console.aliyun.com/) to get an API key and run the command below to try it.
81+
82+
```
83+
operate -m qwen-vl
84+
```
85+
7986
#### Try LLaVa Hosted Through Ollama `-m llava`
8087
If you wish to experiment with the Self-Operating Computer Framework using LLaVA on your own machine, you can with Ollama!
8188
*Note: Ollama currently only supports MacOS and Linux. Windows now in Preview*

operate/main.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
"""
44
import argparse
55
from operate.utils.style import ANSI_BRIGHT_MAGENTA
6-
from operate.run_operate import main
6+
from operate.operate import main
77

88

99
def main_entry():

operate/models/apis.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -168,7 +168,8 @@ async def call_qwen_vl_with_ocr(messages, objective, model):
168168
vision_message = {
169169
"role": "user",
170170
"content": [
171-
{"type": "text", "text": user_prompt},
171+
{"type": "text",
172+
"text": f"{user_prompt}**REMEMBER** Only output json format, do not append any other text."},
172173
{
173174
"type": "image_url",
174175
"image_url": {"url": f"data:image/jpeg;base64,{img_base64}"},
File renamed without changes.

0 commit comments

Comments
 (0)