Skip to content

Commit 11875bd

Browse files
Add comprehensive task tables to README
Addresses feedback about unclear task availability. Added detailed tables showing: - MiniWoB tasks organized by category (click, text entry, navigation, visual, email/social) - All 100+ tasks listed with descriptions and difficulty ratings - WebArena tasks grouped by website (~812 total) - Task examples and usage patterns - Links to full task lists for each benchmark Makes it much clearer what tasks are available and how to use them.
1 parent 2bbd017 commit 11875bd

File tree

1 file changed

+149
-15
lines changed

1 file changed

+149
-15
lines changed

src/envs/browsergym_env/README.md

Lines changed: 149 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -48,21 +48,155 @@ for episode in range(1000):
4848
env.close()
4949
```
5050

51-
### Available MiniWoB Tasks (100+ total)
52-
53-
Popular training tasks:
54-
- **click-test**: Click on a specific button
55-
- **click-button**: Click buttons with specific text
56-
- **click-dialog**: Click buttons in dialogs
57-
- **click-checkboxes**: Select specific checkboxes
58-
- **click-link**: Click on links
59-
- **enter-text**: Type text into input fields
60-
- **navigate-tree**: Navigate through tree structures
61-
- **search-engine**: Use a search interface
62-
- **use-autocomplete**: Interact with autocomplete
63-
- **login-user**: Fill in login forms
64-
65-
And many more! See [MiniWoB++ documentation](https://github.com/Farama-Foundation/miniwob-plusplus) for the full list.
51+
### Available Tasks by Benchmark
52+
53+
#### MiniWoB++ Tasks (Training - 100+ tasks)
54+
55+
MiniWoB tasks are organized by difficulty and type. Here are the main categories:
56+
57+
**Click Tasks** (Basic interaction)
58+
| Task Name | Description | Difficulty |
59+
|-----------|-------------|------------|
60+
| `click-test` | Click a single button | ⭐ Easy |
61+
| `click-button` | Click button with specific text | ⭐ Easy |
62+
| `click-button-sequence` | Click buttons in order | ⭐⭐ Medium |
63+
| `click-checkboxes` | Select specific checkboxes | ⭐⭐ Medium |
64+
| `click-checkboxes-soft` | Select checkboxes (multiple valid) | ⭐⭐ Medium |
65+
| `click-checkboxes-large` | Many checkboxes to select from | ⭐⭐ Medium |
66+
| `click-checkboxes-transfer` | Transfer learning variation | ⭐⭐ Medium |
67+
| `click-dialog` | Click correct button in dialog | ⭐ Easy |
68+
| `click-dialog-2` | More complex dialog | ⭐⭐ Medium |
69+
| `click-link` | Click on a link | ⭐ Easy |
70+
| `click-option` | Select from dropdown | ⭐⭐ Medium |
71+
| `click-pie` | Click on pie chart slice | ⭐⭐ Medium |
72+
| `click-scroll-list` | Click item in scrollable list | ⭐⭐⭐ Hard |
73+
| `click-shades` | Click on specific color shade | ⭐⭐ Medium |
74+
| `click-shape` | Click on specific shape | ⭐⭐ Medium |
75+
| `click-tab` | Switch between tabs | ⭐⭐ Medium |
76+
| `click-tab-2` | More complex tab switching | ⭐⭐⭐ Hard |
77+
| `click-widget` | Click on UI widget | ⭐⭐ Medium |
78+
79+
**Text Entry Tasks** (Typing and forms)
80+
| Task Name | Description | Difficulty |
81+
|-----------|-------------|------------|
82+
| `enter-text` | Type text into input field | ⭐ Easy |
83+
| `enter-text-dynamic` | Dynamic text entry | ⭐⭐ Medium |
84+
| `enter-text-2` | Multiple text fields | ⭐⭐ Medium |
85+
| `enter-password` | Fill password field | ⭐ Easy |
86+
| `enter-date` | Enter a date | ⭐⭐ Medium |
87+
| `enter-time` | Enter a time | ⭐⭐ Medium |
88+
| `login-user` | Complete login form | ⭐⭐ Medium |
89+
| `login-user-popup` | Login via popup | ⭐⭐⭐ Hard |
90+
91+
**Navigation Tasks** (Multi-step interaction)
92+
| Task Name | Description | Difficulty |
93+
|-----------|-------------|------------|
94+
| `navigate-tree` | Navigate through tree structure | ⭐⭐⭐ Hard |
95+
| `search-engine` | Use search interface | ⭐⭐ Medium |
96+
| `use-autocomplete` | Interact with autocomplete | ⭐⭐⭐ Hard |
97+
| `book-flight` | Book a flight (complex form) | ⭐⭐⭐⭐ Very Hard |
98+
| `choose-date` | Pick date from calendar | ⭐⭐⭐ Hard |
99+
| `choose-date-easy` | Simplified date picker | ⭐⭐ Medium |
100+
| `choose-date-medium` | Medium difficulty date picker | ⭐⭐⭐ Hard |
101+
| `choose-list` | Select from long list | ⭐⭐ Medium |
102+
103+
**Visual/Spatial Tasks** (Requires visual understanding)
104+
| Task Name | Description | Difficulty |
105+
|-----------|-------------|------------|
106+
| `count-sides` | Count sides of shape | ⭐⭐ Medium |
107+
| `count-shape` | Count specific shapes | ⭐⭐ Medium |
108+
| `find-word` | Find word in text | ⭐⭐ Medium |
109+
| `focus-text` | Focus on text element | ⭐ Easy |
110+
| `focus-text-2` | More complex focus task | ⭐⭐ Medium |
111+
| `grid-coordinate` | Click grid coordinate | ⭐⭐ Medium |
112+
| `guess-number` | Guess a number game | ⭐⭐⭐ Hard |
113+
| `identify-shape` | Identify shape type | ⭐⭐ Medium |
114+
| `read-table` | Extract info from table | ⭐⭐⭐ Hard |
115+
| `read-table-2` | More complex table reading | ⭐⭐⭐ Hard |
116+
117+
**Email/Social Tasks** (Realistic scenarios)
118+
| Task Name | Description | Difficulty |
119+
|-----------|-------------|------------|
120+
| `email-inbox` | Manage email inbox | ⭐⭐⭐⭐ Very Hard |
121+
| `email-inbox-forward` | Forward emails | ⭐⭐⭐⭐ Very Hard |
122+
| `email-inbox-nl` | Natural language email task | ⭐⭐⭐⭐ Very Hard |
123+
| `email-inbox-star-reply` | Star and reply to emails | ⭐⭐⭐⭐ Very Hard |
124+
| `social-media` | Social media interaction | ⭐⭐⭐⭐ Very Hard |
125+
| `social-media-some` | Partial social media task | ⭐⭐⭐ Hard |
126+
127+
**Total:** 100+ tasks across all categories
128+
129+
**Usage:**
130+
```python
131+
# Easy task for quick testing
132+
env = BrowserGymEnv(environment={"BROWSERGYM_TASK_NAME": "click-test"})
133+
134+
# Medium difficulty for training
135+
env = BrowserGymEnv(environment={"BROWSERGYM_TASK_NAME": "click-checkboxes"})
136+
137+
# Hard task for evaluation
138+
env = BrowserGymEnv(environment={"BROWSERGYM_TASK_NAME": "email-inbox"})
139+
```
140+
141+
#### WebArena Tasks (Evaluation - 812 tasks)
142+
143+
WebArena tasks are organized by website and difficulty. Tasks are numbered 0-811.
144+
145+
**By Website:**
146+
| Website | Task Count | Description | Example Tasks |
147+
|---------|------------|-------------|---------------|
148+
| Shopping | ~200 | E-commerce site | Search products, add to cart, checkout |
149+
| Shopping Admin | ~150 | Admin panel | Manage products, orders, customers |
150+
| Reddit | ~150 | Forum/social | Post, comment, search discussions |
151+
| GitLab | ~200 | Code repository | Create issues, merge requests, review code |
152+
| Wikipedia | ~100 | Knowledge base | Search, read, extract information |
153+
| Map | ~12 | Location service | Find places, get directions |
154+
155+
**By Difficulty:**
156+
| Difficulty | Task Count | Steps Required | Example |
157+
|------------|------------|----------------|---------|
158+
| Easy | ~200 | 1-5 steps | "Find the price of product X" |
159+
| Medium | ~400 | 5-15 steps | "Add cheapest laptop to cart" |
160+
| Hard | ~212 | 15+ steps | "Create merge request for bug fix" |
161+
162+
**Usage:**
163+
```python
164+
# Task 0 (usually easy)
165+
env = BrowserGymEnv(environment={
166+
"BROWSERGYM_BENCHMARK": "webarena",
167+
"BROWSERGYM_TASK_NAME": "0",
168+
"SHOPPING": "http://your-server:7770",
169+
# ... other URLs
170+
})
171+
172+
# Task 156 (GitLab merge request)
173+
env = BrowserGymEnv(environment={
174+
"BROWSERGYM_BENCHMARK": "webarena",
175+
"BROWSERGYM_TASK_NAME": "156",
176+
# ... URLs
177+
})
178+
```
179+
180+
**Note:** WebArena tasks require the full backend infrastructure. See [WebArena setup guide](https://github.com/web-arena-x/webarena/tree/main/environment_docker).
181+
182+
#### VisualWebArena Tasks (910 tasks)
183+
184+
Similar to WebArena but requires visual understanding. Tasks involve:
185+
- Image-based reasoning
186+
- Visual element identification
187+
- Multimodal interaction (text + images)
188+
189+
#### WorkArena Tasks
190+
191+
Enterprise software automation tasks:
192+
- CRM operations
193+
- Project management
194+
- Business workflows
195+
196+
**Full task lists:**
197+
- [MiniWoB++ tasks](https://github.com/Farama-Foundation/miniwob-plusplus/tree/master/miniwob/environment)
198+
- [WebArena tasks](https://github.com/web-arena-x/webarena/blob/main/config_files/)
199+
- [BrowserGym documentation](https://github.com/ServiceNow/BrowserGym)
66200

67201
## Evaluation (WebArena)
68202

0 commit comments

Comments
 (0)