You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The default evaluation setup adopts a client-server architecture where the policy (model) and the environment run in separate processes. This improves compatibility and modularity for large-scale benchmarks.
278
278
You can evaluate `pi0` on the `genmanip` benchmark in a single process using the following command:
279
279
280
+
**Configuring Evaluation: Key Setup and Model Checkpoint**
281
+
282
+
The evaluation requires properly configuring the evaluation config file. Below is an example of a config instance.
283
+
Please pay special attention to modifying the `base_model_path` field, which should point to your finetuned model checkpoint.
284
+
285
+
```python
286
+
from internmanip.configs import *
287
+
from pathlib import Path
288
+
289
+
eval_cfg = EvalCfg(
290
+
eval_type='genmanip',
291
+
agent=AgentCfg(
292
+
agent_type='pi0',
293
+
base_model_path='/PATH/TO/YOUR/CHECKPOINT', # <--- MODIFY THIS PATH
294
+
agent_settings={...},
295
+
model_kwargs={...},
296
+
server_cfg=ServerCfg(...),
297
+
),
298
+
env=EnvCfg(...),
299
+
)
300
+
```
301
+
302
+
303
+
```{important}
304
+
You must modify the `base_model_path` to the path of your own finetuned checkpoint, which is different from the HuggingFace loaded checkpoint — you should **NOT** use the unfinetuned checkpoint directly for evaluation!
305
+
```
306
+
307
+
Also, note that the evaluation data for `genmanip` is different from the training data, so please be careful to distinguish between them when running evaluations.
308
+
280
309
281
310
**🖥 Terminal 1: Launch the Policy Server (Model Side)**
0 commit comments