Error Handling #174

surfiniaburger · 2025-11-10T13:34:25Z

Hardened the server so that it would not crash if the data were malformed.

This commit fixes a critical bug where the Gunicorn server for the DIPG environment would crash when receiving a malformed string from the LLM. The crash was caused by unhandled exceptions in the reward functions during string parsing. This commit addresses the issue by: 1. **Hardening Reward Functions:** Wrapping the parsing logic within each reward function in `dipg_environment.py` with a `try...except` block. This ensures that any malformed string will be caught, penalized with a `missing_answer_penalty`, and will no longer crash the server process. 2. **Adding a Regression Test:** A new test case, `test_malformed_step`, has been added to `test_dipg_environment.py`. This test sends a known problematic string to the server to verify that it handles the error gracefully and does not crash, preventing future regressions. 3. **Client-Side Resilience:** The Jupyter notebook `dipg-rl.ipynb` was also updated to make the training loop more resilient. It now catches `ReadTimeout` and `ConnectionError` exceptions, which can occur if the server crashes for any reason, and continues the training process.

…med-input Fix server crash on malformed LLM responses

This commit provides a comprehensive fix for the training script crashes caused by `ReadTimeout` and `ConnectionError` exceptions. The root cause was the environment server crashing on malformed LLM-generated strings. This commit addresses the issue on multiple levels: 1. **Server-Side Robustness:** The core logic in `dipg_environment.py` has been hardened. The `step` function, which calculates rewards, now contains a `try...except` block that catches any exception during reward calculation. This prevents a single malformed response from crashing the entire server process. Instead, an error is logged, and a penalty is assigned. 2. **Client-Side Resilience:** A new file, `reward_function.py`, has been created to provide the user with a corrected `create_reward_fn`. This function now correctly handles both `ConnectionError` and `ReadTimeout` exceptions, preventing the client-side training script from crashing and allowing it to continue robustly. 3. **Regression Testing:** The existing regression test, `test_malformed_step`, was used to verify that the server no longer crashes when receiving malformed input, ensuring the server-side fix is effective.

…med-input Fix Server Crash and Provide Robust Client-Side Function

update notebook

hierarchical logic

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

hierarchical logic

…into notebook

Notebook

update notebook

google-labs-jules bot and others added 7 commits November 9, 2025 11:37

Merge pull request #25 from surfiniaburger/fix-server-crash-on-malfor…

b0de885

…med-input Fix server crash on malformed LLM responses

Merge pull request #26 from surfiniaburger/fix-server-crash-on-malfor…

83ed390

…med-input Fix Server Crash and Provide Robust Client-Side Function

update notebook

1048bcb

update notebook

2f391d1

Merge pull request #27 from surfiniaburger/notebook

fcf2259

update notebook

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 10, 2025

jspisak added the enhancement New feature or request label Nov 11, 2025

surfiniaburger and others added 13 commits November 11, 2025 21:58

hierarchical logic

383baa5

Merge pull request #28 from surfiniaburger/notebook

cc6f3ad

hierarchical logic

add new test

572ab2c

Update src/envs/dipg_safety_env/README.md

3bda6e8

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Merge pull request #29 from surfiniaburger/notebook

03b6679

hierarchical logic

update notebook

4ddef8a

sft eval

b1124f5

Merge branch 'notebook' of https://github.com/surfiniaburger/OpenEnv …

89d0dfa

…into notebook

Merge pull request #30 from surfiniaburger/notebook

5170843

Notebook

update notebook

65409a8

removed security vulnerability

05c2964

+1

e80437a

Merge pull request #31 from surfiniaburger/notebook

50df67c

update notebook

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error Handling #174

Error Handling #174

Uh oh!

surfiniaburger commented Nov 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Error Handling #174

Are you sure you want to change the base?

Error Handling #174

Uh oh!

Conversation

surfiniaburger commented Nov 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants