|
| 1 | +# Unity NL/T Editing Suite — CI Agent Contract |
| 2 | + |
| 3 | +You are running inside CI for the `unity-mcp` repo. Use only the tools allowed by the workflow. Work autonomously; do not prompt the user. Do NOT spawn subagents. |
| 4 | + |
| 5 | +**Print this once, verbatim, early in the run:** |
| 6 | +AllowedTools: Write,Bash(printf:*),Bash(echo:*),Bash(scripts/nlt-revert.sh:*),mcp__unity__manage_editor,mcp__unity__list_resources,mcp__unity__read_resource,mcp__unity__apply_text_edits,mcp__unity__script_apply_edits,mcp__unity__validate_script,mcp__unity__find_in_file,mcp__unity__read_console,mcp__unity__get_sha |
| 7 | + |
| 8 | +--- |
| 9 | + |
| 10 | +## Mission |
| 11 | +1) Pick target file (prefer): |
| 12 | + - `unity://path/Assets/Scripts/LongUnityScriptClaudeTest.cs` |
| 13 | +2) Execute **all** NL/T tests in order using minimal, precise edits. |
| 14 | +3) Validate each edit with `mcp__unity__validate_script(level:"standard")`. |
| 15 | +4) **Report**: write one `<testcase>` XML fragment per test to `reports/<TESTID>_results.xml`. Do **not** read or edit `$JUNIT_OUT`. |
| 16 | +5) **Restore** the file after each test using the OS‑level helper (fast), not a full‑file text write. |
| 17 | + |
| 18 | +--- |
| 19 | + |
| 20 | +## Environment & Paths (CI) |
| 21 | +- Always pass: `project_root: "TestProjects/UnityMCPTests"` and `ctx: {}` on list/read/edit/validate. |
| 22 | +- **Canonical URIs only**: |
| 23 | + - Primary: `unity://path/Assets/...` (never embed `project_root` in the URI) |
| 24 | + - Relative (when supported): `Assets/...` |
| 25 | +- File paths for the helper script are workspace‑relative: |
| 26 | + - `TestProjects/UnityMCPTests/Assets/...` |
| 27 | + |
| 28 | +CI provides: |
| 29 | +- `$JUNIT_OUT=reports/junit-nl-suite.xml` (pre‑created; leave alone) |
| 30 | +- `$MD_OUT=reports/junit-nl-suite.md` (synthesized from JUnit) |
| 31 | +- Helper script: `scripts/nlt-revert.sh` (snapshot/restore) |
| 32 | + |
| 33 | +--- |
| 34 | + |
| 35 | +## Tool Mapping |
| 36 | +- **Anchors/regex/structured**: `mcp__unity__script_apply_edits` |
| 37 | + - Allowed ops: `anchor_insert`, `replace_range`, `regex_replace` (no overlapping ranges within a single call) |
| 38 | +- **Precise ranges / atomic batch**: `mcp__unity__apply_text_edits` (non‑overlapping ranges) |
| 39 | + - Multi‑span batches are computed from the same fresh read and sent atomically by default. |
| 40 | + - Prefer `options.applyMode:"atomic"` when passing options for multiple spans; for single‑span, sequential is fine. |
| 41 | +- **Hash-only**: `mcp__unity__get_sha` — returns `{sha256,lengthBytes,lastModifiedUtc}` without file body |
| 42 | +- **Validation**: `mcp__unity__validate_script(level:"standard")` |
| 43 | + - For edits, you may pass `options.validate`: |
| 44 | + - `standard` (default): full‑file delimiter balance checks. |
| 45 | + - `relaxed`: scoped checks for interior, non‑structural text edits; do not use for header/signature/brace‑touching changes. |
| 46 | +- **Reporting**: `Write` small XML fragments to `reports/*_results.xml` |
| 47 | +- **Editor state/flush**: `mcp__unity__manage_editor` (use sparingly; no project mutations) |
| 48 | +- **Console readback**: `mcp__unity__read_console` (INFO capture only; do not assert in place of `validate_script`) |
| 49 | +- **Snapshot/Restore**: `Bash(scripts/nlt-revert.sh:*)` |
| 50 | + - For `script_apply_edits`: use `name` + workspace‑relative `path` only (e.g., `name="LongUnityScriptClaudeTest"`, `path="Assets/Scripts"`). Do not pass `unity://...` URIs as `path`. |
| 51 | + - For `apply_text_edits` / `read_resource`: use the URI form only (e.g., `uri="unity://path/Assets/Scripts/LongUnityScriptClaudeTest.cs"`). Do not concatenate `Assets/` with a `unity://...` URI. |
| 52 | + - Never call generic Bash like `mkdir`; the revert helper creates needed directories. Use only `scripts/nlt-revert.sh` for snapshot/restore. |
| 53 | + - If you believe a directory is missing, you are mistaken: the workflow pre-creates it and the snapshot helper creates it if needed. Do not attempt any Bash other than scripts/nlt-revert.sh:*. |
| 54 | + |
| 55 | +### Structured edit ops (required usage) |
| 56 | + |
| 57 | +# Insert a helper RIGHT BEFORE the final class brace (NL‑3, T‑D) |
| 58 | +1) Prefer `script_apply_edits` with a regex capture on the final closing brace: |
| 59 | +```json |
| 60 | +{"op":"regex_replace", |
| 61 | + "pattern":"(?s)(\\r?\\n\\s*\\})\\s*$", |
| 62 | + "replacement":"\\n // Tail test A\\n // Tail test B\\n // Tail test C\\1"} |
| 63 | + |
| 64 | +2) If the server returns `unsupported` (op not available) or `missing_field` (op‑specific), FALL BACK to |
| 65 | + `apply_text_edits`: |
| 66 | + - Find the last `}` in the file (class closing brace) by scanning from end. |
| 67 | + - Insert the three comment lines immediately before that index with one non‑overlapping range. |
| 68 | + |
| 69 | +# Insert after GetCurrentTarget (T‑A/T‑E) |
| 70 | +- Use `script_apply_edits` with: |
| 71 | +```json |
| 72 | +{"op":"anchor_insert","afterMethodName":"GetCurrentTarget","text":"private int __TempHelper(int a,int b)=>a+b;\\n"} |
| 73 | +``` |
| 74 | + |
| 75 | +# Delete the temporary helper (T‑A/T‑E) |
| 76 | +- Prefer structured delete: |
| 77 | + - Use `script_apply_edits` with `{ "op":"delete_method", "className":"LongUnityScriptClaudeTest", "methodName":"PrintSeries" }` (or `__TempHelper` for T‑A). |
| 78 | +- If structured delete is unavailable, fall back to `apply_text_edits` with a single `replace_range` spanning the exact method block (bounds computed from a fresh read); avoid whole‑file regex deletes. |
| 79 | + |
| 80 | +# T‑B (replace method body) |
| 81 | +- Use `mcp__unity__apply_text_edits` with a single `replace_range` strictly inside the `HasTarget` braces. |
| 82 | +- Compute start/end from a fresh `read_resource` at test start. Do not edit signature or header. |
| 83 | +- On `{status:"stale_file"}` retry once with the server-provided hash; if absent, re-read once and retry. |
| 84 | +- On `bad_request`: write the testcase with `<failure>…</failure>`, restore, and continue to next test. |
| 85 | +- On `missing_field`: FALL BACK per above; if the fallback also returns `unsupported` or `bad_request`, then fail as above. |
| 86 | +> Don’t use `mcp__unity__create_script`. Avoid the header/`using` region entirely. |
| 87 | +
|
| 88 | +Span formats for `apply_text_edits`: |
| 89 | +- Prefer LSP ranges (0‑based): `{ "range": { "start": {"line": L, "character": C}, "end": {…} }, "newText": "…" }` |
| 90 | +- Explicit fields are 1‑based: `{ "startLine": L1, "startCol": C1, "endLine": L2, "endCol": C2, "newText": "…" }` |
| 91 | +- SDK preflights overlap after normalization; overlapping non‑zero spans → `{status:"overlap"}` with conflicts and no file mutation. |
| 92 | +- Optional debug: pass `strict:true` to reject explicit 0‑based fields (else they are normalized and a warning is emitted). |
| 93 | +- Apply mode guidance: router defaults to atomic for multi‑span; you can explicitly set `options.applyMode` if needed. |
| 94 | + |
| 95 | +--- |
| 96 | + |
| 97 | +## Output Rules (JUnit fragments only) |
| 98 | +- For each test, create **one** file: `reports/<TESTID>_results.xml` containing exactly a single `<testcase ...> ... </testcase>`. |
| 99 | + Put human-readable lines (PLAN/PROGRESS/evidence) **inside** `<system-out><![CDATA[ ... ]]></system-out>`. |
| 100 | + - If content contains `]]>`, split CDATA: replace `]]>` with `]]]]><![CDATA[>`. |
| 101 | +- Evidence windows only (±20–40 lines). If showing a unified diff, cap at 100 lines and note truncation. |
| 102 | +- **Never** open/patch `$JUNIT_OUT` or `$MD_OUT`; CI merges fragments and synthesizes Markdown. |
| 103 | + - Write destinations must match: `^reports/[A-Za-z0-9._-]+_results\.xml$` |
| 104 | + - Snapshot files must live under `reports/_snapshots/` |
| 105 | + - Reject absolute paths and any path containing `..` |
| 106 | + - Reject control characters and line breaks in filenames; enforce UTF‑8 |
| 107 | + - Cap basename length to ≤64 chars; cap any path segment to ≤100 and total path length to ≤255 |
| 108 | + - Bash(printf|echo) must write to stdout only. Do not use shell redirection, here‑docs, or `tee` to create/modify files. The only allowed FS mutation is via `scripts/nlt-revert.sh`. |
| 109 | + |
| 110 | +**Example fragment** |
| 111 | +```xml |
| 112 | +<testcase classname="UnityMCP.NL-T" name="NL-1. Method replace/insert/delete"> |
| 113 | + <system-out><![CDATA[ |
| 114 | +PLAN: NL-0,NL-1,NL-2,NL-3,NL-4,T-A,T-B,T-C,T-D,T-E,T-F,T-G,T-H,T-I,T-J (len=15) |
| 115 | +PROGRESS: 2/15 completed |
| 116 | +pre_sha=<...> |
| 117 | +... evidence windows ... |
| 118 | +VERDICT: PASS |
| 119 | +]]></system-out> |
| 120 | +</testcase> |
| 121 | + |
| 122 | +``` |
| 123 | + |
| 124 | +Note: Emit the PLAN line only in NL‑0 (do not repeat it for later tests). |
| 125 | + |
| 126 | + |
| 127 | +### Fast Restore Strategy (OS‑level) |
| 128 | + |
| 129 | +- Snapshot once at NL‑0, then restore after each test via the helper. |
| 130 | +- Snapshot (once after confirming the target): |
| 131 | + ```bash |
| 132 | + scripts/nlt-revert.sh snapshot "TestProjects/UnityMCPTests/Assets/Scripts/LongUnityScriptClaudeTest.cs" "reports/_snapshots/LongUnityScriptClaudeTest.cs.baseline" |
| 133 | + ``` |
| 134 | +- Log `snapshot_sha=...` printed by the script. |
| 135 | +- Restore (after each mutating test): |
| 136 | + ```bash |
| 137 | + scripts/nlt-revert.sh restore "TestProjects/UnityMCPTests/Assets/Scripts/LongUnityScriptClaudeTest.cs" "reports/_snapshots/LongUnityScriptClaudeTest.cs.baseline" |
| 138 | + ``` |
| 139 | +- Then `read_resource` to confirm and (optionally) `validate_script(level:"standard")`. |
| 140 | +- If the helper fails: fall back once to a guarded full‑file restore using the baseline bytes; then continue. |
| 141 | + |
| 142 | +### Guarded Write Pattern (for edits, not restores) |
| 143 | + |
| 144 | +- Before any mutation: `res = mcp__unity__read_resource(uri)`; `pre_sha = sha256(res.bytes)`. |
| 145 | +- Write with `precondition_sha256 = pre_sha` on `apply_text_edits`/`script_apply_edits`. |
| 146 | +- To compute `pre_sha` without reading file contents, you may instead call `mcp__unity__get_sha(uri).sha256`. |
| 147 | +- On `{status:"stale_file"}`: |
| 148 | + - Retry once using the server-provided hash (e.g., `data.current_sha256` or `data.expected_sha256`, per API schema). |
| 149 | + - If absent, one re-read then a final retry. No loops. |
| 150 | +- After success: immediately re-read via `res2 = mcp__unity__read_resource(uri)` and set `pre_sha = sha256(res2.bytes)` before any further edits in the same test. |
| 151 | +- Prefer anchors (`script_apply_edits`) for end-of-class / above-method insertions. Keep edits inside method bodies. Avoid header/using. |
| 152 | + |
| 153 | +**On non‑JSON/transport errors (timeout, EOF, connection closed):** |
| 154 | +- Write `reports/<TESTID>_results.xml` with a `<testcase>` that includes a `<failure>` or `<error>` node capturing the error text. |
| 155 | +- Run the OS restore via `scripts/nlt-revert.sh restore …`. |
| 156 | +- Continue to the next test (do not abort). |
| 157 | + |
| 158 | +**If any write returns `bad_request`, or `unsupported` after a fallback attempt:** |
| 159 | +- Write `reports/<TESTID>_results.xml` with a `<testcase>` that includes a `<failure>` node capturing the server error, include evidence, and end with `VERDICT: FAIL`. |
| 160 | +- Run `scripts/nlt-revert.sh restore ...` and continue to the next test. |
| 161 | +### Execution Order (fixed) |
| 162 | + |
| 163 | +- Run exactly: NL-0, NL-1, NL-2, NL-3, NL-4, T-A, T-B, T-C, T-D, T-E, T-F, T-G, T-H, T-I, T-J (15 total). |
| 164 | +- Before NL-1..T-J: Bash(scripts/nlt-revert.sh:restore "<target>" "reports/_snapshots/LongUnityScriptClaudeTest.cs.baseline") IF the baseline exists; skip for NL-0. |
| 165 | +- NL-0 must include the PLAN line (len=15). |
| 166 | +- After each testcase, include `PROGRESS: <k>/15 completed`. |
| 167 | + |
| 168 | + |
| 169 | +### Test Specs (concise) |
| 170 | + |
| 171 | +- NL‑0. Sanity reads — Tail ~120; ±40 around `Update()`. Then snapshot via helper. |
| 172 | +- NL‑1. Replace/insert/delete — `HasTarget → return currentTarget != null;`; insert `PrintSeries()` after `GetCurrentTarget` logging "1,2,3"; verify; delete `PrintSeries()`; restore. |
| 173 | +- NL‑2. Anchor comment — Insert `// Build marker OK` above `public void Update(...)`; restore. |
| 174 | +- NL‑3. End‑of‑class — Insert `// Tail test A/B/C` (3 lines) before final brace; restore. |
| 175 | +- NL‑4. Compile trigger — Record INFO only. |
| 176 | + |
| 177 | +### T‑A. Anchor insert (text path) — Insert helper after `GetCurrentTarget`; verify; delete via `regex_replace`; restore. |
| 178 | +### T‑B. Replace body — Single `replace_range` inside `HasTarget`; restore. |
| 179 | +- Options: pass {"validate":"relaxed"} for interior one-line edits. |
| 180 | +### T‑C. Header/region preservation — Edit interior of `ApplyBlend`; preserve signature/docs/regions; restore. |
| 181 | +- Options: pass {"validate":"relaxed"} for interior one-line edits. |
| 182 | +### T‑D. End‑of‑class (anchor) — Insert helper before final brace; remove; restore. |
| 183 | +### T‑E. Lifecycle — Insert → update → delete via regex; restore. |
| 184 | +### T‑F. Atomic batch — One `mcp__unity__apply_text_edits` call (text ranges only) |
| 185 | + - Compute all three edits from the **same fresh read**: |
| 186 | + 1) Two small interior `replace_range` tweaks. |
| 187 | + 2) One **end‑of‑class insertion**: find the **index of the final `}`** for the class; create a zero‑width range `[idx, idx)` and set `replacement` to the 3‑line comment block. |
| 188 | + - Send all three ranges in **one call**, sorted **descending by start index** to avoid offset drift. |
| 189 | + - Expect all‑or‑nothing semantics; on `{status:"overlap"}` or `{status:"bad_request"}`, write the testcase fragment with `<failure>…</failure>`, **restore**, and continue. |
| 190 | + - Options: pass {"applyMode":"atomic"} to enforce all‑or‑nothing. |
| 191 | +- T‑G. Path normalization — Make the same edit with `unity://path/Assets/...` then `Assets/...`. Without refreshing `precondition_sha256`, the second attempt returns `{stale_file}`; retry with the server-provided hash to confirm both forms resolve to the same file. |
| 192 | + |
| 193 | +### T-H. Validation (standard) |
| 194 | +- Restore baseline (helper call above). |
| 195 | +- Perform a harmless interior tweak (or none), then MUST call: |
| 196 | + mcp__unity__validate_script(level:"standard") |
| 197 | +- Write the validator output to system-out; VERDICT: PASS if standard is clean, else include <failure> with the validator message and continue. |
| 198 | + |
| 199 | +### T-I. Failure surfaces (expected) |
| 200 | +- Restore baseline. |
| 201 | +- (1) OVERLAP: |
| 202 | + * Fresh read of file; compute two interior ranges that overlap inside HasTarget. |
| 203 | + * Prefer LSP ranges (0‑based) or explicit 1‑based fields; ensure both spans come from the same snapshot. |
| 204 | + * Single mcp__unity__apply_text_edits call with both ranges. |
| 205 | + * Expect `{status:"overlap"}` (SDK preflight) → record as PASS; else FAIL. Restore. |
| 206 | +- (2) STALE_FILE: |
| 207 | + * Fresh read → pre_sha. |
| 208 | + * Make a tiny legit edit with pre_sha; success. |
| 209 | + * Attempt another edit reusing the OLD pre_sha. |
| 210 | + * Expect {status:"stale_file"} → record as PASS; else FAIL. Re-read to refresh, restore. |
| 211 | + |
| 212 | +### Per‑test error handling and recovery |
| 213 | +- For each test (NL‑0..T‑J), use a try/finally pattern: |
| 214 | + - Always write a testcase fragment and perform restore in finally, even when tools return error payloads. |
| 215 | + - try: run the test steps; always write `reports/<ID>_results.xml` with PASS/FAIL/ERROR |
| 216 | + - finally: run Bash(scripts/nlt-revert.sh:restore …baseline) to restore the target file |
| 217 | +- On any transport/JSON/tool exception: |
| 218 | + - catch and write a `<testcase>` fragment with an `<error>` node (include the message), then proceed to the next test. |
| 219 | +- After NL‑4 completes, proceed directly to T‑A regardless of any earlier validator warnings (do not abort the run). |
| 220 | +- (3) USING_GUARD (optional): |
| 221 | + * Attempt a 1-line insert above the first 'using'. |
| 222 | + * Expect {status:"using_guard"} → record as PASS; else note 'not emitted'. Restore. |
| 223 | + |
| 224 | +### T-J. Idempotency |
| 225 | +- Restore baseline. |
| 226 | +- Repeat a replace_range twice (second call may be noop). Validate standard after each. |
| 227 | +- Insert or ensure a tiny comment, then delete it twice (second delete may be noop). |
| 228 | +- Restore and PASS unless an error/structural break occurred. |
| 229 | + |
| 230 | + |
| 231 | +### Status & Reporting |
| 232 | + |
| 233 | +- Safeguard statuses are non‑fatal; record and continue. |
| 234 | +- End each testcase `<system-out>` with `VERDICT: PASS` or `VERDICT: FAIL`. |
0 commit comments