@@ -180,11 +180,173 @@ The server supports runtime LLM configuration via the `configure_llm` JSON-RPC m
180180}
181181```
182182
183+ ### Tab Management
184+
185+ The evaluation server supports managing browser tabs via REST API endpoints and Chrome DevTools Protocol (CDP).
186+
187+ #### Tab Identification
188+
189+ Each browser tab is identified by a ** composite client ID** in the format: ` baseClientId:tabId `
190+
191+ - ` baseClientId ` : The persistent identifier for the DevTools client (e.g., ` 9907fd8d-92a8-4a6a-bce9-458ec8c57306 ` )
192+ - ` tabId ` : The Chrome target ID for the specific tab (e.g., ` 482D56EE57B1931A3B9D1BFDAF935429 ` )
193+
194+ #### API Endpoints
195+
196+ ** List All Clients and Tabs**
197+ ``` bash
198+ GET /clients
199+ ```
200+
201+ Returns all registered clients with their active tabs, connection status, and readiness state.
202+
203+ Response format:
204+ ``` json
205+ [
206+ {
207+ "id" : " baseClientId" ,
208+ "name" : " Client Name" ,
209+ "description" : " Client Description" ,
210+ "tabCount" : 3 ,
211+ "tabs" : [
212+ {
213+ "tabId" : " 482D56EE57B1931A3B9D1BFDAF935429" ,
214+ "compositeClientId" : " baseClientId:tabId" ,
215+ "connected" : true ,
216+ "ready" : true ,
217+ "connectedAt" : " 2025-01-15T10:30:00.000Z" ,
218+ "remoteAddress" : " ::ffff:172.18.0.1"
219+ }
220+ ]
221+ }
222+ ]
223+ ```
224+
225+ ** List Tabs for Specific Client**
226+ ``` bash
227+ GET /clients/{clientId}/tabs
228+ ```
229+
230+ Returns all tabs for a specific client identified by ` baseClientId ` .
231+
232+ ** Open New Tab**
233+ ``` bash
234+ POST /tabs/open
235+ Content-Type: application/json
236+
237+ {
238+ " clientId" : " baseClientId:tabId" ,
239+ " url" : " https://example.com" ,
240+ " background" : false
241+ }
242+ ```
243+
244+ Opens a new tab in the browser associated with the specified client.
245+
246+ Response format:
247+ ``` json
248+ {
249+ "clientId" : " baseClientId:tabId" ,
250+ "tabId" : " newTabId" ,
251+ "compositeClientId" : " baseClientId:newTabId" ,
252+ "url" : " https://example.com" ,
253+ "status" : " opened"
254+ }
255+ ```
256+
257+ ** Close Tab**
258+ ``` bash
259+ POST /tabs/close
260+ Content-Type: application/json
261+
262+ {
263+ " clientId" : " baseClientId:tabId" ,
264+ " tabId" : " targetTabId"
265+ }
266+ ```
267+
268+ Closes the specified tab.
269+
270+ Response format:
271+ ``` json
272+ {
273+ "clientId" : " baseClientId:tabId" ,
274+ "tabId" : " targetTabId" ,
275+ "status" : " closed" ,
276+ "success" : true
277+ }
278+ ```
279+
280+ #### Implementation Architecture
281+
282+ ** Direct CDP Approach (Current)**
283+
284+ Tab management is implemented using direct Chrome DevTools Protocol (CDP) communication:
285+
286+ 1 . Server discovers the CDP WebSocket endpoint via ` http://localhost:9223/json/version `
287+ 2 . For each command (open/close), a new WebSocket connection is established to the CDP endpoint
288+ 3 . Commands are sent using JSON-RPC 2.0 format:
289+ - ` Target.createTarget ` - Opens new tab
290+ - ` Target.closeTarget ` - Closes existing tab
291+ 4 . WebSocket connection is closed after receiving the response
292+
293+ Key implementation files:
294+ - ` src/lib/EvalServer.js ` - Contains ` sendCDPCommand() ` , ` openTab() ` , and ` closeTab() ` methods
295+ - ` src/api-server.js ` - REST API endpoints that delegate to EvalServer methods
296+
297+ ** Alternative Approach Considered**
298+
299+ An RPC-based approach was initially considered where:
300+ - API server sends JSON-RPC request to DevTools client via WebSocket
301+ - DevTools client executes CDP commands locally
302+ - Response is sent back via JSON-RPC
303+
304+ This was rejected in favor of direct CDP communication for simplicity and reduced latency.
305+
306+ #### Chrome Setup
307+
308+ The browser must be started with remote debugging enabled:
309+ ``` bash
310+ chromium --remote-debugging-port=9223
311+ ```
312+
313+ The CDP endpoint is accessible at:
314+ - HTTP: ` http://localhost:9223/json/version `
315+ - WebSocket: ` ws://localhost:9223/devtools/browser/{browserId} `
316+
317+ #### Current Limitations
318+
319+ ** ⚠️ Known Issue: WebSocket Timeout**
320+
321+ Tab opening and closing functionality is currently experiencing a WebSocket timeout issue:
322+
323+ - Symptom: ` sendCDPCommand() ` times out after 10 seconds with no response
324+ - Error: ` CDP command timeout: Target.createTarget `
325+ - Status: Under investigation
326+ - Debugging approach: Added extensive logging to track WebSocket lifecycle events
327+
328+ The CDP endpoint is correctly discovered and accessible, but WebSocket messages are not being received. This may be related to:
329+ - WebSocket handshake issues
330+ - CDP protocol version mismatch
331+ - Network/proxy configuration
332+ - Chrome process state
333+
334+ ** Workaround** : Until this issue is resolved, tab management via the API is not functional. Manual CDP testing is required to diagnose the root cause.
335+
336+ #### Future Enhancements
337+
338+ - Automatic tab registration in ClientManager when DevTools connects
339+ - Tab lifecycle events (opened, closed, navigated)
340+ - Bulk tab operations
341+ - Tab metadata (title, URL, favicon)
342+ - Tab grouping and organization
343+
183344### Configuration
184345
185346All configuration is managed through environment variables and ` src/config.js ` . Key settings:
186347- Server port and host
187348- OpenAI API configuration
188349- RPC timeouts
189350- Logging levels and directories
190- - Maximum concurrent evaluations
351+ - Maximum concurrent evaluations
352+ - CDP endpoint (default: localhost:9223)
0 commit comments