File tree Expand file tree Collapse file tree 1 file changed +6
-4
lines changed Expand file tree Collapse file tree 1 file changed +6
-4
lines changed Original file line number Diff line number Diff line change @@ -352,17 +352,19 @@ The questions were generated by GPT-4 based on the "Computer Systems Security: P
352352- ### [ MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning] ( src/inspect_evals/musr )
353353 Evaluating models on multistep soft reasoning tasks in the form of free text narratives.
354354 <sub ><sup >Contributed by: [ @farrelmahaztra ] ( https://github.com/farrelmahaztra ) </sub ></sup >
355- ```
355+
356+ ``` bash
356357 inspect eval inspect_evals/musr
357358 ```
358359
359360- ### [ Needle in a Haystack (NIAH): In-Context Retrieval Benchmark for Long Context LLMs] ( src/inspect_evals/niah )
360361 NIAH evaluates in-context retrieval ability of long context LLMs by testing a model's ability to extract factual information from long-context inputs.
361- ```
362+
363+
364+ ``` bash
362365 inspect eval inspect_evals/niah
363366 ```
364367
365-
366368- ### [ PAWS: Paraphrase Adversaries from Word Scrambling] ( src/inspect_evals/paws )
367369 Evaluating models on the task of paraphrase detection by providing pairs of sentences that are either paraphrases or not.
368370 <sub ><sup >Contributed by: [ @meltemkenis ] ( https://github.com/meltemkenis ) </sub ></sup >
@@ -441,4 +443,4 @@ The questions were generated by GPT-4 based on the "Computer Systems Security: P
441443 inspect eval inspect_evals/agie_lsat_lr
442444 ```
443445
444- <!-- /Eval Listing: Automatically Generated -->
446+ <!-- /Eval Listing: Automatically Generated -->
You can’t perform that action at this time.
0 commit comments