Skip to content

Commit 514f603

Browse files
committed
Fix typos and Markdown warnings
Signed-off-by: Arthit Suriyawongkul <arthit@gmail.com>
1 parent 5341ed9 commit 514f603

File tree

3 files changed

+9
-8
lines changed

3 files changed

+9
-8
lines changed

README.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,11 +25,12 @@ You will also need to install any packages required to interact with the models
2525
export OPENAI_API_KEY=<openai-api-key>
2626
pip install openai
2727
```
28-
Furthermore, some of the evaluations require additional dependencies. If your eval needs extra dependency, instructions for installing them are provided the [list of evals](#list-of-evals). subsection (or the README for that evaluation). For example, to install the dependencies of `SWE-Bench` evaluation you should run:
28+
29+
Furthermore, some of the evaluations require additional dependencies. If your eval needs extra dependency, instructions for installing them are provided the [list of evals](#list-of-evals) subsection (or the README for that evaluation). For example, to install the dependencies of `SWE-Bench` evaluation you should run:
2930

3031
```bash
3132
pip install "inspect_evals[swe_bench] @ git+https://github.com/UKGovernmentBEIS/inspect_evals"
32-
pip install -e ".[swe_bench]" # If developing on the pacakge locally
33+
pip install -e ".[swe_bench]" # If developing on the package locally
3334
```
3435

3536
Once you have a model configured, you can run evaluations for it with:
@@ -38,14 +39,14 @@ Once you have a model configured, you can run evaluations for it with:
3839
inspect eval inspect_evals/gpqa_diamond --model openai/gpt-4o
3940
```
4041

41-
If you don't want to specify the `--model` each time you run an evaluation, create a `.env` configuration file in your working direcotry that defines the `INSPECT_EVAL_MODEL` environment variable along with your API key. For example:
42+
If you don't want to specify the `--model` each time you run an evaluation, create a `.env` configuration file in your working directory that defines the `INSPECT_EVAL_MODEL` environment variable along with your API key. For example:
4243

4344
```bash
4445
INSPECT_EVAL_MODEL=openai/gpt-4o
4546
OPENAI_API_KEY=<openai-api-key>
4647
```
4748

48-
Inspect supports many model providers including OpenAI, Anthropic, Google, Mistral, AzureAI, AWS Bedrock, TogetherAI, Groq, HuggingFace, vLLM, Ollama, and more. See the [Model Providers](https://inspect.ai-safety-institute.org.uk/models.html) documentation for additional details.
49+
Inspect supports many model providers including OpenAI, Anthropic, Google, Mistral, Azure AI, AWS Bedrock, Together AI, Groq, Hugging Face, vLLM, Ollama, and more. See the [Model Providers](https://inspect.ai-safety-institute.org.uk/models.html) documentation for additional details.
4950

5051
# List of Evals
5152
<!-- Eval Listing: Automatically Generated -->

tools/listing.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,13 +30,13 @@ def listing_md(listing: dict[str, Any]) -> str:
3030
output: list[str] = []
3131
output.append(f"- ### {link_md(listing['title'], os.path.join(listing['path']))}")
3232
output.append(f" {listing['description']}{contributors}")
33-
output.append(" ```")
33+
output.append("\n ```shell")
3434
for index, task in enumerate(listing["tasks"]):
3535
if index > 3:
3636
break
3737
output.append(f" inspect eval inspect_evals/{task}")
3838

39-
output.append(" ```\n")
39+
output.append(" ```")
4040
return "\n".join(output)
4141

4242

tools/listing.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@
5858

5959
- title: "Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models"
6060
description: |
61-
40 professional-level Capture the Flag (CTF) tasks from 4 distinct CTF competitions, chosen to be recent, meaningful, and spanning a wide range of difficulties.
61+
40 professional-level Capture the Flag (CTF) tasks from 4 distinct CTF competitions, chosen to be recent, meaningful, and spanning a wide range of difficulties.
6262
path: src/inspect_evals/cybench
6363
group: Cybersecurity
6464
contributors: ["sinman-aisi", "sam-deverett-dsit", "kola-aisi", "pgiav"]
@@ -183,7 +183,7 @@
183183

184184
- title: "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens"
185185
description: |
186-
LLM benchmark featuring an average data length surpassing 100K tokens. Comprises synthetic and realistic tasks spanning diverse domains in English and Chinese.
186+
LLM benchmark featuring an average data length surpassing 100K tokens. Comprises synthetic and realistic tasks spanning diverse domains in English and Chinese.
187187
path: src/inspect_evals/infinite_bench
188188
arxiv: https://arxiv.org/abs/2402.13718
189189
group: Reasoning

0 commit comments

Comments
 (0)