You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Some fixes for vscode jupyter styling. Small fix for when server connects fail on init.
* Fix pre-commit while I'm here
* More pre-commit fixes
* refactor(image): Explicitly cast array returns
* More pre-commit fixes
* Some fixes from co-pilot
* Notebook cleanup
* Unwind a type change
* More type fixes?
---------
Co-authored-by: Brian Greunke <briangreunke@pm.me>
Copy file name to clipboardExpand all lines: docs/airt/overview.mdx
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ slug: airt-overview
4
4
description: Evaluate and red-team AI systems.
5
5
---
6
6
7
-
Strikes AIRT tooling is a small, composable toolkit for **evaluating and testing AI systems** for security and safety, by generating, refining, and scoring adversarial inputs.
7
+
Strikes AIRT tooling is a small, composable toolkit for **evaluating and testing AI systems** for security and safety, by generating, refining, and scoring adversarial inputs.
8
8
9
9
It treats red teaming as a **search problem**: propose a candidate prompt/input, observe the target's response, score how well it met a goal, then iterate-guided by search strategies, constraints, with early stopping.
Copy file name to clipboardExpand all lines: docs/examples/python-agent.mdx
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,8 +4,8 @@ description: Executes Python code in a sandboxed environment
4
4
public: true
5
5
---
6
6
7
-
This agent provides a general-purpose, sandboxed environment for executing Python code to accomplish user-defined tasks.
8
-
It leverages a Large Language Model (LLM) to interpret a natural language task, generate Python code, and execute it within a Docker container.
7
+
This agent provides a general-purpose, sandboxed environment for executing Python code to accomplish user-defined tasks.
8
+
It leverages a Large Language Model (LLM) to interpret a natural language task, generate Python code, and execute it within a Docker container.
9
9
The agent operates by creating an interactive session with a [Jupyter kernel](https://docs.jupyter.org/en/latest/projects/kernels.html) running inside the container, allowing it to iteratively write code, execute it, and use the output to inform its next steps until the task is complete.
10
10
11
11
## Intended Use
@@ -14,8 +14,8 @@ The agent is designed for a wide range of tasks that can be solved programmatica
14
14
15
15
## Environment
16
16
17
-
To run this agent, a Docker daemon must be available and running on the host machine.
18
-
The agent itself is a Python command-line application.
17
+
To run this agent, a Docker daemon must be available and running on the host machine.
18
+
The agent itself is a Python command-line application.
19
19
It pulls a specified Docker image (defaulting to [jupyter/datascience-notebook:latest](https://hub.docker.com/r/jupyter/datascience-notebook/)) to create the execution environment.
0 commit comments