Skip to content

Commit 55c91f5

Browse files
committed
Merge branch 'main' of https://github.com/stanfordnlp/dspy
2 parents 5d3d7b3 + f354043 commit 55c91f5

File tree

5 files changed

+20
-16
lines changed

5 files changed

+20
-16
lines changed

README.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -136,15 +136,17 @@ You can find other examples tweeted by [@lateinteraction](https://twitter.com/la
136136

137137
**Some other examples (not exhaustive, feel free to add more via PR):**
138138

139+
140+
- [DSPy Optimizers Benchmark on a bunch of different tasks, by Michael Ryan](https://github.com/stanfordnlp/dspy/tree/main/testing/tasks)
141+
- [Sophisticated Extreme Multi-Class Classification, IReRa, by Karel D’Oosterlinck](https://github.com/KarelDO/xmc.dspy)
142+
- [Haize Lab's Red Teaming with DSPy](https://blog.haizelabs.com/posts/dspy/) and see [their DSPy code](https://github.com/haizelabs/dspy-redteam)
139143
- Applying DSPy Assertions
140144
- [Long-form Answer Generation with Citations, by Arnav Singhvi](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/examples/longformqa/longformqa_assertions.ipynb)
141145
- [Generating Answer Choices for Quiz Questions, by Arnav Singhvi](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/examples/quiz/quiz_assertions.ipynb)
142146
- [Generating Tweets for QA, by Arnav Singhvi](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/examples/tweets/tweets_assertions.ipynb)
143147
- [Compiling LCEL runnables from LangChain in DSPy](https://github.com/stanfordnlp/dspy/blob/main/examples/tweets/compiling_langchain.ipynb)
144148
- [AI feedback, or writing LM-based metrics in DSPy](https://github.com/stanfordnlp/dspy/blob/main/examples/tweets/tweet_metric.py)
145-
- [DSPy Optimizers Benchmark on a bunch of different tasks, by Michael Ryan](https://github.com/stanfordnlp/dspy/tree/main/testing/tasks)
146149
- [Indian Languages NLI with gains due to compiling by Saiful Haq](https://github.com/saifulhaq95/DSPy-Indic/blob/main/indicxlni.ipynb)
147-
- [Sophisticated Extreme Multi-Class Classification, IReRa, by Karel D’Oosterlinck](https://github.com/KarelDO/xmc.dspy)
148150
- [DSPy on BIG-Bench Hard Example, by Chris Levy](https://drchrislevy.github.io/posts/dspy/dspy.html)
149151
- [Using Ollama with DSPy for Mistral (quantized) by @jrknox1977](https://gist.github.com/jrknox1977/78c17e492b5a75ee5bbaf9673aee4641)
150152
- [Using DSPy, "The Unreasonable Effectiveness of Eccentric Automatic Prompts" (paper) by VMware's Rick Battle & Teja Gollapudi, and interview at TheRegister](https://www.theregister.com/2024/02/22/prompt_engineering_ai_models/)
@@ -153,7 +155,8 @@ You can find other examples tweeted by [@lateinteraction](https://twitter.com/la
153155
- [Using DSPy to train Gpt 3.5 on HumanEval by Thomas Ahle](https://github.com/stanfordnlp/dspy/blob/main/examples/functional/functional.ipynb)
154156
- [Building a chess playing agent using DSPy by Franck SN](https://medium.com/thoughts-on-machine-learning/building-a-chess-playing-agent-using-dspy-9b87c868f71e)
155157

156-
TODO: Add links to the state-of-the-art results on Theory of Mind (ToM) by Plastic Labs, the results by Haize Labs for Red Teaming with DSPy, and the DSPy pipeline from Replit.
158+
159+
TODO: Add links to the state-of-the-art results by the University of Toronto on Clinical NLP, on Theory of Mind (ToM) by Plastic Labs, and the DSPy pipeline from Replit.
157160

158161
There are also recent cool examples at [Weaviate's DSPy cookbook](https://github.com/weaviate/recipes/tree/main/integrations/dspy) by Connor Shorten. [See tutorial on YouTube](https://www.youtube.com/watch?v=CEuUG4Umfxs).
159162

docs/docs/quick-start/minimal-example.mdx

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ We make use of the [GSM8K dataset](https://huggingface.co/datasets/gsm8k) and th
1212

1313
## Setup
1414

15-
Before we delve into the example, let's ensure our environment is properly configured. We'll start by importing the necessary modules and configuring our language model:
15+
Before we jump into the example, let's ensure our environment is properly configured. We'll start by importing the necessary modules and configuring our language model:
1616

1717
```python
1818
import dspy
@@ -33,7 +33,7 @@ Let's take a look at what `gsm8k_trainset` and `gsm8k_devset` are:
3333
print(gsm8k_trainset)
3434
```
3535

36-
The `gsm8k_trainset` and `gsm8k_devset` datasets contain a list of Examples with each example having `question` and `answer` field. We'll use these datasets to train and evaluate our model.
36+
The `gsm8k_trainset` and `gsm8k_devset` datasets contain a list of Examples with each example having `question` and `answer` field.
3737

3838
## Define the Module
3939

@@ -51,7 +51,7 @@ class CoT(dspy.Module):
5151

5252
## Compile and Evaluate the Model
5353

54-
With our simple program in place, let's move on to optimizing it using the [`BootstrapFewShot`](/api/optimizers/BootstrapFewShot) teleprompter:
54+
With our simple program in place, let's move on to compiling it with the [`BootstrapFewShot`](/api/optimizers/BootstrapFewShot) teleprompter:
5555

5656
```python
5757
from dspy.teleprompt import BootstrapFewShot
@@ -61,9 +61,11 @@ config = dict(max_bootstrapped_demos=4, max_labeled_demos=4)
6161

6262
# Optimize! Use the `gsm8k_metric` here. In general, the metric is going to tell the optimizer how well it's doing.
6363
teleprompter = BootstrapFewShot(metric=gsm8k_metric, **config)
64-
optimized_cot = teleprompter.compile(CoT(), trainset=gsm8k_trainset, valset=gsm8k_devset)
64+
optimized_cot = teleprompter.compile(CoT(), trainset=gsm8k_trainset)
6565
```
6666

67+
Note that BootstrapFewShot is not an optimizing teleprompter, i.e. it simple creates and validates examples for steps of the pipeline (in this case, the chain-of-thought reasoning) but does not optimize the metric. Other teleprompters like `BootstrapFewShotWithRandomSearch` and `MIPRO` will apply direct optimization.
68+
6769
## Evaluate
6870

6971
Now that we have a compiled (optimized) DSPy program, let's move to evaluating its performance on the dev dataset.

dspy/teleprompt/bootstrap.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -52,9 +52,8 @@ def __init__(
5252
self.error_count = 0
5353
self.error_lock = threading.Lock()
5454

55-
def compile(self, student, *, teacher=None, trainset, valset=None):
55+
def compile(self, student, *, teacher=None, trainset):
5656
self.trainset = trainset
57-
self.valset = valset
5857

5958
self._prepare_student_and_teacher(student, teacher)
6059
self._prepare_predictor_mappings()
@@ -133,7 +132,7 @@ def _bootstrap(self, *, max_bootstraps=None):
133132
self.validation = [x for idx, x in enumerate(self.trainset) if idx not in bootstrapped]
134133
random.Random(0).shuffle(self.validation)
135134

136-
self.validation = self.valset or self.validation
135+
self.validation = self.validation
137136

138137
# NOTE: Can't yet use evaluate because we need to trace *per example*
139138
# evaluate = Evaluate(program=self.teacher, metric=self.metric, num_threads=12)

skycamp2023.ipynb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -46,8 +46,8 @@
4646
"import pkg_resources # Install the package if it's not installed\n",
4747
"if not \"dspy-ai\" in {pkg.key for pkg in pkg_resources.working_set}:\n",
4848
" !pip install -U pip\n",
49-
" # !pip install dspy-ai\n",
50-
" !pip install -e $repo_path\n",
49+
" !pip install dspy-ai==2.1\n",
50+
" # !pip install -e $repo_path\n",
5151
"\n",
5252
"!pip install transformers"
5353
]

tests/teleprompt/test_bootstrap.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ def test_compile_with_predict_instances():
5454
metric=simple_metric, max_bootstrapped_demos=1, max_labeled_demos=1
5555
)
5656
compiled_student = bootstrap.compile(
57-
student, teacher=teacher, trainset=trainset, valset=valset
57+
student, teacher=teacher, trainset=trainset
5858
)
5959

6060
assert compiled_student is not None, "Failed to compile student"
@@ -74,7 +74,7 @@ def test_bootstrap_effectiveness():
7474
metric=simple_metric, max_bootstrapped_demos=1, max_labeled_demos=1
7575
)
7676
compiled_student = bootstrap.compile(
77-
student, teacher=teacher, trainset=trainset, valset=valset
77+
student, teacher=teacher, trainset=trainset
7878
)
7979

8080
# Check that the compiled student has the correct demos
@@ -149,7 +149,7 @@ def forward(self, **kwargs):
149149
)
150150

151151
with pytest.raises(RuntimeError, match="Simulated error"):
152-
bootstrap.compile(student, teacher=teacher, trainset=trainset, valset=valset)
152+
bootstrap.compile(student, teacher=teacher, trainset=trainset)
153153

154154

155155
def test_validation_set_usage():
@@ -171,7 +171,7 @@ def test_validation_set_usage():
171171
metric=simple_metric, max_bootstrapped_demos=1, max_labeled_demos=1
172172
)
173173
compiled_student = bootstrap.compile(
174-
student, teacher=teacher, trainset=trainset, valset=valset
174+
student, teacher=teacher, trainset=trainset
175175
)
176176

177177
# Check that validation examples are part of student's demos after compilation

0 commit comments

Comments
 (0)