Skip to content

Commit 0ca6ecb

Browse files
docs: Add FAQ entry addressing LLM hallucination concerns
1 parent 44fc852 commit 0ca6ecb

File tree

1 file changed

+17
-0
lines changed

1 file changed

+17
-0
lines changed

README.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -351,6 +351,23 @@ The `examples/` folder contains example files to help you get started. Note that
351351
- `reformulator_dataset.txt`: Examples for card reformulation
352352
- `string_formatting.py`: Handles cloze deletions and text formatting
353353
354+
### Aren't you concerned about LLM hallucinations?
355+
356+
While hallucinations are a valid concern when using LLMs as search engines or relying on their compressed inner knowledge, these tools take a different approach that minimizes this risk:
357+
358+
1. **Few-shot Learning**: By providing carefully crafted examples, we guide the LLM to follow specific patterns and formats, reducing the chance of inventing information.
359+
360+
2. **Structured Output**: The tools enforce strict output formats that make hallucinations easier to detect and correct.
361+
362+
3. **Preservation of Source Material**: Rather than generating new facts, the tools focus on reformulating and enhancing existing content from your cards.
363+
364+
4. **Model Agnosticism**: As new, more reliable models emerge, you can easily switch to them without changing your workflow.
365+
366+
367+
5. **Specialization**: By focusing on specific tasks (reformulation, mnemonic creation, etc.), we reduce the scope for hallucinations compared to general-purpose chat.
368+
369+
While no system is perfect, this approach has proven reliable through extensive testing during medical school. As LLMs continue to improve, we can expect hallucinations to become increasingly rare.
370+
354371
### What's the format of dataset files?
355372
Dataset files (like `explainer_dataset.txt`, `reformulator_dataset.txt`, etc.) are simple text files where messages are separated by `----`. The first message is assumed to be a system prompt, followed by alternating user and assistant messages. This format mirrors a typical LLM conversation flow while remaining easy to read and edit.
356373

0 commit comments

Comments
 (0)