You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* feat(llm): add llm_params option to llm_call
Extend llm_call to accept an optional llm_params dictionary for passing
configuration parameters (e.g., temperature, max_tokens) to the language
model. This enables more flexible control over LLM behavior during calls.
refactor(llm): replace llm_params context manager with argument
Update all usages of the llm_params context manager to pass llm_params as
an argument to llm_call instead. This simplifies parameter handling and
improves code clarity for LLM calls.
docs: clarify prompt customization and llm_params usage
update LLMChain config usage
Copy file name to clipboardExpand all lines: docs/user-guides/advanced/prompt-customization.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -55,6 +55,7 @@ To override the prompt for any other custom purpose, you can specify the `mode`
55
55
As an example of this, let's consider the case of compacting. Some applications might need concise prompts, for instance to avoid handling long contexts, and lower latency at the risk of slightly degraded performance due to the smaller context. For this, you might want to have multiple versions of a prompt for the same task and same model. This can be achieved as follows:
56
56
57
57
Task configuration:
58
+
58
59
```yaml
59
60
models:
60
61
- type: main
@@ -65,6 +66,7 @@ prompting_mode: "compact" # Default value is "standard"
65
66
```
66
67
67
68
Prompts configuration:
69
+
68
70
```yaml
69
71
prompts:
70
72
- task: generate_user_intent
@@ -117,6 +119,7 @@ prompts:
117
119
content: ...
118
120
# ...
119
121
```
122
+
120
123
For each task, you can also specify the maximum length of the prompt to be used for the LLM call in terms of the number of characters. This is useful if you want to limit the number of tokens used by the LLM or when you want to make sure that the prompt length does not exceed the maximum context length. When the maximum length is exceeded, the prompt is truncated by removing older turns from the conversation history until length of the prompt is less than or equal to the maximum length. The default maximum length is 16000 characters.
121
124
122
125
For example, for the `generate_user_intent` task, you can specify the following:
@@ -129,7 +132,6 @@ prompts:
129
132
max_length: 3000
130
133
```
131
134
132
-
133
135
### Content Template
134
136
135
137
The content for a completion prompt or the body for a message in a chat prompt is a string that can also include variables and potentially other types of constructs. NeMo Guardrails uses [Jinja2](https://jinja.palletsprojects.com/) as the templating engine. Check out the [Jinja Synopsis](https://jinja.palletsprojects.com/en/3.1.x/templates/#synopsis) for a quick introduction.
@@ -200,7 +202,6 @@ Optionally, the output from the LLM can be parsed using an *output parser*. The
200
202
- `bot_message`: parse the bot message, i.e., removes the "Bot message:" prefix if present;
201
203
- `verbose_v1`: parse the output of the `verbose_v1` filter.
202
204
203
-
204
205
## Predefined Prompts
205
206
206
207
Currently, the NeMo Guardrails toolkit includes prompts for `openai/gpt-3.5-turbo-instruct`, `openai/gpt-3.5-turbo`, `openai/gpt-4`, `databricks/dolly-v2-3b`, `cohere/command`, `cohere/command-light`, `cohere/command-light-nightly`.
0 commit comments