docs: prepare 0.18 doc release (#1493)

miyoungc · web-flow · commit dda9521a4cd3 · 2025-11-05T12:06:51.000-08:00
* prepare 0.18 doc release

* add the other two notes

* nit
diff --git a/docs/project.json b/docs/project.json
@@ -1 +1 @@
-{ "name": "nemo-guardrails-toolkit", "version": "0.17.0" }
+{ "name": "nemo-guardrails-toolkit", "version": "0.18.0" }
diff --git a/docs/release-notes.md b/docs/release-notes.md
@@ -12,6 +12,33 @@ The following sections summarize and highlight the changes for each release.
 For a complete record of changes in a release, refer to the
 [CHANGELOG.md](https://github.com/NVIDIA/NeMo-Guardrails/blob/develop/CHANGELOG.md) in the GitHub repository.
 
+---
+
+(v0-18-0)=
+
+## 0.18.0
+
+(v0-18-0-features)=
+
+### Key Features
+
+- In-memory caching of guardrail model calls for reduced latency and cost savings.
+  NeMo Guardrails now supports per-model caching of guardrail responses using an LFU (Least Frequently Used) cache.
+  This feature is particularly effective for safety models such as NVIDIA NemoGuard [Content Safety](https://build.nvidia.com/nvidia/llama-3_1-nemoguard-8b-content-safety), [Topic Control](https://build.nvidia.com/nvidia/llama-3_1-nemoguard-8b-topic-control), and [Jailbreak Detection](https://build.nvidia.com/nvidia/nemoguard-jailbreak-detect) where identical inputs are common.
+  For more information, refer to [](model-memory-cache).
+- NeMo Guardrails extracts the reasoning traces from the LLM response and emits them as `BotThinking` events before the final `BotMessage` event.
+  For more information, refer to [](bot-thinking-guardrails).
+- New community integration with [Cisco AI Defense](https://www.cisco.com/site/ca/en/products/security/ai-defense/index.html).
+- New embedding integrations with Azure OpenAI, Google, and Cohere.
+
+(v0-18-0-fixed-issues)=
+
+### Fixed Issues
+
+- Implemented validation of content safety and topic control guardrail configurations at creation time, providing prompt error reporting if required prompt templates or parameters are missing.
+
+---
+
 (v0-17-0)=
 
 ## 0.17.0
diff --git a/docs/user-guides/advanced/bot-thinking-guardrails.md b/docs/user-guides/advanced/bot-thinking-guardrails.md
@@ -1,3 +1,5 @@
+(bot-thinking-guardrails)=
+
 # Guardrailing Bot Reasoning Content
 
 Reasoning-capable large language models (LLMs) expose their internal thought process as reasoning traces. These traces reveal how the model arrives at its conclusions, providing transparency into the decision-making process. However, they may also contain sensitive information or problematic reasoning patterns that need to be monitored and controlled.
diff --git a/docs/versions1.json b/docs/versions1.json
@@ -1,6 +1,10 @@
 [
     {
         "preferred": true,
+        "version": "0.18.0",
+        "url": "../0.18.0/"
+    },
+    {
         "version": "0.17.0",
         "url": "../0.17.0/"
     },

Original file line number	Diff line number	Diff line change
`@@ -1 +1 @@`
`1`		`-{ "name": "nemo-guardrails-toolkit", "version": "0.17.0" }`
	`1`	`+{ "name": "nemo-guardrails-toolkit", "version": "0.18.0" }`
Original file line number	Diff line number	Diff line change
`@@ -1,3 +1,5 @@`
	`1`	`+(bot-thinking-guardrails)=`
	`2`	`+`
`1`	`3`	`# Guardrailing Bot Reasoning Content`
`2`	`4`
`3`	`5`	`Reasoning-capable large language models (LLMs) expose their internal thought process as reasoning traces. These traces reveal how the model arrives at its conclusions, providing transparency into the decision-making process. However, they may also contain sensitive information or problematic reasoning patterns that need to be monitored and controlled.`
Original file line number	Diff line number	Diff line change
`@@ -1,6 +1,10 @@`
`1`	`1`	`[`
`2`	`2`	`{`
`3`	`3`	`"preferred": true,`
	`4`	`+ "version": "0.18.0",`
	`5`	`+ "url": "../0.18.0/"`
	`6`	`+ },`
	`7`	`+ {`
`4`	`8`	`"version": "0.17.0",`
`5`	`9`	`"url": "../0.17.0/"`
`6`	`10`	`},`