From d684512f29db1da89ed026fdf07cb033da8cd931 Mon Sep 17 00:00:00 2001
From: bystrakowa <84568820+bystrakowa@users.noreply.github.com>
Date: Fri, 27 Jun 2025 03:48:56 +0400
Subject: [PATCH 1/2] Update supported-models.md

- UPD the list of available models in Cloud
- UPD the BYOK description
---
 docs/src/content/docs/supported-models.md | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/docs/src/content/docs/supported-models.md b/docs/src/content/docs/supported-models.md
index e67c98c5b..14dee5291 100644
--- a/docs/src/content/docs/supported-models.md
+++ b/docs/src/content/docs/supported-models.md
@@ -9,18 +9,24 @@ With Refact.ai, access state-of-the-art models in your VS Code or JetBrains plug
 
 ### AI Agent models
 - GPT 4.1 (default)
-- Claude 3.7 Sonnet
-- Claude 3.5 Sonnet
+- Claude 4 Sonnet
 - GPT-4o
-- o3-mini
+- o4-mini
+- Gemini 2.5 Pro
+- Gemini 2.5 Pro Preview
+
 
 ### Chat models
 - GPT 4.1 (default)
-- Claude 3.7 Sonnet
-- Claude 3.5 Sonnet
+- GPT-4.1 Mini
+- GPT-4.1 Nano
+- Claude 4 Sonnet  
 - GPT-4o
 - GPT-4o-mini
-- o3-mini
+- o4-mini
+- Gemini 2.5 Pro
+- Gemini 2.5 Pro Preview
+
 
 For select models, click the `💡Think` button to enable advanced reasoning, helping AI better solve complex tasks. Available only in [Refact.ai Pro plan](https://refact.ai/pricing/).
 
@@ -31,7 +37,8 @@ For select models, click the `💡Think` button to enable advanced reasoning, he
 
 ## BYOK (Bring your own key)
 
-Refact.ai gives flexibility to connect your API key and use any external LLM like Gemini, Grok, OpenAI, Deepseek, and others. Read the guide in our [BYOK Documentation](https://docs.refact.ai/byok/).
+Refact.ai lets you connect your own API key and use any external LLM, including GPT, Claude, Gemini, Grok, DeepSeek, and others. It's easy: read the guide in our [BYOK Documentation](https://docs.refact.ai/byok/).
+
 
 
 ## Self-Hosted Version

From 7b20a06475d2bf6cb6b215125cf6524feef384b1 Mon Sep 17 00:00:00 2001
From: JegernOUTT <sergey.vakhreev@gmail.com>
Date: Thu, 4 Dec 2025 00:41:20 +1030
Subject: [PATCH 2/2] Update supported models documentation with new AI Agent
 and Chat models

Expand the list of supported AI models, including GPT-5, Claude 4.5, and Gemini 3.0, while providing detailed descriptions of each family and their capabilities. Include advanced features like reasoning, multimodal support, tool integration, and autonomous tasks. Update README files to reflect these changes and emphasize the use of state-of-the-art models in Refact Agent's offerings.
---
 README.md                                 |   8 +-
 docs/src/content/docs/supported-models.md | 105 ++++++++++++++++++----
 refact-agent/engine/README.md             |  39 ++++++--
 3 files changed, 126 insertions(+), 26 deletions(-)

diff --git a/README.md b/README.md
index 3c12a6b60..113e904ac 100644
--- a/README.md
+++ b/README.md
@@ -35,10 +35,10 @@ Refact Agent works effortlessly with the tools and databases you already use:
 ### ⚡ Why Choose Refact Agent?  
 
 - ✅ **Deploy On-Premise:** For maximum security, choose our self-hosted AI Agent version and run it on your own infrastructure.
-- 🧠 **Access State-of-the-Art Models:** Use Claude 4, GPT-4o, or GPT-4o mini with AI Agent or for chat queries.
-- 🔑 **Bring Your Own Key (BYOK):** Connect your API key and use any LLM: Gemini, Grok, OpenAI, Deepseek, and others.
+- 🧠 **Access State-of-the-Art Models:** Use GPT-5, Claude 4.5, Gemini 3.0, DeepSeek, and more with AI Agent or for chat queries.
+- 🔑 **Bring Your Own Key (BYOK):** Connect your API key and use any LLM: OpenAI, Anthropic, Google, DeepSeek, Qwen, and others.
 - 💬 **Integrated IDE Chat:** Integrate with GitHub, PostgreSQL, Docker, and more. Refact.ai Agent accesses your resources and handles related operations autonomously, mimicking your workflow.
-- ⚡ **Free, Unlimited, Context-Aware Auto-Completion:** Code faster with smart AI suggestions.  
+- ⚡ **Free, Unlimited, Context-Aware Auto-Completion:** Code faster with smart AI suggestions powered by Qwen2.5-Coder-1.5B with RAG.
 - 🛠️ **Supports 25+ Programming Languages:** Python, JavaScript, Java, Rust, TypeScript, PHP, C++, C#, Go, and many more!  
 
 ### 🎉  Hear from our Community
@@ -87,7 +87,7 @@ Our Ambassadors shared remarkable stories of how they transform weeks of coding
 
 ![integrations](https://lh7-rt.googleusercontent.com/docsz/AD_4nXc4DWYXF73AgPWAaFFGLTqEprWwA0im8R_A1QMo4QW4pTnSi1MCoP9L8udMZb5FPyN-CdgefaxJFGpX2ndn5nkjGBF2b_hZBNHogM7IM6SPvUIvUd9iE1lYIq7q-TB2qKzSGLk00A?key=zllGjEBckkx13bRZ6JIqX6qr)
 
- ✅ **State-of-the-Art Models** – Use Claude 4, GPT-4o, or GPT-4o mini with AI Agent or for chat queries.
+ ✅ **State-of-the-Art Models** – Use GPT-5, Claude 4.5, Gemini 3.0, DeepSeek Reasoner, and more with AI Agent or for chat queries.
 
  ✅ **Bring Your Own Key (BYOK)** – Use your own API keys for external LLMs.  
 
diff --git a/docs/src/content/docs/supported-models.md b/docs/src/content/docs/supported-models.md
index 14dee5291..df5df610a 100644
--- a/docs/src/content/docs/supported-models.md
+++ b/docs/src/content/docs/supported-models.md
@@ -7,29 +7,100 @@ description: Supported Models in Refact.ai
 
 With Refact.ai, access state-of-the-art models in your VS Code or JetBrains plugin and select the optimal LLM for each task.
 
-### AI Agent models
-- GPT 4.1 (default)
-- Claude 4 Sonnet
-- GPT-4o
-- o4-mini
-- Gemini 2.5 Pro
-- Gemini 2.5 Pro Preview
+### AI Agent Models
 
+Refact.ai supports advanced models with agent capabilities that can autonomously use tools, integrate with your development environment, and handle complex multi-step tasks:
 
-### Chat models
-- GPT 4.1 (default)
-- GPT-4.1 Mini
-- GPT-4.1 Nano
-- Claude 4 Sonnet  
-- GPT-4o
-- GPT-4o-mini
-- o4-mini
-- Gemini 2.5 Pro
-- Gemini 2.5 Pro Preview
+- **GPT-5 Family** - Latest OpenAI models with reasoning, web search, and code interpreter
+  - `gpt-5` - Most advanced model with full agent capabilities (400K context)
+  - `gpt-5-mini` - Efficient reasoning model (400K context)
+  - `gpt-5-nano` - Ultra-efficient option (400K context)
+  - `gpt-5.1` - Enhanced version with improved reasoning (400K context)
+  - `gpt-5.1-codex` - Code-specialized variant (400K context)
+
+- **GPT-4.1 Family** - Advanced multimodal models (1M context)
+  - `gpt-4.1` (default) - Full-featured with agent capabilities
+  - `gpt-4.1-mini` - Balanced performance and cost
+  - `gpt-4.1-nano` - Most cost-effective option
+
+- **Claude 4.5 Family** - Anthropic's latest with extended thinking (200K context)
+  - `claude-sonnet-4-5` - Balanced performance
+  - `claude-haiku-4-5` - Fast and efficient
+  - `claude-opus-4.5` - Most capable (PRO+ plans only)
+
+- **O-Series** - Reasoning-focused models
+  - `o4-mini` - Multimodal reasoning (200K context)
+  - `o3-mini` - Compact reasoning model (200K context)
+  - `o4-mini-deep-research` - Autonomous web research agent with code execution support (400K context)
+
+- **Google Gemini Models** - Large context multimodal models
+  - `gemini-2.5-pro` - Production-ready (1M context)
+  - `gemini-2.5-pro-preview` - Preview access (200K context)
+  - `gemini-3-pro-preview` - Next-generation preview (200K context)
+
+### Chat Models
+
+All agent models above can be used for chat, plus additional efficient options:
+
+- **GPT-4.1 Family**
+  - `gpt-4.1` (default) - Full-featured multimodal model (1M context)
+  - `gpt-4.1-mini` - Balanced option (1M context)
+  - `gpt-4.1-nano` - Most efficient (1M context)
+
+- **GPT-5 Family**
+  - `gpt-5`, `gpt-5-mini`, `gpt-5-nano` - All support chat with reasoning (400K context)
+  - `gpt-5.1`, `gpt-5.1-codex` - Enhanced versions (400K context)
+
+- **Claude 4.5 Family**
+  - `claude-sonnet-4-5` - Balanced performance (200K context)
+  - `claude-haiku-4-5` - Fast responses (200K context)
+  - `claude-opus-4.5` - Maximum capability (200K context, PRO+ only)
 
+- **O-Series**
+  - `o4-mini`, `o3-mini` - Reasoning models
+  - `o4-mini-deep-research` - Autonomous web research (multi-step internet research)
+
+- **Google Gemini**
+  - `gemini-2.5-pro`, `gemini-2.5-pro-preview`, `gemini-3-pro-preview` - Large context models
+
+- **DeepSeek Models** (Refact team only)
+  - `deepseek-chat` - High-performance chat with tools (64K context)
+  - `deepseek-reasoner` - Reasoning-focused model (64K context)
+
+- **Qwen Models** (Refact team only)
+  - `Qwen3-235B-A22B` - Large-scale reasoning model (41K context)
+
+### Advanced Reasoning
 
 For select models, click the `💡Think` button to enable advanced reasoning, helping AI better solve complex tasks. Available only in [Refact.ai Pro plan](https://refact.ai/pricing/).
 
+**Models with Extended Thinking/Reasoning:**
+- All GPT-5 family models (OpenAI reasoning)
+- All O-series models (OpenAI reasoning)
+- All Claude 4.5 family models (Anthropic extended thinking)
+- DeepSeek Reasoner (DeepSeek reasoning)
+- Qwen3-235B-A22B (Qwen reasoning)
+
+### Model Capabilities Overview
+
+| Capability | Description | Supported Models |
+|------------|-------------|------------------|
+| **Tools/Function Calling** | Models can use external tools and APIs | Most models |
+| **Multimodal** | Support for image inputs | GPT-4.1, GPT-5, O4-mini, Claude 4.5, Gemini |
+| **Agent Mode** | Autonomous multi-step task handling | GPT-5, GPT-4.1, Claude 4.5, Gemini, DeepSeek |
+| **Reasoning** | Advanced problem-solving with chain-of-thought | GPT-5, O-series, Claude 4.5, DeepSeek, Qwen |
+| **Web Search** | Integrated web search capabilities | GPT-5 models, o4-mini-deep-research |
+| **Code Interpreter** | Execute code in sandboxed environment | o4-mini-deep-research (supporting tool) |
+| **Prompt Caching** | Reduced costs for repeated contexts | OpenAI and Anthropic models |
+
+### Pricing Information
+
+All models are available with transparent token-based pricing:
+- **Prompt tokens**: Text you send to the model
+- **Generated tokens**: Text the model produces
+- **Cached tokens**: Previously processed context (discounted)
+
+Models with prompt caching (OpenAI, Anthropic) offer significant cost savings for repeated contexts. Cache read tokens are typically 90% cheaper than regular prompt tokens.
 
 ### Code completion models 
 - Qwen2.5-Coder-1.5B
diff --git a/refact-agent/engine/README.md b/refact-agent/engine/README.md
index 2c7aa247f..93c784f78 100644
--- a/refact-agent/engine/README.md
+++ b/refact-agent/engine/README.md
@@ -36,6 +36,32 @@ check out the [Text UI](#cli) below, you can talk about your project in the comm
 * Ask it anything! It will use the tools available to make changes to your project
 
 
+## Supported Models
+
+Refact Agent supports state-of-the-art models from multiple providers:
+
+### Model Families
+
+- **OpenAI**: GPT-5, GPT-4.1, O-series (o3-mini, o4-mini) - Advanced reasoning and agent capabilities
+- **Anthropic**: Claude 4.5 (Haiku, Sonnet, Opus) - Extended thinking and multimodal support
+- **Google**: Gemini 2.5 & 3.0 Pro - Large context windows up to 1M tokens
+- **DeepSeek**: Chat and Reasoner - High-performance inference
+- **Qwen**: Qwen3-235B - Large-scale reasoning
+
+### Key Capabilities
+
+- ✅ Streaming responses
+- ✅ Prompt caching (OpenAI, Anthropic)
+- ✅ Tool/function calling
+- ✅ Multimodal inputs (images)
+- ✅ Agent mode (autonomous task execution)
+- ✅ Extended reasoning/thinking modes
+- ✅ Web search integration (GPT-5, o4-mini-deep-research)
+- ✅ Autonomous multi-step research (o4-mini-deep-research)
+
+📜 **[View Complete Model List & Pricing](https://docs.refact.ai/supported-models/)**
+
+
 ## Installation
 
 Installable by the end user:
@@ -68,11 +94,14 @@ Installable by the end user:
 - [x] search_pattern() with scope (pattern matching)
 - [x] @file @tree @web @definition @references @search mentions in chat
 - [x] subagent() delegates focused tasks to independent sub-agents
-- [x] Latest gpt-4o gpt-4o-mini
-- [x] Claude-3-5-sonnet
-- [x] Llama-3.1 (passthrough)
-- [ ] Llama-3.2 (passthrough)
-- [ ] Llama-3.2 (scratchpad)
+- [x] OpenAI GPT-4.1, GPT-5, o3-mini, o4-mini models with reasoning support
+- [x] Anthropic Claude 4.5 family (Haiku, Sonnet, Opus) with extended thinking
+- [x] Google Gemini 2.5 & 3.0 Pro models
+- [x] DeepSeek Chat and Reasoner models
+- [x] Qwen3-235B reasoning model
+- [x] Prompt caching support (OpenAI, Anthropic)
+- [x] Web search integration (GPT-5 models)
+- [x] Code interpreter (o4-mini-deep-research)
 - [x] [Bring-your-own-key](https://docs.refact.ai/byok/)
 - [ ] Memory (--experimental)
 - [ ] Docker integration (--experimental)