smallcloudai · JegernOUTT · Dec 3, 2025 · Jun 26, 2025 · Dec 3, 2025
diff --git a/README.md b/README.md
@@ -35,10 +35,10 @@ Refact Agent works effortlessly with the tools and databases you already use:
 ### ⚡ Why Choose Refact Agent?  
 
 - ✅ **Deploy On-Premise:** For maximum security, choose our self-hosted AI Agent version and run it on your own infrastructure.
-- 🧠 **Access State-of-the-Art Models:** Use Claude 4, GPT-4o, or GPT-4o mini with AI Agent or for chat queries.
-- 🔑 **Bring Your Own Key (BYOK):** Connect your API key and use any LLM: Gemini, Grok, OpenAI, Deepseek, and others.
+- 🧠 **Access State-of-the-Art Models:** Use GPT-5, Claude 4.5, Gemini 3.0, DeepSeek, and more with AI Agent or for chat queries.
+- 🔑 **Bring Your Own Key (BYOK):** Connect your API key and use any LLM: OpenAI, Anthropic, Google, DeepSeek, Qwen, and others.
 - 💬 **Integrated IDE Chat:** Integrate with GitHub, PostgreSQL, Docker, and more. Refact.ai Agent accesses your resources and handles related operations autonomously, mimicking your workflow.
-- ⚡ **Free, Unlimited, Context-Aware Auto-Completion:** Code faster with smart AI suggestions.  
+- ⚡ **Free, Unlimited, Context-Aware Auto-Completion:** Code faster with smart AI suggestions powered by Qwen2.5-Coder-1.5B with RAG.
 - 🛠️ **Supports 25+ Programming Languages:** Python, JavaScript, Java, Rust, TypeScript, PHP, C++, C#, Go, and many more!  
 
 ### 🎉  Hear from our Community
@@ -87,7 +87,7 @@ Our Ambassadors shared remarkable stories of how they transform weeks of coding
 
 ![integrations](https://lh7-rt.googleusercontent.com/docsz/AD_4nXc4DWYXF73AgPWAaFFGLTqEprWwA0im8R_A1QMo4QW4pTnSi1MCoP9L8udMZb5FPyN-CdgefaxJFGpX2ndn5nkjGBF2b_hZBNHogM7IM6SPvUIvUd9iE1lYIq7q-TB2qKzSGLk00A?key=zllGjEBckkx13bRZ6JIqX6qr)
 
- ✅ **State-of-the-Art Models** – Use Claude 4, GPT-4o, or GPT-4o mini with AI Agent or for chat queries.
+ ✅ **State-of-the-Art Models** – Use GPT-5, Claude 4.5, Gemini 3.0, DeepSeek Reasoner, and more with AI Agent or for chat queries.
 
  ✅ **Bring Your Own Key (BYOK)** – Use your own API keys for external LLMs.  
 

diff --git a/docs/src/content/docs/supported-models.md b/docs/src/content/docs/supported-models.md
@@ -7,31 +7,109 @@ description: Supported Models in Refact.ai
 
 With Refact.ai, access state-of-the-art models in your VS Code or JetBrains plugin and select the optimal LLM for each task.
 
-### AI Agent models
-- GPT 4.1 (default)
-- Claude 3.7 Sonnet
-- Claude 3.5 Sonnet
-- GPT-4o
-- o3-mini
+### AI Agent Models
 
-### Chat models
-- GPT 4.1 (default)
-- Claude 3.7 Sonnet
-- Claude 3.5 Sonnet
-- GPT-4o
-- GPT-4o-mini
-- o3-mini
+Refact.ai supports advanced models with agent capabilities that can autonomously use tools, integrate with your development environment, and handle complex multi-step tasks:
+
+- **GPT-5 Family** - Latest OpenAI models with reasoning, web search, and code interpreter
+  - `gpt-5` - Most advanced model with full agent capabilities (400K context)
+  - `gpt-5-mini` - Efficient reasoning model (400K context)
+  - `gpt-5-nano` - Ultra-efficient option (400K context)
+  - `gpt-5.1` - Enhanced version with improved reasoning (400K context)
+  - `gpt-5.1-codex` - Code-specialized variant (400K context)
+
+- **GPT-4.1 Family** - Advanced multimodal models (1M context)
+  - `gpt-4.1` (default) - Full-featured with agent capabilities
+  - `gpt-4.1-mini` - Balanced performance and cost
+  - `gpt-4.1-nano` - Most cost-effective option
+
+- **Claude 4.5 Family** - Anthropic's latest with extended thinking (200K context)
+  - `claude-sonnet-4-5` - Balanced performance
+  - `claude-haiku-4-5` - Fast and efficient
+  - `claude-opus-4.5` - Most capable (PRO+ plans only)
+
+- **O-Series** - Reasoning-focused models
+  - `o4-mini` - Multimodal reasoning (200K context)
+  - `o3-mini` - Compact reasoning model (200K context)
+  - `o4-mini-deep-research` - Autonomous web research agent with code execution support (400K context)
+
+- **Google Gemini Models** - Large context multimodal models
+  - `gemini-2.5-pro` - Production-ready (1M context)
+  - `gemini-2.5-pro-preview` - Preview access (200K context)
+  - `gemini-3-pro-preview` - Next-generation preview (200K context)
+
+### Chat Models
+
+All agent models above can be used for chat, plus additional efficient options:
+
+- **GPT-4.1 Family**
+  - `gpt-4.1` (default) - Full-featured multimodal model (1M context)
+  - `gpt-4.1-mini` - Balanced option (1M context)
+  - `gpt-4.1-nano` - Most efficient (1M context)
+
+- **GPT-5 Family**
+  - `gpt-5`, `gpt-5-mini`, `gpt-5-nano` - All support chat with reasoning (400K context)
+  - `gpt-5.1`, `gpt-5.1-codex` - Enhanced versions (400K context)
+
+- **Claude 4.5 Family**
+  - `claude-sonnet-4-5` - Balanced performance (200K context)
+  - `claude-haiku-4-5` - Fast responses (200K context)
+  - `claude-opus-4.5` - Maximum capability (200K context, PRO+ only)
+
+- **O-Series**
+  - `o4-mini`, `o3-mini` - Reasoning models
+  - `o4-mini-deep-research` - Autonomous web research (multi-step internet research)
+
+- **Google Gemini**
+  - `gemini-2.5-pro`, `gemini-2.5-pro-preview`, `gemini-3-pro-preview` - Large context models
+
+- **DeepSeek Models** (Refact team only)
+  - `deepseek-chat` - High-performance chat with tools (64K context)
+  - `deepseek-reasoner` - Reasoning-focused model (64K context)
+
+- **Qwen Models** (Refact team only)
+  - `Qwen3-235B-A22B` - Large-scale reasoning model (41K context)
+
+### Advanced Reasoning
 
 For select models, click the `💡Think` button to enable advanced reasoning, helping AI better solve complex tasks. Available only in [Refact.ai Pro plan](https://refact.ai/pricing/).
 
+**Models with Extended Thinking/Reasoning:**
+- All GPT-5 family models (OpenAI reasoning)
+- All O-series models (OpenAI reasoning)
+- All Claude 4.5 family models (Anthropic extended thinking)
+- DeepSeek Reasoner (DeepSeek reasoning)
+- Qwen3-235B-A22B (Qwen reasoning)
+
+### Model Capabilities Overview
+
+| Capability | Description | Supported Models |
+|------------|-------------|------------------|
+| **Tools/Function Calling** | Models can use external tools and APIs | Most models |
+| **Multimodal** | Support for image inputs | GPT-4.1, GPT-5, O4-mini, Claude 4.5, Gemini |
+| **Agent Mode** | Autonomous multi-step task handling | GPT-5, GPT-4.1, Claude 4.5, Gemini, DeepSeek |
+| **Reasoning** | Advanced problem-solving with chain-of-thought | GPT-5, O-series, Claude 4.5, DeepSeek, Qwen |
+| **Web Search** | Integrated web search capabilities | GPT-5 models, o4-mini-deep-research |
+| **Code Interpreter** | Execute code in sandboxed environment | o4-mini-deep-research (supporting tool) |
+| **Prompt Caching** | Reduced costs for repeated contexts | OpenAI and Anthropic models |
+
+### Pricing Information
+
+All models are available with transparent token-based pricing:
+- **Prompt tokens**: Text you send to the model
+- **Generated tokens**: Text the model produces
+- **Cached tokens**: Previously processed context (discounted)
+
+Models with prompt caching (OpenAI, Anthropic) offer significant cost savings for repeated contexts. Cache read tokens are typically 90% cheaper than regular prompt tokens.
 
 ### Code completion models 
 - Qwen2.5-Coder-1.5B
 
 
 ## BYOK (Bring your own key)
 
-Refact.ai gives flexibility to connect your API key and use any external LLM like Gemini, Grok, OpenAI, Deepseek, and others. Read the guide in our [BYOK Documentation](https://docs.refact.ai/byok/).
+Refact.ai lets you connect your own API key and use any external LLM, including GPT, Claude, Gemini, Grok, DeepSeek, and others. It's easy: read the guide in our [BYOK Documentation](https://docs.refact.ai/byok/).
+
 
 
 ## Self-Hosted Version

diff --git a/refact-agent/engine/README.md b/refact-agent/engine/README.md
@@ -36,6 +36,32 @@ check out the [Text UI](#cli) below, you can talk about your project in the comm
 * Ask it anything! It will use the tools available to make changes to your project
 
 
+## Supported Models
+
+Refact Agent supports state-of-the-art models from multiple providers:
+
+### Model Families
+
+- **OpenAI**: GPT-5, GPT-4.1, O-series (o3-mini, o4-mini) - Advanced reasoning and agent capabilities
+- **Anthropic**: Claude 4.5 (Haiku, Sonnet, Opus) - Extended thinking and multimodal support
+- **Google**: Gemini 2.5 & 3.0 Pro - Large context windows up to 1M tokens
+- **DeepSeek**: Chat and Reasoner - High-performance inference
+- **Qwen**: Qwen3-235B - Large-scale reasoning
+
+### Key Capabilities
+
+- ✅ Streaming responses
+- ✅ Prompt caching (OpenAI, Anthropic)
+- ✅ Tool/function calling
+- ✅ Multimodal inputs (images)
+- ✅ Agent mode (autonomous task execution)
+- ✅ Extended reasoning/thinking modes
+- ✅ Web search integration (GPT-5, o4-mini-deep-research)
+- ✅ Autonomous multi-step research (o4-mini-deep-research)
+
+📜 **[View Complete Model List & Pricing](https://docs.refact.ai/supported-models/)**
+
+
 ## Installation
 
 Installable by the end user:
@@ -68,11 +94,14 @@ Installable by the end user:
 - [x] search_pattern() with scope (pattern matching)
 - [x] @file @tree @web @definition @references @search mentions in chat
 - [x] subagent() delegates focused tasks to independent sub-agents
-- [x] Latest gpt-4o gpt-4o-mini
-- [x] Claude-3-5-sonnet
-- [x] Llama-3.1 (passthrough)
-- [ ] Llama-3.2 (passthrough)
-- [ ] Llama-3.2 (scratchpad)
+- [x] OpenAI GPT-4.1, GPT-5, o3-mini, o4-mini models with reasoning support
+- [x] Anthropic Claude 4.5 family (Haiku, Sonnet, Opus) with extended thinking
+- [x] Google Gemini 2.5 & 3.0 Pro models
+- [x] DeepSeek Chat and Reasoner models
+- [x] Qwen3-235B reasoning model
+- [x] Prompt caching support (OpenAI, Anthropic)
+- [x] Web search integration (GPT-5 models)
+- [x] Code interpreter (o4-mini-deep-research)
 - [x] [Bring-your-own-key](https://docs.refact.ai/byok/)
 - [ ] Memory (--experimental)
 - [ ] Docker integration (--experimental)