Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,10 @@ Refact Agent works effortlessly with the tools and databases you already use:
### ⚡ Why Choose Refact Agent?

- ✅ **Deploy On-Premise:** For maximum security, choose our self-hosted AI Agent version and run it on your own infrastructure.
- 🧠 **Access State-of-the-Art Models:** Use Claude 4, GPT-4o, or GPT-4o mini with AI Agent or for chat queries.
- 🔑 **Bring Your Own Key (BYOK):** Connect your API key and use any LLM: Gemini, Grok, OpenAI, Deepseek, and others.
- 🧠 **Access State-of-the-Art Models:** Use GPT-5, Claude 4.5, Gemini 3.0, DeepSeek, and more with AI Agent or for chat queries.
- 🔑 **Bring Your Own Key (BYOK):** Connect your API key and use any LLM: OpenAI, Anthropic, Google, DeepSeek, Qwen, and others.
- 💬 **Integrated IDE Chat:** Integrate with GitHub, PostgreSQL, Docker, and more. Refact.ai Agent accesses your resources and handles related operations autonomously, mimicking your workflow.
- ⚡ **Free, Unlimited, Context-Aware Auto-Completion:** Code faster with smart AI suggestions.
- ⚡ **Free, Unlimited, Context-Aware Auto-Completion:** Code faster with smart AI suggestions powered by Qwen2.5-Coder-1.5B with RAG.
- 🛠️ **Supports 25+ Programming Languages:** Python, JavaScript, Java, Rust, TypeScript, PHP, C++, C#, Go, and many more!

### 🎉 Hear from our Community
Expand Down Expand Up @@ -87,7 +87,7 @@ Our Ambassadors shared remarkable stories of how they transform weeks of coding

![integrations](https://lh7-rt.googleusercontent.com/docsz/AD_4nXc4DWYXF73AgPWAaFFGLTqEprWwA0im8R_A1QMo4QW4pTnSi1MCoP9L8udMZb5FPyN-CdgefaxJFGpX2ndn5nkjGBF2b_hZBNHogM7IM6SPvUIvUd9iE1lYIq7q-TB2qKzSGLk00A?key=zllGjEBckkx13bRZ6JIqX6qr)

✅ **State-of-the-Art Models** – Use Claude 4, GPT-4o, or GPT-4o mini with AI Agent or for chat queries.
✅ **State-of-the-Art Models** – Use GPT-5, Claude 4.5, Gemini 3.0, DeepSeek Reasoner, and more with AI Agent or for chat queries.

✅ **Bring Your Own Key (BYOK)** – Use your own API keys for external LLMs.

Expand Down
106 changes: 92 additions & 14 deletions docs/src/content/docs/supported-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,31 +7,109 @@ description: Supported Models in Refact.ai

With Refact.ai, access state-of-the-art models in your VS Code or JetBrains plugin and select the optimal LLM for each task.

### AI Agent models
- GPT 4.1 (default)
- Claude 3.7 Sonnet
- Claude 3.5 Sonnet
- GPT-4o
- o3-mini
### AI Agent Models

### Chat models
- GPT 4.1 (default)
- Claude 3.7 Sonnet
- Claude 3.5 Sonnet
- GPT-4o
- GPT-4o-mini
- o3-mini
Refact.ai supports advanced models with agent capabilities that can autonomously use tools, integrate with your development environment, and handle complex multi-step tasks:

- **GPT-5 Family** - Latest OpenAI models with reasoning, web search, and code interpreter
- `gpt-5` - Most advanced model with full agent capabilities (400K context)
- `gpt-5-mini` - Efficient reasoning model (400K context)
- `gpt-5-nano` - Ultra-efficient option (400K context)
- `gpt-5.1` - Enhanced version with improved reasoning (400K context)
- `gpt-5.1-codex` - Code-specialized variant (400K context)

- **GPT-4.1 Family** - Advanced multimodal models (1M context)
- `gpt-4.1` (default) - Full-featured with agent capabilities
- `gpt-4.1-mini` - Balanced performance and cost
- `gpt-4.1-nano` - Most cost-effective option

- **Claude 4.5 Family** - Anthropic's latest with extended thinking (200K context)
- `claude-sonnet-4-5` - Balanced performance
- `claude-haiku-4-5` - Fast and efficient
- `claude-opus-4.5` - Most capable (PRO+ plans only)

- **O-Series** - Reasoning-focused models
- `o4-mini` - Multimodal reasoning (200K context)
- `o3-mini` - Compact reasoning model (200K context)
- `o4-mini-deep-research` - Autonomous web research agent with code execution support (400K context)

- **Google Gemini Models** - Large context multimodal models
- `gemini-2.5-pro` - Production-ready (1M context)
- `gemini-2.5-pro-preview` - Preview access (200K context)
- `gemini-3-pro-preview` - Next-generation preview (200K context)

### Chat Models

All agent models above can be used for chat, plus additional efficient options:

- **GPT-4.1 Family**
- `gpt-4.1` (default) - Full-featured multimodal model (1M context)
- `gpt-4.1-mini` - Balanced option (1M context)
- `gpt-4.1-nano` - Most efficient (1M context)

- **GPT-5 Family**
- `gpt-5`, `gpt-5-mini`, `gpt-5-nano` - All support chat with reasoning (400K context)
- `gpt-5.1`, `gpt-5.1-codex` - Enhanced versions (400K context)

- **Claude 4.5 Family**
- `claude-sonnet-4-5` - Balanced performance (200K context)
- `claude-haiku-4-5` - Fast responses (200K context)
- `claude-opus-4.5` - Maximum capability (200K context, PRO+ only)

- **O-Series**
- `o4-mini`, `o3-mini` - Reasoning models
- `o4-mini-deep-research` - Autonomous web research (multi-step internet research)

- **Google Gemini**
- `gemini-2.5-pro`, `gemini-2.5-pro-preview`, `gemini-3-pro-preview` - Large context models

- **DeepSeek Models** (Refact team only)
- `deepseek-chat` - High-performance chat with tools (64K context)
- `deepseek-reasoner` - Reasoning-focused model (64K context)

- **Qwen Models** (Refact team only)
- `Qwen3-235B-A22B` - Large-scale reasoning model (41K context)

### Advanced Reasoning

For select models, click the `💡Think` button to enable advanced reasoning, helping AI better solve complex tasks. Available only in [Refact.ai Pro plan](https://refact.ai/pricing/).

**Models with Extended Thinking/Reasoning:**
- All GPT-5 family models (OpenAI reasoning)
- All O-series models (OpenAI reasoning)
- All Claude 4.5 family models (Anthropic extended thinking)
- DeepSeek Reasoner (DeepSeek reasoning)
- Qwen3-235B-A22B (Qwen reasoning)

### Model Capabilities Overview

| Capability | Description | Supported Models |
|------------|-------------|------------------|
| **Tools/Function Calling** | Models can use external tools and APIs | Most models |
| **Multimodal** | Support for image inputs | GPT-4.1, GPT-5, O4-mini, Claude 4.5, Gemini |
| **Agent Mode** | Autonomous multi-step task handling | GPT-5, GPT-4.1, Claude 4.5, Gemini, DeepSeek |
| **Reasoning** | Advanced problem-solving with chain-of-thought | GPT-5, O-series, Claude 4.5, DeepSeek, Qwen |
| **Web Search** | Integrated web search capabilities | GPT-5 models, o4-mini-deep-research |
| **Code Interpreter** | Execute code in sandboxed environment | o4-mini-deep-research (supporting tool) |
| **Prompt Caching** | Reduced costs for repeated contexts | OpenAI and Anthropic models |

### Pricing Information

All models are available with transparent token-based pricing:
- **Prompt tokens**: Text you send to the model
- **Generated tokens**: Text the model produces
- **Cached tokens**: Previously processed context (discounted)

Models with prompt caching (OpenAI, Anthropic) offer significant cost savings for repeated contexts. Cache read tokens are typically 90% cheaper than regular prompt tokens.

### Code completion models
- Qwen2.5-Coder-1.5B


## BYOK (Bring your own key)

Refact.ai gives flexibility to connect your API key and use any external LLM like Gemini, Grok, OpenAI, Deepseek, and others. Read the guide in our [BYOK Documentation](https://docs.refact.ai/byok/).
Refact.ai lets you connect your own API key and use any external LLM, including GPT, Claude, Gemini, Grok, DeepSeek, and others. It's easy: read the guide in our [BYOK Documentation](https://docs.refact.ai/byok/).



## Self-Hosted Version
Expand Down
39 changes: 34 additions & 5 deletions refact-agent/engine/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,32 @@ check out the [Text UI](#cli) below, you can talk about your project in the comm
* Ask it anything! It will use the tools available to make changes to your project


## Supported Models

Refact Agent supports state-of-the-art models from multiple providers:

### Model Families

- **OpenAI**: GPT-5, GPT-4.1, O-series (o3-mini, o4-mini) - Advanced reasoning and agent capabilities
- **Anthropic**: Claude 4.5 (Haiku, Sonnet, Opus) - Extended thinking and multimodal support
- **Google**: Gemini 2.5 & 3.0 Pro - Large context windows up to 1M tokens
- **DeepSeek**: Chat and Reasoner - High-performance inference
- **Qwen**: Qwen3-235B - Large-scale reasoning

### Key Capabilities

- ✅ Streaming responses
- ✅ Prompt caching (OpenAI, Anthropic)
- ✅ Tool/function calling
- ✅ Multimodal inputs (images)
- ✅ Agent mode (autonomous task execution)
- ✅ Extended reasoning/thinking modes
- ✅ Web search integration (GPT-5, o4-mini-deep-research)
- ✅ Autonomous multi-step research (o4-mini-deep-research)

📜 **[View Complete Model List & Pricing](https://docs.refact.ai/supported-models/)**


## Installation

Installable by the end user:
Expand Down Expand Up @@ -68,11 +94,14 @@ Installable by the end user:
- [x] search_pattern() with scope (pattern matching)
- [x] @file @tree @web @definition @references @search mentions in chat
- [x] subagent() delegates focused tasks to independent sub-agents
- [x] Latest gpt-4o gpt-4o-mini
- [x] Claude-3-5-sonnet
- [x] Llama-3.1 (passthrough)
- [ ] Llama-3.2 (passthrough)
- [ ] Llama-3.2 (scratchpad)
- [x] OpenAI GPT-4.1, GPT-5, o3-mini, o4-mini models with reasoning support
- [x] Anthropic Claude 4.5 family (Haiku, Sonnet, Opus) with extended thinking
- [x] Google Gemini 2.5 & 3.0 Pro models
- [x] DeepSeek Chat and Reasoner models
- [x] Qwen3-235B reasoning model
- [x] Prompt caching support (OpenAI, Anthropic)
- [x] Web search integration (GPT-5 models)
- [x] Code interpreter (o4-mini-deep-research)
- [x] [Bring-your-own-key](https://docs.refact.ai/byok/)
- [ ] Memory (--experimental)
- [ ] Docker integration (--experimental)
Expand Down
Loading