# 🚀 Local ChatGPT — Powered by Ollama + Python (Streaming + PDF RAG + Embeddings + FAISS)






A **fully local ChatGPT alternative** with:
✔ ChatGPT-style UI
✔ Local LLMs via Ollama (Phi-3, LLaMA-3, Mistral, etc.)
✔ PDF Upload + RAG
✔ Real-time streaming responses
✔ Embeddings using `nomic-embed-text`
✔ FAISS vector search
✔ No cloud, no API keys — **100% offline**
Built using **Flask + TailwindCSS + Ollama + FAISS + PyMuPDF**.
---
---
# ✨ Features
| Feature | Description |
| ----------------- | -------------------------------------------------------- |
| 🧠 Local LLM Chat | Uses any Ollama model (Phi-3, LLaMA-3, Mistral, Gemma… ) |
| 📄 PDF Upload | Load documents and ask questions from them |
| 🔍 RAG | FAISS-powered retrieval from PDF chunks |
| 🧬 Embeddings | Uses `nomic-embed-text` for dense vector embeddings |
| 🚀 Live Streaming | Real-time token-by-token streaming like ChatGPT |
| 🎨 Modern UI | TailwindCSS dark-mode chat interface |
| 🔁 Model Selector | Switch Ollama models dynamically |
| 💾 Chat Memory | Auto-saved in browser localStorage |
| 🔐 100% Offline | No external API calls — everything runs on your laptop |
---
# 🧠 Architecture Overview
Frontend (HTML + TailwindCSS) ↓ Flask API (Python) ↓ Ollama Chat Models (phi3, llama3, mistral…) ↓ Ollama Embedding Model (nomic-embed-text) ↓ FAISS Vector Search ↓ PDF Question Answering (RAG)
---
# 📁 Project Structure
ollama_web_chat/ │ ├── app.py ├── requirements.txt ├── README.md ├── .gitignore │ ├── templates/ │ └── index.html │ └── static/ (optional for CSS/JS assets)
---
# ⚙️ Installation
## 1️⃣ Install Ollama
Linux/macOS:
```bash
curl -fsSL https://ollama.com/install.sh | sh
Windows: Download from https://ollama.com/download
ollama pull phi3
ollama pull llama3
ollama pull mistral
ollama pull nomic-embed-textpip install -r requirements.txtpython app.pyThen open:
http://localhost:5000
This project uses:
nomic-embed-text
Each chunk of your PDF is converted into a vector (embedding). These vectors are stored in a FAISS index.
During a question:
- User question → embedded
- Vector compared against FAISS index
- Top-k relevant chunks returned
- Passed to LLM as RAG prompt
- LLM answers using context
This gives extremely accurate answers for:
- Annual reports
- Research papers
- Legal documents
- Technical PDFs
- Policies
PDF → Text Extraction → Chunking → Embedding → FAISS Index
↑
User Question
↓
Vector Search
↓
Relevant Chunks
↓
LLM generates answer
The UI pulls all locally installed models:
/models → from Ollama tags API
You can switch between:
- phi3
- mistral
- llama3
- codellama
- gemma
- your custom models
Every change clears chat history automatically.
- Local chat
- PDF embedding + FAISS
- Model selection
- Streaming
- Modern UI
- Sidebar multi-threaded chat
- Voice input (Whisper)
- Export conversation
- Multi-PDF knowledgebase
- Dark/light toggle
- Custom embedding model fine-tuning
- Workspace mode (knowledge graphs)
- Browser extension version
- Desktop app (Electron or PyInstaller)
Pull requests welcome! If you want to add a feature, open an issue first.
If you find this useful, please give this repo a ⭐ on GitHub. It helps more people discover offline LLM tools.
This project is licensed under the MIT License.