Skip to content

Commit 1596dff

Browse files
authored
Merge pull request #131 from ChawlaAvi/main
Add video RAG Demo
2 parents eb6330f + a17661d commit 1596dff

File tree

7 files changed

+779
-0
lines changed

7 files changed

+779
-0
lines changed

video-rag-gemini/.env.example

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Gemini API Configuration
2+
GEMINI_API_KEY=your_gemini_api_key_here
3+
4+
# Get your API key from: https://aistudio.google.com/app/apikey
5+

video-rag-gemini/README.md

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
# 🎬 Video RAG with Gemini
2+
3+
A Streamlit demo that allows you to upload videos and chat with them using Google's Gemini AI with multimodal capabilities.
4+
5+
## Features
6+
7+
- 📹 **Video Upload**: Support for multiple video formats (MP4, AVI, MOV, MKV, WEBM)
8+
- 🤖 **AI-Powered Chat**: Ask questions about your video content using Gemini's advanced video understanding
9+
- 💬 **Interactive Interface**: Clean chat interface with streaming responses
10+
- 🔄 **Session Management**: Maintain chat history and video context
11+
-**Real-time Processing**: Upload and process videos with progress feedback
12+
13+
## Setup
14+
15+
1. **Install Dependencies**
16+
```bash
17+
pip install -r requirements.txt
18+
```
19+
20+
2. **Get Gemini API Key**
21+
- Visit [Google AI Studio](https://aistudio.google.com/app/apikey)
22+
- Create a new API key
23+
- Keep it secure - you'll enter it in the app
24+
25+
3. **Run the Application**
26+
```bash
27+
streamlit run app.py
28+
```
29+
30+
## Usage
31+
32+
1. **Enter API Key**: Input your Gemini API key in the sidebar
33+
2. **Upload Video**: Choose a video file (supported formats listed above)
34+
3. **Wait for Processing**: The video will be uploaded and processed by Gemini
35+
4. **Start Chatting**: Ask questions about your video content!
36+
37+
## Example Questions
38+
39+
- "What is happening in this video?"
40+
- "Summarize the main events"
41+
- "Who are the people in this video?"
42+
- "What objects can you see?"
43+
- "Describe the setting and environment"
44+
- "What actions are taking place?"
45+
46+
## Technical Details
47+
48+
- **Video Processing**: Uses Gemini's File API for video upload and processing
49+
- **Multimodal AI**: Combines video understanding with natural language processing
50+
- **File Size Limits**: Large files (>100MB) may take longer to process
51+
- **Supported Formats**: MP4, AVI, MOV, MKV, WEBM
52+
53+
## Limitations
54+
55+
- Video processing time depends on file size and complexity
56+
- Large files may fail to upload or process
57+
- API rate limits may apply based on your Gemini API plan
58+
- Some video formats may not be supported
59+
60+
## Troubleshooting
61+
62+
- **Upload Fails**: Check video format and file size
63+
- **Processing Stuck**: Wait a few minutes, large files take time
64+
- **API Errors**: Verify your API key is correct and has sufficient quota
65+
- **No Response**: Try refreshing the page and re-uploading the video
66+
67+
## Built With
68+
69+
- [Streamlit](https://streamlit.io/) - Web app framework
70+
- [Google Gemini API](https://ai.google.dev/gemini-api) - Multimodal AI capabilities
71+
- [Python](https://python.org/) - Backend processing
72+
73+
---
74+
75+
*Part of the AI Engineering Hub - Building practical AI applications*
76+

video-rag-gemini/USAGE.md

Lines changed: 184 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,184 @@
1+
# 🎬 Video RAG Usage Guide
2+
3+
This guide will help you get started with the Video RAG demo using Google's Gemini API.
4+
5+
## Quick Start
6+
7+
### 1. Setup Environment
8+
9+
```bash
10+
# Clone or navigate to the video-rag-gemini directory
11+
cd video-rag-gemini
12+
13+
# Install dependencies
14+
pip install -r requirements.txt
15+
16+
# Test your setup
17+
python test_setup.py
18+
```
19+
20+
### 2. Get Gemini API Key
21+
22+
1. Visit [Google AI Studio](https://aistudio.google.com/app/apikey)
23+
2. Sign in with your Google account
24+
3. Click "Create API Key"
25+
4. Copy your API key
26+
27+
### 3. Configure API Key
28+
29+
**Option A: Environment Variable (Recommended)**
30+
```bash
31+
# Create .env file
32+
cp .env.example .env
33+
34+
# Edit .env file and add your API key
35+
GEMINI_API_KEY=your_actual_api_key_here
36+
```
37+
38+
**Option B: Enter in App**
39+
- You can also enter the API key directly in the Streamlit sidebar
40+
41+
### 4. Run the Application
42+
43+
```bash
44+
streamlit run app.py
45+
```
46+
47+
The app will open in your browser at `http://localhost:8501`
48+
49+
## Using the App
50+
51+
### Step 1: Enter API Key
52+
- If you haven't set up the environment variable, enter your Gemini API key in the sidebar
53+
- The key is masked for security
54+
55+
### Step 2: Upload Video
56+
- Click "Choose a video file" in the sidebar
57+
- Supported formats: MP4, AVI, MOV, MKV, WEBM
58+
- File size limit: ~100MB (larger files may fail)
59+
- Wait for the video to be processed (this can take several minutes)
60+
61+
### Step 3: Start Chatting
62+
- Once processing is complete, you'll see example questions
63+
- Click on example questions or type your own
64+
- Ask anything about the video content!
65+
66+
## Example Questions
67+
68+
### General Analysis
69+
- "What is happening in this video?"
70+
- "Summarize the main events"
71+
- "Describe the overall scene"
72+
73+
### People & Objects
74+
- "Who are the people in this video?"
75+
- "What objects can you see?"
76+
- "Describe the clothing or appearance of people"
77+
78+
### Actions & Events
79+
- "What actions are taking place?"
80+
- "What is the sequence of events?"
81+
- "What happens at the beginning/middle/end?"
82+
83+
### Environment & Setting
84+
- "What is the setting or location?"
85+
- "Describe the environment"
86+
- "What time of day is it?"
87+
88+
### Specific Details
89+
- "What colors are prominent in the video?"
90+
- "What sounds might be present?" (Note: Gemini analyzes visual content)
91+
- "What emotions are expressed?"
92+
93+
## Tips for Best Results
94+
95+
### Video Quality
96+
- Use clear, well-lit videos
97+
- Avoid very shaky or blurry footage
98+
- Higher resolution generally works better
99+
100+
### Question Types
101+
- Be specific in your questions
102+
- Ask about visual elements (Gemini can't hear audio)
103+
- Break complex questions into simpler parts
104+
105+
### File Management
106+
- Keep video files under 100MB when possible
107+
- Use common formats (MP4 is most reliable)
108+
- Compress large files if needed
109+
110+
## Troubleshooting
111+
112+
### Common Issues
113+
114+
**"Error uploading video"**
115+
- Check file format and size
116+
- Ensure stable internet connection
117+
- Try a different video file
118+
119+
**"Video processing failed"**
120+
- File may be too large or corrupted
121+
- Try compressing the video
122+
- Check if format is supported
123+
124+
**"Error generating response"**
125+
- API key may be invalid or expired
126+
- Check your API quota/billing
127+
- Try a simpler question first
128+
129+
**App is slow or unresponsive**
130+
- Large videos take time to process
131+
- Wait a few minutes before trying again
132+
- Refresh the page if needed
133+
134+
### Getting Help
135+
136+
1. **Check Setup**: Run `python test_setup.py`
137+
2. **Verify API Key**: Make sure it's correct and has quota
138+
3. **Test with Small Video**: Try a short, small video first
139+
4. **Check Logs**: Look at the Streamlit terminal for error messages
140+
141+
## Advanced Usage
142+
143+
### Command Line Demo
144+
```bash
145+
# Run the command-line demo
146+
python demo.py
147+
```
148+
149+
### Environment Variables
150+
```bash
151+
# Set API key for session
152+
export GEMINI_API_KEY=your_key_here
153+
154+
# Run app
155+
streamlit run app.py
156+
```
157+
158+
### Custom Configuration
159+
You can modify `app.py` to:
160+
- Change the Gemini model (e.g., gemini-1.5-flash for faster responses)
161+
- Adjust file size limits
162+
- Customize the UI theme
163+
- Add additional video formats
164+
165+
## API Limits & Costs
166+
167+
- **Free Tier**: Limited requests per minute/day
168+
- **File Size**: ~100MB per file
169+
- **Processing Time**: Varies by video length and complexity
170+
- **Rate Limits**: May need to wait between requests
171+
172+
Check [Gemini API pricing](https://ai.google.dev/pricing) for current limits and costs.
173+
174+
## Security Notes
175+
176+
- Never share your API key publicly
177+
- Use environment variables for production
178+
- The app doesn't store videos permanently
179+
- Videos are uploaded to Google's servers for processing
180+
181+
---
182+
183+
*Happy video chatting! 🎬✨*
184+

0 commit comments

Comments
 (0)