|
| 1 | +# InternSync API |
| 2 | + |
| 3 | +InternSync API provides the backend services for the internship management platform. The API handles user authentication, job postings, applications, and file processing with intelligent resume analysis capabilities. |
| 4 | + |
| 5 | +## What It Does |
| 6 | + |
| 7 | +The API serves as the core backend that powers the InternSync platform: |
| 8 | + |
| 9 | +**User Management:** |
| 10 | +- Secure user registration and authentication for students and companies |
| 11 | +- JWT token-based session management |
| 12 | +- Role-based access control with distinct permissions |
| 13 | + |
| 14 | +**Job Management:** |
| 15 | +- Companies can post, edit, and manage internship opportunities |
| 16 | +- Students can browse active job listings with filtering |
| 17 | +- Automatic pagination for efficient data loading |
| 18 | +- Status tracking (draft, active, closed) for job postings |
| 19 | + |
| 20 | +**Application Processing:** |
| 21 | +- Students submit applications with resume uploads |
| 22 | +- Secure file storage and validation for PDF documents |
| 23 | +- Application status management throughout the hiring process |
| 24 | +- Real-time tracking of application progress |
| 25 | + |
| 26 | +**File Processing:** |
| 27 | +- PDF resume upload and storage |
| 28 | +- Text extraction from resumes for AI analysis |
| 29 | +- AWS S3 bucket support |
| 30 | +- File size validation and format verification |
| 31 | + |
| 32 | +## Technology Stack |
| 33 | + |
| 34 | +### Core Framework |
| 35 | +- **Django 4.2.21** - Python web framework for rapid development |
| 36 | +- **Django REST Framework** - RESTful API implementation |
| 37 | +- **PostgreSQL** - Production-grade relational database |
| 38 | +- **JWT Authentication** - Secure token-based authentication |
| 39 | + |
| 40 | +### File Processing |
| 41 | +- **PyPDF2** - PDF text extraction and manipulation |
| 42 | +- **pdfplumber** - Advanced PDF parsing capabilities |
| 43 | +- **PyMuPDF** - High-performance PDF processing |
| 44 | + |
| 45 | +### Security & Validation |
| 46 | +- **Django CORS Headers** - Cross-origin resource sharing configuration |
| 47 | +- **Input sanitization** - XSS protection with HTML escaping |
| 48 | +- **Password hashing** - Secure password storage with Django's built-in hashers |
| 49 | + |
| 50 | +## Database Schema |
| 51 | + |
| 52 | +The API uses a PostgreSQL database with the following core models: |
| 53 | + |
| 54 | +### Users Table |
| 55 | +- **userName** - Unique username for authentication |
| 56 | +- **role** - Single character ('s' for student, 'c' for company) |
| 57 | +- **password** - Hashed password for secure authentication |
| 58 | + |
| 59 | +### Students Table |
| 60 | +- **Fullname** - Student's complete name |
| 61 | +- **uid** - Foreign key reference to Users table |
| 62 | + |
| 63 | +### Companies Table (Compony) |
| 64 | +- **name** - Company name (unique) |
| 65 | +- **hr_mail** - HR contact email address |
| 66 | +- **website** - Company website URL |
| 67 | +- **uid** - Foreign key reference to Users table |
| 68 | + |
| 69 | +### Jobs Table |
| 70 | +- **title** - Job position title |
| 71 | +- **description** - Detailed job description |
| 72 | +- **short_description** - Brief summary for listings |
| 73 | +- **location** - Job location or work arrangement |
| 74 | +- **end** - Application deadline |
| 75 | +- **status** - Current status (draft, active, closed) |
| 76 | +- **work_mode** - Work arrangement (On-Site, Remote, Hybrid) |
| 77 | +- **work_type** - Employment type (Full-Time, Part-Time) |
| 78 | +- **cid** - Foreign key reference to Companies table |
| 79 | + |
| 80 | +### Applications Table |
| 81 | +- **application_date** - Timestamp of application submission |
| 82 | +- **status** - Application status (pending, reviewing, shortlisted, etc.) |
| 83 | +- **path** - File system path to uploaded resume |
| 84 | +- **sid** - Foreign key reference to Students table |
| 85 | +- **jid** - Foreign key reference to Jobs table |
| 86 | +- **cid** - Foreign key reference to Companies table |
| 87 | + |
| 88 | +## API Endpoints |
| 89 | + |
| 90 | +### Authentication |
| 91 | +- `POST /api/user/add` - Register new user (student or company) |
| 92 | +- `POST /api/user/login` - Authenticate user and receive JWT token |
| 93 | +- `GET /api/user/info` - Verify token validity and get user information |
| 94 | + |
| 95 | +### Job Management |
| 96 | +- `POST /api/jobs/add` - Create new job posting (companies only) |
| 97 | +- `GET /api/jobs/get` - Retrieve job listings with pagination |
| 98 | +- `POST /api/jobs/edit` - Update existing job posting |
| 99 | + |
| 100 | +### Applications |
| 101 | +- `POST /api/jobs/apply` - Submit job application with resume |
| 102 | +- `GET /api/jobs/get/applications/<job_id>` - Get applicants for specific job |
| 103 | +- `GET /api/jobs/get/applications/student` - Get student's application history |
| 104 | +- `POST /api/jobs/update/application/status/<application_id>` - Update application status |
| 105 | + |
| 106 | +### File Processing |
| 107 | +- `GET /api/jobs/get/applicant/cv/<application_id>` - Download applicant resume |
| 108 | +- `GET /api/jobs/extract/pdf/text/<application_id>` - Extract text from resume |
| 109 | +- `POST /api/jobs/extract/pdf/text` - Extract text from uploaded PDF |
| 110 | + |
| 111 | +## Getting Started |
| 112 | + |
| 113 | +### Prerequisites |
| 114 | +- Python 3.8+ and pip |
| 115 | +- PostgreSQL 12+ database server |
| 116 | +- Virtual environment (recommended) |
| 117 | + |
| 118 | +### Database Setup |
| 119 | + |
| 120 | +1. Install PostgreSQL and create the database: |
| 121 | +```bash |
| 122 | +python create_database.py |
| 123 | +``` |
| 124 | +2. rename `mysite/mysite/example.settings.py` to `mysite/mysite/settings.py` |
| 125 | + |
| 126 | +3. 2. rename `mysite/.env.example` to `mysite/.env` |
| 127 | + |
| 128 | +4. Configure database connection in `mysite/.env`: |
| 129 | + |
| 130 | +### S3 setup |
| 131 | +Edit S3 config in `mysite/mysite/settings.py`: |
| 132 | +``` |
| 133 | +MINIO_PORT=443 |
| 134 | +MINIO_USE_SSL=true |
| 135 | +MINIO_BUCKET=mybucket |
| 136 | +MINIO_REGION=is-sa-eastern-1 |
| 137 | +MINIO_ENDPOINT=api.s3.dev.is.sa |
| 138 | +MINIO_ACCESS_KEY=CHANGE_TO_YOUR_ACCESS_KEY |
| 139 | +MINIO_SECRET_KEY=CHANGE_TO_YOUR_SECRET_KEY |
| 140 | +``` |
| 141 | + |
| 142 | +You can also refer to: |
| 143 | +``https://docs.is.sa/doc/how-to-create-s3-bucket-LktU013MBN`` |
| 144 | + |
| 145 | +### Installation |
| 146 | + |
| 147 | +1. Navigate to the API directory: |
| 148 | +```bash |
| 149 | +cd api/internsynk |
| 150 | +``` |
| 151 | + |
| 152 | +2. Create and activate virtual environment: |
| 153 | +```bash |
| 154 | +python -m venv venv |
| 155 | +source venv/bin/activate # On Windows: venv\Scripts\activate |
| 156 | +``` |
| 157 | + |
| 158 | +3. Install dependencies: |
| 159 | +```bash |
| 160 | +pip install -r requirements.txt |
| 161 | +``` |
| 162 | + |
| 163 | +4. Run database migrations: |
| 164 | +```bash |
| 165 | +cd mysite |
| 166 | +python manage.py makemigrations |
| 167 | +python manage.py migrate |
| 168 | +``` |
| 169 | + |
| 170 | +5. Start the development server: |
| 171 | +```bash |
| 172 | +python manage.py runserver |
| 173 | +``` |
| 174 | + |
| 175 | +The API runs at `http://localhost:8000` and accepts requests from the Angular frontend at `http://localhost:4200`. |
| 176 | + |
| 177 | +## Project Structure |
| 178 | + |
| 179 | +``` |
| 180 | +api/ |
| 181 | +├── internsynk/ |
| 182 | +│ ├── requirements.txt # Python dependencies |
| 183 | +│ └── mysite/ # Django project |
| 184 | +│ ├── manage.py # Django management script |
| 185 | +│ ├── mysite/ # Project settings |
| 186 | +│ │ ├── settings.py # Database and app configuration |
| 187 | +│ │ ├── urls.py # URL routing |
| 188 | +│ │ └── wsgi.py # WSGI application |
| 189 | +│ ├── api/ # Main application |
| 190 | +│ │ ├── models.py # Database models |
| 191 | +│ │ ├── serializers.py # API serializers |
| 192 | +│ │ ├── urls.py # API URL patterns |
| 193 | +│ │ ├── admin.py # Django admin configuration |
| 194 | +│ │ ├── views/ # API view controllers |
| 195 | +│ │ │ ├── views.py # User registration and login |
| 196 | +│ │ │ ├── post_jobs.py # Job posting and retrieval |
| 197 | +│ │ │ ├── applay.py # Application submission |
| 198 | +│ │ │ ├── get_applications.py # Application management |
| 199 | +│ │ │ ├── edit_jobs.py # Job editing |
| 200 | +│ │ │ ├── pdf_extract.py # Resume text extraction |
| 201 | +│ │ │ └── update_application_status.py |
| 202 | +│ │ └── migrations/ # Database schema changes |
| 203 | +│ ├── files/ |
| 204 | +│ │ └── cvs/ # Resume file storage |
| 205 | +│ └── static/ # Static files and assets |
| 206 | +├── create_database.py # Database setup script |
| 207 | +└── README.md |
| 208 | +``` |
| 209 | + |
| 210 | +## Authentication Flow |
| 211 | + |
| 212 | +The API uses JWT tokens for secure authentication: |
| 213 | + |
| 214 | +1. **Registration**: Users register with role-specific information |
| 215 | +2. **Login**: Credentials are verified and JWT token is issued |
| 216 | +3. **Authorization**: Each API request includes Bearer token in headers |
| 217 | +4. **Token Validation**: Server verifies token signature and expiration |
| 218 | +5. **Role-Based Access**: Endpoints check user role for permissions |
| 219 | + |
| 220 | +Token payload includes: |
| 221 | +- User ID and username |
| 222 | +- Role-specific information (student or company details) |
| 223 | +- Token expiration time (1 hour default) |
| 224 | + |
| 225 | +## File Handling |
| 226 | + |
| 227 | +The API processes resume uploads with security measures: |
| 228 | + |
| 229 | +**Upload Process:** |
| 230 | +1. Receive base64-encoded PDF from frontend |
| 231 | +2. Validate file format using PDF magic number |
| 232 | +3. Check file size limits (5MB maximum) |
| 233 | +4. Generate unique filename using UUID |
| 234 | +5. Store file in S3 bucket |
| 235 | +6. Save file path reference in database |
| 236 | + |
| 237 | +**Text Extraction:** |
| 238 | +- Multiple PDF parsing libraries for reliability |
| 239 | +- Fallback methods ensure text extraction success |
| 240 | +- Extracted text feeds AI analysis pipeline |
| 241 | + |
| 242 | +## Security Features |
| 243 | + |
| 244 | +**Data Protection:** |
| 245 | +- Input sanitization prevents XSS attacks |
| 246 | +- Password hashing with Django's secure hashers |
| 247 | +- JWT tokens with configurable expiration |
| 248 | +- Database connection uses environment variables |
| 249 | + |
| 250 | +**File Security:** |
| 251 | +- File size limits prevent storage abuse |
| 252 | +- Unique file naming prevents conflicts |
| 253 | +- Secure file storage |
| 254 | + |
| 255 | +**Access Control:** |
| 256 | +- Role-based endpoint restrictions |
| 257 | +- Token validation on protected routes |
| 258 | +- Company-specific data isolation |
| 259 | +- Student privacy protection |
| 260 | + |
| 261 | +## Configuration |
| 262 | + |
| 263 | +Key settings in `mysite/mysite/settings.py`: |
| 264 | + |
| 265 | +```python |
| 266 | +# Database Configuration |
| 267 | +DATABASES = { |
| 268 | + 'default': { |
| 269 | + 'ENGINE': 'django.db.backends.postgresql', |
| 270 | + 'NAME': 'internsync', |
| 271 | + 'USER': 'postgres', |
| 272 | + 'PASSWORD': 'your_password', |
| 273 | + 'HOST': 'localhost', |
| 274 | + 'PORT': '5432', |
| 275 | + } |
| 276 | +} |
| 277 | + |
| 278 | +# JWT Settings |
| 279 | +JWT_SECRET = 'your_secret_key' |
| 280 | +JWT_ALGORITHM = 'HS256' |
| 281 | +JWT_EXP_DELTA_SECONDS = 3600 |
| 282 | + |
| 283 | +# CORS Configuration |
| 284 | +CORS_ALLOWED_ORIGINS = [ |
| 285 | + "http://localhost:4200", # Angular frontend (that may not work) |
| 286 | +] |
| 287 | + |
| 288 | +# File Storage |
| 289 | +CV_STORAGE_PATH = "/path/to/resume/storage" |
| 290 | + |
| 291 | +# AWS S3 |
| 292 | +MINIO_PORT=443 |
| 293 | +MINIO_USE_SSL=true |
| 294 | +MINIO_BUCKET=mybucket |
| 295 | +MINIO_REGION=is-sa-eastern-1 |
| 296 | +MINIO_ENDPOINT=api.s3.dev.is.sa |
| 297 | +MINIO_ACCESS_KEY=CHANGE_TO_YOUR_ACCESS_KEY |
| 298 | +MINIO_SECRET_KEY=CHANGE_TO_YOUR_SECRET_KEY |
| 299 | +``` |
| 300 | + |
| 301 | +## Development Notes |
| 302 | + |
| 303 | +The API follows Django best practices with clear separation of concerns: |
| 304 | + |
| 305 | +- **Models** define database structure and relationships |
| 306 | +- **Serializers** handle data validation and JSON conversion |
| 307 | +- **Views** implement business logic and HTTP response handling |
| 308 | +- **URLs** provide clean RESTful endpoint structure |
| 309 | + |
| 310 | +Database migrations track all schema changes, ensuring consistent development and production environments. The PostgreSQL database provides reliable transaction support and efficient querying for the application's needs. |
| 311 | + |
| 312 | +File processing includes multiple PDF parsing libraries to handle various resume formats. The base64 encoding ensures secure file transmission between frontend and backend systems. |
| 313 | + |
| 314 | +## Authors |
| 315 | + |
| 316 | +**Mustafa Al-Jishi** |
| 317 | +Cybersecurity and Digital Forensics Student, IAU |
| 318 | + |
| 319 | +**Mohammed Al-Mutawah** |
| 320 | +Cybersecurity and Digital Forensics Student, IAU |
| 321 | + |
| 322 | +Licensed under Innosoft Limited |
0 commit comments