Threat Detection URL Checker

Project Status

Threat Detection URL Checker

Threat Detection URL Checker is a Python-based tool that analyzes URLs using the Google Web Risk API, logs categorized results into CSV files, and visualizes threat data. It is designed with modular components, supports automation through CI/CD, and includes full documentation.

Features

Asynchronous URL scanning using the Google Web Risk API
Categorizes results into threat types or safe
Logs and saves results to CSV
Generates a threat distribution chart using matplotlib
Structured logging for debugging and traceability
Unit tested using unittest
GitHub Actions CI for automatic test runs
Public documentation hosted on Confluence
Video walkthrough available on YouTube

🏗️ Architecture Overview

Below is the architecture design of the project, which outlines the core components and their interactions.

📦 Core Modules

Component	Responsibility
`APIHandler`	Interacts with Google Web Risk API to analyze URLs for threats. Manages concurrency and async processing.
`CSVHandler`	Loads input URLs from CSV and saves threat analysis results. Also calculates threat-type percentages.
`Logger`	Handles centralized logging of info, warnings, and errors to a log file.
`ThreatAnalyzer`	Reads results and generates visual charts (pie chart of threat types) using `matplotlib`.
`main.py`	Entry point. Orchestrates the entire analysis flow loading data, calling the API, saving results, and generating charts.

🧪 Testing Modules

Test File	Description
`test_api_handler.py`	Tests the initialization and threat-type config of `APIHandler`.
`test_csv_writer.py`	Tests CSV read/write and percentage logic in `CSVHandler`.
`test_threat_analyzer.py`	Tests chart generation and empty-data handling in `ThreatAnalyzer`.

🔁 Component Interactions

main.py initializes:
- Logger
- CSVHandler
- APIHandler
- ThreatAnalyzer
APIHandler:
- Uses CSVHandler to load URLs
- Uses Logger to log the scanning process
- Sends requests to Google Web Risk API
After scanning:
- Results are saved via CSVHandler
- ThreatAnalyzer reads results and generates a pie chart
Logs, results, and chart are stored in the resources/ folder.

💡 All environment variables and paths are defined in .env for centralized configuration.

Core Components

APIHandler → Fetches the list of public API URLs from a CSV file
ThreatAnalyzer → Sends URLs to the Google Web Risk API and determines their status
CSVHandler → Saves results into results.csv and calculates threat percentages
Logger → Logs errors, API failures, and other issues for debugging

Project Structure

Multithreaded-URL-Checker/
├── threat_analyzer/
│   ├── api_handler.py
│   ├── csv_handler.py
│   ├── logger.py
│   ├── threat_analyzer.py
│   └── test/
│       ├── test_api_handler.py
│       ├── test_csv_writer.py
│       └── test_threat_analyzer.py
├── resources/
│   ├── key.json
│   ├── results.csv
│   └── threat_analysis_chart.png
├── data/
│   └── PublicAPIslist.csv
├── .github/workflows/
│   └── python-tests.yml
├── .env
├── Dockerfile
├── requirements.txt
└── README.md

Automated Testing & CI

Note: GitHub Actions is configured but currently inactive due to free tier CI usage limits.
Tests are written using Python's built in unittest module
GitHub Actions automatically runs tests on every push and pull request to main

Run tests locally:

python -m unittest discover -s threat_analyzer/test

CI Status

Technologies Used

Python 3.10+
Google Web Risk API
pandas
asyncio
matplotlib
unittest
GitHub Actions
Docker (optional)

Video Explanation

Watch the full walkthrough of this project on YouTube: (in progress)

Installation

Clone the repository:

git clone https://github.com/yourusername/Threat-Detection-URL-Checker.git
cd Threat-Detection-URL-Checker

Create a .env file:

CSV_FILE=data/PublicAPIslist.csv
RESULTS_FILE=resources/results.csv
CHART_FILE=resources/threat_analysis_chart.png
GOOGLE_API_KEY=your_api_key
GOOGLE_APPLICATION_CREDENTIALS=resources/key.json

Install dependencies:

pip install -r requirements.txt

Run the program:

python threat_analyzer/main.py

Output

resources/results.csv → Results of all URL scans
resources/threat_analysis_chart.png → Threat type distribution chart

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Project Status

Threat Detection URL Checker

Table of Contents

Features

🏗️ Architecture Overview

📦 Core Modules

🧪 Testing Modules

🔁 Component Interactions

Core Components

Project Structure

Automated Testing & CI

CI Status

Technologies Used

Video Explanation

Installation

Output

Author

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.github/workflows		.github/workflows
.idea		.idea
data		data
resources		resources
threat_analyzer		threat_analyzer
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt
sketch.py		sketch.py

HusainCode/Threat-Detection-URL-Checker

Folders and files

Latest commit

History

Repository files navigation

Project Status

Threat Detection URL Checker

Table of Contents

Features

🏗️ Architecture Overview

📦 Core Modules

🧪 Testing Modules

🔁 Component Interactions

Core Components

Project Structure

Automated Testing & CI

CI Status

Technologies Used

Video Explanation

Installation

Output

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages