Skip to content

KartikThakkar1/content-summarization-with-Llama3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Content Summarizer with Llama3 | LangChain & GroqAPI

Summarize content from any website or a YouTube video - with a click

Features

1. Multi-source content ingestion with smart loaders

Automatically detects whether the input URL is a YouTube video or a regular web page, then pulls data with youtube-transcript-api or LangChain’s UnstructuredURLLoader to build a clean Document object pipeline.

2. Dynamic summarization strategy based on real-time token counts

Uses a custom count_tokens helper (Hugging Face tokenizer) to measure prompt size and switch between stuff, map-reduce, or refine chains on the fly, keeping every request under Groq’s 12 k TPM ceiling while squeezing maximum context into Llama-3.3-70B.

3. Chunk‐aware splitting and selective context pruning

Implements RecursiveCharacterTextSplitter with adjustable chunk/overlap sizes, then iteratively trims excess segments to stay within a configurable token budget, this provides ability to have granular control over memory, latency, and cost for large-scale summarization workflows.

How does it work?

  • The Python script leverages functionalities from LangChain and accesses Meta's llama-3.3-70b-versatile through Groq API
  • Validates URLs and handles them based on their source (youtube, generic websites, etc)
  • Utilizes chain summarization methods from LangChain to chain prompts and inputs
  • Provides a Streamlit page for web based interaction

How to run?

  • Install the requirements in your python environment using pip install -r requirements.txt
  • Run app.py with streamlit run app.py
  • Provide a Groq API Key (for more : Groq API)
  • Provide a URL and click the summarize button

Utility Examples

  1. A thoughtful Medium article by Debbie Levitt on thinking critically about perspectives and information presented in a satirical fashion.

    Image

  2. One of the most popular API Documentation - Stripe API Docs

    Image

  3. An insightful YouTube video lecture : Stanford CS229 I Machine Learning I Building Large Language Models (LLMs) by Yann Dubois(PhD Student at Stanford)

    Image

About

LLM powered summarization tool for YouTube videos and other websites.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages