Spaces:
Runtime error
A newer version of the Gradio SDK is available:
6.2.0
VidInsight AI: AI-Powered YouTube Content Analyzer
Overview
VidInsight AI is an AI-powered application designed to analyze YouTube videos for a given subject, extract insights, provide transcriptions, topic, summary, key-points and a new content idea! The application is built to assist:
- content creators,
- educators & researchers, and
- everyday users in understanding video content quickly and effectively.
This ReadMe file documents the current phase of the project and will be updated as new features are implemented.
Current Features (Asif's Code):
1. YouTube Video Retrieval:
β’ Fetches up to 10 YouTube videos based on a user-provided topic.
β’ Filters videos based on criteria such as keywords, view counts, and trusted channels.
β’ Selects the top 3 videos based on relevance and view counts.
2. Transcription:
β’ Transcribes audio from the top 3 selected videos using OpenAIβs Whisper model.
β’ Saves the complete transcripts in an `output` folder for further processing.
3. User Interface:
β’ Input
β’ Provides a user-friendly interface built with Gradio.
β’ Output
β’ Displays video details (title, channel, views) and a preview of the transcription.
β’ Analysis (Topic, Summary & Key Points)
β’ Content Idea with comprehensive details
Project Structure
VidInsight-AI/
βββ app.py # Gradio web interface for user interaction
βββ config.py # Configuration file for API keys and filters
βββ fetch_youtube_videos.py # Fetches and filters YouTube videos
βββ transcribe_videos.py # Transcribes videos and saves transcripts
βββ summary.py # Generates summaries from transcriptions
βββ YouTubeAgent.py # Creates content ideas using Gemini AI
βββ main.py # CLI-based alternative to run the app
βββ requirements.txt # Project dependencies
βββ keys1.env # Environment variables (API keys)
βββ output/ # Folder for saved transcripts
βββ .txt # Transcripts saved as text files\
Key Components:
1. Interface Files:
β’ `app.py`: Web interface using Gradio
β’ `main.py`: Command-line interface
2. Core Processing Files:
β’ `fetch_youtube_videos.py`: Video retrieval
β’ `transcribe_videos.py`: Audio transcription
β’ `summary.py`: Content summarization
β’ `YouTubeAgent.py`: Content idea generation
3. Configuration Files:
β’ `config.py`: Settings and filters
β’ `keys1.env`: API keys
β’ `requirements.txt`: Dependencies
4. Output Directory:
β’ `output/`: Stores generated transcripts
Setup Instructions (need to be completed)
Prerequisites
β’ Python 3.8 or higher
β’ FFmpeg installed on the system (for audio processing) β’ A YouTube Data API key (create one via Google Cloud Console) β’ A GEMINI API key β’ A TAVILY API keyInstallation
- Clone the repository:
git clone <repository_url>Install required dependencies:
Set up your API key:
β’ Create a
.envfile or updatekeys1.envwith your YouTube API key:YOUTUBE_API_KEY="your_api_key_here" GEMINI_API_KEY="your_api_key_here" TAVILY_API_KEY="your_api_key_here"\
- Clone the repository:
Running the Application
β’ Using the Gradio Interface:python app.py
β’ Using the CLI:python main.py
Usage
Gradio App
1. Enter a topic in the βEnter learning topicβ field (e.g., βMachine Learningβ).
2. Click βSubmitβ to fetch and analyze videos.
3. View results, including:
β’ Video title, channel name, view count.
β’ A preview of the transcription.
β’ The path to the saved transcript file.
β’ Topic, Summary, and Key-Points
β’ A New Content Idea with Compreehensive Details
Output Folder
β’ Complete transcripts are saved in the `output/` folder as `.txt` files.
β’ File names are based on unique YouTube video IDs (e.g., `ukzFI9rgwfU.txt`).
Configuration
The config.py file allows customization of filtering criteria:
FILTER_CONFIG = {
"videoDuration": "medium", # Focus on videos between 4 and 20 minutes
"order": "relevance", # Sort by relevance
"trusted_channels": {
"Khan Academy": "UC4a-Gbdw7vOaccHmFo40b9g",
"edX": "UCEBb1b_L6zDS3xTUrIALZOw",
"Coursera": "UC58aowNEXHHnflR_5YTtP4g",
},
"teaching_keywords": {"tutorial", "lesson", "course", "how-to", "introduction", "basics"},
"non_teaching_keywords": {"fun", "experiment", "joke", "prank", "vlog"},
"max_results": 10, # Maximum number of videos fetched from YouTube API
"min_view_count": 10000 # Minimum view count for relevance
}
Known Issues
1. If no results are found or an error occurs during video fetching, the app displays an error message in JSON format.
2. Ensure that valid topics are entered; overly broad or unrelated topics may not yield meaningful results.
Future Features
1. Multilingual Support (Future):
β’ Add support for transcription in other languages (e.g., Spanish, French).
2. Interactive Q&A (Future):
β’ Allow users to ask questions about analyzed video content.
π οΈ Technology Stack
| Task | Technology |
|---|---|
| Video Retrieval | YouTube Data API, google-api-python-client |
| Transcription | yt-dlp, OpenAI Whisper |
| Summarization | Gemini AI, LangChain |
| Content Generation | Gemini AI, LangChain |
| Vectorizaton | ____ |
| Vector Database | ____ |
π Contributors
β’ Asif Khan β Developer and Project Lead
β’ Kade Thomas β Summarization Specialist
β’ Amit Gaikwad - Vector Database Specialist
β’ Simranpreet Saini β AI Agent Specialist
β’ Jason Brooks β Documentation Specialist
π Acknowledgements
- Special thanks to Firas Obeid for being an advisor on the project
- Special thanks to OpenAI, Hugging Face, and YouTube API, Gemini API, and Tavily API for providing the tools that made this project possible. π