VideoInsightAI / ReadMe.md
amitgcode's picture
Initial Commit
e6580d2 verified

A newer version of the Gradio SDK is available: 6.2.0

Upgrade

VidInsight AI: AI-Powered YouTube Content Analyzer

Overview

VidInsight AI is an AI-powered application designed to analyze YouTube videos for a given subject, extract insights, provide transcriptions, topic, summary, key-points and a new content idea! The application is built to assist:

  • content creators,
  • educators & researchers, and
  • everyday users in understanding video content quickly and effectively.

This ReadMe file documents the current phase of the project and will be updated as new features are implemented.

Current Features (Asif's Code):

1.	YouTube Video Retrieval:
    β€’	Fetches up to 10 YouTube videos based on a user-provided topic.
    β€’	Filters videos based on criteria such as keywords, view counts, and trusted channels.
    β€’	Selects the top 3 videos based on relevance and view counts.

2.	Transcription:
    β€’	Transcribes audio from the top 3 selected videos using OpenAI’s Whisper model.
    β€’	Saves the complete transcripts in an `output` folder for further processing.

3.	User Interface:
    β€’	Input
        β€’	Provides a user-friendly interface built with Gradio.
    β€’	Output
        β€’	Displays video details (title, channel, views) and a preview of the transcription.
        β€’	Analysis (Topic, Summary & Key Points)
        β€’	Content Idea with comprehensive details

Project Structure

VidInsight-AI/
β”œβ”€β”€ app.py              # Gradio web interface for user interaction
β”œβ”€β”€ config.py              # Configuration file for API keys and filters
β”œβ”€β”€ fetch_youtube_videos.py              # Fetches and filters YouTube videos
β”œβ”€β”€ transcribe_videos.py              # Transcribes videos and saves transcripts
β”œβ”€β”€ summary.py              # Generates summaries from transcriptions
β”œβ”€β”€ YouTubeAgent.py              # Creates content ideas using Gemini AI
β”œβ”€β”€ main.py              # CLI-based alternative to run the app
β”œβ”€β”€ requirements.txt              # Project dependencies
β”œβ”€β”€ keys1.env              # Environment variables (API keys)
└── output/              # Folder for saved transcripts
    └── .txt              # Transcripts saved as text files\

Key Components:

1.	Interface Files:
    β€’	`app.py`: Web interface using Gradio
    β€’	`main.py`: Command-line interface
2.	Core Processing Files:
    β€’	`fetch_youtube_videos.py`: Video retrieval
    β€’	`transcribe_videos.py`: Audio transcription
    β€’	`summary.py`: Content summarization
    β€’	`YouTubeAgent.py`: Content idea generation
3.	Configuration Files:
    β€’	`config.py`: Settings and filters
    β€’	`keys1.env`: API keys
    β€’	`requirements.txt`: Dependencies
4.	Output Directory:
    β€’	`output/`: Stores generated transcripts

Setup Instructions (need to be completed)

  1. Prerequisites
    β€’ Python 3.8 or higher
    β€’ FFmpeg installed on the system (for audio processing) β€’ A YouTube Data API key (create one via Google Cloud Console) β€’ A GEMINI API key β€’ A TAVILY API key

  2. Installation

    1. Clone the repository:
      git clone <repository_url>
      
      1. Install required dependencies:

      2. Set up your API key:

    β€’ Create a .env file or update keys1.env with your YouTube API key:

    YOUTUBE_API_KEY="your_api_key_here"
    GEMINI_API_KEY="your_api_key_here"
    TAVILY_API_KEY="your_api_key_here"
    

    \

  3. Running the Application
    β€’ Using the Gradio Interface:

    python app.py
    


    β€’ Using the CLI:

    python main.py
    

Usage

Gradio App

1.	Enter a topic in the β€œEnter learning topic” field (e.g., β€œMachine Learning”).
2.	Click β€œSubmit” to fetch and analyze videos.
3.	View results, including:
    β€’	Video title, channel name, view count.
    β€’	A preview of the transcription.
    β€’	The path to the saved transcript file.
    β€’	Topic, Summary, and Key-Points
    β€’	A New Content Idea with Compreehensive Details    

Output Folder

β€’	Complete transcripts are saved in the `output/` folder as `.txt` files.
β€’	File names are based on unique YouTube video IDs (e.g., `ukzFI9rgwfU.txt`).

Configuration

The config.py file allows customization of filtering criteria:

FILTER_CONFIG = {
    "videoDuration": "medium",  # Focus on videos between 4 and 20 minutes
    "order": "relevance",       # Sort by relevance
    "trusted_channels": {
        "Khan Academy": "UC4a-Gbdw7vOaccHmFo40b9g",
        "edX": "UCEBb1b_L6zDS3xTUrIALZOw",
        "Coursera": "UC58aowNEXHHnflR_5YTtP4g",
    },
    "teaching_keywords": {"tutorial", "lesson", "course", "how-to", "introduction", "basics"},
    "non_teaching_keywords": {"fun", "experiment", "joke", "prank", "vlog"},
    "max_results": 10,          # Maximum number of videos fetched from YouTube API
    "min_view_count": 10000     # Minimum view count for relevance
}

Known Issues

1.	If no results are found or an error occurs during video fetching, the app displays an error message in JSON format.
2.	Ensure that valid topics are entered; overly broad or unrelated topics may not yield meaningful results.

Future Features

1.	Multilingual Support (Future):
    β€’	Add support for transcription in other languages (e.g., Spanish, French).
    
2.	Interactive Q&A (Future):
    β€’	Allow users to ask questions about analyzed video content.

πŸ› οΈ Technology Stack

Task Technology
Video Retrieval YouTube Data API, google-api-python-client
Transcription yt-dlp, OpenAI Whisper
Summarization Gemini AI, LangChain
Content Generation Gemini AI, LangChain
Vectorizaton ____
Vector Database ____

πŸ“Œ Contributors

β€’	Asif Khan – Developer and Project Lead
β€’	Kade Thomas – Summarization Specialist
β€’	Amit Gaikwad - Vector Database Specialist
β€’	Simranpreet Saini – AI Agent Specialist
β€’	Jason Brooks – Documentation Specialist

πŸ™ Acknowledgements

  • Special thanks to Firas Obeid for being an advisor on the project
  • Special thanks to OpenAI, Hugging Face, and YouTube API, Gemini API, and Tavily API for providing the tools that made this project possible. πŸš€