File size: 7,181 Bytes
e6580d2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
# VidInsight AI: AI-Powered YouTube Content Analyzer

## Overview
VidInsight AI is an AI-powered application designed to analyze YouTube videos for a given subject, extract insights, provide transcriptions, topic, summary, key-points and a new content idea! 
The application is built to assist:
- content creators,
- educators & researchers, and
- everyday users in understanding video content quickly and effectively.

---
This ReadMe file documents the current phase of the project and will be updated as new features are implemented.

**Current Features (Asif's Code):**

	1.	YouTube Video Retrieval:
    	β€’	Fetches up to 10 YouTube videos based on a user-provided topic.
    	β€’	Filters videos based on criteria such as keywords, view counts, and trusted channels.
    	β€’	Selects the top 3 videos based on relevance and view counts.
    
	2.	Transcription:
    	β€’	Transcribes audio from the top 3 selected videos using OpenAI’s Whisper model.
    	β€’	Saves the complete transcripts in an `output` folder for further processing.
    
	3.	User Interface:
    	β€’	Input
        	β€’	Provides a user-friendly interface built with Gradio.
    	β€’	Output
        	β€’	Displays video details (title, channel, views) and a preview of the transcription.
        	β€’	Analysis (Topic, Summary & Key Points)
        	β€’	Content Idea with comprehensive details
---

## Project Structure

VidInsight-AI/\
β”œβ”€β”€ app.py                                 # Gradio web interface for user interaction\
β”œβ”€β”€ config.py                              # Configuration file for API keys and filters\
β”œβ”€β”€ fetch_youtube_videos.py                # Fetches and filters YouTube videos\
β”œβ”€β”€ transcribe_videos.py                   # Transcribes videos and saves transcripts\
β”œβ”€β”€ summary.py                             # Generates summaries from transcriptions\
β”œβ”€β”€ YouTubeAgent.py                        # Creates content ideas using Gemini AI\
β”œβ”€β”€ main.py                                # CLI-based alternative to run the app\
β”œβ”€β”€ requirements.txt                       # Project dependencies\
β”œβ”€β”€ keys1.env                              # Environment variables (API keys)\
└── output/                                # Folder for saved transcripts\
&nbsp;&nbsp;&nbsp; └── <video_id>.txt     &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # Transcripts saved as text files\

### Key Components:
	1.	Interface Files:
    	β€’	`app.py`: Web interface using Gradio
    	β€’	`main.py`: Command-line interface
	2.	Core Processing Files:
    	β€’	`fetch_youtube_videos.py`: Video retrieval
    	β€’	`transcribe_videos.py`: Audio transcription
    	β€’	`summary.py`: Content summarization
    	β€’	`YouTubeAgent.py`: Content idea generation
	3.	Configuration Files:
    	β€’	`config.py`: Settings and filters
    	β€’	`keys1.env`: API keys
    	β€’	`requirements.txt`: Dependencies
	4.	Output Directory:
    	β€’	`output/`: Stores generated transcripts

---

## Setup Instructions (need to be completed)

1. Prerequisites\
	β€’	Python 3.8 or higher\
	β€’	FFmpeg installed on the system (for audio processing)
	β€’	A YouTube Data API key (create one via Google Cloud Console)
	β€’	A GEMINI API key 
	β€’	A TAVILY API key 

3. Installation
	1.	Clone the repository:
      ```python
      git clone <repository_url>
      ```
   	2.	Install required dependencies:

   	3.	Set up your API key:
	β€’	Create a `.env` file or update `keys1.env` with your YouTube API key:
    ```python
    YOUTUBE_API_KEY="your_api_key_here"
    GEMINI_API_KEY="your_api_key_here"
    TAVILY_API_KEY="your_api_key_here"
    ```
   \
    
4. Running the Application\
	β€’	Using the Gradio Interface:
    ```python
    python app.py
    ```
    \
   β€’	Using the CLI:
    ```python
    python main.py
    ```
---

## Usage

#### Gradio App
	1.	Enter a topic in the β€œEnter learning topic” field (e.g., β€œMachine Learning”).
	2.	Click β€œSubmit” to fetch and analyze videos.
	3.	View results, including:
    	β€’	Video title, channel name, view count.
    	β€’	A preview of the transcription.
    	β€’	The path to the saved transcript file.
    	β€’	Topic, Summary, and Key-Points
    	β€’	A New Content Idea with Compreehensive Details    
#### Output Folder
	β€’	Complete transcripts are saved in the `output/` folder as `.txt` files.
	β€’	File names are based on unique YouTube video IDs (e.g., `ukzFI9rgwfU.txt`).

---

## Configuration

The `config.py` file allows customization of filtering criteria:
```python
FILTER_CONFIG = {
    "videoDuration": "medium",  # Focus on videos between 4 and 20 minutes
    "order": "relevance",       # Sort by relevance
    "trusted_channels": {
        "Khan Academy": "UC4a-Gbdw7vOaccHmFo40b9g",
        "edX": "UCEBb1b_L6zDS3xTUrIALZOw",
        "Coursera": "UC58aowNEXHHnflR_5YTtP4g",
    },
    "teaching_keywords": {"tutorial", "lesson", "course", "how-to", "introduction", "basics"},
    "non_teaching_keywords": {"fun", "experiment", "joke", "prank", "vlog"},
    "max_results": 10,          # Maximum number of videos fetched from YouTube API
    "min_view_count": 10000     # Minimum view count for relevance
}
```

---

## Known Issues
	1.	If no results are found or an error occurs during video fetching, the app displays an error message in JSON format.
	2.	Ensure that valid topics are entered; overly broad or unrelated topics may not yield meaningful results.

---

## Future Features        
	1.	Multilingual Support (Future):
    	β€’	Add support for transcription in other languages (e.g., Spanish, French).
        
	2.	Interactive Q&A (Future):
    	β€’	Allow users to ask questions about analyzed video content.

---

## πŸ› οΈ Technology Stack

| Task  | Technology |
| -------- | ------- |
| Video Retrieval | YouTube Data API, google-api-python-client   |
| Transcription | yt-dlp, OpenAI Whisper     |
| Summarization  | Gemini AI, LangChain  |
| Content Generation | Gemini AI, LangChain   |
| Vectorizaton | ____  |
| Vector Database | ____  |


---
## πŸ“Œ Contributors
	β€’	Asif Khan – Developer and Project Lead
    β€’	Kade Thomas – Summarization Specialist
    β€’	Amit Gaikwad - Vector Database Specialist
    β€’	Simranpreet Saini – AI Agent Specialist
    β€’	Jason Brooks – Documentation Specialist

---
## πŸ™ Acknowledgements 
- Special thanks to Firas Obeid for being an advisor on the project
- Special thanks to OpenAI, Hugging Face, and YouTube API, Gemini API, and Tavily API for providing the tools that made this project possible. πŸš€