Johannes commited on
Commit
cca4a24
·
1 Parent(s): cfd5264
Files changed (11) hide show
  1. KI-Bulli-Manual.md +617 -0
  2. README copy.md +195 -0
  3. debug-transformers.html +75 -0
  4. index-new.js +954 -0
  5. index.html +214 -19
  6. index.js +933 -55
  7. rag-backup.html +683 -0
  8. rag-complete.html +1466 -0
  9. start-simple.sh +30 -0
  10. style.css +277 -46
  11. test-smollm.html +75 -0
KI-Bulli-Manual.md ADDED
@@ -0,0 +1,617 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # KI-Bulli Zeitreisemaschine - Bedienungsanleitung
2
+ ## Modell ZR-1955: Der Deutsche Temporale Omnibus
3
+
4
+ **Version 3.14159 | Bundesamt für Zeitreisen | Geheimhaltungsstufe: Streng Vertraulich**
5
+
6
+ ---
7
+
8
+ ## Inhaltsverzeichnis
9
+
10
+ 1. [Einführung](#einführung)
11
+ 2. [Sicherheitsbestimmungen](#sicherheitsbestimmungen)
12
+ 3. [Technische Spezifikationen](#technische-spezifikationen)
13
+ 4. [Die Schmidhuber Besserwisser Matrix](#die-schmidhuber-besserwisser-matrix)
14
+ 5. [Der Einsteinsche Relativitätsumkehrer](#der-einsteinsche-relativitätsumkehrer)
15
+ 6. [Bedienungsanleitung](#bedienungsanleitung)
16
+ 7. [Wartung und Pflege](#wartung-und-pflege)
17
+ 8. [Fehlerdiagnose](#fehlerdiagnose)
18
+ 9. [Zeitparadoxon-Vermeidung](#zeitparadoxon-vermeidung)
19
+ 10. [Garantie und Haftung](#garantie-und-haftung)
20
+
21
+ ---
22
+
23
+ ## Einführung
24
+
25
+ Herzlichen Glückwunsch zum Erwerb Ihres KI-Bulli Zeitreisefahrzeugs! Der KI-Bulli (Künstliche Intelligenz Volkswagen Bus) ist das Ergebnis jahrzehntelanger deutscher Ingenieurskunst und stellt den Höhepunkt der temporalen Transporttechnologie dar.
26
+
27
+ Anders als sein amerikanisches Pendant, der DeLorean DMC-12, basiert der KI-Bulli auf dem bewährten Volkswagen Typ 2 Chassis und vereint deutsche Zuverlässigkeit mit revolutionärer Zeitreisetechnologie. Das Fahrzeug wurde speziell für den europäischen Markt entwickelt und berücksichtigt sowohl die Autobahn-Geschwindigkeiten als auch die komplexen zeitlichen Verhältnisse des europäischen Kontinents.
28
+
29
+ ### Wichtige Hinweise
30
+
31
+ - Dieses Fahrzeug ist ausschließlich für lizenzierte Zeitreisende bestimmt
32
+ - Eine gültige Temporale Fahrerlaubnis (Klasse ZR-A) ist erforderlich
33
+ - Das Fahrzeug darf nicht in Zeiträume vor 1885 oder nach 2155 eingesetzt werden
34
+ - Bei Zeitreisen in die Vergangenheit sind die Bestimmungen des Temporalen Schutzgesetzes (TSchG) zu beachten
35
+
36
+ ---
37
+
38
+ ## Sicherheitsbestimmungen
39
+
40
+ ### Grundlegende Sicherheitsregeln
41
+
42
+ **ACHTUNG:** Zeitreisen bergen erhebliche Risiken für Pilot, Passagiere und das Raum-Zeit-Kontinuum. Befolgen Sie stets diese Sicherheitsbestimmungen:
43
+
44
+ 1. **Niemals das eigene Geburtsdatum als Zieldatum wählen** - Führt zu kritischen Paradoxon-Schleifen
45
+ 2. **Historische Großereignisse meiden** - Mindestabstand: 48 Stunden vor/nach bedeutsamen Ereignissen
46
+ 3. **Maximale Sprungdistanz beachten** - Nicht mehr als 500 Jahre pro Einzelsprung
47
+ 4. **Schmidhuber-Matrix regelmäßig kalibrieren** - Mindestens alle 88 Sprünge oder bei Erreichen von 1,21 Gigawatt Spitzenleistung
48
+
49
+ ### Persönliche Schutzausrüstung
50
+
51
+ - Temporaler Schutzanzug (mitgeliefert, Größen S-XXXL verfügbar)
52
+ - Chronometer-Armbanduhr mit Quanten-Synchronisation
53
+ - Notfall-Zeitanker (für Rückkehr bei Systemausfall)
54
+ - Erste-Hilfe-Set für temporale Verletzungen
55
+
56
+ ### Verbotene Gegenstände
57
+
58
+ Folgende Gegenstände dürfen NIEMALS mit in die Vergangenheit genommen werden:
59
+ - Sportalmanacha oder Börsenkurse
60
+ - Zukunftstechnologie (Smartphones, Computer, etc.)
61
+ - Medikamente, die vor 1950 unbekannt waren
62
+ - Dokumente mit Zukunftswissen
63
+ - Graue Sportschuhe (siehe Dok-Brown-Direktive 1885)
64
+
65
+ ---
66
+
67
+ ## Technische Spezifikationen
68
+
69
+ ### Grundfahrzeug: Volkswagen Typ 2 (T1)
70
+ - **Baujahr:** 1955 (temporale Modifikationen: 2072)
71
+ - **Motor:** 1.2L Boxer, modifiziert mit Plutonium-Zusatztank
72
+ - **Getriebe:** 4-Gang manuell + Temporaler Overdrive
73
+ - **Höchstgeschwindigkeit:** 95 km/h (konventionell), 88 mph (temporal kritisch)
74
+ - **Passagiere:** 7 (inklusive Zeitreise-Sicherheitsabstand)
75
+
76
+ ### Temporale Antriebseinheit
77
+ - **Kernreaktor:** Mr. Fusion Home Energy Reactor (deutsche Lizenzfertigung)
78
+ - **Flussdichte:** 1,21 Gigawatt (2,1 GW Notreserve)
79
+ - **Zeitspule:** Doppelt gewickelte Chromoly-Legierung
80
+ - **Stabilisatoren:** Gyro-stabilisierte Raumzeit-Kompensatoren
81
+
82
+ ### Navigationssystem
83
+ - **Temporal-GPS:** Quantenverschränkte Satellitennavigation
84
+ - **Zielgenauigkeit:** ±3 Stunden, ±50 Meter
85
+ - **Kartenmaterial:** Vollständige Erdgeschichte 1885-2155
86
+ - **Sprachen:** Deutsch, Englisch, Latein, Altgriechisch, Zukunftsdeutsch
87
+
88
+ ---
89
+
90
+ ## Die Schmidhuber Besserwisser Matrix
91
+
92
+ Die Schmidhuber Besserwisser Matrix (SBM) ist das Herzstück des KI-Bulli Zeitreisesystems. Entwickelt von Prof. Dr. Dr. h.c. Günther Schmidhuber am Institut für Angewandte Temporalphysik der TU München, stellt sie sicher, dass alle Zeitreisen im Einklang mit den Naturgesetzen und den Bestimmungen der Zeitreise-Verordnung erfolgen.
93
+
94
+ ### Funktionsweise der SBM
95
+
96
+ Die Matrix besteht aus 42 miteinander vernetzten Quantencomputern, die kontinuierlich folgende Parameter berechnen:
97
+
98
+ 1. **Temporale Wahrscheinlichkeitsfelder**
99
+ 2. **Paradoxon-Risikofaktoren**
100
+ 3. **Kausalitätsketten-Integrität**
101
+ 4. **Schmetterlingseffekt-Minimierung**
102
+ 5. **Realitäts-Konsistenz-Index**
103
+
104
+ ### Kalibrierung der SBM
105
+
106
+ **WICHTIG:** Die SBM muss vor jeder Zeitreise korrekt kalibriert werden. Eine fehlerhafte Kalibrierung kann zu irreversiblen Schäden am Raum-Zeit-Kontinuum führen.
107
+
108
+ #### Schritt-für-Schritt Anleitung:
109
+
110
+ 1. **Grundkalibrierung einleiten**
111
+ - Hauptschalter auf "BEREIT" stellen
112
+ - Warten bis grüne LED-Anzeige "MATRIX AKTIV" erscheint
113
+ - Eingabe des persönlichen Zeitreise-Codes (8-stellig)
114
+
115
+ 2. **Referenzdatum setzen**
116
+ - Aktuelles Datum und Uhrzeit bestätigen
117
+ - GPS-Position verifizieren
118
+ - Höhenmeter über NN eingeben
119
+
120
+ 3. **Zielparameter eingeben**
121
+ - Zieldatum (Format: TT.MM.JJJJ)
122
+ - Zielzeit (Format: HH:MM:SS)
123
+ - Zielkoordinaten (Grad, Minuten, Sekunden)
124
+ - Aufenthaltsdauer (maximal 168 Stunden)
125
+
126
+ 4. **Besserwisser-Protokoll aktivieren**
127
+ - Slider "BESSERWISSER" auf 73% einstellen
128
+ - Warnhinweise lesen und bestätigen
129
+ - Notfall-Rückkehr-Zeitpunkt festlegen
130
+
131
+ 5. **Matrix-Synchronisation**
132
+ - "SYNC STARTEN" drücken
133
+ - Warten auf Bestätigung aller 42 Quantencomputer
134
+ - Bei roten Warnlampen: Vorgang abbrechen und Wartung kontaktieren
135
+
136
+ ### Erweiterte SBM-Einstellungen
137
+
138
+ Für erfahrene Zeitreisende stehen erweiterte Optionen zur Verfügung:
139
+
140
+ - **Präzisionsmodus:** Erhöht Zielgenauigkeit auf ±30 Minuten
141
+ - **Stealth-Modus:** Minimiert temporale Signatur
142
+ - **Historiker-Modus:** Erlaubt Nur-Beobachtung-Reisen
143
+ - **Notfall-Modus:** Ermöglicht unkalibrierten Rücksprung (nur in Extremsituationen!)
144
+
145
+ ---
146
+
147
+ ## Der Einsteinsche Relativitätsumkehrer
148
+
149
+ Der Einsteinsche Relativitätsumkehrer (ERU) ist die physikalische Grundlage aller Zeitreisen im KI-Bulli. Dieses geniale System nutzt die Prinzipien der Allgemeinen Relativitätstheorie und kehrt sie kontrolliert um, wodurch Bewegung rückwärts durch die Zeit möglich wird.
150
+
151
+ ### Wissenschaftliche Grundlagen
152
+
153
+ Einstein bewies, dass Zeit relativ ist und von Geschwindigkeit und Gravitation beeinflusst wird. Der ERU nutzt diese Erkenntnis und erzeugt ein kontrolliertes Gravitationsfeld, das die Zeit in einem begrenzten Bereich umkehrt.
154
+
155
+ #### Die ERU-Formel:
156
+ ```
157
+ Δt = -(v²/c²) × √(1 - (2GM/rc²)) × ℏ × ψ(SBM)
158
+ ```
159
+
160
+ Wobei:
161
+ - Δt = Zeitverschiebung
162
+ - v = Fahrzeuggeschwindigkeit
163
+ - c = Lichtgeschwindigkeit
164
+ - G = Gravitationskonstante
165
+ - M = Masse des temporalen Feldes
166
+ - r = Radius der Zeitblase
167
+ - ℏ = Reduziertes Plancksches Wirkungsquantum
168
+ - ψ(SBM) = Schmidhuber-Besserwisser-Matrix-Funktion
169
+
170
+ ### ERU-Betriebsmodi
171
+
172
+ Der Relativitätsumkehrer kann in verschiedenen Modi betrieben werden:
173
+
174
+ #### 1. Standard-Modus (Empfohlen)
175
+ - **Energieverbrauch:** 1,21 GW
176
+ - **Maximaler Zeitsprung:** 100 Jahre
177
+ - **Genauigkeit:** ±2 Stunden
178
+ - **Sicherheitslevel:** Maximal
179
+
180
+ #### 2. Öko-Modus (Energiesparend)
181
+ - **Energieverbrauch:** 0,88 GW
182
+ - **Maximaler Zeitsprung:** 50 Jahre
183
+ - **Genauigkeit:** ±6 Stunden
184
+ - **Sicherheitslevel:** Hoch
185
+
186
+ #### 3. Sport-Modus (Für Profis)
187
+ - **Energieverbrauch:** 2,1 GW
188
+ - **Maximaler Zeitsprung:** 500 Jahre
189
+ - **Genauigkeit:** ±30 Minuten
190
+ - **Sicherheitslevel:** Mittel
191
+
192
+ #### 4. Ludicrous-Modus (Nur für Notfälle!)
193
+ - **Energieverbrauch:** 5,55 GW
194
+ - **Maximaler Zeitsprung:** 1000 Jahre
195
+ - **Genauigkeit:** ±5 Minuten
196
+ - **Sicherheitslevel:** GEFÄHRLICH!
197
+
198
+ ### Koordination zwischen SBM und ERU
199
+
200
+ Die korrekte Abstimmung zwischen der Schmidhuber Besserwisser Matrix und dem Einsteinschen Relativitätsumkehrer ist kritisch für erfolgreiche Zeitreisen:
201
+
202
+ 1. **Initialisierungsphase**
203
+ - SBM berechnet optimale ERU-Parameter
204
+ - ERU bestätigt physikalische Machbarkeit
205
+ - Gegenseitige Validierung der Berechnungen
206
+
207
+ 2. **Synchronisationsphase**
208
+ - SBM sendet Startsequenz an ERU
209
+ - ERU lädt temporales Feld auf
210
+ - Bei 88 mph: Automatische Zeitsprung-Auslösung
211
+
212
+ 3. **Überwachungsphase**
213
+ - Kontinuierliche Parameterüberwachung
214
+ - Automatische Korrekturen bei Abweichungen
215
+ - Notabschaltung bei kritischen Werten
216
+
217
+ ### Wartung des ERU
218
+
219
+ **Monatlich:**
220
+ - Plutonium-Füllstand prüfen
221
+ - Kühlmitteltemperatur kontrollieren
222
+ - Gravitations-Sensoren kalibrieren
223
+
224
+ **Jährlich:**
225
+ - Komplette ERU-Inspektion durch zertifizierten Techniker
226
+ - Austausch der Temporalspulen
227
+ - Update der Einstein-Algorithmen
228
+
229
+ **Bei Bedarf:**
230
+ - Notfall-Reset nach Paradoxon-Exposition
231
+ - Rekalibrierung nach Blitzschlag
232
+ - Vollservice nach 100.000 Zeitreise-Kilometern
233
+
234
+ ---
235
+
236
+ ## Bedienungsanleitung
237
+
238
+ ### Vor der ersten Zeitreise
239
+
240
+ 1. **Dokumentation studieren** (mindestens 40 Stunden)
241
+ 2. **Praxisschulung absolvieren** (Zeitreise-Fahrschule)
242
+ 3. **Versicherung abschließen** (Temporal-Haftpflicht obligatorisch)
243
+ 4. **Testsprung durchführen** (max. 24 Stunden in die Vergangenheit)
244
+
245
+ ### Vorbereitung einer Zeitreise
246
+
247
+ #### Fahrzeug-Check
248
+ - [ ] Kraftstoff: mindestens 3/4 Tank
249
+ - [ ] Plutonium: 100% (rote LED = nachfüllen!)
250
+ - [ ] Reifen: Mindestprofiltiefe 1,6mm (auch in der Vergangenheit!)
251
+ - [ ] Scheibenwischer: funktionsfähig (Dinosaurier-Kot ist hartnäckig)
252
+ - [ ] Zeitanzeige: synchronisiert mit Atomuhr
253
+ - [ ] Notfallausrüstung: vollständig
254
+
255
+ #### Persönliche Vorbereitung
256
+ - [ ] Temporale Fahrerlaubnis mitführen
257
+ - [ ] Reisepass für alle Zeitepochen
258
+ - [ ] Erste-Hilfe-Ausbildung aufgefrischt
259
+ - [ ] Testament aktualisiert (nur zur Sicherheit)
260
+ - [ ] Familie über Reisepläne informiert
261
+
262
+ ### Durchführung der Zeitreise
263
+
264
+ #### Schritt 1: Fahrzeug starten
265
+ 1. Zündung einschalten
266
+ 2. Motor warmfahren (3 Minuten)
267
+ 3. Hauptschalter "ZEITREISE" aktivieren
268
+ 4. Warten auf Systembereitschaft (gelbe Lampe)
269
+
270
+ #### Schritt 2: SBM konfigurieren
271
+ 1. Hauptmenü aufrufen: "MENU" drücken
272
+ 2. "SCHMIDHUBER MATRIX" wählen
273
+ 3. Zieldatum eingeben
274
+ 4. Besserwisser-Level auf 73% einstellen
275
+ 5. "KALIBRIERUNG STARTEN" drücken
276
+ 6. Warten auf grünes "BEREIT"-Signal
277
+
278
+ #### Schritt 3: ERU synchronisieren
279
+ 1. "EINSTEIN UMKEHRER" Menü öffnen
280
+ 2. Aktueller Standort bestätigen
281
+ 3. Relativitätsmodus wählen (Standard empfohlen)
282
+ 4. "SYNC MIT SBM" auswählen
283
+ 5. Bestätigung abwarten (blaue LED)
284
+
285
+ #### Schritt 4: Zeitsprung vorbereiten
286
+ 1. Sicherheitsgurt anlegen (SEHR wichtig!)
287
+ 2. Alle Gegenstände sichern
288
+ 3. "ZEITSPRUNG BEREIT" bestätigen
289
+ 4. Beschleunigung auf mindestens 88 mph
290
+ 5. Bei Erreichen der kritischen Geschwindigkeit:
291
+ **AUTOMATISCHER ZEITSPRUNG!**
292
+
293
+ #### Schritt 5: Ankunft in der Zielzeit
294
+ 1. Motor auslaufen lassen (nicht sofort ausschalten!)
295
+ 2. Position und Zeit überprüfen
296
+ 3. Temporale Signatur messen
297
+ 4. Bei Abweichungen: Korrekturen durchführen
298
+ 5. "ANKUNFT BESTÄTIGEN" drücken
299
+
300
+ ### Rückkehr in die Gegenwart
301
+
302
+ Die Rückreise erfolgt nach dem gleichen Prinzip:
303
+
304
+ 1. **Mindestens 30 Minuten vor geplanter Rückkehr vorbereiten**
305
+ 2. **SBM auf Ursprungszeit/Ort einstellen**
306
+ 3. **ERU-Rückkehr-Modus aktivieren**
307
+ 4. **Beschleunigung auf 88 mph und automatische Rückkehr**
308
+
309
+ **WICHTIG:** Niemals länger als geplant in der Vergangenheit bleiben! Jede zusätzliche Minute erhöht das Paradoxon-Risiko exponentiell.
310
+
311
+ ---
312
+
313
+ ## Wartung und Pflege
314
+
315
+ ### Tägliche Kontrollen
316
+
317
+ - **Plutonium-Füllstand:** Minimum 75%
318
+ - **Kühlwasser:** Temperatur zwischen 80-90°C
319
+ - **Temporale Spulen:** Keine sichtbaren Risse oder Verfärbungen
320
+ - **SBM-Display:** Alle 42 Quantencomputer zeigen "GRÜN"
321
+ - **ERU-Status:** Keine Fehlercodes im System
322
+
323
+ ### Wöchentliche Wartung
324
+
325
+ #### Äußere Reinigung
326
+ - Fahrzeug waschen (nur destilliertes Wasser verwenden!)
327
+ - Temporale Rückstände entfernen (spezielle Lösungsmittel im Zubehör)
328
+ - Antenne auf Beschädigungen prüfen
329
+ - Kennzeichen auf Lesbarkeit kontrollieren
330
+
331
+ #### Technische Kontrolle
332
+ - Alle Anzeigen und Warnlampen testen
333
+ - Sicherungen prüfen (Ersatzsicherungen im Handschuhfach)
334
+ - Notfall-Zeitanker funktionstest
335
+ - Backup-Systeme aktivieren und testen
336
+
337
+ ### Monatliche Inspektion
338
+
339
+ 1. **Motor-Check**
340
+ - Ölstand prüfen (Spezialöl für Temporal-Motoren!)
341
+ - Zündkerzen kontrollieren
342
+ - Kühlsystem überprüfen
343
+
344
+ 2. **Temporalsystem-Diagnose**
345
+ - Vollständiger SBM-Systemtest
346
+ - ERU-Kalibrierung überprüfen
347
+ - Zeitgenauigkeit mit Atomuhr abgleichen
348
+
349
+ 3. **Sicherheitssysteme**
350
+ - Notbremsung testen
351
+ - Paradoxon-Schutzschilde kalibrieren
352
+ - Raumzeit-Stabilisatoren justieren
353
+
354
+ ### Jährliche Hauptuntersuchung
355
+
356
+ **ACHTUNG:** Muss von autorisiertem Zeitreise-Servicepartner durchgeführt werden!
357
+
358
+ - Komplette Systemdiagnose aller temporalen Komponenten
359
+ - Austausch der Verschleißteile
360
+ - Software-Updates für SBM und ERU
361
+ - Neue TÜV-Plakette für Zeitreisefahrzeuge
362
+ - Versicherungsnachweis aktualisieren
363
+
364
+ ---
365
+
366
+ ## Fehlerdiagnose
367
+
368
+ ### Häufige Probleme und Lösungen
369
+
370
+ #### "SBM zeigt ERROR 404 - Zeitraum nicht gefunden"
371
+ **Ursache:** Zieldatum liegt außerhalb des erlaubten Bereichs
372
+ **Lösung:** Datum zwischen 1885 und 2155 wählen
373
+
374
+ #### "ERU überhitzt bei 87 mph"
375
+ **Ursache:** Kühlsystem defekt oder Kühlmittel zu niedrig
376
+ **Lösung:** Kühlmittel nachfüllen, bei Wiederholung Service kontaktieren
377
+
378
+ #### "Zeitsprung zu ungenau (±24 Stunden Abweichung)"
379
+ **Ursache:** SBM-Kalibrierung veraltet
380
+ **Lösung:** Vollständige Neu-Kalibrierung durchführen
381
+
382
+ #### "Paradoxon-Warnung bei Zeitsprung-Vorbereitung"
383
+ **Ursache:** Geplante Reise würde bekannte Ereignisse beeinflussen
384
+ **Lösung:** Zieldatum/Ort ändern oder Historiker-Modus verwenden
385
+
386
+ #### "Motor startet nicht nach Zeitsprung"
387
+ **Ursache:** Temporale Interferenz oder EMP-Schäden
388
+ **Lösung:**
389
+ 1. 10 Minuten warten (temporale Entladung)
390
+ 2. Notstart-Sequenz aktivieren
391
+ 3. Bei erneutem Ausfall: Notfall-Zeitanker verwenden
392
+
393
+ #### "Anzeigen zeigen kryptische Zeitangaben"
394
+ **Ursache:** Quantenverschränkung mit Paralleluniversum
395
+ **Lösung:**
396
+ 1. Fahrzeug ausschalten
397
+ 2. 30 Sekunden warten
398
+ 3. Neustart durchführen
399
+ 4. Bei Wiederholung: Sofortiger Service erforderlich!
400
+
401
+ ### Notfall-Verfahren
402
+
403
+ #### Bei Festsitzen in der Vergangenheit:
404
+ 1. **Ruhe bewahren!**
405
+ 2. Notfall-Zeitanker aktivieren
406
+ 3. Funkspruch an Zeitreise-Leitstelle (Frequenz 88.8 MHz)
407
+ 4. Auf Rettungsteam warten
408
+ 5. **Niemals versuchen, ohne funktionierende Systeme zurückzureisen!**
409
+
410
+ #### Bei Paradoxon-Alarm:
411
+ 1. **Sofort alle Zeitreise-Aktivitäten einstellen!**
412
+ 2. Fahrzeug an sicherem Ort parken
413
+ 3. Mindestabstand 500m zu allen Personen
414
+ 4. Notdienst verständigen (Telefon: 0800-PARADOX)
415
+ 5. Quarantäne-Protokoll befolgen
416
+
417
+ #### Bei Begegnung mit sich selbst:
418
+ 1. **Sofortiger Rückzug!**
419
+ 2. Kein Blickkontakt!
420
+ 3. Minimum 1km Abstand halten
421
+ 4. ERU auf "Stealth-Modus" umschalten
422
+ 5. Bei versehentlichem Kontakt: Gedächtnis-Löschprotokoll anwenden
423
+
424
+ ---
425
+
426
+ ## Zeitparadoxon-Vermeidung
427
+
428
+ Die Vermeidung von Zeitparadoxen ist oberstes Gebot für jeden Zeitreisenden. Die im KI-Bulli integrierte Schmidhuber Besserwisser Matrix wurde speziell entwickelt, um die häufigsten Paradoxon-Arten zu verhindern.
429
+
430
+ ### Bekannte Paradoxon-Typen
431
+
432
+ #### 1. Großvater-Paradoxon
433
+ **Beschreibung:** Veränderung der eigenen Familiengeschichte
434
+ **Risiko:** Extrem hoch
435
+ **Prävention:** Automatische Sperrung aller direkten Verwandten in SBM-Datenbank
436
+
437
+ #### 2. Bootstrap-Paradoxon
438
+ **Beschreibung:** Information ohne Ursprung in Zeitschleife
439
+ **Risiko:** Mittel
440
+ **Prävention:** Dokumentation aller mitgeführten Informationen
441
+
442
+ #### 3. Schmetterlingseffekt
443
+ **Beschreibung:** Kleine Änderungen mit großen Folgen
444
+ **Risiko:** Hoch bei längeren Aufenthalten
445
+ **Prävention:** Minimale Interaktion, sterilisierte Umgebung
446
+
447
+ #### 4. Prädestinierungs-Paradoxon
448
+ **Beschreibung:** Zirkuläre Kausalität
449
+ **Risiko:** Gering bei korrekter SBM-Kalibrierung
450
+ **Prävention:** Automatische Ereignis-Kette-Analyse
451
+
452
+ ### Verhaltensregeln in der Vergangenheit
453
+
454
+ #### DO's:
455
+ - ✅ Beobachten ohne zu interagieren
456
+ - ✅ Temporale Schutzkleidung tragen
457
+ - ✅ Dokumentation aller Aktivitäten
458
+ - ✅ Regelmäßige Positionsmeldungen
459
+ - ✅ Historische Akkuratesse wahren
460
+
461
+ #### DON'Ts:
462
+ - ❌ Keine Gegenstände zurücklassen
463
+ - ❌ Keine zukunftsrelevanten Informationen preisgeben
464
+ - ❌ Keine romantischen Beziehungen
465
+ - ❌ Keine geschäftlichen Transaktionen
466
+ - ❌ Keine medizinischen Eingriffe
467
+
468
+ ### SBM-Paradoxon-Schutzprotokoll
469
+
470
+ Die Schmidhuber Besserwisser Matrix überwacht kontinuierlich folgende Parameter:
471
+
472
+ 1. **Kausalitäts-Index:** Messung der Wahrscheinlichkeit von Ursache-Wirkung-Störungen
473
+ 2. **Temporale Integrität:** Überprüfung der Zeitlinie auf Inkonsistenzen
474
+ 3. **Realitäts-Kohärenz:** Abgleich mit bekannter Geschichte
475
+ 4. **Quantenfluktuationen:** Überwachung der Raumzeit-Stabilität
476
+
477
+ Bei kritischen Werten erfolgt automatisch:
478
+ - Warnung an den Piloten
479
+ - Einleitung von Schutzmaßnahmen
480
+ - Bei Bedarf: Zwangsrückkehr zur Ursprungszeit
481
+
482
+ ---
483
+
484
+ ## Garantie und Haftung
485
+
486
+ ### Herstellergarantie
487
+
488
+ **Zeitreise-Industries GmbH** gewährt auf den KI-Bulli folgende Garantien:
489
+
490
+ #### Grundgarantie (2 Jahre oder 100.000 Zeitreise-Kilometer)
491
+ - Motor und Antriebsstrang
492
+ - Karosserie und Fahrwerk
493
+ - Standard-Elektronik
494
+
495
+ #### Temporale Spezialgarantie (5 Jahre oder 1.000 Zeitsprünge)
496
+ - Schmidhuber Besserwisser Matrix
497
+ - Einsteinscher Relativitätsumkehrer
498
+ - Plutonium-Reaktor
499
+ - Temporale Navigationssysteme
500
+
501
+ #### Paradoxon-Schutzgarantie (10 Jahre)
502
+ - Schutz vor Zeitparadoxen durch Herstellerfehler
503
+ - Kostenlose Realitäts-Restaurierung bei Systemversagen
504
+ - 24/7 Notfall-Hotline für Temporal-Krisen
505
+
506
+ ### Garantieausschlüsse
507
+
508
+ Die Garantie erlischt bei:
509
+ - Unsachgemäßem Gebrauch (z.B. Reisen vor 1885)
510
+ - Selbst durchgeführten Reparaturen
511
+ - Verwendung von Nicht-Original-Plutonium
512
+ - Missachtung der Paradoxon-Schutzbestimmungen
513
+ - Schäden durch Dinosaurier, Meteore oder Vulkanausbrüche
514
+ - Manipulation der SBM oder ERU durch Zeitreise-Amateure
515
+
516
+ ### Haftungsausschluss
517
+
518
+ **WICHTIGER HINWEIS:** Zeitreisen bergen unvorhersehbare Risiken!
519
+
520
+ Zeitreise-Industries GmbH übernimmt keine Haftung für:
521
+ - Paradoxe Effekte auf die Realität
522
+ - Verlorene oder veränderte Erinnerungen
523
+ - Begegnungen mit Parallelversionen Ihrer selbst
524
+ - Unerwartete historische Ereignisse
525
+ - Schäden durch zukünftige Zeitreise-Polizei
526
+ - Suchtgefahr durch exzessive Zeitreisen
527
+
528
+ ### Versicherungsbestimmungen
529
+
530
+ **Obligatorisch:**
531
+ - Temporale Haftpflichtversicherung (minimum 50 Mio. € Deckung)
532
+ - Paradoxon-Vollkaskoversicherung
533
+ - Zeitreisenden-Unfallversicherung
534
+
535
+ **Empfohlen:**
536
+ - Realitäts-Korrektur-Zusatzversicherung
537
+ - Rechtsschutz für Temporal-Verkehrsverstöße
538
+ - Rücktransport-Versicherung bei Systemausfall
539
+
540
+ ### Service und Support
541
+
542
+ **Bundesweite Service-Hotline:** 0800-ZEITBULI (0800-9348284)
543
+ **Notfall-Nummer:** 112-TEMPORAL
544
+ **E-Mail:** service@zeitreise-industries.de
545
+ **Adresse:** Zeitreise-Industries GmbH, Temporalstraße 88, 85577 Hill Valley
546
+
547
+ **Service-Zeiten:**
548
+ - Mo-Fr: 8:00-18:00 Uhr (alle Zeitzonen)
549
+ - Sa: 9:00-15:00 Uhr
550
+ - So: Nur Notdienst
551
+ - Feiertage: Nach Vereinbarung (auch historische Feiertage)
552
+
553
+ **Autorisierte Servicepartner** in über 150 deutschen Städten verfügbar.
554
+ Vollständige Liste unter: www.zeitreise-industries.de/service
555
+
556
+ ---
557
+
558
+ ## Anhang
559
+
560
+ ### Anhang A: Technische Daten (Detailangaben)
561
+
562
+ **Motor:**
563
+ - Typ: VW Boxer, 4-Zylinder, luftgekühlt
564
+ - Hubraum: 1192 cm³
565
+ - Leistung: 30 PS (22 kW) @ 3400 U/min + Temporal-Boost
566
+ - Kraftstoff: Normalbenzin + Plutonium-Zusatz
567
+ - Verbrauch: 12l/100km (konventionell), 0,1g Plutonium/Zeitsprung
568
+
569
+ **Temporal-Antrieb:**
570
+ - Mr. Fusion Reaktor: 1,21 GW Standardleistung
571
+ - Zeitspulen: 2x gegenläufig, Chromoly-Legierung
572
+ - Kühlsystem: Flüssigstickstoff-Kreislauf
573
+ - Sicherheitssysteme: 3-fache Redundanz
574
+
575
+ ### Anhang B: Ersatzteilliste
576
+
577
+ **Verschleißteile (alle 50.000 km):**
578
+ - Temporalspulen (Paar): Art.-Nr. TP-4711
579
+ - Plutonium-Filter: Art.-Nr. PF-1337
580
+ - Relativitäts-Sensor: Art.-Nr. RS-2001
581
+ - Paradoxon-Detektor: Art.-Nr. PD-404
582
+
583
+ **Kritische Komponenten:**
584
+ - SBM-Quantencomputer (komplett): Art.-Nr. QC-42
585
+ - ERU-Haupteinheit: Art.-Nr. ERU-1955
586
+ - Zeitanker (Notfall): Art.-Nr. ZA-HELP
587
+
588
+ ### Anhang C: Gesetzliche Bestimmungen
589
+
590
+ **Relevant für KI-Bulli Betrieb:**
591
+ - Temporales Schutzgesetz (TSchG) §§ 1-156
592
+ - Zeitreise-Verkehrs-Ordnung (ZrVO)
593
+ - Paradoxon-Verhütungsverordnung (PVV)
594
+ - EU-Richtlinie 88/1955/EG "Temporaler Verkehr"
595
+
596
+ ### Anhang D: Erste Hilfe bei Temporal-Verletzungen
597
+
598
+ **Temporale Übelkeit:**
599
+ - Symptome: Schwindel, Zeitwahrnehmungsstörungen
600
+ - Behandlung: Ruhe, frische Luft, Chronometer-Synchronisation
601
+
602
+ **Paradoxon-Schock:**
603
+ - Symptome: Erinnerungslücken, Déjà-vu-Attacken
604
+ - Behandlung: Sofortige medizinische Versorgung, Realitäts-Therapie
605
+
606
+ **Zeitschleifen-Syndrom:**
607
+ - Symptome: Wiederholung derselben Handlungen
608
+ - Behandlung: Bewusste Durchbrechung der Schleife, ggf. ERU-Reset
609
+
610
+ ---
611
+
612
+ **© 2025 Zeitreise-Industries GmbH - Alle Rechte vorbehalten**
613
+ **"Wo wir hinfahren, brauchen wir keine Straßen... aber einen TÜV!"**
614
+
615
+ *Dieses Handbuch wurde in Zusammenarbeit mit dem Institut für Angewandte Temporalphysik der TU München erstellt und vom Bundesamt für Zeitreisen genehmigt. Alle technischen Angaben entsprechen dem aktuellen Stand der Zeitreise-Wissenschaft.*
616
+
617
+ **Version 3.14159 - Letzte Aktualisierung: 31.02.2025**
README copy.md ADDED
@@ -0,0 +1,195 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🤖 AI-Powered Document Search & RAG Chat with Transformers.js
2
+
3
+ A complete **Retrieval-Augmented Generation (RAG)** system powered by **real transformer models** running directly in your browser via Transformers.js!
4
+
5
+ ## ✨ Real AI Features
6
+
7
+ - 🧠 **Real Embeddings** - Xenova/all-MiniLM-L6-v2 (384-dimensional sentence transformers)
8
+ - 🤖 **Q&A Model** - Xenova/distilbert-base-cased-distilled-squad for question answering
9
+ - 🚀 **Language Model** - Xenova/distilgpt2 for creative text generation
10
+ - 🔮 **Semantic Search** - True vector similarity using transformer embeddings
11
+ - 💬 **Intelligent Chat** - Multiple AI modes: Q&A, Pure LLM, and LLM+RAG
12
+ - 📚 **Document Management** - Automatic embedding generation for new documents
13
+ - 🎨 **Professional UI** - Beautiful interface with real-time progress indicators
14
+ - ⚡ **Browser-Native** - No server required, models run entirely in your browser
15
+ - 💾 **Model Caching** - Downloads once, cached for future use
16
+
17
+ ## 🚀 Quick Start
18
+
19
+ 1. **Start the server:**
20
+ ```bash
21
+ ./start-simple.sh
22
+ ```
23
+
24
+ 2. **Open your browser:**
25
+ ```
26
+ http://localhost:8000/rag-complete.html
27
+ ```
28
+
29
+ 3. **Initialize Real AI Models:**
30
+ - Click "🚀 Initialize Real AI Models"
31
+ - First load: ~1-2 minutes (downloads ~50MB of models)
32
+ - Subsequent loads: Instant (models are cached)
33
+
34
+ 4. **Experience Real AI:**
35
+ - **Ask complex questions:** Get AI-generated answers with confidence scores
36
+ - **LLM Chat:** Generate creative text, stories, poems, and explanations
37
+ - **LLM+RAG:** Combine document context with language model generation
38
+ - **Semantic search:** Find documents by meaning, not just keywords
39
+ - **Add documents:** Auto-generate embeddings with real transformers
40
+ - **Test system:** Verify all AI components are working
41
+
42
+ ## 🧠 AI Models Used
43
+
44
+ ### Embedding Model: Xenova/all-MiniLM-L6-v2
45
+ - **Purpose:** Generate 384-dimensional sentence embeddings
46
+ - **Size:** ~23MB
47
+ - **Performance:** ~2-3 seconds per document
48
+ - **Quality:** State-of-the-art semantic understanding
49
+
50
+ ### Q&A Model: Xenova/distilbert-base-cased-distilled-squad
51
+ - **Purpose:** Question answering with document context
52
+ - **Size:** ~28MB
53
+ - **Performance:** ~3-5 seconds per question
54
+ - **Quality:** Accurate answers with confidence scores
55
+
56
+ ### Language Model: Xenova/distilgpt2
57
+ - **Purpose:** Creative text generation and completion
58
+ - **Size:** ~40MB
59
+ - **Performance:** ~3-8 seconds per generation
60
+ - **Quality:** Coherent text with adjustable creativity
61
+
62
+ ## 📁 Project Structure
63
+
64
+ ```
65
+ document-embedding-search/
66
+ ├── rag-complete.html # Complete RAG system with real AI
67
+ ├── rag-backup.html # Backup (simulated AI version)
68
+ ├── start-simple.sh # Simple HTTP server startup script
69
+ └── README.md # This file
70
+ ```
71
+
72
+ ## 🔬 How Real AI Works
73
+
74
+ ### 1. **Real Embeddings Generation**
75
+ ```javascript
76
+ // Uses actual transformer model
77
+ embeddingModel = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
78
+ const embedding = await embeddingModel(text, { pooling: 'mean', normalize: true });
79
+ ```
80
+
81
+ ### 2. **True Semantic Search**
82
+ - Documents encoded into 384-dimensional vectors
83
+ - Query embedded using same transformer
84
+ - Cosine similarity calculated between real embeddings
85
+ - Results ranked by actual semantic similarity
86
+
87
+ ### 3. **Real AI Q&A Pipeline**
88
+ ```javascript
89
+ // Actual question-answering model
90
+ qaModel = await pipeline('question-answering', 'Xenova/distilbert-base-cased-distilled-squad');
91
+ const result = await qaModel(question, context);
92
+ // Returns: { answer: "...", score: 0.95 }
93
+ ```
94
+
95
+ ### 4. **Intelligent RAG Flow**
96
+ 1. **Question Analysis:** Real NLP processing of user query
97
+ 2. **Semantic Retrieval:** Vector similarity using transformer embeddings
98
+ 3. **Context Assembly:** Intelligent document selection and ranking
99
+ 4. **AI Generation:** Actual transformer-generated responses with confidence
100
+
101
+ ## 🎯 Technical Implementation
102
+
103
+ - **Frontend:** Pure HTML5, CSS3, vanilla JavaScript
104
+ - **AI Framework:** Transformers.js (Hugging Face models in browser)
105
+ - **Models:** Real pre-trained transformers from Hugging Face Hub
106
+ - **Inference:** CPU-based, runs entirely client-side
107
+ - **Memory:** ~100MB RAM during inference
108
+ - **Storage:** ~50MB cached models (persistent browser cache)
109
+
110
+ ## 🌟 Advanced Real AI Features
111
+
112
+ - **Progress Tracking** - Real-time model loading progress
113
+ - **Confidence Scores** - AI provides confidence levels for answers
114
+ - **Error Handling** - Robust error management for model operations
115
+ - **Performance Monitoring** - Track inference times and model status
116
+ - **Batch Processing** - Efficient embedding generation for multiple documents
117
+ - **Memory Management** - Optimized for browser resource constraints
118
+
119
+ ## 📊 Performance Characteristics
120
+
121
+ | Operation | Time | Memory | Quality |
122
+ |-----------|------|--------|---------|
123
+ | Model Loading | 60-180s | 90MB | One-time |
124
+ | Document Embedding | 2-3s | 25MB | High |
125
+ | Semantic Search | 1-2s | 15MB | Excellent |
126
+ | Q&A Generation | 3-5s | 30MB | Very High |
127
+ | LLM Generation | 3-8s | 40MB | High |
128
+ | LLM+RAG | 5-10s | 50MB | Very High |
129
+
130
+ ## 🎮 Demo Capabilities
131
+
132
+ ### Real Semantic Search
133
+ - Try: "machine learning applications" vs "ML uses"
134
+ - Experience true semantic understanding beyond keywords
135
+
136
+ ### Intelligent Q&A
137
+ - Ask: "How does renewable energy help the environment?"
138
+ - Get AI-generated answers with confidence scores
139
+
140
+ ### Pure LLM Generation
141
+ - Prompt: "Tell me a story about space exploration"
142
+ - Generate creative content with adjustable temperature
143
+
144
+ ### LLM+RAG Hybrid
145
+ - Combines document retrieval with language generation
146
+ - Context-aware creative responses
147
+ - Best of both worlds: accuracy + creativity
148
+
149
+ ### Context-Aware Responses
150
+ - Multi-document context assembly
151
+ - Relevant source citation
152
+ - Confidence-based answer validation
153
+
154
+ ## 🔧 Customization
155
+
156
+ Easily swap models by changing the pipeline configuration:
157
+
158
+ ```javascript
159
+ // Different embedding models
160
+ embeddingModel = await pipeline('feature-extraction', 'Xenova/e5-small-v2');
161
+
162
+ // Different QA models
163
+ qaModel = await pipeline('question-answering', 'Xenova/roberta-base-squad2');
164
+
165
+ // Text generation models
166
+ genModel = await pipeline('text-generation', 'Xenova/gpt2');
167
+ ```
168
+
169
+ ## 🚀 Deployment
170
+
171
+ Since models run entirely in the browser:
172
+
173
+ 1. **Static Hosting:** Upload single HTML file to any web server
174
+ 2. **CDN Distribution:** Serve globally with edge caching
175
+ 3. **Offline Capable:** Works without internet after initial model download
176
+ 4. **Mobile Compatible:** Runs on tablets and modern mobile browsers
177
+
178
+ ## 🎉 Transformers.js Showcase
179
+
180
+ This project demonstrates the incredible capabilities of Transformers.js:
181
+
182
+ - ✅ **Real AI in Browser** - No GPU servers required
183
+ - ✅ **Production Quality** - State-of-the-art model performance
184
+ - ✅ **Developer Friendly** - Simple API, complex AI made easy
185
+ - ✅ **Privacy Focused** - All processing happens locally
186
+ - ✅ **Cost Effective** - No API calls or inference costs
187
+ - ✅ **Scalable** - Handles unlimited users without backend
188
+
189
+ ## 📄 License
190
+
191
+ Open source and available under the MIT License.
192
+
193
+ ---
194
+
195
+ **🎯 Result:** A production-ready RAG system showcasing real transformer models running natively in web browsers - the future of AI-powered web applications!
debug-transformers.html ADDED
@@ -0,0 +1,75 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>Transformers.js Debug</title>
7
+ <script type="module">
8
+ // Import transformers.js from CDN
9
+ import { pipeline, env } from 'https://cdn.jsdelivr.net/npm/@xenova/transformers@2.17.2';
10
+
11
+ // Make available globally
12
+ window.transformers = { pipeline, env };
13
+ window.transformersLoaded = true;
14
+
15
+ console.log('✅ Transformers.js loaded via ES modules');
16
+ </script>
17
+ <script src="https://cdn.jsdelivr.net/npm/@xenova/transformers@2.17.2/dist/transformers.min.js"></script>
18
+ </head>
19
+ <body>
20
+ <h1>Transformers.js Debug Test</h1>
21
+ <div id="status">Loading...</div>
22
+ <button onclick="testTransformers()">Test Transformers.js</button>
23
+ <div id="result"></div>
24
+
25
+ <script>
26
+ async function testTransformers() {
27
+ const statusDiv = document.getElementById('status');
28
+ const resultDiv = document.getElementById('result');
29
+
30
+ try {
31
+ statusDiv.textContent = 'Testing Transformers.js loading...';
32
+
33
+ // Check what's available
34
+ console.log('window.transformers:', window.transformers);
35
+ console.log('window.Transformers:', window.Transformers);
36
+ console.log('window.transformersLoaded:', window.transformersLoaded);
37
+
38
+ let pipeline, env;
39
+
40
+ if (window.transformers && window.transformersLoaded) {
41
+ console.log('Using ES modules version');
42
+ ({ pipeline, env } = window.transformers);
43
+ } else if (window.Transformers) {
44
+ console.log('Using UMD version');
45
+ ({ pipeline, env } = window.Transformers);
46
+ } else {
47
+ throw new Error('No transformers library found');
48
+ }
49
+
50
+ statusDiv.textContent = 'Creating pipeline...';
51
+
52
+ // Test basic functionality
53
+ const classifier = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncased-finetuned-sst-2-english');
54
+
55
+ statusDiv.textContent = 'Running inference...';
56
+
57
+ const result = await classifier('I love transformers.js!');
58
+
59
+ statusDiv.textContent = '✅ Success!';
60
+ resultDiv.innerHTML = `<pre>${JSON.stringify(result, null, 2)}</pre>`;
61
+
62
+ } catch (error) {
63
+ console.error('Error:', error);
64
+ statusDiv.textContent = '❌ Error: ' + error.message;
65
+ resultDiv.textContent = error.stack;
66
+ }
67
+ }
68
+
69
+ // Auto-test when page loads
70
+ document.addEventListener('DOMContentLoaded', () => {
71
+ setTimeout(testTransformers, 2000);
72
+ });
73
+ </script>
74
+ </body>
75
+ </html>
index-new.js ADDED
@@ -0,0 +1,954 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ // Import transformers.js 3.0.0 from CDN (new Hugging Face ownership)
2
+ import { pipeline, env } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.0';
3
+
4
+ // Make available globally
5
+ window.transformers = { pipeline, env };
6
+ window.transformersLoaded = true;
7
+
8
+ console.log('✅ Transformers.js 3.0.0 loaded via ES modules (Hugging Face)');
9
+
10
+ // Global variables for transformers.js
11
+ let transformersPipeline = null;
12
+ let transformersEnv = null;
13
+ let transformersReady = false;
14
+
15
+ // Document storage and AI state
16
+ let documents = [
17
+ {
18
+ id: 0,
19
+ title: "Artificial Intelligence Overview",
20
+ content: "Artificial Intelligence (AI) is a branch of computer science that aims to create intelligent machines that work and react like humans. Some activities computers with AI are designed for include speech recognition, learning, planning, and problem-solving. AI is used in healthcare, finance, transportation, and entertainment. Machine learning enables computers to learn from experience without explicit programming. Deep learning uses neural networks to understand complex patterns in data.",
21
+ embedding: null
22
+ },
23
+ {
24
+ id: 1,
25
+ title: "Space Exploration",
26
+ content: "Space exploration is the ongoing discovery and exploration of celestial structures in outer space through evolving space technology. Physical exploration is conducted by unmanned robotic probes and human spaceflight. Space exploration has been used for geopolitical rivalries like the Cold War. The early era was driven by a Space Race between the Soviet Union and United States. Modern exploration includes Mars missions, the International Space Station, and satellite programs.",
27
+ embedding: null
28
+ },
29
+ {
30
+ id: 2,
31
+ title: "Renewable Energy",
32
+ content: "Renewable energy comes from naturally replenished resources on a human timescale. It includes sunlight, wind, rain, tides, waves, and geothermal heat. Renewable energy contrasts with fossil fuels that are used faster than replenished. Most renewable sources are sustainable. Solar energy is abundant and promising. Wind energy and hydroelectric power are major contributors to renewable generation worldwide.",
33
+ embedding: null
34
+ }
35
+ ];
36
+
37
+ let embeddingModel = null;
38
+ let qaModel = null;
39
+ let llmModel = null;
40
+ let loadedModelName = '';
41
+ let modelsInitialized = false;
42
+
43
+ // Calculate cosine similarity between two vectors
44
+ function cosineSimilarity(a, b) {
45
+ const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);
46
+ const magnitudeA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
47
+ const magnitudeB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
48
+
49
+ if (magnitudeA === 0 || magnitudeB === 0) return 0;
50
+ return dotProduct / (magnitudeA * magnitudeB);
51
+ }
52
+
53
+ // Initialize transformers.js when the script loads
54
+ async function initTransformers() {
55
+ try {
56
+ console.log('🔄 Initializing Transformers.js...');
57
+
58
+ // Try ES modules first (preferred method)
59
+ if (window.transformers && window.transformersLoaded) {
60
+ console.log('✅ Using ES modules version (Transformers.js 3.0.0)');
61
+ ({ pipeline: transformersPipeline, env: transformersEnv } = window.transformers);
62
+ }
63
+ // Fallback to UMD version
64
+ else if (window.Transformers) {
65
+ console.log('✅ Using UMD version (Transformers.js 3.0.0)');
66
+ ({ pipeline: transformersPipeline, env: transformersEnv } = window.Transformers);
67
+ }
68
+ // Wait for library to load
69
+ else {
70
+ console.log('⏳ Waiting for library to load...');
71
+ let attempts = 0;
72
+ while (!window.Transformers && !window.transformersLoaded && attempts < 50) {
73
+ await new Promise(resolve => setTimeout(resolve, 200));
74
+ attempts++;
75
+ }
76
+
77
+ if (window.transformers && window.transformersLoaded) {
78
+ ({ pipeline: transformersPipeline, env: transformersEnv } = window.transformers);
79
+ } else if (window.Transformers) {
80
+ ({ pipeline: transformersPipeline, env: transformersEnv } = window.Transformers);
81
+ } else {
82
+ throw new Error('Failed to load Transformers.js library');
83
+ }
84
+ }
85
+
86
+ // Configure transformers.js with minimal settings
87
+ if (transformersEnv) {
88
+ transformersEnv.allowLocalModels = false;
89
+ transformersEnv.allowRemoteModels = true;
90
+ // Let Transformers.js use default WASM paths for better compatibility
91
+ }
92
+
93
+ transformersReady = true;
94
+ console.log('✅ Transformers.js initialized successfully');
95
+
96
+ // Update UI to show ready state
97
+ updateStatus();
98
+
99
+ // Update status indicator
100
+ const statusSpan = document.getElementById('transformersStatus');
101
+ if (statusSpan) {
102
+ statusSpan.textContent = '✅ Ready!';
103
+ statusSpan.style.color = 'green';
104
+ }
105
+
106
+ } catch (error) {
107
+ console.error('❌ Error initializing Transformers.js:', error);
108
+
109
+ // Show error in UI
110
+ const statusDiv = document.getElementById('status');
111
+ if (statusDiv) {
112
+ statusDiv.textContent = `❌ Failed to load Transformers.js: ${error.message}`;
113
+ statusDiv.style.color = 'red';
114
+ }
115
+
116
+ // Update status indicator
117
+ const statusSpan = document.getElementById('transformersStatus');
118
+ if (statusSpan) {
119
+ statusSpan.textContent = `❌ Failed: ${error.message}`;
120
+ statusSpan.style.color = 'red';
121
+ }
122
+ }
123
+ }
124
+
125
+ // Initialize when page loads
126
+ document.addEventListener('DOMContentLoaded', function() {
127
+ initTransformers();
128
+ initFileUpload();
129
+ });
130
+
131
+ // UI Functions
132
+ function showTab(tabName) {
133
+ // Hide all tabs
134
+ document.querySelectorAll('.tab-content').forEach(tab => {
135
+ tab.classList.remove('active');
136
+ });
137
+ document.querySelectorAll('.tab').forEach(button => {
138
+ button.classList.remove('active');
139
+ });
140
+
141
+ // Show selected tab
142
+ document.getElementById(tabName).classList.add('active');
143
+ event.target.classList.add('active');
144
+ }
145
+
146
+ function updateSliderValue(sliderId) {
147
+ const slider = document.getElementById(sliderId);
148
+ const valueSpan = document.getElementById(sliderId + 'Value');
149
+ valueSpan.textContent = slider.value;
150
+ }
151
+
152
+ function updateStatus() {
153
+ const status = document.getElementById('status');
154
+ const transformersStatus = transformersReady ? 'Ready' : 'Not ready';
155
+ const embeddingStatus = embeddingModel ? 'Loaded' : 'Not loaded';
156
+ const qaStatus = qaModel ? 'Loaded' : 'Not loaded';
157
+ const llmStatus = llmModel ? 'Loaded' : 'Not loaded';
158
+ status.textContent = `📊 Documents: ${documents.length} | 🔧 Transformers.js: ${transformersStatus} | 🤖 QA: ${qaStatus} | 🧠 Embedding: ${embeddingStatus} | 🚀 LLM: ${llmStatus}`;
159
+ }
160
+
161
+ function updateProgress(percent, text) {
162
+ const progressBar = document.getElementById('progressBar');
163
+ const progressText = document.getElementById('progressText');
164
+ progressBar.style.width = percent + '%';
165
+ progressText.textContent = text;
166
+ }
167
+
168
+ // AI Functions
169
+ async function initializeModels() {
170
+ const statusDiv = document.getElementById('initStatus');
171
+ const progressDiv = document.getElementById('initProgress');
172
+ const initBtn = document.getElementById('initBtn');
173
+
174
+ statusDiv.style.display = 'block';
175
+ progressDiv.style.display = 'block';
176
+ initBtn.disabled = true;
177
+
178
+ try {
179
+ // Check if transformers.js is ready
180
+ if (!transformersReady || !transformersPipeline) {
181
+ updateProgress(5, "Waiting for Transformers.js to initialize...");
182
+ statusDiv.innerHTML = '🔄 Initializing Transformers.js library...';
183
+
184
+ // Wait for transformers.js to be ready
185
+ let attempts = 0;
186
+ while (!transformersReady && attempts < 30) {
187
+ await new Promise(resolve => setTimeout(resolve, 1000));
188
+ attempts++;
189
+ }
190
+
191
+ if (!transformersReady) {
192
+ throw new Error('Transformers.js failed to initialize. Please refresh the page.');
193
+ }
194
+ }
195
+
196
+ updateProgress(10, "Loading embedding model...");
197
+ statusDiv.innerHTML = '🔄 Loading embedding model (Xenova/all-MiniLM-L6-v2)...';
198
+
199
+ // Load embedding model with progress tracking
200
+ embeddingModel = await transformersPipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2', {
201
+ progress_callback: (progress) => {
202
+ if (progress.status === 'downloading') {
203
+ const percent = progress.loaded && progress.total ?
204
+ Math.round((progress.loaded / progress.total) * 100) : 0;
205
+ statusDiv.innerHTML = `🔄 Downloading embedding model: ${percent}%`;
206
+ }
207
+ }
208
+ });
209
+
210
+ updateProgress(40, "Loading question-answering model...");
211
+ statusDiv.innerHTML = '🔄 Loading QA model (Xenova/distilbert-base-cased-distilled-squad)...';
212
+
213
+ // Load QA model with progress tracking
214
+ qaModel = await transformersPipeline('question-answering', 'Xenova/distilbert-base-cased-distilled-squad', {
215
+ progress_callback: (progress) => {
216
+ if (progress.status === 'downloading') {
217
+ const percent = progress.loaded && progress.total ?
218
+ Math.round((progress.loaded / progress.total) * 100) : 0;
219
+ statusDiv.innerHTML = `🔄 Downloading QA model: ${percent}%`;
220
+ }
221
+ }
222
+ });
223
+
224
+ updateProgress(70, "Loading language model...");
225
+ statusDiv.innerHTML = '🔄 Loading LLM (trying SmolLM models)...';
226
+
227
+ // Load LLM model - Stable Transformers.js 3.0.0 configuration
228
+ const modelsToTry = [
229
+ {
230
+ name: 'Xenova/gpt2',
231
+ options: {}
232
+ },
233
+ {
234
+ name: 'Xenova/distilgpt2',
235
+ options: {}
236
+ }
237
+ ];
238
+
239
+ let modelLoaded = false;
240
+ for (const model of modelsToTry) {
241
+ try {
242
+ console.log(`Trying to load ${model.name}...`);
243
+ statusDiv.innerHTML = `🔄 Loading LLM (${model.name})...`;
244
+
245
+ // Load LLM with progress tracking
246
+ llmModel = await transformersPipeline('text-generation', model.name, {
247
+ progress_callback: (progress) => {
248
+ if (progress.status === 'downloading') {
249
+ const percent = progress.loaded && progress.total ?
250
+ Math.round((progress.loaded / progress.total) * 100) : 0;
251
+ statusDiv.innerHTML = `🔄 Downloading ${model.name}: ${percent}%`;
252
+ }
253
+ }
254
+ });
255
+
256
+ console.log(`✅ Successfully loaded ${model.name}`);
257
+ loadedModelName = model.name;
258
+ modelLoaded = true;
259
+ break;
260
+ } catch (error) {
261
+ console.warn(`${model.name} failed:`, error);
262
+ }
263
+ }
264
+
265
+ if (!modelLoaded) {
266
+ throw new Error('Failed to load any LLM model');
267
+ }
268
+
269
+ updateProgress(85, "Generating embeddings for documents...");
270
+ statusDiv.innerHTML = '🔄 Generating embeddings for existing documents...';
271
+
272
+ // Generate embeddings for all existing documents
273
+ for (let i = 0; i < documents.length; i++) {
274
+ const doc = documents[i];
275
+ updateProgress(85 + (i / documents.length) * 10, `Processing document ${i + 1}/${documents.length}...`);
276
+ doc.embedding = await generateEmbedding(doc.content);
277
+ }
278
+
279
+ updateProgress(100, "Initialization complete!");
280
+ modelsInitialized = true;
281
+
282
+ statusDiv.innerHTML = `✅ AI Models initialized successfully!
283
+ 🧠 Embedding Model: Xenova/all-MiniLM-L6-v2 (384 dimensions)
284
+ 🤖 QA Model: Xenova/distilbert-base-cased-distilled-squad
285
+ 🚀 LLM Model: ${loadedModelName} (Language model for text generation)
286
+ 📚 Documents processed: ${documents.length}
287
+ 🔮 Ready for semantic search, Q&A, and LLM chat!
288
+
289
+ 📊 Model Info:
290
+ • Embedding model size: ~23MB
291
+ • QA model size: ~28MB
292
+ • LLM model size: ~15-50MB (depending on model loaded)
293
+ • Total memory usage: ~70-100MB
294
+ • Inference speed: ~2-8 seconds per operation`;
295
+
296
+ updateStatus();
297
+
298
+ } catch (error) {
299
+ console.error('Error initializing models:', error);
300
+ statusDiv.innerHTML = `❌ Error initializing models: ${error.message}
301
+
302
+ Please check your internet connection and try again.`;
303
+ updateProgress(0, "Initialization failed");
304
+ } finally {
305
+ initBtn.disabled = false;
306
+ setTimeout(() => {
307
+ progressDiv.style.display = 'none';
308
+ }, 2000);
309
+ }
310
+ }
311
+
312
+ async function generateEmbedding(text) {
313
+ if (!transformersReady || !transformersPipeline) {
314
+ throw new Error('Transformers.js not initialized');
315
+ }
316
+
317
+ if (!embeddingModel) {
318
+ throw new Error('Embedding model not loaded');
319
+ }
320
+
321
+ try {
322
+ const output = await embeddingModel(text, { pooling: 'mean', normalize: true });
323
+ return Array.from(output.data);
324
+ } catch (error) {
325
+ console.error('Error generating embedding:', error);
326
+ throw error;
327
+ }
328
+ }
329
+
330
+ async function searchDocumentsSemantic() {
331
+ const query = document.getElementById('searchQuery').value;
332
+ const maxResults = parseInt(document.getElementById('maxResults').value);
333
+ const resultsDiv = document.getElementById('searchResults');
334
+ const searchBtn = document.getElementById('searchBtn');
335
+
336
+ if (!query.trim()) {
337
+ resultsDiv.style.display = 'block';
338
+ resultsDiv.textContent = '❌ Please enter a search query';
339
+ return;
340
+ }
341
+
342
+ if (!transformersReady || !modelsInitialized || !embeddingModel) {
343
+ resultsDiv.style.display = 'block';
344
+ resultsDiv.textContent = '❌ Please initialize AI models first!';
345
+ return;
346
+ }
347
+
348
+ resultsDiv.style.display = 'block';
349
+ resultsDiv.innerHTML = '<div class="loading"></div> Generating query embedding and searching...';
350
+ searchBtn.disabled = true;
351
+
352
+ try {
353
+ // Generate embedding for query
354
+ const queryEmbedding = await generateEmbedding(query);
355
+
356
+ // Calculate similarities
357
+ const results = [];
358
+ documents.forEach(doc => {
359
+ if (doc.embedding) {
360
+ const similarity = cosineSimilarity(queryEmbedding, doc.embedding);
361
+ results.push({ doc, similarity });
362
+ }
363
+ });
364
+
365
+ // Sort by similarity
366
+ results.sort((a, b) => b.similarity - a.similarity);
367
+
368
+ if (results.length === 0) {
369
+ resultsDiv.textContent = `❌ No documents with embeddings found for '${query}'`;
370
+ return;
371
+ }
372
+
373
+ let output = `🔍 Semantic search results for '${query}':\n\n`;
374
+ results.slice(0, maxResults).forEach((result, i) => {
375
+ const doc = result.doc;
376
+ const similarity = result.similarity;
377
+ const excerpt = doc.content.length > 200 ? doc.content.substring(0, 200) + '...' : doc.content;
378
+ output += `**Result ${i + 1}** (similarity: ${similarity.toFixed(3)})\n📄 Title: ${doc.title}\n📝 Content: ${excerpt}\n\n`;
379
+ });
380
+
381
+ resultsDiv.textContent = output;
382
+
383
+ } catch (error) {
384
+ console.error('Search error:', error);
385
+ resultsDiv.textContent = `❌ Error during search: ${error.message}`;
386
+ } finally {
387
+ searchBtn.disabled = false;
388
+ }
389
+ }
390
+
391
+ function searchDocumentsKeyword() {
392
+ const query = document.getElementById('searchQuery').value;
393
+ const maxResults = parseInt(document.getElementById('maxResults').value);
394
+ const resultsDiv = document.getElementById('searchResults');
395
+
396
+ if (!query.trim()) {
397
+ resultsDiv.style.display = 'block';
398
+ resultsDiv.textContent = '❌ Please enter a search query';
399
+ return;
400
+ }
401
+
402
+ resultsDiv.style.display = 'block';
403
+ resultsDiv.innerHTML = '<div class="loading"></div> Searching keywords...';
404
+
405
+ setTimeout(() => {
406
+ const results = [];
407
+ const queryWords = query.toLowerCase().split(/\s+/);
408
+
409
+ documents.forEach(doc => {
410
+ const contentLower = doc.content.toLowerCase();
411
+ const titleLower = doc.title.toLowerCase();
412
+
413
+ let matches = 0;
414
+ queryWords.forEach(word => {
415
+ matches += (contentLower.match(new RegExp(word, 'g')) || []).length;
416
+ matches += (titleLower.match(new RegExp(word, 'g')) || []).length * 2;
417
+ });
418
+
419
+ if (matches > 0) {
420
+ results.push({ doc, score: matches });
421
+ }
422
+ });
423
+
424
+ results.sort((a, b) => b.score - a.score);
425
+
426
+ if (results.length === 0) {
427
+ resultsDiv.textContent = `❌ No documents found containing '${query}'`;
428
+ return;
429
+ }
430
+
431
+ let output = `🔍 Keyword search results for '${query}':\n\n`;
432
+ results.slice(0, maxResults).forEach((result, i) => {
433
+ const doc = result.doc;
434
+ const excerpt = doc.content.length > 200 ? doc.content.substring(0, 200) + '...' : doc.content;
435
+ output += `**Result ${i + 1}**\n📄 Title: ${doc.title}\n📝 Content: ${excerpt}\n\n`;
436
+ });
437
+
438
+ resultsDiv.textContent = output;
439
+ }, 500);
440
+ }
441
+
442
+ async function chatWithRAG() {
443
+ const question = document.getElementById('chatQuestion').value;
444
+ const maxContext = parseInt(document.getElementById('maxContext').value);
445
+ const responseDiv = document.getElementById('chatResponse');
446
+ const chatBtn = document.getElementById('chatBtn');
447
+
448
+ if (!question.trim()) {
449
+ responseDiv.style.display = 'block';
450
+ responseDiv.textContent = '❌ Please enter a question';
451
+ return;
452
+ }
453
+
454
+ if (!transformersReady || !modelsInitialized || !embeddingModel || !qaModel) {
455
+ responseDiv.style.display = 'block';
456
+ responseDiv.textContent = '❌ AI models not loaded yet. Please initialize them first!';
457
+ return;
458
+ }
459
+
460
+ responseDiv.style.display = 'block';
461
+ responseDiv.innerHTML = '<div class="loading"></div> Generating answer with real AI...';
462
+ chatBtn.disabled = true;
463
+
464
+ try {
465
+ // Generate embedding for the question
466
+ const questionEmbedding = await generateEmbedding(question);
467
+
468
+ // Find relevant documents using semantic similarity
469
+ const relevantDocs = [];
470
+ documents.forEach(doc => {
471
+ if (doc.embedding) {
472
+ const similarity = cosineSimilarity(questionEmbedding, doc.embedding);
473
+ if (similarity > 0.1) {
474
+ relevantDocs.push({ doc, similarity });
475
+ }
476
+ }
477
+ });
478
+
479
+ relevantDocs.sort((a, b) => b.similarity - a.similarity);
480
+ relevantDocs.splice(maxContext);
481
+
482
+ if (relevantDocs.length === 0) {
483
+ responseDiv.textContent = '❌ No relevant context found in the documents for your question.';
484
+ return;
485
+ }
486
+
487
+ // Combine context from top documents
488
+ const context = relevantDocs.map(item => item.doc.content).join(' ').substring(0, 2000);
489
+
490
+ // Use the QA model to generate an answer
491
+ const qaResult = await qaModel(question, context);
492
+
493
+ let response = `🤖 AI Answer:\n${qaResult.answer}\n\n`;
494
+ response += `📊 Confidence: ${(qaResult.score * 100).toFixed(1)}%\n\n`;
495
+ response += `📚 Sources: ${relevantDocs.length} documents\n`;
496
+ response += `🔍 Best match: "${relevantDocs[0].doc.title}" (similarity: ${relevantDocs[0].similarity.toFixed(3)})\n\n`;
497
+ response += `📝 Context used:\n${context.substring(0, 300)}...`;
498
+
499
+ responseDiv.textContent = response;
500
+
501
+ } catch (error) {
502
+ console.error('Chat error:', error);
503
+ responseDiv.textContent = `❌ Error generating response: ${error.message}`;
504
+ } finally {
505
+ chatBtn.disabled = false;
506
+ }
507
+ }
508
+
509
+ async function chatWithLLM() {
510
+ const prompt = document.getElementById('llmPrompt').value;
511
+ const maxTokens = parseInt(document.getElementById('maxTokens').value);
512
+ const temperature = parseFloat(document.getElementById('temperature').value);
513
+ const responseDiv = document.getElementById('llmResponse');
514
+ const llmBtn = document.getElementById('llmBtn');
515
+
516
+ if (!prompt.trim()) {
517
+ responseDiv.style.display = 'block';
518
+ responseDiv.textContent = '❌ Please enter a prompt';
519
+ return;
520
+ }
521
+
522
+ if (!transformersReady || !modelsInitialized || !llmModel) {
523
+ responseDiv.style.display = 'block';
524
+ responseDiv.textContent = '❌ LLM model not loaded yet. Please initialize models first!';
525
+ return;
526
+ }
527
+
528
+ responseDiv.style.display = 'block';
529
+ responseDiv.innerHTML = '<div class="loading"></div> Generating text with LLM...';
530
+ llmBtn.disabled = true;
531
+
532
+ try {
533
+ // Generate text with the LLM
534
+ const result = await llmModel(prompt, {
535
+ max_new_tokens: maxTokens,
536
+ temperature: temperature,
537
+ do_sample: true,
538
+ return_full_text: false
539
+ });
540
+
541
+ let generatedText = result[0].generated_text;
542
+
543
+ let response = `🚀 LLM Generated Text:\n\n"${generatedText}"\n\n`;
544
+ response += `📊 Settings: ${maxTokens} tokens, temperature ${temperature}\n`;
545
+ response += `🤖 Model: ${loadedModelName ? loadedModelName.split('/')[1] : 'Language Model'}\n`;
546
+ response += `⏱️ Generated in real-time by your browser!`;
547
+
548
+ responseDiv.textContent = response;
549
+
550
+ } catch (error) {
551
+ console.error('LLM error:', error);
552
+ responseDiv.textContent = `❌ Error generating text: ${error.message}`;
553
+ } finally {
554
+ llmBtn.disabled = false;
555
+ }
556
+ }
557
+
558
+ async function chatWithLLMRAG() {
559
+ const prompt = document.getElementById('llmPrompt').value;
560
+ const maxTokens = parseInt(document.getElementById('maxTokens').value);
561
+ const temperature = parseFloat(document.getElementById('temperature').value);
562
+ const responseDiv = document.getElementById('llmResponse');
563
+ const llmRagBtn = document.getElementById('llmRagBtn');
564
+
565
+ if (!prompt.trim()) {
566
+ responseDiv.style.display = 'block';
567
+ responseDiv.textContent = '❌ Please enter a prompt';
568
+ return;
569
+ }
570
+
571
+ if (!transformersReady || !modelsInitialized || !llmModel || !embeddingModel) {
572
+ responseDiv.style.display = 'block';
573
+ responseDiv.textContent = '❌ Models not loaded yet. Please initialize all models first!';
574
+ return;
575
+ }
576
+
577
+ responseDiv.style.display = 'block';
578
+ responseDiv.innerHTML = '<div class="loading"></div> Finding relevant context and generating with LLM...';
579
+ llmRagBtn.disabled = true;
580
+
581
+ try {
582
+ // Find relevant documents using semantic search
583
+ const queryEmbedding = await generateEmbedding(prompt);
584
+ const relevantDocs = [];
585
+
586
+ documents.forEach(doc => {
587
+ if (doc.embedding) {
588
+ const similarity = cosineSimilarity(queryEmbedding, doc.embedding);
589
+ if (similarity > 0.1) {
590
+ relevantDocs.push({ doc, similarity });
591
+ }
592
+ }
593
+ });
594
+
595
+ relevantDocs.sort((a, b) => b.similarity - a.similarity);
596
+ relevantDocs.splice(3); // Limit to top 3 documents
597
+
598
+ // Create enhanced prompt with context
599
+ let enhancedPrompt = prompt;
600
+ if (relevantDocs.length > 0) {
601
+ const context = relevantDocs.map(item => item.doc.content.substring(0, 300)).join(' ');
602
+ enhancedPrompt = `Context: ${context}\n\nQuestion: ${prompt}\n\nAnswer:`;
603
+ }
604
+
605
+ // Generate text with the LLM using enhanced prompt
606
+ const result = await llmModel(enhancedPrompt, {
607
+ max_new_tokens: maxTokens,
608
+ temperature: temperature,
609
+ do_sample: true,
610
+ return_full_text: false
611
+ });
612
+
613
+ let generatedText = result[0].generated_text;
614
+
615
+ let response = `🤖 LLM + RAG Generated Response:\n\n"${generatedText}"\n\n`;
616
+ response += `📚 Context: ${relevantDocs.length} relevant documents used\n`;
617
+ if (relevantDocs.length > 0) {
618
+ response += `🔍 Best match: "${relevantDocs[0].doc.title}" (similarity: ${relevantDocs[0].similarity.toFixed(3)})\n`;
619
+ }
620
+ response += `📊 Settings: ${maxTokens} tokens, temperature ${temperature}\n`;
621
+ response += `🚀 Model: ${loadedModelName ? loadedModelName.split('/')[1] : 'LLM'} enhanced with document retrieval`;
622
+
623
+ responseDiv.textContent = response;
624
+
625
+ } catch (error) {
626
+ console.error('LLM+RAG error:', error);
627
+ responseDiv.textContent = `❌ Error generating response: ${error.message}`;
628
+ } finally {
629
+ llmRagBtn.disabled = false;
630
+ }
631
+ }
632
+
633
+ async function addDocumentManual() {
634
+ const title = document.getElementById('docTitle').value || `User Document ${documents.length - 2}`;
635
+ const content = document.getElementById('docContent').value;
636
+ const statusDiv = document.getElementById('addStatus');
637
+ const previewDiv = document.getElementById('docPreview');
638
+ const addBtn = document.getElementById('addBtn');
639
+
640
+ if (!content.trim()) {
641
+ statusDiv.style.display = 'block';
642
+ statusDiv.textContent = '❌ Please enter document content';
643
+ previewDiv.style.display = 'none';
644
+ return;
645
+ }
646
+
647
+ statusDiv.style.display = 'block';
648
+ statusDiv.innerHTML = '<div class="loading"></div> Adding document...';
649
+ addBtn.disabled = true;
650
+
651
+ try {
652
+ const docId = documents.length;
653
+ const newDocument = {
654
+ id: docId,
655
+ title: title,
656
+ content: content.trim(),
657
+ embedding: null
658
+ };
659
+
660
+ // Generate embedding if models are initialized
661
+ if (transformersReady && modelsInitialized && embeddingModel) {
662
+ statusDiv.innerHTML = '<div class="loading"></div> Generating AI embedding...';
663
+ newDocument.embedding = await generateEmbedding(content);
664
+ }
665
+
666
+ documents.push(newDocument);
667
+
668
+ const preview = content.length > 300 ? content.substring(0, 300) + '...' : content;
669
+ const status = `✅ Document added successfully!
670
+ 📄 Title: ${title}
671
+ 📊 Size: ${content.length.toLocaleString()} characters
672
+ 📚 Total documents: ${documents.length}${(transformersReady && modelsInitialized) ? '\n🧠 AI embedding generated automatically' : '\n⚠️ AI embedding will be generated when models are loaded'}`;
673
+
674
+ statusDiv.textContent = status;
675
+ previewDiv.style.display = 'block';
676
+ previewDiv.textContent = `📖 Preview:\n${preview}`;
677
+
678
+ // Clear form
679
+ document.getElementById('docTitle').value = '';
680
+ document.getElementById('docContent').value = '';
681
+
682
+ updateStatus();
683
+
684
+ } catch (error) {
685
+ console.error('Error adding document:', error);
686
+ statusDiv.textContent = `❌ Error adding document: ${error.message}`;
687
+ } finally {
688
+ addBtn.disabled = false;
689
+ }
690
+ }
691
+
692
+ // File upload functionality
693
+ function initFileUpload() {
694
+ const uploadArea = document.getElementById('uploadArea');
695
+ const fileInput = document.getElementById('fileInput');
696
+
697
+ if (!uploadArea || !fileInput) return;
698
+
699
+ // Click to select files
700
+ uploadArea.addEventListener('click', () => {
701
+ fileInput.click();
702
+ });
703
+
704
+ // Drag and drop functionality
705
+ uploadArea.addEventListener('dragover', (e) => {
706
+ e.preventDefault();
707
+ uploadArea.classList.add('dragover');
708
+ });
709
+
710
+ uploadArea.addEventListener('dragleave', (e) => {
711
+ e.preventDefault();
712
+ uploadArea.classList.remove('dragover');
713
+ });
714
+
715
+ uploadArea.addEventListener('drop', (e) => {
716
+ e.preventDefault();
717
+ uploadArea.classList.remove('dragover');
718
+ const files = e.dataTransfer.files;
719
+ handleFiles(files);
720
+ });
721
+
722
+ // File input change
723
+ fileInput.addEventListener('change', (e) => {
724
+ handleFiles(e.target.files);
725
+ });
726
+ }
727
+
728
+ async function handleFiles(files) {
729
+ const uploadStatus = document.getElementById('uploadStatus');
730
+ const uploadProgress = document.getElementById('uploadProgress');
731
+ const uploadProgressBar = document.getElementById('uploadProgressBar');
732
+ const uploadProgressText = document.getElementById('uploadProgressText');
733
+
734
+ if (files.length === 0) return;
735
+
736
+ uploadStatus.style.display = 'block';
737
+ uploadProgress.style.display = 'block';
738
+ uploadStatus.textContent = '';
739
+
740
+ let successCount = 0;
741
+ let errorCount = 0;
742
+
743
+ for (let i = 0; i < files.length; i++) {
744
+ const file = files[i];
745
+ const progress = ((i + 1) / files.length) * 100;
746
+
747
+ uploadProgressBar.style.width = progress + '%';
748
+ if (file.size > 10000) {
749
+ uploadProgressText.textContent = `Processing large file: ${file.name} (${i + 1}/${files.length}) - chunking for better search...`;
750
+ } else {
751
+ uploadProgressText.textContent = `Processing ${file.name} (${i + 1}/${files.length})...`;
752
+ }
753
+
754
+ try {
755
+ await processFile(file);
756
+ successCount++;
757
+ } catch (error) {
758
+ console.error(`Error processing ${file.name}:`, error);
759
+ errorCount++;
760
+ }
761
+ }
762
+
763
+ uploadProgress.style.display = 'none';
764
+
765
+ let statusText = `✅ Upload complete!\n📁 ${successCount} files processed successfully`;
766
+ if (errorCount > 0) {
767
+ statusText += `\n❌ ${errorCount} files failed to process`;
768
+ }
769
+ statusText += `\n📊 Total documents: ${documents.length}`;
770
+ statusText += `\n🧩 Large files automatically chunked for better search`;
771
+
772
+ uploadStatus.textContent = statusText;
773
+ updateStatus();
774
+
775
+ // Clear file input
776
+ document.getElementById('fileInput').value = '';
777
+ }
778
+
779
+ // Document chunking function for large files
780
+ function chunkDocument(content, maxChunkSize = 1000) {
781
+ const sentences = content.split(/[.!?]+/).filter(s => s.trim().length > 0);
782
+ const chunks = [];
783
+ let currentChunk = '';
784
+
785
+ for (let sentence of sentences) {
786
+ sentence = sentence.trim();
787
+ if (currentChunk.length + sentence.length > maxChunkSize && currentChunk.length > 0) {
788
+ chunks.push(currentChunk.trim());
789
+ currentChunk = sentence;
790
+ } else {
791
+ currentChunk += (currentChunk ? '. ' : '') + sentence;
792
+ }
793
+ }
794
+
795
+ if (currentChunk.trim()) {
796
+ chunks.push(currentChunk.trim());
797
+ }
798
+
799
+ return chunks.length > 0 ? chunks : [content];
800
+ }
801
+
802
+ async function processFile(file) {
803
+ return new Promise((resolve, reject) => {
804
+ const reader = new FileReader();
805
+
806
+ reader.onload = async function(e) {
807
+ try {
808
+ const content = e.target.result.trim();
809
+ const baseTitle = file.name.replace(/\.[^/.]+$/, ""); // Remove file extension
810
+
811
+ // Check if document is large and needs chunking
812
+ if (content.length > 2000) {
813
+ // Chunk large documents
814
+ const chunks = chunkDocument(content, 1500);
815
+ console.log(`📄 Chunking large file: ${chunks.length} chunks created from ${content.length} characters`);
816
+
817
+ for (let i = 0; i < chunks.length; i++) {
818
+ const chunkTitle = chunks.length > 1 ? `${baseTitle} (Part ${i + 1}/${chunks.length})` : baseTitle;
819
+ const newDocument = {
820
+ id: documents.length,
821
+ title: chunkTitle,
822
+ content: chunks[i],
823
+ embedding: null
824
+ };
825
+
826
+ // Generate embedding if models are loaded
827
+ if (transformersReady && modelsInitialized && embeddingModel) {
828
+ newDocument.embedding = await generateEmbedding(chunks[i]);
829
+ }
830
+
831
+ documents.push(newDocument);
832
+ }
833
+ } else {
834
+ // Small document - process as single document
835
+ const newDocument = {
836
+ id: documents.length,
837
+ title: baseTitle,
838
+ content: content,
839
+ embedding: null
840
+ };
841
+
842
+ // Generate embedding if models are loaded
843
+ if (transformersReady && modelsInitialized && embeddingModel) {
844
+ newDocument.embedding = await generateEmbedding(content);
845
+ }
846
+
847
+ documents.push(newDocument);
848
+ }
849
+
850
+ resolve();
851
+
852
+ } catch (error) {
853
+ reject(error);
854
+ }
855
+ };
856
+
857
+ reader.onerror = function() {
858
+ reject(new Error(`Failed to read file: ${file.name}`));
859
+ };
860
+
861
+ // Read file as text
862
+ reader.readAsText(file);
863
+ });
864
+ }
865
+
866
+ async function testSystem() {
867
+ const outputDiv = document.getElementById('testOutput');
868
+ const testBtn = document.getElementById('testBtn');
869
+
870
+ outputDiv.style.display = 'block';
871
+ outputDiv.innerHTML = '<div class="loading"></div> Running system tests...';
872
+ testBtn.disabled = true;
873
+
874
+ try {
875
+ let output = `🧪 System Test Results:\n\n`;
876
+ output += `📊 Documents: ${documents.length} loaded\n`;
877
+ output += `🔧 Transformers.js: ${transformersReady ? '✅ Ready' : '❌ Not ready'}\n`;
878
+ output += `🧠 Embedding Model: ${embeddingModel ? '✅ Loaded' : '❌ Not loaded'}\n`;
879
+ output += `🤖 QA Model: ${qaModel ? '✅ Loaded' : '❌ Not loaded'}\n`;
880
+ output += `🚀 LLM Model: ${llmModel ? '✅ Loaded' : '❌ Not loaded'}\n\n`;
881
+
882
+ if (transformersReady && modelsInitialized && embeddingModel) {
883
+ output += `🔍 Testing embedding generation...\n`;
884
+ const testEmbedding = await generateEmbedding("test sentence");
885
+ output += `✅ Embedding test: Generated ${testEmbedding.length}D vector\n\n`;
886
+
887
+ output += `🔍 Testing semantic search...\n`;
888
+ const testQuery = "artificial intelligence";
889
+ const queryEmbedding = await generateEmbedding(testQuery);
890
+
891
+ let testResults = [];
892
+ documents.forEach(doc => {
893
+ if (doc.embedding) {
894
+ const similarity = cosineSimilarity(queryEmbedding, doc.embedding);
895
+ testResults.push({ doc, similarity });
896
+ }
897
+ });
898
+ testResults.sort((a, b) => b.similarity - a.similarity);
899
+
900
+ if (testResults.length > 0) {
901
+ output += `✅ Search test: Found ${testResults.length} results\n`;
902
+ output += `📄 Top result: "${testResults[0].doc.title}" (similarity: ${testResults[0].similarity.toFixed(3)})\n\n`;
903
+ }
904
+
905
+ if (qaModel) {
906
+ output += `🤖 Testing QA model...\n`;
907
+ const context = documents[0].content.substring(0, 500);
908
+ const testQuestion = "What is artificial intelligence?";
909
+ const qaResult = await qaModel(testQuestion, context);
910
+ output += `✅ QA test: Generated answer with ${(qaResult.score * 100).toFixed(1)}% confidence\n`;
911
+ output += `💬 Answer: ${qaResult.answer.substring(0, 100)}...\n\n`;
912
+ }
913
+
914
+ if (llmModel) {
915
+ output += `🚀 Testing LLM model...\n`;
916
+ const testPrompt = "Explain artificial intelligence:";
917
+ const llmResult = await llmModel(testPrompt, { max_new_tokens: 30, temperature: 0.7, do_sample: true, return_full_text: false });
918
+ output += `✅ LLM test: Generated text completion\n`;
919
+ output += `💬 Generated: "${llmResult[0].generated_text.substring(0, 100)}..."\n\n`;
920
+ }
921
+
922
+ output += `🎉 All tests passed! System is fully operational.`;
923
+ } else {
924
+ output += `⚠️ Models not initialized. Click "Initialize AI Models" first.`;
925
+ }
926
+
927
+ outputDiv.textContent = output;
928
+
929
+ } catch (error) {
930
+ console.error('Test error:', error);
931
+ outputDiv.textContent = `❌ Test failed: ${error.message}`;
932
+ } finally {
933
+ testBtn.disabled = false;
934
+ }
935
+ }
936
+
937
+ // Initialize UI
938
+ updateStatus();
939
+
940
+ // Show version info in console
941
+ console.log('🤖 AI-Powered RAG System with Transformers.js');
942
+ console.log('Models: Xenova/all-MiniLM-L6-v2, Xenova/distilbert-base-cased-distilled-squad');
943
+
944
+ // Export functions for global access
945
+ window.showTab = showTab;
946
+ window.updateSliderValue = updateSliderValue;
947
+ window.initializeModels = initializeModels;
948
+ window.searchDocumentsSemantic = searchDocumentsSemantic;
949
+ window.searchDocumentsKeyword = searchDocumentsKeyword;
950
+ window.chatWithRAG = chatWithRAG;
951
+ window.chatWithLLM = chatWithLLM;
952
+ window.chatWithLLMRAG = chatWithLLMRAG;
953
+ window.addDocumentManual = addDocumentManual;
954
+ window.testSystem = testSystem;
index.html CHANGED
@@ -1,29 +1,224 @@
1
  <!DOCTYPE html>
2
  <html lang="en">
3
-
4
  <head>
5
- <meta charset="UTF-8" />
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  <link rel="stylesheet" href="style.css" />
7
-
8
- <meta name="viewport" content="width=device-width, initial-scale=1.0" />
9
- <title>Transformers.js - Object Detection</title>
10
  </head>
11
-
12
  <body>
13
- <h1>Object Detection w/ 🤗 Transformers.js</h1>
14
- <label id="container" for="upload">
15
- <svg width="25" height="25" viewBox="0 0 25 25" fill="none" xmlns="http://www.w3.org/2000/svg">
16
- <path fill="#000"
17
- d="M3.5 24.3a3 3 0 0 1-1.9-.8c-.5-.5-.8-1.2-.8-1.9V2.9c0-.7.3-1.3.8-1.9.6-.5 1.2-.7 2-.7h18.6c.7 0 1.3.2 1.9.7.5.6.7 1.2.7 2v18.6c0 .7-.2 1.4-.7 1.9a3 3 0 0 1-2 .8H3.6Zm0-2.7h18.7V2.9H3.5v18.7Zm2.7-2.7h13.3c.3 0 .5 0 .6-.3v-.7l-3.7-5a.6.6 0 0 0-.6-.2c-.2 0-.4 0-.5.3l-3.5 4.6-2.4-3.3a.6.6 0 0 0-.6-.3c-.2 0-.4.1-.5.3l-2.7 3.6c-.1.2-.2.4 0 .7.1.2.3.3.6.3Z">
18
- </path>
19
- </svg>
20
- Click to upload image
21
- <label id="example">(or try example)</label>
22
- </label>
23
- <label id="status">Loading model...</label>
24
- <input id="upload" type="file" accept="image/*" />
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
 
26
  <script src="index.js" type="module"></script>
27
  </body>
28
-
29
  </html>
 
1
  <!DOCTYPE html>
2
  <html lang="en">
 
3
  <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>🤖 AI-Powered Document Search & RAG Chat</title>
7
+ <script type="module">
8
+ // Import transformers.js 3.0.0 from CDN (new Hugging Face ownership)
9
+ import { pipeline, env } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.0';
10
+
11
+ // Make available globally
12
+ window.transformers = { pipeline, env };
13
+ window.transformersLoaded = true;
14
+
15
+ console.log('✅ Transformers.js 3.0.0 loaded via ES modules (Hugging Face)');
16
+ </script>
17
+ <script src="https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.0/dist/transformers.min.js"></script>
18
  <link rel="stylesheet" href="style.css" />
 
 
 
19
  </head>
 
20
  <body>
21
+ <div class="container">
22
+ <div class="header">
23
+ <h1>🤖 AI-Powered Document Search & RAG Chat</h1>
24
+ <p>Real transformer models running in your browser with Transformers.js</p>
25
+ </div>
26
+
27
+ <div class="status" id="status">
28
+ 📊 Documents: 3 | 🤖 AI Models: Not loaded | 🧠 Embedding Model: Not loaded
29
+ </div>
30
+
31
+ <div class="tabs">
32
+ <button class="tab active" onclick="showTab('init')">🚀 Initialize AI</button>
33
+ <button class="tab" onclick="showTab('chat')">🤖 AI Chat (RAG)</button>
34
+ <button class="tab" onclick="showTab('llm')">🚀 LLM Chat</button>
35
+ <button class="tab" onclick="showTab('search')">🔍 Semantic Search</button>
36
+ <button class="tab" onclick="showTab('add')">📝 Add Documents</button>
37
+ <button class="tab" onclick="showTab('test')">🧪 System Test</button>
38
+ </div>
39
+
40
+ <!-- Initialize AI Tab -->
41
+ <div id="init" class="tab-content active">
42
+ <div class="alert alert-info">
43
+ <strong>🚀 Real AI Models!</strong> This system uses actual transformer models via Transformers.js.
44
+ </div>
45
+
46
+ <div class="model-info">
47
+ <h4>🧠 Models Being Loaded:</h4>
48
+ <p><strong>Embedding Model:</strong> Xenova/all-MiniLM-L6-v2 (384-dimensional sentence embeddings)</p>
49
+ <p><strong>Q&A Model:</strong> Xenova/distilbert-base-cased-distilled-squad (Question Answering)</p>
50
+ <p><strong>LLM Model:</strong> Auto-selected GPT-2 or DistilGPT-2 (Transformers.js 3.0.0)</p>
51
+ <p><strong>Size:</strong> ~100MB total (cached after first load)</p>
52
+ <p><strong>Performance:</strong> CPU inference, ~2-8 seconds per operation</p>
53
+ <p><strong>Status:</strong> <span id="transformersStatus">⏳ Loading library...</span></p>
54
+ </div>
55
+
56
+ <div class="alert alert-warning">
57
+ <strong>⚠️ First Load:</strong> Model downloading may take 1-2 minutes depending on your internet connection. Models are cached for subsequent uses.
58
+ </div>
59
+
60
+ <button onclick="initializeModels()" id="initBtn" style="font-size: 18px; padding: 15px 30px;">
61
+ 🚀 Initialize Real AI Models
62
+ </button>
63
+
64
+ <div id="initProgress" style="display: none;">
65
+ <div class="progress">
66
+ <div class="progress-bar" id="progressBar" style="width: 0%"></div>
67
+ </div>
68
+ <p id="progressText">Preparing to load models...</p>
69
+ </div>
70
+
71
+ <div id="initStatus" class="result" style="display: none;"></div>
72
+ </div>
73
+
74
+ <!-- AI Chat Tab -->
75
+ <div id="chat" class="tab-content">
76
+ <div class="alert alert-info">
77
+ <strong>🤖 Real AI Chat!</strong> Ask questions and get answers from actual transformer models.
78
+ </div>
79
+ <div class="alert alert-success">
80
+ <strong>💡 Try asking:</strong><br>
81
+ • "What is artificial intelligence?"<br>
82
+ • "How does space exploration work?"<br>
83
+ • "What are renewable energy sources?"<br>
84
+ • "Explain machine learning in simple terms"
85
+ </div>
86
+ <div class="grid">
87
+ <div>
88
+ <label for="chatQuestion">Your Question</label>
89
+ <textarea id="chatQuestion" rows="3" placeholder="Ask anything about the documents..."></textarea>
90
+ </div>
91
+ <div>
92
+ <label for="maxContext">Context Documents</label>
93
+ <div class="slider-container">
94
+ <input type="range" id="maxContext" class="slider" min="1" max="5" value="3" oninput="updateSliderValue('maxContext')">
95
+ <span id="maxContextValue" class="slider-value">3</span>
96
+ </div>
97
+ </div>
98
+ </div>
99
+ <button onclick="chatWithRAG()" id="chatBtn">🤖 Ask AI</button>
100
+ <div id="chatResponse" class="result" style="display: none;"></div>
101
+ </div>
102
+
103
+ <!-- LLM Chat Tab -->
104
+ <div id="llm" class="tab-content">
105
+ <div class="alert alert-info">
106
+ <strong>🚀 Pure LLM Chat!</strong> Chat with a language model (GPT-2 or Llama2.c) running in your browser.
107
+ </div>
108
+ <div class="alert alert-success">
109
+ <strong>💡 Try these prompts:</strong><br>
110
+ • "Tell me a story about space exploration"<br>
111
+ • "Explain machine learning in simple terms"<br>
112
+ • "Write a poem about artificial intelligence"<br>
113
+ • "What are the benefits of renewable energy?"
114
+ </div>
115
+ <div class="grid">
116
+ <div>
117
+ <label for="llmPrompt">Your Prompt</label>
118
+ <textarea id="llmPrompt" rows="3" placeholder="Enter your prompt for the language model..."></textarea>
119
+ </div>
120
+ <div>
121
+ <label for="maxTokens">Max Tokens</label>
122
+ <div class="slider-container">
123
+ <input type="range" id="maxTokens" class="slider" min="20" max="200" value="100" oninput="updateSliderValue('maxTokens')">
124
+ <span id="maxTokensValue" class="slider-value">100</span>
125
+ </div>
126
+ <label for="temperature">Temperature</label>
127
+ <div class="slider-container">
128
+ <input type="range" id="temperature" class="slider" min="0.1" max="1.5" step="0.1" value="0.7" oninput="updateSliderValue('temperature')">
129
+ <span id="temperatureValue" class="slider-value">0.7</span>
130
+ </div>
131
+ </div>
132
+ </div>
133
+ <div style="display: flex; gap: 10px;">
134
+ <button onclick="chatWithLLM()" id="llmBtn">🚀 Generate Text</button>
135
+ <button class="btn-secondary" onclick="chatWithLLMRAG()" id="llmRagBtn">🤖 LLM + RAG</button>
136
+ </div>
137
+ <div id="llmResponse" class="result" style="display: none;"></div>
138
+ </div>
139
+
140
+ <!-- Semantic Search Tab -->
141
+ <div id="search" class="tab-content">
142
+ <div class="alert alert-info">
143
+ <strong>🔮 Real semantic search!</strong> Using transformer embeddings to find documents by meaning.
144
+ </div>
145
+ <div class="grid">
146
+ <div>
147
+ <label for="searchQuery">Search Query</label>
148
+ <input type="text" id="searchQuery" placeholder="Try: 'machine learning', 'Mars missions', 'solar power'">
149
+ </div>
150
+ <div>
151
+ <label for="maxResults">Max Results</label>
152
+ <div class="slider-container">
153
+ <input type="range" id="maxResults" class="slider" min="1" max="10" value="5" oninput="updateSliderValue('maxResults')">
154
+ <span id="maxResultsValue" class="slider-value">5</span>
155
+ </div>
156
+ </div>
157
+ </div>
158
+ <div style="display: flex; gap: 10px;">
159
+ <button onclick="searchDocumentsSemantic()" id="searchBtn">🔮 Semantic Search</button>
160
+ <button class="btn-secondary" onclick="searchDocumentsKeyword()">🔤 Keyword Search</button>
161
+ </div>
162
+ <div id="searchResults" class="result" style="display: none;"></div>
163
+ </div>
164
+
165
+ <!-- Add Documents Tab -->
166
+ <div id="add" class="tab-content">
167
+ <div class="alert alert-info">
168
+ <strong>📚 Expand your knowledge base!</strong> Upload files or paste text with real AI embeddings.
169
+ </div>
170
+
171
+ <!-- File Upload Section -->
172
+ <div class="upload-section">
173
+ <h4>📁 Upload Files</h4>
174
+ <div class="upload-area" id="uploadArea">
175
+ <div class="upload-content">
176
+ <div class="upload-icon">📄</div>
177
+ <div class="upload-text">
178
+ <strong>Drop files here or click to select</strong>
179
+ <br>Supports: .md, .txt, .json, .csv, .html, .js, .py, .xml
180
+ </div>
181
+ </div>
182
+ <input type="file" id="fileInput" accept=".md,.txt,.json,.csv,.html,.js,.py,.xml,.rst,.yaml,.yml" multiple style="display: none;">
183
+ </div>
184
+ <div id="uploadProgress" class="progress-container" style="display: none;">
185
+ <div class="progress-bar" id="uploadProgressBar"></div>
186
+ <div class="progress-text" id="uploadProgressText">Processing files...</div>
187
+ </div>
188
+ <div id="uploadStatus" class="result" style="display: none;"></div>
189
+ </div>
190
+
191
+ <div class="divider">OR</div>
192
+
193
+ <!-- Manual Entry Section -->
194
+ <div class="manual-entry">
195
+ <h4>✏️ Manual Entry</h4>
196
+ <div class="form-group">
197
+ <label for="docTitle">Document Title (optional)</label>
198
+ <input type="text" id="docTitle" placeholder="Enter document title...">
199
+ </div>
200
+ <div class="form-group">
201
+ <label for="docContent">Document Content</label>
202
+ <textarea id="docContent" rows="8" placeholder="Paste your document text here..."></textarea>
203
+ </div>
204
+ <button onclick="addDocumentManual()" id="addBtn">📝 Add Document</button>
205
+ <div class="grid">
206
+ <div id="addStatus" class="result" style="display: none;"></div>
207
+ <div id="docPreview" class="result" style="display: none;"></div>
208
+ </div>
209
+ </div>
210
+ </div>
211
+
212
+ <!-- System Test Tab -->
213
+ <div id="test" class="tab-content">
214
+ <div class="alert alert-info">
215
+ <strong>🧪 Test the system</strong> to verify AI models are working correctly.
216
+ </div>
217
+ <button onclick="testSystem()" id="testBtn">🧪 Run System Test</button>
218
+ <div id="testOutput" class="result" style="display: none;"></div>
219
+ </div>
220
+ </div>
221
 
222
  <script src="index.js" type="module"></script>
223
  </body>
 
224
  </html>
index.js CHANGED
@@ -1,76 +1,954 @@
1
- import { pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.4.1';
 
2
 
3
- // Reference the elements that we will need
4
- const status = document.getElementById('status');
5
- const fileUpload = document.getElementById('upload');
6
- const imageContainer = document.getElementById('container');
7
- const example = document.getElementById('example');
8
 
9
- const EXAMPLE_URL = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/city-streets.jpg';
10
 
11
- // Create a new object detection pipeline
12
- status.textContent = 'Loading model...';
13
- const detector = await pipeline('object-detection', 'Xenova/detr-resnet-50');
14
- status.textContent = 'Ready';
15
 
16
- example.addEventListener('click', (e) => {
17
- e.preventDefault();
18
- detect(EXAMPLE_URL);
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  });
20
 
21
- fileUpload.addEventListener('change', function (e) {
22
- const file = e.target.files[0];
23
- if (!file) {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
  return;
25
  }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
- const reader = new FileReader();
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
- // Set up a callback when the file is loaded
30
- reader.onload = e2 => detect(e2.target.result);
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
 
32
- reader.readAsDataURL(file);
33
- });
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
 
36
- // Detect objects in the image
37
- async function detect(img) {
38
- imageContainer.innerHTML = '';
39
- imageContainer.style.backgroundImage = `url(${img})`;
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
 
41
- status.textContent = 'Analysing...';
42
- const output = await detector(img, {
43
- threshold: 0.5,
44
- percentage: true,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
45
  });
46
- status.textContent = '';
47
- output.forEach(renderBox);
48
  }
49
 
50
- // Render a bounding box and label on the image
51
- function renderBox({ box, label }) {
52
- const { xmax, xmin, ymax, ymin } = box;
53
-
54
- // Generate a random color for the box
55
- const color = '#' + Math.floor(Math.random() * 0xFFFFFF).toString(16).padStart(6, 0);
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
56
 
57
- // Draw the box
58
- const boxElement = document.createElement('div');
59
- boxElement.className = 'bounding-box';
60
- Object.assign(boxElement.style, {
61
- borderColor: color,
62
- left: 100 * xmin + '%',
63
- top: 100 * ymin + '%',
64
- width: 100 * (xmax - xmin) + '%',
65
- height: 100 * (ymax - ymin) + '%',
66
- })
 
 
 
 
 
 
 
 
 
 
 
 
67
 
68
- // Draw label
69
- const labelElement = document.createElement('span');
70
- labelElement.textContent = label;
71
- labelElement.className = 'bounding-box-label';
72
- labelElement.style.backgroundColor = color;
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
73
 
74
- boxElement.appendChild(labelElement);
75
- imageContainer.appendChild(boxElement);
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76
  }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ // Import transformers.js 3.0.0 from CDN (new Hugging Face ownership)
2
+ import { pipeline, env } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.0';
3
 
4
+ // Make available globally
5
+ window.transformers = { pipeline, env };
6
+ window.transformersLoaded = true;
 
 
7
 
8
+ console.log('✅ Transformers.js 3.0.0 loaded via ES modules (Hugging Face)');
9
 
10
+ // Global variables for transformers.js
11
+ let transformersPipeline = null;
12
+ let transformersEnv = null;
13
+ let transformersReady = false;
14
 
15
+ // Document storage and AI state
16
+ let documents = [
17
+ {
18
+ id: 0,
19
+ title: "Artificial Intelligence Overview",
20
+ content: "Artificial Intelligence (AI) is a branch of computer science that aims to create intelligent machines that work and react like humans. Some activities computers with AI are designed for include speech recognition, learning, planning, and problem-solving. AI is used in healthcare, finance, transportation, and entertainment. Machine learning enables computers to learn from experience without explicit programming. Deep learning uses neural networks to understand complex patterns in data.",
21
+ embedding: null
22
+ },
23
+ {
24
+ id: 1,
25
+ title: "Space Exploration",
26
+ content: "Space exploration is the ongoing discovery and exploration of celestial structures in outer space through evolving space technology. Physical exploration is conducted by unmanned robotic probes and human spaceflight. Space exploration has been used for geopolitical rivalries like the Cold War. The early era was driven by a Space Race between the Soviet Union and United States. Modern exploration includes Mars missions, the International Space Station, and satellite programs.",
27
+ embedding: null
28
+ },
29
+ {
30
+ id: 2,
31
+ title: "Renewable Energy",
32
+ content: "Renewable energy comes from naturally replenished resources on a human timescale. It includes sunlight, wind, rain, tides, waves, and geothermal heat. Renewable energy contrasts with fossil fuels that are used faster than replenished. Most renewable sources are sustainable. Solar energy is abundant and promising. Wind energy and hydroelectric power are major contributors to renewable generation worldwide.",
33
+ embedding: null
34
+ }
35
+ ];
36
+
37
+ let embeddingModel = null;
38
+ let qaModel = null;
39
+ let llmModel = null;
40
+ let loadedModelName = '';
41
+ let modelsInitialized = false;
42
+
43
+ // Calculate cosine similarity between two vectors
44
+ function cosineSimilarity(a, b) {
45
+ const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);
46
+ const magnitudeA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
47
+ const magnitudeB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
48
+
49
+ if (magnitudeA === 0 || magnitudeB === 0) return 0;
50
+ return dotProduct / (magnitudeA * magnitudeB);
51
+ }
52
+
53
+ // Initialize transformers.js when the script loads
54
+ async function initTransformers() {
55
+ try {
56
+ console.log('🔄 Initializing Transformers.js...');
57
+
58
+ // Try ES modules first (preferred method)
59
+ if (window.transformers && window.transformersLoaded) {
60
+ console.log('✅ Using ES modules version (Transformers.js 3.0.0)');
61
+ ({ pipeline: transformersPipeline, env: transformersEnv } = window.transformers);
62
+ }
63
+ // Fallback to UMD version
64
+ else if (window.Transformers) {
65
+ console.log('✅ Using UMD version (Transformers.js 3.0.0)');
66
+ ({ pipeline: transformersPipeline, env: transformersEnv } = window.Transformers);
67
+ }
68
+ // Wait for library to load
69
+ else {
70
+ console.log('⏳ Waiting for library to load...');
71
+ let attempts = 0;
72
+ while (!window.Transformers && !window.transformersLoaded && attempts < 50) {
73
+ await new Promise(resolve => setTimeout(resolve, 200));
74
+ attempts++;
75
+ }
76
+
77
+ if (window.transformers && window.transformersLoaded) {
78
+ ({ pipeline: transformersPipeline, env: transformersEnv } = window.transformers);
79
+ } else if (window.Transformers) {
80
+ ({ pipeline: transformersPipeline, env: transformersEnv } = window.Transformers);
81
+ } else {
82
+ throw new Error('Failed to load Transformers.js library');
83
+ }
84
+ }
85
+
86
+ // Configure transformers.js with minimal settings
87
+ if (transformersEnv) {
88
+ transformersEnv.allowLocalModels = false;
89
+ transformersEnv.allowRemoteModels = true;
90
+ // Let Transformers.js use default WASM paths for better compatibility
91
+ }
92
+
93
+ transformersReady = true;
94
+ console.log('✅ Transformers.js initialized successfully');
95
+
96
+ // Update UI to show ready state
97
+ updateStatus();
98
+
99
+ // Update status indicator
100
+ const statusSpan = document.getElementById('transformersStatus');
101
+ if (statusSpan) {
102
+ statusSpan.textContent = '✅ Ready!';
103
+ statusSpan.style.color = 'green';
104
+ }
105
+
106
+ } catch (error) {
107
+ console.error('❌ Error initializing Transformers.js:', error);
108
+
109
+ // Show error in UI
110
+ const statusDiv = document.getElementById('status');
111
+ if (statusDiv) {
112
+ statusDiv.textContent = `❌ Failed to load Transformers.js: ${error.message}`;
113
+ statusDiv.style.color = 'red';
114
+ }
115
+
116
+ // Update status indicator
117
+ const statusSpan = document.getElementById('transformersStatus');
118
+ if (statusSpan) {
119
+ statusSpan.textContent = `❌ Failed: ${error.message}`;
120
+ statusSpan.style.color = 'red';
121
+ }
122
+ }
123
+ }
124
+
125
+ // Initialize when page loads
126
+ document.addEventListener('DOMContentLoaded', function() {
127
+ initTransformers();
128
+ initFileUpload();
129
  });
130
 
131
+ // UI Functions
132
+ function showTab(tabName) {
133
+ // Hide all tabs
134
+ document.querySelectorAll('.tab-content').forEach(tab => {
135
+ tab.classList.remove('active');
136
+ });
137
+ document.querySelectorAll('.tab').forEach(button => {
138
+ button.classList.remove('active');
139
+ });
140
+
141
+ // Show selected tab
142
+ document.getElementById(tabName).classList.add('active');
143
+ event.target.classList.add('active');
144
+ }
145
+
146
+ function updateSliderValue(sliderId) {
147
+ const slider = document.getElementById(sliderId);
148
+ const valueSpan = document.getElementById(sliderId + 'Value');
149
+ valueSpan.textContent = slider.value;
150
+ }
151
+
152
+ function updateStatus() {
153
+ const status = document.getElementById('status');
154
+ const transformersStatus = transformersReady ? 'Ready' : 'Not ready';
155
+ const embeddingStatus = embeddingModel ? 'Loaded' : 'Not loaded';
156
+ const qaStatus = qaModel ? 'Loaded' : 'Not loaded';
157
+ const llmStatus = llmModel ? 'Loaded' : 'Not loaded';
158
+ status.textContent = `📊 Documents: ${documents.length} | 🔧 Transformers.js: ${transformersStatus} | 🤖 QA: ${qaStatus} | 🧠 Embedding: ${embeddingStatus} | 🚀 LLM: ${llmStatus}`;
159
+ }
160
+
161
+ function updateProgress(percent, text) {
162
+ const progressBar = document.getElementById('progressBar');
163
+ const progressText = document.getElementById('progressText');
164
+ progressBar.style.width = percent + '%';
165
+ progressText.textContent = text;
166
+ }
167
+
168
+ // AI Functions
169
+ async function initializeModels() {
170
+ const statusDiv = document.getElementById('initStatus');
171
+ const progressDiv = document.getElementById('initProgress');
172
+ const initBtn = document.getElementById('initBtn');
173
+
174
+ statusDiv.style.display = 'block';
175
+ progressDiv.style.display = 'block';
176
+ initBtn.disabled = true;
177
+
178
+ try {
179
+ // Check if transformers.js is ready
180
+ if (!transformersReady || !transformersPipeline) {
181
+ updateProgress(5, "Waiting for Transformers.js to initialize...");
182
+ statusDiv.innerHTML = '🔄 Initializing Transformers.js library...';
183
+
184
+ // Wait for transformers.js to be ready
185
+ let attempts = 0;
186
+ while (!transformersReady && attempts < 30) {
187
+ await new Promise(resolve => setTimeout(resolve, 1000));
188
+ attempts++;
189
+ }
190
+
191
+ if (!transformersReady) {
192
+ throw new Error('Transformers.js failed to initialize. Please refresh the page.');
193
+ }
194
+ }
195
+
196
+ updateProgress(10, "Loading embedding model...");
197
+ statusDiv.innerHTML = '🔄 Loading embedding model (Xenova/all-MiniLM-L6-v2)...';
198
+
199
+ // Load embedding model with progress tracking
200
+ embeddingModel = await transformersPipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2', {
201
+ progress_callback: (progress) => {
202
+ if (progress.status === 'downloading') {
203
+ const percent = progress.loaded && progress.total ?
204
+ Math.round((progress.loaded / progress.total) * 100) : 0;
205
+ statusDiv.innerHTML = `🔄 Downloading embedding model: ${percent}%`;
206
+ }
207
+ }
208
+ });
209
+
210
+ updateProgress(40, "Loading question-answering model...");
211
+ statusDiv.innerHTML = '🔄 Loading QA model (Xenova/distilbert-base-cased-distilled-squad)...';
212
+
213
+ // Load QA model with progress tracking
214
+ qaModel = await transformersPipeline('question-answering', 'Xenova/distilbert-base-cased-distilled-squad', {
215
+ progress_callback: (progress) => {
216
+ if (progress.status === 'downloading') {
217
+ const percent = progress.loaded && progress.total ?
218
+ Math.round((progress.loaded / progress.total) * 100) : 0;
219
+ statusDiv.innerHTML = `🔄 Downloading QA model: ${percent}%`;
220
+ }
221
+ }
222
+ });
223
+
224
+ updateProgress(70, "Loading language model...");
225
+ statusDiv.innerHTML = '🔄 Loading LLM (trying SmolLM models)...';
226
+
227
+ // Load LLM model - Stable Transformers.js 3.0.0 configuration
228
+ const modelsToTry = [
229
+ {
230
+ name: 'Xenova/gpt2',
231
+ options: {}
232
+ },
233
+ {
234
+ name: 'Xenova/distilgpt2',
235
+ options: {}
236
+ }
237
+ ];
238
+
239
+ let modelLoaded = false;
240
+ for (const model of modelsToTry) {
241
+ try {
242
+ console.log(`Trying to load ${model.name}...`);
243
+ statusDiv.innerHTML = `🔄 Loading LLM (${model.name})...`;
244
+
245
+ // Load LLM with progress tracking
246
+ llmModel = await transformersPipeline('text-generation', model.name, {
247
+ progress_callback: (progress) => {
248
+ if (progress.status === 'downloading') {
249
+ const percent = progress.loaded && progress.total ?
250
+ Math.round((progress.loaded / progress.total) * 100) : 0;
251
+ statusDiv.innerHTML = `🔄 Downloading ${model.name}: ${percent}%`;
252
+ }
253
+ }
254
+ });
255
+
256
+ console.log(`✅ Successfully loaded ${model.name}`);
257
+ loadedModelName = model.name;
258
+ modelLoaded = true;
259
+ break;
260
+ } catch (error) {
261
+ console.warn(`${model.name} failed:`, error);
262
+ }
263
+ }
264
+
265
+ if (!modelLoaded) {
266
+ throw new Error('Failed to load any LLM model');
267
+ }
268
+
269
+ updateProgress(85, "Generating embeddings for documents...");
270
+ statusDiv.innerHTML = '🔄 Generating embeddings for existing documents...';
271
+
272
+ // Generate embeddings for all existing documents
273
+ for (let i = 0; i < documents.length; i++) {
274
+ const doc = documents[i];
275
+ updateProgress(85 + (i / documents.length) * 10, `Processing document ${i + 1}/${documents.length}...`);
276
+ doc.embedding = await generateEmbedding(doc.content);
277
+ }
278
+
279
+ updateProgress(100, "Initialization complete!");
280
+ modelsInitialized = true;
281
+
282
+ statusDiv.innerHTML = `✅ AI Models initialized successfully!
283
+ 🧠 Embedding Model: Xenova/all-MiniLM-L6-v2 (384 dimensions)
284
+ 🤖 QA Model: Xenova/distilbert-base-cased-distilled-squad
285
+ 🚀 LLM Model: ${loadedModelName} (Language model for text generation)
286
+ 📚 Documents processed: ${documents.length}
287
+ 🔮 Ready for semantic search, Q&A, and LLM chat!
288
+
289
+ 📊 Model Info:
290
+ • Embedding model size: ~23MB
291
+ • QA model size: ~28MB
292
+ • LLM model size: ~15-50MB (depending on model loaded)
293
+ • Total memory usage: ~70-100MB
294
+ • Inference speed: ~2-8 seconds per operation`;
295
+
296
+ updateStatus();
297
+
298
+ } catch (error) {
299
+ console.error('Error initializing models:', error);
300
+ statusDiv.innerHTML = `❌ Error initializing models: ${error.message}
301
+
302
+ Please check your internet connection and try again.`;
303
+ updateProgress(0, "Initialization failed");
304
+ } finally {
305
+ initBtn.disabled = false;
306
+ setTimeout(() => {
307
+ progressDiv.style.display = 'none';
308
+ }, 2000);
309
+ }
310
+ }
311
+
312
+ async function generateEmbedding(text) {
313
+ if (!transformersReady || !transformersPipeline) {
314
+ throw new Error('Transformers.js not initialized');
315
+ }
316
+
317
+ if (!embeddingModel) {
318
+ throw new Error('Embedding model not loaded');
319
+ }
320
+
321
+ try {
322
+ const output = await embeddingModel(text, { pooling: 'mean', normalize: true });
323
+ return Array.from(output.data);
324
+ } catch (error) {
325
+ console.error('Error generating embedding:', error);
326
+ throw error;
327
+ }
328
+ }
329
+
330
+ async function searchDocumentsSemantic() {
331
+ const query = document.getElementById('searchQuery').value;
332
+ const maxResults = parseInt(document.getElementById('maxResults').value);
333
+ const resultsDiv = document.getElementById('searchResults');
334
+ const searchBtn = document.getElementById('searchBtn');
335
+
336
+ if (!query.trim()) {
337
+ resultsDiv.style.display = 'block';
338
+ resultsDiv.textContent = '❌ Please enter a search query';
339
+ return;
340
+ }
341
+
342
+ if (!transformersReady || !modelsInitialized || !embeddingModel) {
343
+ resultsDiv.style.display = 'block';
344
+ resultsDiv.textContent = '❌ Please initialize AI models first!';
345
  return;
346
  }
347
+
348
+ resultsDiv.style.display = 'block';
349
+ resultsDiv.innerHTML = '<div class="loading"></div> Generating query embedding and searching...';
350
+ searchBtn.disabled = true;
351
+
352
+ try {
353
+ // Generate embedding for query
354
+ const queryEmbedding = await generateEmbedding(query);
355
+
356
+ // Calculate similarities
357
+ const results = [];
358
+ documents.forEach(doc => {
359
+ if (doc.embedding) {
360
+ const similarity = cosineSimilarity(queryEmbedding, doc.embedding);
361
+ results.push({ doc, similarity });
362
+ }
363
+ });
364
+
365
+ // Sort by similarity
366
+ results.sort((a, b) => b.similarity - a.similarity);
367
+
368
+ if (results.length === 0) {
369
+ resultsDiv.textContent = `❌ No documents with embeddings found for '${query}'`;
370
+ return;
371
+ }
372
+
373
+ let output = `🔍 Semantic search results for '${query}':\n\n`;
374
+ results.slice(0, maxResults).forEach((result, i) => {
375
+ const doc = result.doc;
376
+ const similarity = result.similarity;
377
+ const excerpt = doc.content.length > 200 ? doc.content.substring(0, 200) + '...' : doc.content;
378
+ output += `**Result ${i + 1}** (similarity: ${similarity.toFixed(3)})\n📄 Title: ${doc.title}\n📝 Content: ${excerpt}\n\n`;
379
+ });
380
+
381
+ resultsDiv.textContent = output;
382
+
383
+ } catch (error) {
384
+ console.error('Search error:', error);
385
+ resultsDiv.textContent = `❌ Error during search: ${error.message}`;
386
+ } finally {
387
+ searchBtn.disabled = false;
388
+ }
389
+ }
390
 
391
+ function searchDocumentsKeyword() {
392
+ const query = document.getElementById('searchQuery').value;
393
+ const maxResults = parseInt(document.getElementById('maxResults').value);
394
+ const resultsDiv = document.getElementById('searchResults');
395
+
396
+ if (!query.trim()) {
397
+ resultsDiv.style.display = 'block';
398
+ resultsDiv.textContent = '❌ Please enter a search query';
399
+ return;
400
+ }
401
+
402
+ resultsDiv.style.display = 'block';
403
+ resultsDiv.innerHTML = '<div class="loading"></div> Searching keywords...';
404
+
405
+ setTimeout(() => {
406
+ const results = [];
407
+ const queryWords = query.toLowerCase().split(/\s+/);
408
+
409
+ documents.forEach(doc => {
410
+ const contentLower = doc.content.toLowerCase();
411
+ const titleLower = doc.title.toLowerCase();
412
+
413
+ let matches = 0;
414
+ queryWords.forEach(word => {
415
+ matches += (contentLower.match(new RegExp(word, 'g')) || []).length;
416
+ matches += (titleLower.match(new RegExp(word, 'g')) || []).length * 2;
417
+ });
418
+
419
+ if (matches > 0) {
420
+ results.push({ doc, score: matches });
421
+ }
422
+ });
423
+
424
+ results.sort((a, b) => b.score - a.score);
425
+
426
+ if (results.length === 0) {
427
+ resultsDiv.textContent = `❌ No documents found containing '${query}'`;
428
+ return;
429
+ }
430
+
431
+ let output = `🔍 Keyword search results for '${query}':\n\n`;
432
+ results.slice(0, maxResults).forEach((result, i) => {
433
+ const doc = result.doc;
434
+ const excerpt = doc.content.length > 200 ? doc.content.substring(0, 200) + '...' : doc.content;
435
+ output += `**Result ${i + 1}**\n📄 Title: ${doc.title}\n📝 Content: ${excerpt}\n\n`;
436
+ });
437
+
438
+ resultsDiv.textContent = output;
439
+ }, 500);
440
+ }
441
 
442
+ async function chatWithRAG() {
443
+ const question = document.getElementById('chatQuestion').value;
444
+ const maxContext = parseInt(document.getElementById('maxContext').value);
445
+ const responseDiv = document.getElementById('chatResponse');
446
+ const chatBtn = document.getElementById('chatBtn');
447
+
448
+ if (!question.trim()) {
449
+ responseDiv.style.display = 'block';
450
+ responseDiv.textContent = '❌ Please enter a question';
451
+ return;
452
+ }
453
+
454
+ if (!transformersReady || !modelsInitialized || !embeddingModel || !qaModel) {
455
+ responseDiv.style.display = 'block';
456
+ responseDiv.textContent = '❌ AI models not loaded yet. Please initialize them first!';
457
+ return;
458
+ }
459
+
460
+ responseDiv.style.display = 'block';
461
+ responseDiv.innerHTML = '<div class="loading"></div> Generating answer with real AI...';
462
+ chatBtn.disabled = true;
463
+
464
+ try {
465
+ // Generate embedding for the question
466
+ const questionEmbedding = await generateEmbedding(question);
467
+
468
+ // Find relevant documents using semantic similarity
469
+ const relevantDocs = [];
470
+ documents.forEach(doc => {
471
+ if (doc.embedding) {
472
+ const similarity = cosineSimilarity(questionEmbedding, doc.embedding);
473
+ if (similarity > 0.1) {
474
+ relevantDocs.push({ doc, similarity });
475
+ }
476
+ }
477
+ });
478
+
479
+ relevantDocs.sort((a, b) => b.similarity - a.similarity);
480
+ relevantDocs.splice(maxContext);
481
+
482
+ if (relevantDocs.length === 0) {
483
+ responseDiv.textContent = '❌ No relevant context found in the documents for your question.';
484
+ return;
485
+ }
486
+
487
+ // Combine context from top documents
488
+ const context = relevantDocs.map(item => item.doc.content).join(' ').substring(0, 2000);
489
+
490
+ // Use the QA model to generate an answer
491
+ const qaResult = await qaModel(question, context);
492
+
493
+ let response = `🤖 AI Answer:\n${qaResult.answer}\n\n`;
494
+ response += `📊 Confidence: ${(qaResult.score * 100).toFixed(1)}%\n\n`;
495
+ response += `📚 Sources: ${relevantDocs.length} documents\n`;
496
+ response += `🔍 Best match: "${relevantDocs[0].doc.title}" (similarity: ${relevantDocs[0].similarity.toFixed(3)})\n\n`;
497
+ response += `📝 Context used:\n${context.substring(0, 300)}...`;
498
+
499
+ responseDiv.textContent = response;
500
+
501
+ } catch (error) {
502
+ console.error('Chat error:', error);
503
+ responseDiv.textContent = `❌ Error generating response: ${error.message}`;
504
+ } finally {
505
+ chatBtn.disabled = false;
506
+ }
507
+ }
508
 
509
+ async function chatWithLLM() {
510
+ const prompt = document.getElementById('llmPrompt').value;
511
+ const maxTokens = parseInt(document.getElementById('maxTokens').value);
512
+ const temperature = parseFloat(document.getElementById('temperature').value);
513
+ const responseDiv = document.getElementById('llmResponse');
514
+ const llmBtn = document.getElementById('llmBtn');
515
+
516
+ if (!prompt.trim()) {
517
+ responseDiv.style.display = 'block';
518
+ responseDiv.textContent = '❌ Please enter a prompt';
519
+ return;
520
+ }
521
+
522
+ if (!transformersReady || !modelsInitialized || !llmModel) {
523
+ responseDiv.style.display = 'block';
524
+ responseDiv.textContent = '❌ LLM model not loaded yet. Please initialize models first!';
525
+ return;
526
+ }
527
+
528
+ responseDiv.style.display = 'block';
529
+ responseDiv.innerHTML = '<div class="loading"></div> Generating text with LLM...';
530
+ llmBtn.disabled = true;
531
+
532
+ try {
533
+ // Generate text with the LLM
534
+ const result = await llmModel(prompt, {
535
+ max_new_tokens: maxTokens,
536
+ temperature: temperature,
537
+ do_sample: true,
538
+ return_full_text: false
539
+ });
540
+
541
+ let generatedText = result[0].generated_text;
542
+
543
+ let response = `🚀 LLM Generated Text:\n\n"${generatedText}"\n\n`;
544
+ response += `📊 Settings: ${maxTokens} tokens, temperature ${temperature}\n`;
545
+ response += `🤖 Model: ${loadedModelName ? loadedModelName.split('/')[1] : 'Language Model'}\n`;
546
+ response += `⏱️ Generated in real-time by your browser!`;
547
+
548
+ responseDiv.textContent = response;
549
+
550
+ } catch (error) {
551
+ console.error('LLM error:', error);
552
+ responseDiv.textContent = `❌ Error generating text: ${error.message}`;
553
+ } finally {
554
+ llmBtn.disabled = false;
555
+ }
556
+ }
557
 
558
+ async function chatWithLLMRAG() {
559
+ const prompt = document.getElementById('llmPrompt').value;
560
+ const maxTokens = parseInt(document.getElementById('maxTokens').value);
561
+ const temperature = parseFloat(document.getElementById('temperature').value);
562
+ const responseDiv = document.getElementById('llmResponse');
563
+ const llmRagBtn = document.getElementById('llmRagBtn');
564
+
565
+ if (!prompt.trim()) {
566
+ responseDiv.style.display = 'block';
567
+ responseDiv.textContent = '❌ Please enter a prompt';
568
+ return;
569
+ }
570
+
571
+ if (!transformersReady || !modelsInitialized || !llmModel || !embeddingModel) {
572
+ responseDiv.style.display = 'block';
573
+ responseDiv.textContent = '❌ Models not loaded yet. Please initialize all models first!';
574
+ return;
575
+ }
576
+
577
+ responseDiv.style.display = 'block';
578
+ responseDiv.innerHTML = '<div class="loading"></div> Finding relevant context and generating with LLM...';
579
+ llmRagBtn.disabled = true;
580
+
581
+ try {
582
+ // Find relevant documents using semantic search
583
+ const queryEmbedding = await generateEmbedding(prompt);
584
+ const relevantDocs = [];
585
+
586
+ documents.forEach(doc => {
587
+ if (doc.embedding) {
588
+ const similarity = cosineSimilarity(queryEmbedding, doc.embedding);
589
+ if (similarity > 0.1) {
590
+ relevantDocs.push({ doc, similarity });
591
+ }
592
+ }
593
+ });
594
+
595
+ relevantDocs.sort((a, b) => b.similarity - a.similarity);
596
+ relevantDocs.splice(3); // Limit to top 3 documents
597
+
598
+ // Create enhanced prompt with context
599
+ let enhancedPrompt = prompt;
600
+ if (relevantDocs.length > 0) {
601
+ const context = relevantDocs.map(item => item.doc.content.substring(0, 300)).join(' ');
602
+ enhancedPrompt = `Context: ${context}\n\nQuestion: ${prompt}\n\nAnswer:`;
603
+ }
604
+
605
+ // Generate text with the LLM using enhanced prompt
606
+ const result = await llmModel(enhancedPrompt, {
607
+ max_new_tokens: maxTokens,
608
+ temperature: temperature,
609
+ do_sample: true,
610
+ return_full_text: false
611
+ });
612
+
613
+ let generatedText = result[0].generated_text;
614
+
615
+ let response = `🤖 LLM + RAG Generated Response:\n\n"${generatedText}"\n\n`;
616
+ response += `📚 Context: ${relevantDocs.length} relevant documents used\n`;
617
+ if (relevantDocs.length > 0) {
618
+ response += `🔍 Best match: "${relevantDocs[0].doc.title}" (similarity: ${relevantDocs[0].similarity.toFixed(3)})\n`;
619
+ }
620
+ response += `📊 Settings: ${maxTokens} tokens, temperature ${temperature}\n`;
621
+ response += `🚀 Model: ${loadedModelName ? loadedModelName.split('/')[1] : 'LLM'} enhanced with document retrieval`;
622
+
623
+ responseDiv.textContent = response;
624
+
625
+ } catch (error) {
626
+ console.error('LLM+RAG error:', error);
627
+ responseDiv.textContent = `❌ Error generating response: ${error.message}`;
628
+ } finally {
629
+ llmRagBtn.disabled = false;
630
+ }
631
+ }
632
 
633
+ async function addDocumentManual() {
634
+ const title = document.getElementById('docTitle').value || `User Document ${documents.length - 2}`;
635
+ const content = document.getElementById('docContent').value;
636
+ const statusDiv = document.getElementById('addStatus');
637
+ const previewDiv = document.getElementById('docPreview');
638
+ const addBtn = document.getElementById('addBtn');
639
+
640
+ if (!content.trim()) {
641
+ statusDiv.style.display = 'block';
642
+ statusDiv.textContent = '❌ Please enter document content';
643
+ previewDiv.style.display = 'none';
644
+ return;
645
+ }
646
+
647
+ statusDiv.style.display = 'block';
648
+ statusDiv.innerHTML = '<div class="loading"></div> Adding document...';
649
+ addBtn.disabled = true;
650
+
651
+ try {
652
+ const docId = documents.length;
653
+ const newDocument = {
654
+ id: docId,
655
+ title: title,
656
+ content: content.trim(),
657
+ embedding: null
658
+ };
659
+
660
+ // Generate embedding if models are initialized
661
+ if (transformersReady && modelsInitialized && embeddingModel) {
662
+ statusDiv.innerHTML = '<div class="loading"></div> Generating AI embedding...';
663
+ newDocument.embedding = await generateEmbedding(content);
664
+ }
665
+
666
+ documents.push(newDocument);
667
+
668
+ const preview = content.length > 300 ? content.substring(0, 300) + '...' : content;
669
+ const status = `✅ Document added successfully!
670
+ 📄 Title: ${title}
671
+ 📊 Size: ${content.length.toLocaleString()} characters
672
+ 📚 Total documents: ${documents.length}${(transformersReady && modelsInitialized) ? '\n🧠 AI embedding generated automatically' : '\n⚠️ AI embedding will be generated when models are loaded'}`;
673
+
674
+ statusDiv.textContent = status;
675
+ previewDiv.style.display = 'block';
676
+ previewDiv.textContent = `📖 Preview:\n${preview}`;
677
+
678
+ // Clear form
679
+ document.getElementById('docTitle').value = '';
680
+ document.getElementById('docContent').value = '';
681
+
682
+ updateStatus();
683
+
684
+ } catch (error) {
685
+ console.error('Error adding document:', error);
686
+ statusDiv.textContent = `❌ Error adding document: ${error.message}`;
687
+ } finally {
688
+ addBtn.disabled = false;
689
+ }
690
+ }
691
 
692
+ // File upload functionality
693
+ function initFileUpload() {
694
+ const uploadArea = document.getElementById('uploadArea');
695
+ const fileInput = document.getElementById('fileInput');
696
+
697
+ if (!uploadArea || !fileInput) return;
698
+
699
+ // Click to select files
700
+ uploadArea.addEventListener('click', () => {
701
+ fileInput.click();
702
+ });
703
+
704
+ // Drag and drop functionality
705
+ uploadArea.addEventListener('dragover', (e) => {
706
+ e.preventDefault();
707
+ uploadArea.classList.add('dragover');
708
+ });
709
+
710
+ uploadArea.addEventListener('dragleave', (e) => {
711
+ e.preventDefault();
712
+ uploadArea.classList.remove('dragover');
713
+ });
714
+
715
+ uploadArea.addEventListener('drop', (e) => {
716
+ e.preventDefault();
717
+ uploadArea.classList.remove('dragover');
718
+ const files = e.dataTransfer.files;
719
+ handleFiles(files);
720
+ });
721
+
722
+ // File input change
723
+ fileInput.addEventListener('change', (e) => {
724
+ handleFiles(e.target.files);
725
  });
 
 
726
  }
727
 
728
+ async function handleFiles(files) {
729
+ const uploadStatus = document.getElementById('uploadStatus');
730
+ const uploadProgress = document.getElementById('uploadProgress');
731
+ const uploadProgressBar = document.getElementById('uploadProgressBar');
732
+ const uploadProgressText = document.getElementById('uploadProgressText');
733
+
734
+ if (files.length === 0) return;
735
+
736
+ uploadStatus.style.display = 'block';
737
+ uploadProgress.style.display = 'block';
738
+ uploadStatus.textContent = '';
739
+
740
+ let successCount = 0;
741
+ let errorCount = 0;
742
+
743
+ for (let i = 0; i < files.length; i++) {
744
+ const file = files[i];
745
+ const progress = ((i + 1) / files.length) * 100;
746
+
747
+ uploadProgressBar.style.width = progress + '%';
748
+ if (file.size > 10000) {
749
+ uploadProgressText.textContent = `Processing large file: ${file.name} (${i + 1}/${files.length}) - chunking for better search...`;
750
+ } else {
751
+ uploadProgressText.textContent = `Processing ${file.name} (${i + 1}/${files.length})...`;
752
+ }
753
+
754
+ try {
755
+ await processFile(file);
756
+ successCount++;
757
+ } catch (error) {
758
+ console.error(`Error processing ${file.name}:`, error);
759
+ errorCount++;
760
+ }
761
+ }
762
+
763
+ uploadProgress.style.display = 'none';
764
+
765
+ let statusText = `✅ Upload complete!\n📁 ${successCount} files processed successfully`;
766
+ if (errorCount > 0) {
767
+ statusText += `\n❌ ${errorCount} files failed to process`;
768
+ }
769
+ statusText += `\n📊 Total documents: ${documents.length}`;
770
+ statusText += `\n🧩 Large files automatically chunked for better search`;
771
+
772
+ uploadStatus.textContent = statusText;
773
+ updateStatus();
774
+
775
+ // Clear file input
776
+ document.getElementById('fileInput').value = '';
777
+ }
778
 
779
+ // Document chunking function for large files
780
+ function chunkDocument(content, maxChunkSize = 1000) {
781
+ const sentences = content.split(/[.!?]+/).filter(s => s.trim().length > 0);
782
+ const chunks = [];
783
+ let currentChunk = '';
784
+
785
+ for (let sentence of sentences) {
786
+ sentence = sentence.trim();
787
+ if (currentChunk.length + sentence.length > maxChunkSize && currentChunk.length > 0) {
788
+ chunks.push(currentChunk.trim());
789
+ currentChunk = sentence;
790
+ } else {
791
+ currentChunk += (currentChunk ? '. ' : '') + sentence;
792
+ }
793
+ }
794
+
795
+ if (currentChunk.trim()) {
796
+ chunks.push(currentChunk.trim());
797
+ }
798
+
799
+ return chunks.length > 0 ? chunks : [content];
800
+ }
801
 
802
+ async function processFile(file) {
803
+ return new Promise((resolve, reject) => {
804
+ const reader = new FileReader();
805
+
806
+ reader.onload = async function(e) {
807
+ try {
808
+ const content = e.target.result.trim();
809
+ const baseTitle = file.name.replace(/\.[^/.]+$/, ""); // Remove file extension
810
+
811
+ // Check if document is large and needs chunking
812
+ if (content.length > 2000) {
813
+ // Chunk large documents
814
+ const chunks = chunkDocument(content, 1500);
815
+ console.log(`📄 Chunking large file: ${chunks.length} chunks created from ${content.length} characters`);
816
+
817
+ for (let i = 0; i < chunks.length; i++) {
818
+ const chunkTitle = chunks.length > 1 ? `${baseTitle} (Part ${i + 1}/${chunks.length})` : baseTitle;
819
+ const newDocument = {
820
+ id: documents.length,
821
+ title: chunkTitle,
822
+ content: chunks[i],
823
+ embedding: null
824
+ };
825
+
826
+ // Generate embedding if models are loaded
827
+ if (transformersReady && modelsInitialized && embeddingModel) {
828
+ newDocument.embedding = await generateEmbedding(chunks[i]);
829
+ }
830
+
831
+ documents.push(newDocument);
832
+ }
833
+ } else {
834
+ // Small document - process as single document
835
+ const newDocument = {
836
+ id: documents.length,
837
+ title: baseTitle,
838
+ content: content,
839
+ embedding: null
840
+ };
841
+
842
+ // Generate embedding if models are loaded
843
+ if (transformersReady && modelsInitialized && embeddingModel) {
844
+ newDocument.embedding = await generateEmbedding(content);
845
+ }
846
+
847
+ documents.push(newDocument);
848
+ }
849
+
850
+ resolve();
851
+
852
+ } catch (error) {
853
+ reject(error);
854
+ }
855
+ };
856
+
857
+ reader.onerror = function() {
858
+ reject(new Error(`Failed to read file: ${file.name}`));
859
+ };
860
+
861
+ // Read file as text
862
+ reader.readAsText(file);
863
+ });
864
+ }
865
 
866
+ async function testSystem() {
867
+ const outputDiv = document.getElementById('testOutput');
868
+ const testBtn = document.getElementById('testBtn');
869
+
870
+ outputDiv.style.display = 'block';
871
+ outputDiv.innerHTML = '<div class="loading"></div> Running system tests...';
872
+ testBtn.disabled = true;
873
+
874
+ try {
875
+ let output = `🧪 System Test Results:\n\n`;
876
+ output += `📊 Documents: ${documents.length} loaded\n`;
877
+ output += `🔧 Transformers.js: ${transformersReady ? '✅ Ready' : '❌ Not ready'}\n`;
878
+ output += `🧠 Embedding Model: ${embeddingModel ? '✅ Loaded' : '❌ Not loaded'}\n`;
879
+ output += `🤖 QA Model: ${qaModel ? '✅ Loaded' : '❌ Not loaded'}\n`;
880
+ output += `🚀 LLM Model: ${llmModel ? '✅ Loaded' : '❌ Not loaded'}\n\n`;
881
+
882
+ if (transformersReady && modelsInitialized && embeddingModel) {
883
+ output += `🔍 Testing embedding generation...\n`;
884
+ const testEmbedding = await generateEmbedding("test sentence");
885
+ output += `✅ Embedding test: Generated ${testEmbedding.length}D vector\n\n`;
886
+
887
+ output += `🔍 Testing semantic search...\n`;
888
+ const testQuery = "artificial intelligence";
889
+ const queryEmbedding = await generateEmbedding(testQuery);
890
+
891
+ let testResults = [];
892
+ documents.forEach(doc => {
893
+ if (doc.embedding) {
894
+ const similarity = cosineSimilarity(queryEmbedding, doc.embedding);
895
+ testResults.push({ doc, similarity });
896
+ }
897
+ });
898
+ testResults.sort((a, b) => b.similarity - a.similarity);
899
+
900
+ if (testResults.length > 0) {
901
+ output += `✅ Search test: Found ${testResults.length} results\n`;
902
+ output += `📄 Top result: "${testResults[0].doc.title}" (similarity: ${testResults[0].similarity.toFixed(3)})\n\n`;
903
+ }
904
+
905
+ if (qaModel) {
906
+ output += `🤖 Testing QA model...\n`;
907
+ const context = documents[0].content.substring(0, 500);
908
+ const testQuestion = "What is artificial intelligence?";
909
+ const qaResult = await qaModel(testQuestion, context);
910
+ output += `✅ QA test: Generated answer with ${(qaResult.score * 100).toFixed(1)}% confidence\n`;
911
+ output += `💬 Answer: ${qaResult.answer.substring(0, 100)}...\n\n`;
912
+ }
913
+
914
+ if (llmModel) {
915
+ output += `🚀 Testing LLM model...\n`;
916
+ const testPrompt = "Explain artificial intelligence:";
917
+ const llmResult = await llmModel(testPrompt, { max_new_tokens: 30, temperature: 0.7, do_sample: true, return_full_text: false });
918
+ output += `✅ LLM test: Generated text completion\n`;
919
+ output += `💬 Generated: "${llmResult[0].generated_text.substring(0, 100)}..."\n\n`;
920
+ }
921
+
922
+ output += `🎉 All tests passed! System is fully operational.`;
923
+ } else {
924
+ output += `⚠️ Models not initialized. Click "Initialize AI Models" first.`;
925
+ }
926
+
927
+ outputDiv.textContent = output;
928
+
929
+ } catch (error) {
930
+ console.error('Test error:', error);
931
+ outputDiv.textContent = `❌ Test failed: ${error.message}`;
932
+ } finally {
933
+ testBtn.disabled = false;
934
+ }
935
  }
936
+
937
+ // Initialize UI
938
+ updateStatus();
939
+
940
+ // Show version info in console
941
+ console.log('🤖 AI-Powered RAG System with Transformers.js');
942
+ console.log('Models: Xenova/all-MiniLM-L6-v2, Xenova/distilbert-base-cased-distilled-squad');
943
+
944
+ // Export functions for global access
945
+ window.showTab = showTab;
946
+ window.updateSliderValue = updateSliderValue;
947
+ window.initializeModels = initializeModels;
948
+ window.searchDocumentsSemantic = searchDocumentsSemantic;
949
+ window.searchDocumentsKeyword = searchDocumentsKeyword;
950
+ window.chatWithRAG = chatWithRAG;
951
+ window.chatWithLLM = chatWithLLM;
952
+ window.chatWithLLMRAG = chatWithLLMRAG;
953
+ window.addDocumentManual = addDocumentManual;
954
+ window.testSystem = testSystem;
rag-backup.html ADDED
@@ -0,0 +1,683 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>🤖 AI-Powered Document Search & RAG Chat</title>
7
+ <style>
8
+ * { margin: 0; padding: 0; box-sizing: border-box; }
9
+
10
+ body {
11
+ font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
12
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
13
+ min-height: 100vh;
14
+ padding: 20px;
15
+ }
16
+
17
+ .container {
18
+ max-width: 1200px;
19
+ margin: 0 auto;
20
+ background: white;
21
+ border-radius: 20px;
22
+ box-shadow: 0 20px 60px rgba(0,0,0,0.1);
23
+ overflow: hidden;
24
+ }
25
+
26
+ .header {
27
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
28
+ color: white;
29
+ padding: 30px;
30
+ text-align: center;
31
+ }
32
+
33
+ .header h1 { font-size: 2.5em; margin-bottom: 10px; }
34
+ .header p { font-size: 1.2em; opacity: 0.9; }
35
+
36
+ .status {
37
+ background: #f8f9fa;
38
+ padding: 15px 30px;
39
+ border-bottom: 1px solid #e9ecef;
40
+ font-weight: 600;
41
+ color: #495057;
42
+ }
43
+
44
+ .tabs {
45
+ display: flex;
46
+ background: #f8f9fa;
47
+ border-bottom: 1px solid #e9ecef;
48
+ }
49
+
50
+ .tab {
51
+ flex: 1;
52
+ padding: 15px 20px;
53
+ background: none;
54
+ border: none;
55
+ cursor: pointer;
56
+ font-weight: 600;
57
+ font-size: 14px;
58
+ transition: all 0.3s;
59
+ border-bottom: 3px solid transparent;
60
+ }
61
+
62
+ .tab:hover { background: #e9ecef; }
63
+ .tab.active { background: white; border-bottom-color: #667eea; color: #667eea; }
64
+
65
+ .tab-content {
66
+ display: none;
67
+ padding: 30px;
68
+ }
69
+
70
+ .tab-content.active { display: block; }
71
+
72
+ .form-group {
73
+ margin-bottom: 20px;
74
+ }
75
+
76
+ label {
77
+ display: block;
78
+ margin-bottom: 5px;
79
+ font-weight: 600;
80
+ color: #495057;
81
+ }
82
+
83
+ input, textarea, select {
84
+ width: 100%;
85
+ padding: 12px;
86
+ border: 2px solid #e9ecef;
87
+ border-radius: 8px;
88
+ font-size: 16px;
89
+ transition: border-color 0.3s;
90
+ }
91
+
92
+ input:focus, textarea:focus, select:focus {
93
+ outline: none;
94
+ border-color: #667eea;
95
+ }
96
+
97
+ button {
98
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
99
+ color: white;
100
+ border: none;
101
+ padding: 12px 24px;
102
+ border-radius: 8px;
103
+ font-size: 16px;
104
+ font-weight: 600;
105
+ cursor: pointer;
106
+ transition: transform 0.2s;
107
+ }
108
+
109
+ button:hover { transform: translateY(-2px); }
110
+
111
+ .btn-secondary {
112
+ background: linear-gradient(135deg, #6c757d 0%, #495057 100%);
113
+ }
114
+
115
+ .result {
116
+ background: #f8f9fa;
117
+ border: 1px solid #e9ecef;
118
+ border-radius: 8px;
119
+ padding: 20px;
120
+ margin-top: 15px;
121
+ white-space: pre-wrap;
122
+ max-height: 400px;
123
+ overflow-y: auto;
124
+ }
125
+
126
+ .grid {
127
+ display: grid;
128
+ grid-template-columns: 1fr 1fr;
129
+ gap: 20px;
130
+ }
131
+
132
+ .alert {
133
+ padding: 15px;
134
+ border-radius: 8px;
135
+ margin-bottom: 20px;
136
+ }
137
+
138
+ .alert-info {
139
+ background: #d1ecf1;
140
+ border: 1px solid #b8daff;
141
+ color: #0c5460;
142
+ }
143
+
144
+ .alert-success {
145
+ background: #d4edda;
146
+ border: 1px solid #c3e6cb;
147
+ color: #155724;
148
+ }
149
+
150
+ .slider-container {
151
+ display: flex;
152
+ align-items: center;
153
+ gap: 15px;
154
+ }
155
+
156
+ .slider {
157
+ flex: 1;
158
+ }
159
+
160
+ .slider-value {
161
+ min-width: 40px;
162
+ text-align: center;
163
+ font-weight: 600;
164
+ color: #667eea;
165
+ }
166
+
167
+ .loading {
168
+ display: inline-block;
169
+ width: 20px;
170
+ height: 20px;
171
+ border: 2px solid #f3f3f3;
172
+ border-top: 2px solid #667eea;
173
+ border-radius: 50%;
174
+ animation: spin 1s linear infinite;
175
+ }
176
+
177
+ @keyframes spin {
178
+ 0% { transform: rotate(0deg); }
179
+ 100% { transform: rotate(360deg); }
180
+ }
181
+ </style>
182
+ </head>
183
+ <body>
184
+ <div class="container">
185
+ <div class="header">
186
+ <h1>🤖 AI-Powered Document Search & RAG Chat</h1>
187
+ <p>Complete RAG system with semantic search and intelligent responses</p>
188
+ </div>
189
+
190
+ <div class="status" id="status">
191
+ 📊 Documents: 3 | 🤖 AI Models: Not loaded
192
+ </div>
193
+
194
+ <div class="tabs">
195
+ <button class="tab active" onclick="showTab('init')">🚀 Initialize AI</button>
196
+ <button class="tab" onclick="showTab('chat')">🤖 AI Chat (RAG)</button>
197
+ <button class="tab" onclick="showTab('search')">🔍 Semantic Search</button>
198
+ <button class="tab" onclick="showTab('add')">📝 Add Documents</button>
199
+ <button class="tab" onclick="showTab('test')">🧪 System Test</button>
200
+ </div>
201
+
202
+ <!-- Initialize AI Tab -->
203
+ <div id="init" class="tab-content active">
204
+ <div class="alert alert-info">
205
+ <strong>🚀 Start here!</strong> Initialize the AI system for semantic search and intelligent chat.
206
+ </div>
207
+ <div class="alert alert-success">
208
+ <strong>⚡ Features:</strong><br>
209
+ • 🔮 <strong>Semantic Search:</strong> AI-powered similarity matching<br>
210
+ • 🧠 <strong>Smart Chat:</strong> Context-aware responses with document retrieval<br>
211
+ • 📚 <strong>RAG Pipeline:</strong> Retrieval-Augmented Generation for accurate answers
212
+ </div>
213
+ <button onclick="initializeModels()" style="font-size: 18px; padding: 15px 30px;">
214
+ 🚀 Initialize AI Models
215
+ </button>
216
+ <div id="initStatus" class="result" style="display: none;"></div>
217
+ </div>
218
+
219
+ <!-- AI Chat Tab -->
220
+ <div id="chat" class="tab-content">
221
+ <div class="alert alert-info">
222
+ <strong>🤖 Ask questions!</strong> The AI searches through documents and provides intelligent answers.
223
+ </div>
224
+ <div class="alert alert-success">
225
+ <strong>💡 Try asking:</strong><br>
226
+ • "What is artificial intelligence?"<br>
227
+ • "How does space exploration work?"<br>
228
+ • "What are renewable energy sources?"
229
+ </div>
230
+ <div class="grid">
231
+ <div>
232
+ <label for="chatQuestion">Your Question</label>
233
+ <textarea id="chatQuestion" rows="3" placeholder="Ask anything about the documents..."></textarea>
234
+ </div>
235
+ <div>
236
+ <label for="maxContext">Context Documents</label>
237
+ <div class="slider-container">
238
+ <input type="range" id="maxContext" class="slider" min="1" max="5" value="3" oninput="updateSliderValue('maxContext')">
239
+ <span id="maxContextValue" class="slider-value">3</span>
240
+ </div>
241
+ </div>
242
+ </div>
243
+ <button onclick="chatWithRAG()">🤖 Ask AI</button>
244
+ <div id="chatResponse" class="result" style="display: none;"></div>
245
+ </div>
246
+
247
+ <!-- Semantic Search Tab -->
248
+ <div id="search" class="tab-content">
249
+ <div class="alert alert-info">
250
+ <strong>🔮 Semantic search!</strong> Find documents by meaning, not just keywords.
251
+ </div>
252
+ <div class="grid">
253
+ <div>
254
+ <label for="searchQuery">Search Query</label>
255
+ <input type="text" id="searchQuery" placeholder="Try: 'machine learning', 'Mars missions', 'solar power'">
256
+ </div>
257
+ <div>
258
+ <label for="maxResults">Max Results</label>
259
+ <div class="slider-container">
260
+ <input type="range" id="maxResults" class="slider" min="1" max="10" value="5" oninput="updateSliderValue('maxResults')">
261
+ <span id="maxResultsValue" class="slider-value">5</span>
262
+ </div>
263
+ </div>
264
+ </div>
265
+ <div style="display: flex; gap: 10px;">
266
+ <button onclick="searchDocumentsSemantic()">🔮 Semantic Search</button>
267
+ <button class="btn-secondary" onclick="searchDocumentsKeyword()">🔤 Keyword Search</button>
268
+ </div>
269
+ <div id="searchResults" class="result" style="display: none;"></div>
270
+ </div>
271
+
272
+ <!-- Add Documents Tab -->
273
+ <div id="add" class="tab-content">
274
+ <div class="alert alert-info">
275
+ <strong>📚 Expand your knowledge base!</strong> Add your own documents to search and chat with.
276
+ </div>
277
+ <div class="form-group">
278
+ <label for="docTitle">Document Title (optional)</label>
279
+ <input type="text" id="docTitle" placeholder="Enter document title...">
280
+ </div>
281
+ <div class="form-group">
282
+ <label for="docContent">Document Content</label>
283
+ <textarea id="docContent" rows="8" placeholder="Paste your document text here..."></textarea>
284
+ </div>
285
+ <button onclick="addDocument()">📝 Add Document</button>
286
+ <div class="grid">
287
+ <div id="addStatus" class="result" style="display: none;"></div>
288
+ <div id="docPreview" class="result" style="display: none;"></div>
289
+ </div>
290
+ </div>
291
+
292
+ <!-- System Test Tab -->
293
+ <div id="test" class="tab-content">
294
+ <div class="alert alert-info">
295
+ <strong>🧪 Test the system</strong> to verify everything is working correctly.
296
+ </div>
297
+ <button onclick="testSystem()">🧪 Run System Test</button>
298
+ <div id="testOutput" class="result" style="display: none;"></div>
299
+ </div>
300
+ </div>
301
+
302
+ <script>
303
+ // Document storage and AI state
304
+ let documents = [
305
+ {
306
+ id: 0,
307
+ title: "Artificial Intelligence Overview",
308
+ content: "Artificial Intelligence (AI) is a branch of computer science that aims to create intelligent machines that work and react like humans. Some activities computers with AI are designed for include speech recognition, learning, planning, and problem-solving. AI is used in healthcare, finance, transportation, and entertainment. Machine learning enables computers to learn from experience without explicit programming. Deep learning uses neural networks to understand complex patterns.",
309
+ embedding: null
310
+ },
311
+ {
312
+ id: 1,
313
+ title: "Space Exploration",
314
+ content: "Space exploration is the ongoing discovery and exploration of celestial structures in outer space through evolving space technology. Physical exploration is conducted by unmanned robotic probes and human spaceflight. Space exploration has been used for geopolitical rivalries like the Cold War. The early era was driven by a Space Race between the Soviet Union and United States. Modern exploration includes Mars missions, the International Space Station, and satellite programs.",
315
+ embedding: null
316
+ },
317
+ {
318
+ id: 2,
319
+ title: "Renewable Energy",
320
+ content: "Renewable energy comes from naturally replenished resources on a human timescale. It includes sunlight, wind, rain, tides, waves, and geothermal heat. Renewable energy contrasts with fossil fuels that are used faster than replenished. Most renewable sources are sustainable. Solar energy is abundant and promising. Wind energy and hydroelectric power are major contributors to renewable generation worldwide.",
321
+ embedding: null
322
+ }
323
+ ];
324
+
325
+ let modelsInitialized = false;
326
+
327
+ // Simple embedding generation based on word frequency
328
+ function generateEmbedding(text) {
329
+ const words = text.toLowerCase().split(/\s+/);
330
+ const embedding = new Array(10).fill(0);
331
+
332
+ // Word to dimension mapping
333
+ const wordDims = {
334
+ 'ai': 0, 'artificial': 0, 'intelligence': 0, 'machine': 0, 'learning': 0,
335
+ 'space': 1, 'exploration': 1, 'mars': 1, 'satellite': 1, 'rocket': 1,
336
+ 'energy': 2, 'renewable': 2, 'solar': 2, 'wind': 2, 'power': 2,
337
+ 'computer': 3, 'technology': 3, 'science': 3, 'data': 3,
338
+ 'human': 4, 'people': 4, 'society': 4, 'world': 4,
339
+ 'system': 5, 'network': 5, 'process': 5, 'method': 5,
340
+ 'research': 6, 'study': 6, 'analysis': 6, 'development': 6,
341
+ 'future': 7, 'modern': 7, 'new': 7, 'advanced': 7,
342
+ 'problem': 8, 'solution': 8, 'application': 8, 'use': 8,
343
+ 'important': 9, 'major': 9, 'significant': 9, 'key': 9
344
+ };
345
+
346
+ // Count word frequencies
347
+ words.forEach(word => {
348
+ if (wordDims.hasOwnProperty(word)) {
349
+ embedding[wordDims[word]]++;
350
+ }
351
+ });
352
+
353
+ // Normalize vector
354
+ const magnitude = Math.sqrt(embedding.reduce((sum, val) => sum + val * val, 0));
355
+ if (magnitude > 0) {
356
+ return embedding.map(val => val / magnitude);
357
+ }
358
+ return embedding;
359
+ }
360
+
361
+ // Calculate cosine similarity
362
+ function cosineSimilarity(a, b) {
363
+ const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);
364
+ const magnitudeA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
365
+ const magnitudeB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
366
+
367
+ if (magnitudeA === 0 || magnitudeB === 0) return 0;
368
+ return dotProduct / (magnitudeA * magnitudeB);
369
+ }
370
+
371
+ // UI Functions
372
+ function showTab(tabName) {
373
+ // Hide all tabs
374
+ document.querySelectorAll('.tab-content').forEach(tab => {
375
+ tab.classList.remove('active');
376
+ });
377
+ document.querySelectorAll('.tab').forEach(button => {
378
+ button.classList.remove('active');
379
+ });
380
+
381
+ // Show selected tab
382
+ document.getElementById(tabName).classList.add('active');
383
+ event.target.classList.add('active');
384
+ }
385
+
386
+ function updateSliderValue(sliderId) {
387
+ const slider = document.getElementById(sliderId);
388
+ const valueSpan = document.getElementById(sliderId + 'Value');
389
+ valueSpan.textContent = slider.value;
390
+ }
391
+
392
+ function updateStatus() {
393
+ const status = document.getElementById('status');
394
+ status.textContent = `📊 Documents: ${documents.length} | 🤖 AI Models: ${modelsInitialized ? 'Loaded' : 'Not loaded'}`;
395
+ }
396
+
397
+ // AI Functions
398
+ function initializeModels() {
399
+ const statusDiv = document.getElementById('initStatus');
400
+ statusDiv.style.display = 'block';
401
+ statusDiv.innerHTML = '<div class="loading"></div> Initializing AI models...';
402
+
403
+ setTimeout(() => {
404
+ modelsInitialized = true;
405
+
406
+ // Generate embeddings for all documents
407
+ documents.forEach(doc => {
408
+ doc.embedding = generateEmbedding(doc.content);
409
+ });
410
+
411
+ statusDiv.innerHTML = `✅ AI Models initialized successfully!
412
+ 🔮 Embeddings generated for all documents
413
+ 🧠 Ready for semantic search and RAG chat!
414
+
415
+ 📊 System Status:
416
+ • Documents processed: ${documents.length}
417
+ • Embedding dimensions: 10
418
+ • Similarity algorithm: Cosine similarity
419
+ • Ready for advanced queries!`;
420
+
421
+ updateStatus();
422
+ }, 2000);
423
+ }
424
+
425
+ function searchDocumentsSemantic() {
426
+ const query = document.getElementById('searchQuery').value;
427
+ const maxResults = parseInt(document.getElementById('maxResults').value);
428
+ const resultsDiv = document.getElementById('searchResults');
429
+
430
+ if (!query.trim()) {
431
+ resultsDiv.style.display = 'block';
432
+ resultsDiv.textContent = '❌ Please enter a search query';
433
+ return;
434
+ }
435
+
436
+ if (!modelsInitialized) {
437
+ resultsDiv.style.display = 'block';
438
+ resultsDiv.textContent = '❌ Please initialize AI models first!';
439
+ return;
440
+ }
441
+
442
+ resultsDiv.style.display = 'block';
443
+ resultsDiv.innerHTML = '<div class="loading"></div> Searching...';
444
+
445
+ setTimeout(() => {
446
+ const queryEmbedding = generateEmbedding(query);
447
+ const results = [];
448
+
449
+ documents.forEach(doc => {
450
+ if (doc.embedding) {
451
+ const similarity = cosineSimilarity(queryEmbedding, doc.embedding);
452
+ results.push({ doc, similarity });
453
+ }
454
+ });
455
+
456
+ results.sort((a, b) => b.similarity - a.similarity);
457
+
458
+ if (results.length === 0) {
459
+ resultsDiv.textContent = `❌ No documents found for '${query}'`;
460
+ return;
461
+ }
462
+
463
+ let output = `🔍 Semantic search results for '${query}':\n\n`;
464
+ results.slice(0, maxResults).forEach((result, i) => {
465
+ const doc = result.doc;
466
+ const similarity = result.similarity;
467
+ const excerpt = doc.content.length > 200 ? doc.content.substring(0, 200) + '...' : doc.content;
468
+ output += `**Result ${i + 1}** (similarity: ${similarity.toFixed(3)})\n📄 Title: ${doc.title}\n📝 Content: ${excerpt}\n\n`;
469
+ });
470
+
471
+ resultsDiv.textContent = output;
472
+ }, 500);
473
+ }
474
+
475
+ function searchDocumentsKeyword() {
476
+ const query = document.getElementById('searchQuery').value;
477
+ const maxResults = parseInt(document.getElementById('maxResults').value);
478
+ const resultsDiv = document.getElementById('searchResults');
479
+
480
+ if (!query.trim()) {
481
+ resultsDiv.style.display = 'block';
482
+ resultsDiv.textContent = '❌ Please enter a search query';
483
+ return;
484
+ }
485
+
486
+ resultsDiv.style.display = 'block';
487
+ resultsDiv.innerHTML = '<div class="loading"></div> Searching...';
488
+
489
+ setTimeout(() => {
490
+ const results = [];
491
+ const queryWords = query.toLowerCase().split(/\s+/);
492
+
493
+ documents.forEach(doc => {
494
+ const contentLower = doc.content.toLowerCase();
495
+ const titleLower = doc.title.toLowerCase();
496
+
497
+ let matches = 0;
498
+ queryWords.forEach(word => {
499
+ matches += (contentLower.match(new RegExp(word, 'g')) || []).length;
500
+ matches += (titleLower.match(new RegExp(word, 'g')) || []).length * 2; // Title matches weighted more
501
+ });
502
+
503
+ if (matches > 0) {
504
+ results.push({ doc, score: matches });
505
+ }
506
+ });
507
+
508
+ results.sort((a, b) => b.score - a.score);
509
+
510
+ if (results.length === 0) {
511
+ resultsDiv.textContent = `❌ No documents found containing '${query}'`;
512
+ return;
513
+ }
514
+
515
+ let output = `🔍 Keyword search results for '${query}':\n\n`;
516
+ results.slice(0, maxResults).forEach((result, i) => {
517
+ const doc = result.doc;
518
+ const excerpt = doc.content.length > 200 ? doc.content.substring(0, 200) + '...' : doc.content;
519
+ output += `**Result ${i + 1}**\n📄 Title: ${doc.title}\n📝 Content: ${excerpt}\n\n`;
520
+ });
521
+
522
+ resultsDiv.textContent = output;
523
+ }, 500);
524
+ }
525
+
526
+ function chatWithRAG() {
527
+ const question = document.getElementById('chatQuestion').value;
528
+ const maxContext = parseInt(document.getElementById('maxContext').value);
529
+ const responseDiv = document.getElementById('chatResponse');
530
+
531
+ if (!question.trim()) {
532
+ responseDiv.style.display = 'block';
533
+ responseDiv.textContent = '❌ Please enter a question';
534
+ return;
535
+ }
536
+
537
+ if (!modelsInitialized) {
538
+ responseDiv.style.display = 'block';
539
+ responseDiv.textContent = '❌ AI models not loaded yet. Please initialize them first!';
540
+ return;
541
+ }
542
+
543
+ responseDiv.style.display = 'block';
544
+ responseDiv.innerHTML = '<div class="loading"></div> Generating response...';
545
+
546
+ setTimeout(() => {
547
+ // Use semantic search to find relevant documents
548
+ const queryEmbedding = generateEmbedding(question);
549
+ const relevantDocs = [];
550
+
551
+ documents.forEach(doc => {
552
+ if (doc.embedding) {
553
+ const similarity = cosineSimilarity(queryEmbedding, doc.embedding);
554
+ if (similarity > 0.1) { // Threshold for relevance
555
+ relevantDocs.push({ doc, similarity });
556
+ }
557
+ }
558
+ });
559
+
560
+ relevantDocs.sort((a, b) => b.similarity - a.similarity);
561
+ relevantDocs.splice(maxContext); // Limit to maxContext documents
562
+
563
+ if (relevantDocs.length === 0) {
564
+ responseDiv.textContent = '❌ No relevant context found. Try asking about AI, space exploration, or renewable energy.';
565
+ return;
566
+ }
567
+
568
+ // Generate response based on context
569
+ const contextTexts = relevantDocs.map(item => item.doc.content.substring(0, 400));
570
+ const context = contextTexts.join(' ');
571
+
572
+ // Simple response generation based on question type
573
+ const questionLower = question.toLowerCase();
574
+ let response = '';
575
+
576
+ if (questionLower.includes('what') || questionLower.includes('define')) {
577
+ response = `🤖 Based on the documents, here's what I found:\n\n${context.substring(0, 500)}...`;
578
+ } else if (questionLower.includes('how') || questionLower.includes('process') || questionLower.includes('work')) {
579
+ response = `🤖 Here's how it works according to the documents:\n\n${context.substring(0, 500)}...`;
580
+ } else if (questionLower.includes('why') || questionLower.includes('reason')) {
581
+ response = `🤖 The reasons include:\n\n${context.substring(0, 500)}...`;
582
+ } else {
583
+ response = `🤖 Based on the relevant documents:\n\n${context.substring(0, 500)}...`;
584
+ }
585
+
586
+ response += `\n\n📚 Sources: ${relevantDocs.length} documents | Best similarity: ${relevantDocs[0].similarity.toFixed(3)}`;
587
+
588
+ responseDiv.textContent = response;
589
+ }, 1500);
590
+ }
591
+
592
+ function addDocument() {
593
+ const title = document.getElementById('docTitle').value || `User Document ${documents.length - 2}`;
594
+ const content = document.getElementById('docContent').value;
595
+ const statusDiv = document.getElementById('addStatus');
596
+ const previewDiv = document.getElementById('docPreview');
597
+
598
+ if (!content.trim()) {
599
+ statusDiv.style.display = 'block';
600
+ statusDiv.textContent = '❌ Please enter document content';
601
+ previewDiv.style.display = 'none';
602
+ return;
603
+ }
604
+
605
+ const docId = documents.length;
606
+ const document = {
607
+ id: docId,
608
+ title: title,
609
+ content: content.trim(),
610
+ embedding: modelsInitialized ? generateEmbedding(content) : null
611
+ };
612
+
613
+ documents.push(document);
614
+
615
+ const preview = content.length > 300 ? content.substring(0, 300) + '...' : content;
616
+ const status = `✅ Document added successfully!
617
+ 📄 Title: ${title}
618
+ 📊 Size: ${content.length.toLocaleString()} characters
619
+ 📚 Total documents: ${documents.length}${modelsInitialized ? '\n🔮 Embedding generated automatically' : ''}`;
620
+
621
+ statusDiv.style.display = 'block';
622
+ statusDiv.textContent = status;
623
+
624
+ previewDiv.style.display = 'block';
625
+ previewDiv.textContent = `📖 Preview:\n${preview}`;
626
+
627
+ // Clear form
628
+ document.getElementById('docTitle').value = '';
629
+ document.getElementById('docContent').value = '';
630
+
631
+ updateStatus();
632
+ }
633
+
634
+ function testSystem() {
635
+ const outputDiv = document.getElementById('testOutput');
636
+ outputDiv.style.display = 'block';
637
+ outputDiv.innerHTML = '<div class="loading"></div> Running system tests...';
638
+
639
+ setTimeout(() => {
640
+ if (documents.length === 0) {
641
+ outputDiv.textContent = '❌ No documents found!';
642
+ return;
643
+ }
644
+
645
+ // Perform a test search
646
+ const testQuery = 'AI';
647
+ const queryEmbedding = generateEmbedding(testQuery);
648
+ let testResults = [];
649
+
650
+ if (modelsInitialized) {
651
+ documents.forEach(doc => {
652
+ if (doc.embedding) {
653
+ const similarity = cosineSimilarity(queryEmbedding, doc.embedding);
654
+ if (similarity > 0) {
655
+ testResults.push({ doc, similarity });
656
+ }
657
+ }
658
+ });
659
+ testResults.sort((a, b) => b.similarity - a.similarity);
660
+ }
661
+
662
+ let output = `✅ System test successful! ${documents.length} documents loaded.\n\n`;
663
+
664
+ if (modelsInitialized) {
665
+ output += `🔮 AI Models: ✅ Loaded\n🧮 Embeddings: ✅ Generated\n🔍 Test search for "AI": ${testResults.length} results\n\n`;
666
+ if (testResults.length > 0) {
667
+ const topResult = testResults[0];
668
+ output += `📄 Top result: "${topResult.doc.title}" (similarity: ${topResult.similarity.toFixed(3)})\n`;
669
+ output += `📝 Content: ${topResult.doc.content.substring(0, 150)}...`;
670
+ }
671
+ } else {
672
+ output += `⚠️ AI Models: Not initialized\n💡 Click "Initialize AI Models" to enable semantic search and RAG chat`;
673
+ }
674
+
675
+ outputDiv.textContent = output;
676
+ }, 1000);
677
+ }
678
+
679
+ // Initialize UI
680
+ updateStatus();
681
+ </script>
682
+ </body>
683
+ </html>
rag-complete.html ADDED
@@ -0,0 +1,1466 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>🤖 AI-Powered Document Search & RAG Chat</title>
7
+ <script type="module">
8
+ // Import transformers.js 3.0.0 from CDN (new Hugging Face ownership)
9
+ import { pipeline, env } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.0';
10
+
11
+ // Make available globally
12
+ window.transformers = { pipeline, env };
13
+ window.transformersLoaded = true;
14
+
15
+ console.log('✅ Transformers.js 3.0.0 loaded via ES modules (Hugging Face)');
16
+ </script>
17
+ <script src="https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.0/dist/transformers.min.js"></script>
18
+ <style>
19
+ * { margin: 0; padding: 0; box-sizing: border-box; }
20
+
21
+ body {
22
+ font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
23
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
24
+ min-height: 100vh;
25
+ padding: 20px;
26
+ }
27
+
28
+ .container {
29
+ max-width: 1200px;
30
+ margin: 0 auto;
31
+ background: white;
32
+ border-radius: 20px;
33
+ box-shadow: 0 20px 60px rgba(0,0,0,0.1);
34
+ overflow: hidden;
35
+ }
36
+
37
+ .header {
38
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
39
+ color: white;
40
+ padding: 30px;
41
+ text-align: center;
42
+ }
43
+
44
+ .header h1 { font-size: 2.5em; margin-bottom: 10px; }
45
+ .header p { font-size: 1.2em; opacity: 0.9; }
46
+
47
+ .status {
48
+ background: #f8f9fa;
49
+ padding: 15px 30px;
50
+ border-bottom: 1px solid #e9ecef;
51
+ font-weight: 600;
52
+ color: #495057;
53
+ }
54
+
55
+ .tabs {
56
+ display: flex;
57
+ background: #f8f9fa;
58
+ border-bottom: 1px solid #e9ecef;
59
+ }
60
+
61
+ .tab {
62
+ flex: 1;
63
+ padding: 15px 20px;
64
+ background: none;
65
+ border: none;
66
+ cursor: pointer;
67
+ font-weight: 600;
68
+ font-size: 14px;
69
+ transition: all 0.3s;
70
+ border-bottom: 3px solid transparent;
71
+ }
72
+
73
+ .tab:hover { background: #e9ecef; }
74
+ .tab.active { background: white; border-bottom-color: #667eea; color: #667eea; }
75
+
76
+ .tab-content {
77
+ display: none;
78
+ padding: 30px;
79
+ }
80
+
81
+ .tab-content.active { display: block; }
82
+
83
+ .form-group {
84
+ margin-bottom: 20px;
85
+ }
86
+
87
+ label {
88
+ display: block;
89
+ margin-bottom: 5px;
90
+ font-weight: 600;
91
+ color: #495057;
92
+ }
93
+
94
+ input, textarea, select {
95
+ width: 100%;
96
+ padding: 12px;
97
+ border: 2px solid #e9ecef;
98
+ border-radius: 8px;
99
+ font-size: 16px;
100
+ transition: border-color 0.3s;
101
+ }
102
+
103
+ input:focus, textarea:focus, select:focus {
104
+ outline: none;
105
+ border-color: #667eea;
106
+ }
107
+
108
+ button {
109
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
110
+ color: white;
111
+ border: none;
112
+ padding: 12px 24px;
113
+ border-radius: 8px;
114
+ font-size: 16px;
115
+ font-weight: 600;
116
+ cursor: pointer;
117
+ transition: transform 0.2s;
118
+ }
119
+
120
+ button:hover { transform: translateY(-2px); }
121
+ button:disabled { opacity: 0.6; cursor: not-allowed; transform: none; }
122
+
123
+ .btn-secondary {
124
+ background: linear-gradient(135deg, #6c757d 0%, #495057 100%);
125
+ }
126
+
127
+ .result {
128
+ background: #f8f9fa;
129
+ border: 1px solid #e9ecef;
130
+ border-radius: 8px;
131
+ padding: 20px;
132
+ margin-top: 15px;
133
+ white-space: pre-wrap;
134
+ max-height: 400px;
135
+ overflow-y: auto;
136
+ }
137
+
138
+ .upload-section {
139
+ margin-bottom: 30px;
140
+ }
141
+
142
+ .upload-area {
143
+ border: 2px dashed #007bff;
144
+ border-radius: 12px;
145
+ padding: 40px;
146
+ text-align: center;
147
+ background: #f8f9ff;
148
+ cursor: pointer;
149
+ transition: all 0.3s ease;
150
+ margin: 20px 0;
151
+ }
152
+
153
+ .upload-area:hover {
154
+ border-color: #0056b3;
155
+ background: #e3f2fd;
156
+ }
157
+
158
+ .upload-area.dragover {
159
+ border-color: #28a745;
160
+ background: #e8f5e8;
161
+ }
162
+
163
+ .upload-content {
164
+ pointer-events: none;
165
+ }
166
+
167
+ .upload-icon {
168
+ font-size: 48px;
169
+ margin-bottom: 15px;
170
+ }
171
+
172
+ .upload-text {
173
+ color: #666;
174
+ font-size: 16px;
175
+ }
176
+
177
+ .divider {
178
+ text-align: center;
179
+ margin: 30px 0;
180
+ position: relative;
181
+ color: #666;
182
+ font-weight: bold;
183
+ background: white;
184
+ padding: 0 20px;
185
+ display: inline-block;
186
+ width: 100%;
187
+ }
188
+
189
+ .divider::before {
190
+ content: '';
191
+ position: absolute;
192
+ top: 50%;
193
+ left: 0;
194
+ right: 0;
195
+ height: 1px;
196
+ background: #ddd;
197
+ z-index: 1;
198
+ }
199
+
200
+ .manual-entry {
201
+ margin-top: 20px;
202
+ }
203
+
204
+ .progress-container {
205
+ background: #f0f0f0;
206
+ border-radius: 6px;
207
+ margin: 15px 0;
208
+ overflow: hidden;
209
+ position: relative;
210
+ }
211
+
212
+ .progress-bar {
213
+ background: linear-gradient(45deg, #007bff, #0056b3);
214
+ height: 20px;
215
+ border-radius: 6px;
216
+ transition: width 0.3s ease;
217
+ width: 0%;
218
+ }
219
+
220
+ .progress-text {
221
+ position: absolute;
222
+ top: 50%;
223
+ left: 50%;
224
+ transform: translate(-50%, -50%);
225
+ font-size: 12px;
226
+ font-weight: bold;
227
+ color: #333;
228
+ white-space: nowrap;
229
+ }
230
+
231
+ .grid {
232
+ display: grid;
233
+ grid-template-columns: 1fr 1fr;
234
+ gap: 20px;
235
+ }
236
+
237
+ .alert {
238
+ padding: 15px;
239
+ border-radius: 8px;
240
+ margin-bottom: 20px;
241
+ }
242
+
243
+ .alert-info {
244
+ background: #d1ecf1;
245
+ border: 1px solid #b8daff;
246
+ color: #0c5460;
247
+ }
248
+
249
+ .alert-success {
250
+ background: #d4edda;
251
+ border: 1px solid #c3e6cb;
252
+ color: #155724;
253
+ }
254
+
255
+ .alert-warning {
256
+ background: #fff3cd;
257
+ border: 1px solid #ffeeba;
258
+ color: #856404;
259
+ }
260
+
261
+ .slider-container {
262
+ display: flex;
263
+ align-items: center;
264
+ gap: 15px;
265
+ }
266
+
267
+ .slider {
268
+ flex: 1;
269
+ }
270
+
271
+ .slider-value {
272
+ min-width: 40px;
273
+ text-align: center;
274
+ font-weight: 600;
275
+ color: #667eea;
276
+ }
277
+
278
+ .loading {
279
+ display: inline-block;
280
+ width: 20px;
281
+ height: 20px;
282
+ border: 2px solid #f3f3f3;
283
+ border-top: 2px solid #667eea;
284
+ border-radius: 50%;
285
+ animation: spin 1s linear infinite;
286
+ }
287
+
288
+ .progress {
289
+ width: 100%;
290
+ height: 8px;
291
+ background: #e9ecef;
292
+ border-radius: 4px;
293
+ overflow: hidden;
294
+ margin: 10px 0;
295
+ }
296
+
297
+ .progress-bar {
298
+ height: 100%;
299
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
300
+ transition: width 0.3s ease;
301
+ }
302
+
303
+ @keyframes spin {
304
+ 0% { transform: rotate(0deg); }
305
+ 100% { transform: rotate(360deg); }
306
+ }
307
+
308
+ .model-info {
309
+ background: #e8f4f8;
310
+ border: 1px solid #bee5eb;
311
+ border-radius: 8px;
312
+ padding: 15px;
313
+ margin: 15px 0;
314
+ }
315
+
316
+ .model-info h4 {
317
+ color: #0c5460;
318
+ margin-bottom: 8px;
319
+ }
320
+
321
+ .model-info p {
322
+ color: #0c5460;
323
+ font-size: 14px;
324
+ margin: 5px 0;
325
+ }
326
+ </style>
327
+ </head>
328
+ <body>
329
+ <div class="container">
330
+ <div class="header">
331
+ <h1>🤖 AI-Powered Document Search & RAG Chat</h1>
332
+ <p>Real transformer models running in your browser with Transformers.js</p>
333
+ </div>
334
+
335
+ <div class="status" id="status">
336
+ 📊 Documents: 3 | 🤖 AI Models: Not loaded | 🧠 Embedding Model: Not loaded
337
+ </div>
338
+
339
+ <div class="tabs">
340
+ <button class="tab active" onclick="showTab('init')">🚀 Initialize AI</button>
341
+ <button class="tab" onclick="showTab('chat')">🤖 AI Chat (RAG)</button>
342
+ <button class="tab" onclick="showTab('llm')">🚀 LLM Chat</button>
343
+ <button class="tab" onclick="showTab('search')">🔍 Semantic Search</button>
344
+ <button class="tab" onclick="showTab('add')">📝 Add Documents</button>
345
+ <button class="tab" onclick="showTab('test')">🧪 System Test</button>
346
+ </div>
347
+
348
+ <!-- Initialize AI Tab -->
349
+ <div id="init" class="tab-content active">
350
+ <div class="alert alert-info">
351
+ <strong>🚀 Real AI Models!</strong> This system uses actual transformer models via Transformers.js.
352
+ </div>
353
+
354
+ <div class="model-info">
355
+ <h4>🧠 Models Being Loaded:</h4>
356
+ <p><strong>Embedding Model:</strong> Xenova/all-MiniLM-L6-v2 (384-dimensional sentence embeddings)</p>
357
+ <p><strong>Q&A Model:</strong> Xenova/distilbert-base-cased-distilled-squad (Question Answering)</p>
358
+ <p><strong>LLM Model:</strong> Auto-selected GPT-2 or DistilGPT-2 (Transformers.js 3.0.0)</p>
359
+ <p><strong>Size:</strong> ~100MB total (cached after first load)</p>
360
+ <p><strong>Performance:</strong> CPU inference, ~2-8 seconds per operation</p>
361
+ <p><strong>Status:</strong> <span id="transformersStatus">⏳ Loading library...</span></p>
362
+ </div>
363
+
364
+ <div class="alert alert-warning">
365
+ <strong>⚠️ First Load:</strong> Model downloading may take 1-2 minutes depending on your internet connection. Models are cached for subsequent uses.
366
+ </div>
367
+
368
+ <button onclick="initializeModels()" id="initBtn" style="font-size: 18px; padding: 15px 30px;">
369
+ 🚀 Initialize Real AI Models
370
+ </button>
371
+
372
+ <div id="initProgress" style="display: none;">
373
+ <div class="progress">
374
+ <div class="progress-bar" id="progressBar" style="width: 0%"></div>
375
+ </div>
376
+ <p id="progressText">Preparing to load models...</p>
377
+ </div>
378
+
379
+ <div id="initStatus" class="result" style="display: none;"></div>
380
+ </div>
381
+
382
+ <!-- AI Chat Tab -->
383
+ <div id="chat" class="tab-content">
384
+ <div class="alert alert-info">
385
+ <strong>🤖 Real AI Chat!</strong> Ask questions and get answers from actual transformer models.
386
+ </div>
387
+ <div class="alert alert-success">
388
+ <strong>💡 Try asking:</strong><br>
389
+ • "What is artificial intelligence?"<br>
390
+ • "How does space exploration work?"<br>
391
+ • "What are renewable energy sources?"<br>
392
+ • "Explain machine learning in simple terms"
393
+ </div>
394
+ <div class="grid">
395
+ <div>
396
+ <label for="chatQuestion">Your Question</label>
397
+ <textarea id="chatQuestion" rows="3" placeholder="Ask anything about the documents..."></textarea>
398
+ </div>
399
+ <div>
400
+ <label for="maxContext">Context Documents</label>
401
+ <div class="slider-container">
402
+ <input type="range" id="maxContext" class="slider" min="1" max="5" value="3" oninput="updateSliderValue('maxContext')">
403
+ <span id="maxContextValue" class="slider-value">3</span>
404
+ </div>
405
+ </div>
406
+ </div>
407
+ <button onclick="chatWithRAG()" id="chatBtn">🤖 Ask AI</button>
408
+ <div id="chatResponse" class="result" style="display: none;"></div>
409
+ </div>
410
+
411
+ <!-- LLM Chat Tab -->
412
+ <div id="llm" class="tab-content">
413
+ <div class="alert alert-info">
414
+ <strong>🚀 Pure LLM Chat!</strong> Chat with a language model (GPT-2 or Llama2.c) running in your browser.
415
+ </div>
416
+ <div class="alert alert-success">
417
+ <strong>💡 Try these prompts:</strong><br>
418
+ • "Tell me a story about space exploration"<br>
419
+ • "Explain machine learning in simple terms"<br>
420
+ • "Write a poem about artificial intelligence"<br>
421
+ • "What are the benefits of renewable energy?"
422
+ </div>
423
+ <div class="grid">
424
+ <div>
425
+ <label for="llmPrompt">Your Prompt</label>
426
+ <textarea id="llmPrompt" rows="3" placeholder="Enter your prompt for the language model..."></textarea>
427
+ </div>
428
+ <div>
429
+ <label for="maxTokens">Max Tokens</label>
430
+ <div class="slider-container">
431
+ <input type="range" id="maxTokens" class="slider" min="20" max="200" value="100" oninput="updateSliderValue('maxTokens')">
432
+ <span id="maxTokensValue" class="slider-value">100</span>
433
+ </div>
434
+ <label for="temperature">Temperature</label>
435
+ <div class="slider-container">
436
+ <input type="range" id="temperature" class="slider" min="0.1" max="1.5" step="0.1" value="0.7" oninput="updateSliderValue('temperature')">
437
+ <span id="temperatureValue" class="slider-value">0.7</span>
438
+ </div>
439
+ </div>
440
+ </div>
441
+ <div style="display: flex; gap: 10px;">
442
+ <button onclick="chatWithLLM()" id="llmBtn">🚀 Generate Text</button>
443
+ <button class="btn-secondary" onclick="chatWithLLMRAG()" id="llmRagBtn">🤖 LLM + RAG</button>
444
+ </div>
445
+ <div id="llmResponse" class="result" style="display: none;"></div>
446
+ </div>
447
+
448
+ <!-- Semantic Search Tab -->
449
+ <div id="search" class="tab-content">
450
+ <div class="alert alert-info">
451
+ <strong>🔮 Real semantic search!</strong> Using transformer embeddings to find documents by meaning.
452
+ </div>
453
+ <div class="grid">
454
+ <div>
455
+ <label for="searchQuery">Search Query</label>
456
+ <input type="text" id="searchQuery" placeholder="Try: 'machine learning', 'Mars missions', 'solar power'">
457
+ </div>
458
+ <div>
459
+ <label for="maxResults">Max Results</label>
460
+ <div class="slider-container">
461
+ <input type="range" id="maxResults" class="slider" min="1" max="10" value="5" oninput="updateSliderValue('maxResults')">
462
+ <span id="maxResultsValue" class="slider-value">5</span>
463
+ </div>
464
+ </div>
465
+ </div>
466
+ <div style="display: flex; gap: 10px;">
467
+ <button onclick="searchDocumentsSemantic()" id="searchBtn">🔮 Semantic Search</button>
468
+ <button class="btn-secondary" onclick="searchDocumentsKeyword()">🔤 Keyword Search</button>
469
+ </div>
470
+ <div id="searchResults" class="result" style="display: none;"></div>
471
+ </div>
472
+
473
+ <!-- Add Documents Tab -->
474
+ <div id="add" class="tab-content">
475
+ <div class="alert alert-info">
476
+ <strong>📚 Expand your knowledge base!</strong> Upload files or paste text with real AI embeddings.
477
+ </div>
478
+
479
+ <!-- File Upload Section -->
480
+ <div class="upload-section">
481
+ <h4>📁 Upload Files</h4>
482
+ <div class="upload-area" id="uploadArea">
483
+ <div class="upload-content">
484
+ <div class="upload-icon">📄</div>
485
+ <div class="upload-text">
486
+ <strong>Drop files here or click to select</strong>
487
+ <br>Supports: .md, .txt, .json, .csv, .html, .js, .py, .xml
488
+ </div>
489
+ </div>
490
+ <input type="file" id="fileInput" accept=".md,.txt,.json,.csv,.html,.js,.py,.xml,.rst,.yaml,.yml" multiple style="display: none;">
491
+ </div>
492
+ <div id="uploadProgress" class="progress-container" style="display: none;">
493
+ <div class="progress-bar" id="uploadProgressBar"></div>
494
+ <div class="progress-text" id="uploadProgressText">Processing files...</div>
495
+ </div>
496
+ <div id="uploadStatus" class="result" style="display: none;"></div>
497
+ </div>
498
+
499
+ <div class="divider">OR</div>
500
+
501
+ <!-- Manual Entry Section -->
502
+ <div class="manual-entry">
503
+ <h4>✏️ Manual Entry</h4>
504
+ <div class="form-group">
505
+ <label for="docTitle">Document Title (optional)</label>
506
+ <input type="text" id="docTitle" placeholder="Enter document title...">
507
+ </div>
508
+ <div class="form-group">
509
+ <label for="docContent">Document Content</label>
510
+ <textarea id="docContent" rows="8" placeholder="Paste your document text here..."></textarea>
511
+ </div>
512
+ <button onclick="addDocumentManual()" id="addBtn">📝 Add Document</button>
513
+ <div class="grid">
514
+ <div id="addStatus" class="result" style="display: none;"></div>
515
+ <div id="docPreview" class="result" style="display: none;"></div>
516
+ </div>
517
+ </div>
518
+ </div>
519
+
520
+ <!-- System Test Tab -->
521
+ <div id="test" class="tab-content">
522
+ <div class="alert alert-info">
523
+ <strong>🧪 Test the system</strong> to verify AI models are working correctly.
524
+ </div>
525
+ <button onclick="testSystem()" id="testBtn">🧪 Run System Test</button>
526
+ <div id="testOutput" class="result" style="display: none;"></div>
527
+ </div>
528
+ </div>
529
+
530
+ <script>
531
+ // Global variables for transformers.js
532
+ let pipeline = null;
533
+ let env = null;
534
+ let transformersReady = false;
535
+
536
+ // Initialize transformers.js when the script loads
537
+ async function initTransformers() {
538
+ try {
539
+ console.log('🔄 Initializing Transformers.js...');
540
+
541
+ // Try ES modules first (preferred method)
542
+ if (window.transformers && window.transformersLoaded) {
543
+ console.log('✅ Using ES modules version (Transformers.js 3.0.0)');
544
+ ({ pipeline, env } = window.transformers);
545
+ }
546
+ // Fallback to UMD version
547
+ else if (window.Transformers) {
548
+ console.log('✅ Using UMD version (Transformers.js 3.0.0)');
549
+ ({ pipeline, env } = window.Transformers);
550
+ }
551
+ // Wait for library to load
552
+ else {
553
+ console.log('⏳ Waiting for library to load...');
554
+ let attempts = 0;
555
+ while (!window.Transformers && !window.transformersLoaded && attempts < 50) {
556
+ await new Promise(resolve => setTimeout(resolve, 200));
557
+ attempts++;
558
+ }
559
+
560
+ if (window.transformers && window.transformersLoaded) {
561
+ ({ pipeline, env } = window.transformers);
562
+ } else if (window.Transformers) {
563
+ ({ pipeline, env } = window.Transformers);
564
+ } else {
565
+ throw new Error('Failed to load Transformers.js library');
566
+ }
567
+ }
568
+
569
+ // Configure transformers.js with minimal settings
570
+ if (env) {
571
+ env.allowLocalModels = false;
572
+ env.allowRemoteModels = true;
573
+ // Let Transformers.js use default WASM paths for better compatibility
574
+ }
575
+
576
+ transformersReady = true;
577
+ console.log('✅ Transformers.js initialized successfully');
578
+
579
+ // Update UI to show ready state
580
+ updateStatus();
581
+
582
+ // Update status indicator
583
+ const statusSpan = document.getElementById('transformersStatus');
584
+ if (statusSpan) {
585
+ statusSpan.textContent = '✅ Ready!';
586
+ statusSpan.style.color = 'green';
587
+ }
588
+
589
+ } catch (error) {
590
+ console.error('❌ Error initializing Transformers.js:', error);
591
+
592
+ // Show error in UI
593
+ const statusDiv = document.getElementById('status');
594
+ if (statusDiv) {
595
+ statusDiv.textContent = `❌ Failed to load Transformers.js: ${error.message}`;
596
+ statusDiv.style.color = 'red';
597
+ }
598
+
599
+ // Update status indicator
600
+ const statusSpan = document.getElementById('transformersStatus');
601
+ if (statusSpan) {
602
+ statusSpan.textContent = `❌ Failed: ${error.message}`;
603
+ statusSpan.style.color = 'red';
604
+ }
605
+ }
606
+ }
607
+
608
+ // Initialize when page loads
609
+ document.addEventListener('DOMContentLoaded', function() {
610
+ initTransformers();
611
+ initFileUpload();
612
+ });
613
+
614
+ // Document storage and AI state
615
+ let documents = [
616
+ {
617
+ id: 0,
618
+ title: "Artificial Intelligence Overview",
619
+ content: "Artificial Intelligence (AI) is a branch of computer science that aims to create intelligent machines that work and react like humans. Some activities computers with AI are designed for include speech recognition, learning, planning, and problem-solving. AI is used in healthcare, finance, transportation, and entertainment. Machine learning enables computers to learn from experience without explicit programming. Deep learning uses neural networks to understand complex patterns in data.",
620
+ embedding: null
621
+ },
622
+ {
623
+ id: 1,
624
+ title: "Space Exploration",
625
+ content: "Space exploration is the ongoing discovery and exploration of celestial structures in outer space through evolving space technology. Physical exploration is conducted by unmanned robotic probes and human spaceflight. Space exploration has been used for geopolitical rivalries like the Cold War. The early era was driven by a Space Race between the Soviet Union and United States. Modern exploration includes Mars missions, the International Space Station, and satellite programs.",
626
+ embedding: null
627
+ },
628
+ {
629
+ id: 2,
630
+ title: "Renewable Energy",
631
+ content: "Renewable energy comes from naturally replenished resources on a human timescale. It includes sunlight, wind, rain, tides, waves, and geothermal heat. Renewable energy contrasts with fossil fuels that are used faster than replenished. Most renewable sources are sustainable. Solar energy is abundant and promising. Wind energy and hydroelectric power are major contributors to renewable generation worldwide.",
632
+ embedding: null
633
+ }
634
+ ];
635
+
636
+ let embeddingModel = null;
637
+ let qaModel = null;
638
+ let llmModel = null;
639
+ let loadedModelName = '';
640
+ let modelsInitialized = false;
641
+
642
+ // Calculate cosine similarity between two vectors
643
+ function cosineSimilarity(a, b) {
644
+ const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);
645
+ const magnitudeA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
646
+ const magnitudeB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
647
+
648
+ if (magnitudeA === 0 || magnitudeB === 0) return 0;
649
+ return dotProduct / (magnitudeA * magnitudeB);
650
+ }
651
+
652
+ // UI Functions
653
+ function showTab(tabName) {
654
+ // Hide all tabs
655
+ document.querySelectorAll('.tab-content').forEach(tab => {
656
+ tab.classList.remove('active');
657
+ });
658
+ document.querySelectorAll('.tab').forEach(button => {
659
+ button.classList.remove('active');
660
+ });
661
+
662
+ // Show selected tab
663
+ document.getElementById(tabName).classList.add('active');
664
+ event.target.classList.add('active');
665
+ }
666
+
667
+ function updateSliderValue(sliderId) {
668
+ const slider = document.getElementById(sliderId);
669
+ const valueSpan = document.getElementById(sliderId + 'Value');
670
+ valueSpan.textContent = slider.value;
671
+ }
672
+
673
+ function updateStatus() {
674
+ const status = document.getElementById('status');
675
+ const transformersStatus = transformersReady ? 'Ready' : 'Not ready';
676
+ const embeddingStatus = embeddingModel ? 'Loaded' : 'Not loaded';
677
+ const qaStatus = qaModel ? 'Loaded' : 'Not loaded';
678
+ const llmStatus = llmModel ? 'Loaded' : 'Not loaded';
679
+ status.textContent = `📊 Documents: ${documents.length} | 🔧 Transformers.js: ${transformersStatus} | 🤖 QA: ${qaStatus} | 🧠 Embedding: ${embeddingStatus} | 🚀 LLM: ${llmStatus}`;
680
+ }
681
+
682
+ function updateProgress(percent, text) {
683
+ const progressBar = document.getElementById('progressBar');
684
+ const progressText = document.getElementById('progressText');
685
+ progressBar.style.width = percent + '%';
686
+ progressText.textContent = text;
687
+ }
688
+
689
+ // AI Functions
690
+ async function initializeModels() {
691
+ const statusDiv = document.getElementById('initStatus');
692
+ const progressDiv = document.getElementById('initProgress');
693
+ const initBtn = document.getElementById('initBtn');
694
+
695
+ statusDiv.style.display = 'block';
696
+ progressDiv.style.display = 'block';
697
+ initBtn.disabled = true;
698
+
699
+ try {
700
+ // Check if transformers.js is ready
701
+ if (!transformersReady || !pipeline) {
702
+ updateProgress(5, "Waiting for Transformers.js to initialize...");
703
+ statusDiv.innerHTML = '🔄 Initializing Transformers.js library...';
704
+
705
+ // Wait for transformers.js to be ready
706
+ let attempts = 0;
707
+ while (!transformersReady && attempts < 30) {
708
+ await new Promise(resolve => setTimeout(resolve, 1000));
709
+ attempts++;
710
+ }
711
+
712
+ if (!transformersReady) {
713
+ throw new Error('Transformers.js failed to initialize. Please refresh the page.');
714
+ }
715
+ }
716
+
717
+ updateProgress(10, "Loading embedding model...");
718
+ statusDiv.innerHTML = '🔄 Loading embedding model (Xenova/all-MiniLM-L6-v2)...';
719
+
720
+ // Load embedding model with progress tracking
721
+ embeddingModel = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2', {
722
+ progress_callback: (progress) => {
723
+ if (progress.status === 'downloading') {
724
+ const percent = progress.loaded && progress.total ?
725
+ Math.round((progress.loaded / progress.total) * 100) : 0;
726
+ statusDiv.innerHTML = `🔄 Downloading embedding model: ${percent}%`;
727
+ }
728
+ }
729
+ });
730
+
731
+ updateProgress(40, "Loading question-answering model...");
732
+ statusDiv.innerHTML = '🔄 Loading QA model (Xenova/distilbert-base-cased-distilled-squad)...';
733
+
734
+ // Load QA model with progress tracking
735
+ qaModel = await pipeline('question-answering', 'Xenova/distilbert-base-cased-distilled-squad', {
736
+ progress_callback: (progress) => {
737
+ if (progress.status === 'downloading') {
738
+ const percent = progress.loaded && progress.total ?
739
+ Math.round((progress.loaded / progress.total) * 100) : 0;
740
+ statusDiv.innerHTML = `🔄 Downloading QA model: ${percent}%`;
741
+ }
742
+ }
743
+ });
744
+
745
+ updateProgress(70, "Loading language model...");
746
+ statusDiv.innerHTML = '🔄 Loading LLM (trying SmolLM models)...';
747
+
748
+ // Load LLM model - Stable Transformers.js 3.0.0 configuration
749
+ const modelsToTry = [
750
+ {
751
+ name: 'Xenova/gpt2',
752
+ options: {}
753
+ },
754
+ {
755
+ name: 'Xenova/distilgpt2',
756
+ options: {}
757
+ }
758
+ ];
759
+
760
+ let modelLoaded = false;
761
+ for (const model of modelsToTry) {
762
+ try {
763
+ console.log(`Trying to load ${model.name}...`);
764
+ statusDiv.innerHTML = `🔄 Loading LLM (${model.name})...`;
765
+
766
+ // Load LLM with progress tracking
767
+ llmModel = await pipeline('text-generation', model.name, {
768
+ progress_callback: (progress) => {
769
+ if (progress.status === 'downloading') {
770
+ const percent = progress.loaded && progress.total ?
771
+ Math.round((progress.loaded / progress.total) * 100) : 0;
772
+ statusDiv.innerHTML = `🔄 Downloading ${model.name}: ${percent}%`;
773
+ }
774
+ }
775
+ });
776
+
777
+ console.log(`✅ Successfully loaded ${model.name}`);
778
+ loadedModelName = model.name;
779
+ modelLoaded = true;
780
+ break;
781
+ } catch (error) {
782
+ console.warn(`${model.name} failed:`, error);
783
+ }
784
+ }
785
+
786
+ if (!modelLoaded) {
787
+ throw new Error('Failed to load any LLM model');
788
+ }
789
+
790
+ updateProgress(85, "Generating embeddings for documents...");
791
+ statusDiv.innerHTML = '🔄 Generating embeddings for existing documents...';
792
+
793
+ // Generate embeddings for all existing documents
794
+ for (let i = 0; i < documents.length; i++) {
795
+ const doc = documents[i];
796
+ updateProgress(85 + (i / documents.length) * 10, `Processing document ${i + 1}/${documents.length}...`);
797
+ doc.embedding = await generateEmbedding(doc.content);
798
+ }
799
+
800
+ updateProgress(100, "Initialization complete!");
801
+ modelsInitialized = true;
802
+
803
+ statusDiv.innerHTML = `✅ AI Models initialized successfully!
804
+ 🧠 Embedding Model: Xenova/all-MiniLM-L6-v2 (384 dimensions)
805
+ 🤖 QA Model: Xenova/distilbert-base-cased-distilled-squad
806
+ 🚀 LLM Model: ${loadedModelName} (Language model for text generation)
807
+ 📚 Documents processed: ${documents.length}
808
+ 🔮 Ready for semantic search, Q&A, and LLM chat!
809
+
810
+ 📊 Model Info:
811
+ • Embedding model size: ~23MB
812
+ • QA model size: ~28MB
813
+ • LLM model size: ~15-50MB (depending on model loaded)
814
+ • Total memory usage: ~70-100MB
815
+ • Inference speed: ~2-8 seconds per operation`;
816
+
817
+ updateStatus();
818
+
819
+ } catch (error) {
820
+ console.error('Error initializing models:', error);
821
+ statusDiv.innerHTML = `❌ Error initializing models: ${error.message}
822
+
823
+ Please check your internet connection and try again.`;
824
+ updateProgress(0, "Initialization failed");
825
+ } finally {
826
+ initBtn.disabled = false;
827
+ setTimeout(() => {
828
+ progressDiv.style.display = 'none';
829
+ }, 2000);
830
+ }
831
+ }
832
+
833
+ async function generateEmbedding(text) {
834
+ if (!transformersReady || !pipeline) {
835
+ throw new Error('Transformers.js not initialized');
836
+ }
837
+
838
+ if (!embeddingModel) {
839
+ throw new Error('Embedding model not loaded');
840
+ }
841
+
842
+ try {
843
+ const output = await embeddingModel(text, { pooling: 'mean', normalize: true });
844
+ return Array.from(output.data);
845
+ } catch (error) {
846
+ console.error('Error generating embedding:', error);
847
+ throw error;
848
+ }
849
+ }
850
+
851
+ async function searchDocumentsSemantic() {
852
+ const query = document.getElementById('searchQuery').value;
853
+ const maxResults = parseInt(document.getElementById('maxResults').value);
854
+ const resultsDiv = document.getElementById('searchResults');
855
+ const searchBtn = document.getElementById('searchBtn');
856
+
857
+ if (!query.trim()) {
858
+ resultsDiv.style.display = 'block';
859
+ resultsDiv.textContent = '❌ Please enter a search query';
860
+ return;
861
+ }
862
+
863
+ if (!transformersReady || !modelsInitialized || !embeddingModel) {
864
+ resultsDiv.style.display = 'block';
865
+ resultsDiv.textContent = '❌ Please initialize AI models first!';
866
+ return;
867
+ }
868
+
869
+ resultsDiv.style.display = 'block';
870
+ resultsDiv.innerHTML = '<div class="loading"></div> Generating query embedding and searching...';
871
+ searchBtn.disabled = true;
872
+
873
+ try {
874
+ // Generate embedding for query
875
+ const queryEmbedding = await generateEmbedding(query);
876
+
877
+ // Calculate similarities
878
+ const results = [];
879
+ documents.forEach(doc => {
880
+ if (doc.embedding) {
881
+ const similarity = cosineSimilarity(queryEmbedding, doc.embedding);
882
+ results.push({ doc, similarity });
883
+ }
884
+ });
885
+
886
+ // Sort by similarity
887
+ results.sort((a, b) => b.similarity - a.similarity);
888
+
889
+ if (results.length === 0) {
890
+ resultsDiv.textContent = `❌ No documents with embeddings found for '${query}'`;
891
+ return;
892
+ }
893
+
894
+ let output = `🔍 Semantic search results for '${query}':\n\n`;
895
+ results.slice(0, maxResults).forEach((result, i) => {
896
+ const doc = result.doc;
897
+ const similarity = result.similarity;
898
+ const excerpt = doc.content.length > 200 ? doc.content.substring(0, 200) + '...' : doc.content;
899
+ output += `**Result ${i + 1}** (similarity: ${similarity.toFixed(3)})\n📄 Title: ${doc.title}\n📝 Content: ${excerpt}\n\n`;
900
+ });
901
+
902
+ resultsDiv.textContent = output;
903
+
904
+ } catch (error) {
905
+ console.error('Search error:', error);
906
+ resultsDiv.textContent = `❌ Error during search: ${error.message}`;
907
+ } finally {
908
+ searchBtn.disabled = false;
909
+ }
910
+ }
911
+
912
+ function searchDocumentsKeyword() {
913
+ const query = document.getElementById('searchQuery').value;
914
+ const maxResults = parseInt(document.getElementById('maxResults').value);
915
+ const resultsDiv = document.getElementById('searchResults');
916
+
917
+ if (!query.trim()) {
918
+ resultsDiv.style.display = 'block';
919
+ resultsDiv.textContent = '❌ Please enter a search query';
920
+ return;
921
+ }
922
+
923
+ resultsDiv.style.display = 'block';
924
+ resultsDiv.innerHTML = '<div class="loading"></div> Searching keywords...';
925
+
926
+ setTimeout(() => {
927
+ const results = [];
928
+ const queryWords = query.toLowerCase().split(/\s+/);
929
+
930
+ documents.forEach(doc => {
931
+ const contentLower = doc.content.toLowerCase();
932
+ const titleLower = doc.title.toLowerCase();
933
+
934
+ let matches = 0;
935
+ queryWords.forEach(word => {
936
+ matches += (contentLower.match(new RegExp(word, 'g')) || []).length;
937
+ matches += (titleLower.match(new RegExp(word, 'g')) || []).length * 2;
938
+ });
939
+
940
+ if (matches > 0) {
941
+ results.push({ doc, score: matches });
942
+ }
943
+ });
944
+
945
+ results.sort((a, b) => b.score - a.score);
946
+
947
+ if (results.length === 0) {
948
+ resultsDiv.textContent = `❌ No documents found containing '${query}'`;
949
+ return;
950
+ }
951
+
952
+ let output = `🔍 Keyword search results for '${query}':\n\n`;
953
+ results.slice(0, maxResults).forEach((result, i) => {
954
+ const doc = result.doc;
955
+ const excerpt = doc.content.length > 200 ? doc.content.substring(0, 200) + '...' : doc.content;
956
+ output += `**Result ${i + 1}**\n📄 Title: ${doc.title}\n📝 Content: ${excerpt}\n\n`;
957
+ });
958
+
959
+ resultsDiv.textContent = output;
960
+ }, 500);
961
+ }
962
+
963
+ async function chatWithRAG() {
964
+ const question = document.getElementById('chatQuestion').value;
965
+ const maxContext = parseInt(document.getElementById('maxContext').value);
966
+ const responseDiv = document.getElementById('chatResponse');
967
+ const chatBtn = document.getElementById('chatBtn');
968
+
969
+ if (!question.trim()) {
970
+ responseDiv.style.display = 'block';
971
+ responseDiv.textContent = '❌ Please enter a question';
972
+ return;
973
+ }
974
+
975
+ if (!transformersReady || !modelsInitialized || !embeddingModel || !qaModel) {
976
+ responseDiv.style.display = 'block';
977
+ responseDiv.textContent = '❌ AI models not loaded yet. Please initialize them first!';
978
+ return;
979
+ }
980
+
981
+ responseDiv.style.display = 'block';
982
+ responseDiv.innerHTML = '<div class="loading"></div> Generating answer with real AI...';
983
+ chatBtn.disabled = true;
984
+
985
+ try {
986
+ // Generate embedding for the question
987
+ const questionEmbedding = await generateEmbedding(question);
988
+
989
+ // Find relevant documents using semantic similarity
990
+ const relevantDocs = [];
991
+ documents.forEach(doc => {
992
+ if (doc.embedding) {
993
+ const similarity = cosineSimilarity(questionEmbedding, doc.embedding);
994
+ if (similarity > 0.1) {
995
+ relevantDocs.push({ doc, similarity });
996
+ }
997
+ }
998
+ });
999
+
1000
+ relevantDocs.sort((a, b) => b.similarity - a.similarity);
1001
+ relevantDocs.splice(maxContext);
1002
+
1003
+ if (relevantDocs.length === 0) {
1004
+ responseDiv.textContent = '❌ No relevant context found in the documents for your question.';
1005
+ return;
1006
+ }
1007
+
1008
+ // Combine context from top documents
1009
+ const context = relevantDocs.map(item => item.doc.content).join(' ').substring(0, 2000);
1010
+
1011
+ // Use the QA model to generate an answer
1012
+ const qaResult = await qaModel(question, context);
1013
+
1014
+ let response = `🤖 AI Answer:\n${qaResult.answer}\n\n`;
1015
+ response += `📊 Confidence: ${(qaResult.score * 100).toFixed(1)}%\n\n`;
1016
+ response += `📚 Sources: ${relevantDocs.length} documents\n`;
1017
+ response += `🔍 Best match: "${relevantDocs[0].doc.title}" (similarity: ${relevantDocs[0].similarity.toFixed(3)})\n\n`;
1018
+ response += `📝 Context used:\n${context.substring(0, 300)}...`;
1019
+
1020
+ responseDiv.textContent = response;
1021
+
1022
+ } catch (error) {
1023
+ console.error('Chat error:', error);
1024
+ responseDiv.textContent = `❌ Error generating response: ${error.message}`;
1025
+ } finally {
1026
+ chatBtn.disabled = false;
1027
+ }
1028
+ }
1029
+
1030
+ async function chatWithLLM() {
1031
+ const prompt = document.getElementById('llmPrompt').value;
1032
+ const maxTokens = parseInt(document.getElementById('maxTokens').value);
1033
+ const temperature = parseFloat(document.getElementById('temperature').value);
1034
+ const responseDiv = document.getElementById('llmResponse');
1035
+ const llmBtn = document.getElementById('llmBtn');
1036
+
1037
+ if (!prompt.trim()) {
1038
+ responseDiv.style.display = 'block';
1039
+ responseDiv.textContent = '❌ Please enter a prompt';
1040
+ return;
1041
+ }
1042
+
1043
+ if (!transformersReady || !modelsInitialized || !llmModel) {
1044
+ responseDiv.style.display = 'block';
1045
+ responseDiv.textContent = '❌ LLM model not loaded yet. Please initialize models first!';
1046
+ return;
1047
+ }
1048
+
1049
+ responseDiv.style.display = 'block';
1050
+ responseDiv.innerHTML = '<div class="loading"></div> Generating text with LLM...';
1051
+ llmBtn.disabled = true;
1052
+
1053
+ try {
1054
+ // Generate text with the LLM
1055
+ const result = await llmModel(prompt, {
1056
+ max_new_tokens: maxTokens,
1057
+ temperature: temperature,
1058
+ do_sample: true,
1059
+ return_full_text: false
1060
+ });
1061
+
1062
+ let generatedText = result[0].generated_text;
1063
+
1064
+ let response = `🚀 LLM Generated Text:\n\n"${generatedText}"\n\n`;
1065
+ response += `📊 Settings: ${maxTokens} tokens, temperature ${temperature}\n`;
1066
+ response += `🤖 Model: ${loadedModelName ? loadedModelName.split('/')[1] : 'Language Model'}\n`;
1067
+ response += `⏱️ Generated in real-time by your browser!`;
1068
+
1069
+ responseDiv.textContent = response;
1070
+
1071
+ } catch (error) {
1072
+ console.error('LLM error:', error);
1073
+ responseDiv.textContent = `❌ Error generating text: ${error.message}`;
1074
+ } finally {
1075
+ llmBtn.disabled = false;
1076
+ }
1077
+ }
1078
+
1079
+ async function chatWithLLMRAG() {
1080
+ const prompt = document.getElementById('llmPrompt').value;
1081
+ const maxTokens = parseInt(document.getElementById('maxTokens').value);
1082
+ const temperature = parseFloat(document.getElementById('temperature').value);
1083
+ const responseDiv = document.getElementById('llmResponse');
1084
+ const llmRagBtn = document.getElementById('llmRagBtn');
1085
+
1086
+ if (!prompt.trim()) {
1087
+ responseDiv.style.display = 'block';
1088
+ responseDiv.textContent = '❌ Please enter a prompt';
1089
+ return;
1090
+ }
1091
+
1092
+ if (!transformersReady || !modelsInitialized || !llmModel || !embeddingModel) {
1093
+ responseDiv.style.display = 'block';
1094
+ responseDiv.textContent = '❌ Models not loaded yet. Please initialize all models first!';
1095
+ return;
1096
+ }
1097
+
1098
+ responseDiv.style.display = 'block';
1099
+ responseDiv.innerHTML = '<div class="loading"></div> Finding relevant context and generating with LLM...';
1100
+ llmRagBtn.disabled = true;
1101
+
1102
+ try {
1103
+ // Find relevant documents using semantic search
1104
+ const queryEmbedding = await generateEmbedding(prompt);
1105
+ const relevantDocs = [];
1106
+
1107
+ documents.forEach(doc => {
1108
+ if (doc.embedding) {
1109
+ const similarity = cosineSimilarity(queryEmbedding, doc.embedding);
1110
+ if (similarity > 0.1) {
1111
+ relevantDocs.push({ doc, similarity });
1112
+ }
1113
+ }
1114
+ });
1115
+
1116
+ relevantDocs.sort((a, b) => b.similarity - a.similarity);
1117
+ relevantDocs.splice(3); // Limit to top 3 documents
1118
+
1119
+ // Create enhanced prompt with context
1120
+ let enhancedPrompt = prompt;
1121
+ if (relevantDocs.length > 0) {
1122
+ const context = relevantDocs.map(item => item.doc.content.substring(0, 300)).join(' ');
1123
+ enhancedPrompt = `Context: ${context}\n\nQuestion: ${prompt}\n\nAnswer:`;
1124
+ }
1125
+
1126
+ // Generate text with the LLM using enhanced prompt
1127
+ const result = await llmModel(enhancedPrompt, {
1128
+ max_new_tokens: maxTokens,
1129
+ temperature: temperature,
1130
+ do_sample: true,
1131
+ return_full_text: false
1132
+ });
1133
+
1134
+ let generatedText = result[0].generated_text;
1135
+
1136
+ let response = `🤖 LLM + RAG Generated Response:\n\n"${generatedText}"\n\n`;
1137
+ response += `📚 Context: ${relevantDocs.length} relevant documents used\n`;
1138
+ if (relevantDocs.length > 0) {
1139
+ response += `🔍 Best match: "${relevantDocs[0].doc.title}" (similarity: ${relevantDocs[0].similarity.toFixed(3)})\n`;
1140
+ }
1141
+ response += `📊 Settings: ${maxTokens} tokens, temperature ${temperature}\n`;
1142
+ response += `🚀 Model: ${loadedModelName ? loadedModelName.split('/')[1] : 'LLM'} enhanced with document retrieval`;
1143
+
1144
+ responseDiv.textContent = response;
1145
+
1146
+ } catch (error) {
1147
+ console.error('LLM+RAG error:', error);
1148
+ responseDiv.textContent = `❌ Error generating response: ${error.message}`;
1149
+ } finally {
1150
+ llmRagBtn.disabled = false;
1151
+ }
1152
+ }
1153
+
1154
+ async function addDocument() {
1155
+ const title = document.getElementById('docTitle').value || `User Document ${documents.length - 2}`;
1156
+ const content = document.getElementById('docContent').value;
1157
+ const statusDiv = document.getElementById('addStatus');
1158
+ const previewDiv = document.getElementById('docPreview');
1159
+ const addBtn = document.getElementById('addBtn');
1160
+
1161
+ if (!content.trim()) {
1162
+ statusDiv.style.display = 'block';
1163
+ statusDiv.textContent = '❌ Please enter document content';
1164
+ previewDiv.style.display = 'none';
1165
+ return;
1166
+ }
1167
+
1168
+ statusDiv.style.display = 'block';
1169
+ statusDiv.innerHTML = '<div class="loading"></div> Adding document...';
1170
+ addBtn.disabled = true;
1171
+
1172
+ try {
1173
+ const docId = documents.length;
1174
+ const newDocument = {
1175
+ id: docId,
1176
+ title: title,
1177
+ content: content.trim(),
1178
+ embedding: null
1179
+ };
1180
+
1181
+ // Generate embedding if models are initialized
1182
+ if (transformersReady && modelsInitialized && embeddingModel) {
1183
+ statusDiv.innerHTML = '<div class="loading"></div> Generating AI embedding...';
1184
+ newDocument.embedding = await generateEmbedding(content);
1185
+ }
1186
+
1187
+ documents.push(newDocument);
1188
+
1189
+ const preview = content.length > 300 ? content.substring(0, 300) + '...' : content;
1190
+ const status = `✅ Document added successfully!
1191
+ 📄 Title: ${title}
1192
+ 📊 Size: ${content.length.toLocaleString()} characters
1193
+ 📚 Total documents: ${documents.length}${(transformersReady && modelsInitialized) ? '\n🧠 AI embedding generated automatically' : '\n⚠️ AI embedding will be generated when models are loaded'}`;
1194
+
1195
+ statusDiv.textContent = status;
1196
+ previewDiv.style.display = 'block';
1197
+ previewDiv.textContent = `📖 Preview:\n${preview}`;
1198
+
1199
+ // Clear form
1200
+ document.getElementById('docTitle').value = '';
1201
+ document.getElementById('docContent').value = '';
1202
+
1203
+ updateStatus();
1204
+
1205
+ } catch (error) {
1206
+ console.error('Error adding document:', error);
1207
+ statusDiv.textContent = `❌ Error adding document: ${error.message}`;
1208
+ } finally {
1209
+ addBtn.disabled = false;
1210
+ }
1211
+ }
1212
+
1213
+ // File upload functionality
1214
+ function initFileUpload() {
1215
+ const uploadArea = document.getElementById('uploadArea');
1216
+ const fileInput = document.getElementById('fileInput');
1217
+
1218
+ if (!uploadArea || !fileInput) return;
1219
+
1220
+ // Click to select files
1221
+ uploadArea.addEventListener('click', () => {
1222
+ fileInput.click();
1223
+ });
1224
+
1225
+ // Drag and drop functionality
1226
+ uploadArea.addEventListener('dragover', (e) => {
1227
+ e.preventDefault();
1228
+ uploadArea.classList.add('dragover');
1229
+ });
1230
+
1231
+ uploadArea.addEventListener('dragleave', (e) => {
1232
+ e.preventDefault();
1233
+ uploadArea.classList.remove('dragover');
1234
+ });
1235
+
1236
+ uploadArea.addEventListener('drop', (e) => {
1237
+ e.preventDefault();
1238
+ uploadArea.classList.remove('dragover');
1239
+ const files = e.dataTransfer.files;
1240
+ handleFiles(files);
1241
+ });
1242
+
1243
+ // File input change
1244
+ fileInput.addEventListener('change', (e) => {
1245
+ handleFiles(e.target.files);
1246
+ });
1247
+ }
1248
+
1249
+ async function handleFiles(files) {
1250
+ const uploadStatus = document.getElementById('uploadStatus');
1251
+ const uploadProgress = document.getElementById('uploadProgress');
1252
+ const uploadProgressBar = document.getElementById('uploadProgressBar');
1253
+ const uploadProgressText = document.getElementById('uploadProgressText');
1254
+
1255
+ if (files.length === 0) return;
1256
+
1257
+ uploadStatus.style.display = 'block';
1258
+ uploadProgress.style.display = 'block';
1259
+ uploadStatus.textContent = '';
1260
+
1261
+ let successCount = 0;
1262
+ let errorCount = 0;
1263
+
1264
+ for (let i = 0; i < files.length; i++) {
1265
+ const file = files[i];
1266
+ const progress = ((i + 1) / files.length) * 100;
1267
+
1268
+ uploadProgressBar.style.width = progress + '%';
1269
+ if (file.size > 10000) {
1270
+ uploadProgressText.textContent = `Processing large file: ${file.name} (${i + 1}/${files.length}) - chunking for better search...`;
1271
+ } else {
1272
+ uploadProgressText.textContent = `Processing ${file.name} (${i + 1}/${files.length})...`;
1273
+ }
1274
+
1275
+ try {
1276
+ await processFile(file);
1277
+ successCount++;
1278
+ } catch (error) {
1279
+ console.error(`Error processing ${file.name}:`, error);
1280
+ errorCount++;
1281
+ }
1282
+ }
1283
+
1284
+ uploadProgress.style.display = 'none';
1285
+
1286
+ let statusText = `✅ Upload complete!\n📁 ${successCount} files processed successfully`;
1287
+ if (errorCount > 0) {
1288
+ statusText += `\n❌ ${errorCount} files failed to process`;
1289
+ }
1290
+ statusText += `\n📊 Total documents: ${documents.length}`;
1291
+ statusText += `\n🧩 Large files automatically chunked for better search`;
1292
+
1293
+ uploadStatus.textContent = statusText;
1294
+ updateStatus();
1295
+
1296
+ // Clear file input
1297
+ document.getElementById('fileInput').value = '';
1298
+ }
1299
+
1300
+ // Document chunking function for large files
1301
+ function chunkDocument(content, maxChunkSize = 1000) {
1302
+ const sentences = content.split(/[.!?]+/).filter(s => s.trim().length > 0);
1303
+ const chunks = [];
1304
+ let currentChunk = '';
1305
+
1306
+ for (let sentence of sentences) {
1307
+ sentence = sentence.trim();
1308
+ if (currentChunk.length + sentence.length > maxChunkSize && currentChunk.length > 0) {
1309
+ chunks.push(currentChunk.trim());
1310
+ currentChunk = sentence;
1311
+ } else {
1312
+ currentChunk += (currentChunk ? '. ' : '') + sentence;
1313
+ }
1314
+ }
1315
+
1316
+ if (currentChunk.trim()) {
1317
+ chunks.push(currentChunk.trim());
1318
+ }
1319
+
1320
+ return chunks.length > 0 ? chunks : [content];
1321
+ }
1322
+
1323
+ async function processFile(file) {
1324
+ return new Promise((resolve, reject) => {
1325
+ const reader = new FileReader();
1326
+
1327
+ reader.onload = async function(e) {
1328
+ try {
1329
+ const content = e.target.result.trim();
1330
+ const baseTitle = file.name.replace(/\.[^/.]+$/, ""); // Remove file extension
1331
+
1332
+ // Check if document is large and needs chunking
1333
+ if (content.length > 2000) {
1334
+ // Chunk large documents
1335
+ const chunks = chunkDocument(content, 1500);
1336
+ console.log(`📄 Chunking large file: ${chunks.length} chunks created from ${content.length} characters`);
1337
+
1338
+ for (let i = 0; i < chunks.length; i++) {
1339
+ const chunkTitle = chunks.length > 1 ? `${baseTitle} (Part ${i + 1}/${chunks.length})` : baseTitle;
1340
+ const newDocument = {
1341
+ id: documents.length,
1342
+ title: chunkTitle,
1343
+ content: chunks[i],
1344
+ embedding: null
1345
+ };
1346
+
1347
+ // Generate embedding if models are loaded
1348
+ if (transformersReady && modelsInitialized && embeddingModel) {
1349
+ newDocument.embedding = await generateEmbedding(chunks[i]);
1350
+ }
1351
+
1352
+ documents.push(newDocument);
1353
+ }
1354
+ } else {
1355
+ // Small document - process as single document
1356
+ const newDocument = {
1357
+ id: documents.length,
1358
+ title: baseTitle,
1359
+ content: content,
1360
+ embedding: null
1361
+ };
1362
+
1363
+ // Generate embedding if models are loaded
1364
+ if (transformersReady && modelsInitialized && embeddingModel) {
1365
+ newDocument.embedding = await generateEmbedding(content);
1366
+ }
1367
+
1368
+ documents.push(newDocument);
1369
+ }
1370
+
1371
+ resolve();
1372
+
1373
+ } catch (error) {
1374
+ reject(error);
1375
+ }
1376
+ };
1377
+
1378
+ reader.onerror = function() {
1379
+ reject(new Error(`Failed to read file: ${file.name}`));
1380
+ };
1381
+
1382
+ // Read file as text
1383
+ reader.readAsText(file);
1384
+ });
1385
+ }
1386
+
1387
+ async function testSystem() {
1388
+ const outputDiv = document.getElementById('testOutput');
1389
+ const testBtn = document.getElementById('testBtn');
1390
+
1391
+ outputDiv.style.display = 'block';
1392
+ outputDiv.innerHTML = '<div class="loading"></div> Running system tests...';
1393
+ testBtn.disabled = true;
1394
+
1395
+ try {
1396
+ let output = `🧪 System Test Results:\n\n`;
1397
+ output += `📊 Documents: ${documents.length} loaded\n`;
1398
+ output += `🔧 Transformers.js: ${transformersReady ? '✅ Ready' : '❌ Not ready'}\n`;
1399
+ output += `🧠 Embedding Model: ${embeddingModel ? '✅ Loaded' : '❌ Not loaded'}\n`;
1400
+ output += `🤖 QA Model: ${qaModel ? '✅ Loaded' : '❌ Not loaded'}\n`;
1401
+ output += `🚀 LLM Model: ${llmModel ? '✅ Loaded' : '❌ Not loaded'}\n\n`;
1402
+
1403
+ if (transformersReady && modelsInitialized && embeddingModel) {
1404
+ output += `🔍 Testing embedding generation...\n`;
1405
+ const testEmbedding = await generateEmbedding("test sentence");
1406
+ output += `✅ Embedding test: Generated ${testEmbedding.length}D vector\n\n`;
1407
+
1408
+ output += `🔍 Testing semantic search...\n`;
1409
+ const testQuery = "artificial intelligence";
1410
+ const queryEmbedding = await generateEmbedding(testQuery);
1411
+
1412
+ let testResults = [];
1413
+ documents.forEach(doc => {
1414
+ if (doc.embedding) {
1415
+ const similarity = cosineSimilarity(queryEmbedding, doc.embedding);
1416
+ testResults.push({ doc, similarity });
1417
+ }
1418
+ });
1419
+ testResults.sort((a, b) => b.similarity - a.similarity);
1420
+
1421
+ if (testResults.length > 0) {
1422
+ output += `✅ Search test: Found ${testResults.length} results\n`;
1423
+ output += `📄 Top result: "${testResults[0].doc.title}" (similarity: ${testResults[0].similarity.toFixed(3)})\n\n`;
1424
+ }
1425
+
1426
+ if (qaModel) {
1427
+ output += `🤖 Testing QA model...\n`;
1428
+ const context = documents[0].content.substring(0, 500);
1429
+ const testQuestion = "What is artificial intelligence?";
1430
+ const qaResult = await qaModel(testQuestion, context);
1431
+ output += `✅ QA test: Generated answer with ${(qaResult.score * 100).toFixed(1)}% confidence\n`;
1432
+ output += `💬 Answer: ${qaResult.answer.substring(0, 100)}...\n\n`;
1433
+ }
1434
+
1435
+ if (llmModel) {
1436
+ output += `🚀 Testing LLM model...\n`;
1437
+ const testPrompt = "Explain artificial intelligence:";
1438
+ const llmResult = await llmModel(testPrompt, { max_new_tokens: 30, temperature: 0.7, do_sample: true, return_full_text: false });
1439
+ output += `✅ LLM test: Generated text completion\n`;
1440
+ output += `💬 Generated: "${llmResult[0].generated_text.substring(0, 100)}..."\n\n`;
1441
+ }
1442
+
1443
+ output += `🎉 All tests passed! System is fully operational.`;
1444
+ } else {
1445
+ output += `⚠️ Models not initialized. Click "Initialize AI Models" first.`;
1446
+ }
1447
+
1448
+ outputDiv.textContent = output;
1449
+
1450
+ } catch (error) {
1451
+ console.error('Test error:', error);
1452
+ outputDiv.textContent = `❌ Test failed: ${error.message}`;
1453
+ } finally {
1454
+ testBtn.disabled = false;
1455
+ }
1456
+ }
1457
+
1458
+ // Initialize UI
1459
+ updateStatus();
1460
+
1461
+ // Show version info in console
1462
+ console.log('🤖 AI-Powered RAG System with Transformers.js');
1463
+ console.log('Models: Xenova/all-MiniLM-L6-v2, Xenova/distilbert-base-cased-distilled-squad');
1464
+ </script>
1465
+ </body>
1466
+ </html>
start-simple.sh ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+
3
+ # Simple Document Search Startup Script (No Docker Required!)
4
+
5
+ echo "🚀 Starting Simple Document Search & RAG Chat"
6
+ echo "=============================================="
7
+ echo ""
8
+ echo "✅ No Docker required - everything runs in memory!"
9
+ echo "✅ No external dependencies needed"
10
+ echo "✅ Pure browser-based AI system"
11
+ echo ""
12
+
13
+ # Check if Python is available
14
+ if command -v python3 &> /dev/null; then
15
+ PYTHON_CMD="python3"
16
+ elif command -v python &> /dev/null; then
17
+ PYTHON_CMD="python"
18
+ else
19
+ echo "❌ Python is not installed. Please install Python 3."
20
+ exit 1
21
+ fi
22
+
23
+ # Start the web server
24
+ echo "🌐 Starting web server..."
25
+ echo "📁 Serving files from: $(pwd)"
26
+ echo "🔗 Open your browser and navigate to: http://localhost:8000"
27
+ echo "🛑 Press Ctrl+C to stop the server"
28
+ echo ""
29
+
30
+ $PYTHON_CMD -m http.server 8000
style.css CHANGED
@@ -1,76 +1,307 @@
1
- * {
2
- box-sizing: border-box;
3
- padding: 0;
4
- margin: 0;
5
- font-family: sans-serif;
6
- }
7
 
8
- html,
9
  body {
10
- height: 100%;
 
 
 
11
  }
12
 
13
- body {
14
- padding: 32px;
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  }
16
 
17
- body,
18
- #container {
19
  display: flex;
20
- flex-direction: column;
21
- justify-content: center;
22
- align-items: center;
23
  }
24
 
25
- #container {
26
- position: relative;
27
- gap: 0.4rem;
 
 
 
 
 
 
 
 
28
 
29
- width: 640px;
30
- height: 640px;
31
- max-width: 100%;
32
- max-height: 100%;
33
 
34
- border: 2px dashed #D1D5DB;
35
- border-radius: 0.75rem;
36
- overflow: hidden;
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  cursor: pointer;
38
- margin: 1rem;
 
 
 
 
39
 
40
- background-size: 100% 100%;
41
- background-position: center;
42
- background-repeat: no-repeat;
43
- font-size: 18px;
44
  }
45
 
46
- #upload {
47
- display: none;
 
 
 
 
 
 
 
48
  }
49
 
50
- svg {
51
- pointer-events: none;
52
  }
53
 
54
- #example {
55
- font-size: 14px;
56
- text-decoration: underline;
 
 
 
57
  cursor: pointer;
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58
  }
59
 
60
- #example:hover {
61
- color: #2563EB;
 
 
 
 
 
 
 
 
62
  }
63
 
64
- .bounding-box {
 
65
  position: absolute;
66
- box-sizing: border-box;
67
- border: solid 2px;
 
 
 
 
68
  }
69
 
70
- .bounding-box-label {
71
- color: white;
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
72
  position: absolute;
 
 
 
73
  font-size: 12px;
74
- margin: -16px 0 0 -2px;
75
- padding: 1px;
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76
  }
 
1
+ * { margin: 0; padding: 0; box-sizing: border-box; }
 
 
 
 
 
2
 
 
3
  body {
4
+ font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
5
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
6
+ min-height: 100vh;
7
+ padding: 20px;
8
  }
9
 
10
+ .container {
11
+ max-width: 1200px;
12
+ margin: 0 auto;
13
+ background: white;
14
+ border-radius: 20px;
15
+ box-shadow: 0 20px 60px rgba(0,0,0,0.1);
16
+ overflow: hidden;
17
+ }
18
+
19
+ .header {
20
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
21
+ color: white;
22
+ padding: 30px;
23
+ text-align: center;
24
+ }
25
+
26
+ .header h1 { font-size: 2.5em; margin-bottom: 10px; }
27
+ .header p { font-size: 1.2em; opacity: 0.9; }
28
+
29
+ .status {
30
+ background: #f8f9fa;
31
+ padding: 15px 30px;
32
+ border-bottom: 1px solid #e9ecef;
33
+ font-weight: 600;
34
+ color: #495057;
35
  }
36
 
37
+ .tabs {
 
38
  display: flex;
39
+ background: #f8f9fa;
40
+ border-bottom: 1px solid #e9ecef;
 
41
  }
42
 
43
+ .tab {
44
+ flex: 1;
45
+ padding: 15px 20px;
46
+ background: none;
47
+ border: none;
48
+ cursor: pointer;
49
+ font-weight: 600;
50
+ font-size: 14px;
51
+ transition: all 0.3s;
52
+ border-bottom: 3px solid transparent;
53
+ }
54
 
55
+ .tab:hover { background: #e9ecef; }
56
+ .tab.active { background: white; border-bottom-color: #667eea; color: #667eea; }
 
 
57
 
58
+ .tab-content {
59
+ display: none;
60
+ padding: 30px;
61
+ }
62
+
63
+ .tab-content.active { display: block; }
64
+
65
+ .form-group {
66
+ margin-bottom: 20px;
67
+ }
68
+
69
+ label {
70
+ display: block;
71
+ margin-bottom: 5px;
72
+ font-weight: 600;
73
+ color: #495057;
74
+ }
75
+
76
+ input, textarea, select {
77
+ width: 100%;
78
+ padding: 12px;
79
+ border: 2px solid #e9ecef;
80
+ border-radius: 8px;
81
+ font-size: 16px;
82
+ transition: border-color 0.3s;
83
+ }
84
+
85
+ input:focus, textarea:focus, select:focus {
86
+ outline: none;
87
+ border-color: #667eea;
88
+ }
89
+
90
+ button {
91
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
92
+ color: white;
93
+ border: none;
94
+ padding: 12px 24px;
95
+ border-radius: 8px;
96
+ font-size: 16px;
97
+ font-weight: 600;
98
  cursor: pointer;
99
+ transition: transform 0.2s;
100
+ }
101
+
102
+ button:hover { transform: translateY(-2px); }
103
+ button:disabled { opacity: 0.6; cursor: not-allowed; transform: none; }
104
 
105
+ .btn-secondary {
106
+ background: linear-gradient(135deg, #6c757d 0%, #495057 100%);
 
 
107
  }
108
 
109
+ .result {
110
+ background: #f8f9fa;
111
+ border: 1px solid #e9ecef;
112
+ border-radius: 8px;
113
+ padding: 20px;
114
+ margin-top: 15px;
115
+ white-space: pre-wrap;
116
+ max-height: 400px;
117
+ overflow-y: auto;
118
  }
119
 
120
+ .upload-section {
121
+ margin-bottom: 30px;
122
  }
123
 
124
+ .upload-area {
125
+ border: 2px dashed #007bff;
126
+ border-radius: 12px;
127
+ padding: 40px;
128
+ text-align: center;
129
+ background: #f8f9ff;
130
  cursor: pointer;
131
+ transition: all 0.3s ease;
132
+ margin: 20px 0;
133
+ }
134
+
135
+ .upload-area:hover {
136
+ border-color: #0056b3;
137
+ background: #e3f2fd;
138
+ }
139
+
140
+ .upload-area.dragover {
141
+ border-color: #28a745;
142
+ background: #e8f5e8;
143
+ }
144
+
145
+ .upload-content {
146
+ pointer-events: none;
147
+ }
148
+
149
+ .upload-icon {
150
+ font-size: 48px;
151
+ margin-bottom: 15px;
152
+ }
153
+
154
+ .upload-text {
155
+ color: #666;
156
+ font-size: 16px;
157
  }
158
 
159
+ .divider {
160
+ text-align: center;
161
+ margin: 30px 0;
162
+ position: relative;
163
+ color: #666;
164
+ font-weight: bold;
165
+ background: white;
166
+ padding: 0 20px;
167
+ display: inline-block;
168
+ width: 100%;
169
  }
170
 
171
+ .divider::before {
172
+ content: '';
173
  position: absolute;
174
+ top: 50%;
175
+ left: 0;
176
+ right: 0;
177
+ height: 1px;
178
+ background: #ddd;
179
+ z-index: 1;
180
  }
181
 
182
+ .manual-entry {
183
+ margin-top: 20px;
184
+ }
185
+
186
+ .progress-container {
187
+ background: #f0f0f0;
188
+ border-radius: 6px;
189
+ margin: 15px 0;
190
+ overflow: hidden;
191
+ position: relative;
192
+ }
193
+
194
+ .progress-bar {
195
+ background: linear-gradient(45deg, #007bff, #0056b3);
196
+ height: 20px;
197
+ border-radius: 6px;
198
+ transition: width 0.3s ease;
199
+ width: 0%;
200
+ }
201
+
202
+ .progress-text {
203
  position: absolute;
204
+ top: 50%;
205
+ left: 50%;
206
+ transform: translate(-50%, -50%);
207
  font-size: 12px;
208
+ font-weight: bold;
209
+ color: #333;
210
+ white-space: nowrap;
211
+ }
212
+
213
+ .grid {
214
+ display: grid;
215
+ grid-template-columns: 1fr 1fr;
216
+ gap: 20px;
217
+ }
218
+
219
+ .alert {
220
+ padding: 15px;
221
+ border-radius: 8px;
222
+ margin-bottom: 20px;
223
+ }
224
+
225
+ .alert-info {
226
+ background: #d1ecf1;
227
+ border: 1px solid #b8daff;
228
+ color: #0c5460;
229
+ }
230
+
231
+ .alert-success {
232
+ background: #d4edda;
233
+ border: 1px solid #c3e6cb;
234
+ color: #155724;
235
+ }
236
+
237
+ .alert-warning {
238
+ background: #fff3cd;
239
+ border: 1px solid #ffeeba;
240
+ color: #856404;
241
+ }
242
+
243
+ .slider-container {
244
+ display: flex;
245
+ align-items: center;
246
+ gap: 15px;
247
+ }
248
+
249
+ .slider {
250
+ flex: 1;
251
+ }
252
+
253
+ .slider-value {
254
+ min-width: 40px;
255
+ text-align: center;
256
+ font-weight: 600;
257
+ color: #667eea;
258
+ }
259
+
260
+ .loading {
261
+ display: inline-block;
262
+ width: 20px;
263
+ height: 20px;
264
+ border: 2px solid #f3f3f3;
265
+ border-top: 2px solid #667eea;
266
+ border-radius: 50%;
267
+ animation: spin 1s linear infinite;
268
+ }
269
+
270
+ .progress {
271
+ width: 100%;
272
+ height: 8px;
273
+ background: #e9ecef;
274
+ border-radius: 4px;
275
+ overflow: hidden;
276
+ margin: 10px 0;
277
+ }
278
+
279
+ .progress-bar {
280
+ height: 100%;
281
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
282
+ transition: width 0.3s ease;
283
+ }
284
+
285
+ @keyframes spin {
286
+ 0% { transform: rotate(0deg); }
287
+ 100% { transform: rotate(360deg); }
288
+ }
289
+
290
+ .model-info {
291
+ background: #e8f4f8;
292
+ border: 1px solid #bee5eb;
293
+ border-radius: 8px;
294
+ padding: 15px;
295
+ margin: 15px 0;
296
+ }
297
+
298
+ .model-info h4 {
299
+ color: #0c5460;
300
+ margin-bottom: 8px;
301
+ }
302
+
303
+ .model-info p {
304
+ color: #0c5460;
305
+ font-size: 14px;
306
+ margin: 5px 0;
307
  }
test-smollm.html ADDED
@@ -0,0 +1,75 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>SmolLM Test</title>
7
+ <script type="module">
8
+ import { pipeline, env } from 'https://cdn.jsdelivr.net/npm/@xenova/transformers@2.17.2';
9
+ window.transformers = { pipeline, env };
10
+ window.transformersLoaded = true;
11
+ console.log('✅ Transformers.js loaded');
12
+ </script>
13
+ </head>
14
+ <body>
15
+ <h1>SmolLM Model Test</h1>
16
+ <div id="status">Loading...</div>
17
+ <button onclick="testSmolLM()">Test SmolLM Models</button>
18
+ <div id="result"></div>
19
+
20
+ <script>
21
+ async function testSmolLM() {
22
+ const statusDiv = document.getElementById('status');
23
+ const resultDiv = document.getElementById('result');
24
+
25
+ statusDiv.textContent = 'Testing SmolLM models...';
26
+ resultDiv.innerHTML = '';
27
+
28
+ const modelsToTest = [
29
+ 'onnx-community/Phi-3.5-mini-instruct-onnx-web',
30
+ 'Xenova/SmolLM-135M',
31
+ 'Xenova/SmolLM-360M',
32
+ 'HuggingFaceTB/SmolLM2-135M-Instruct',
33
+ 'HuggingFaceTB/SmolLM-135M'
34
+ ];
35
+
36
+ for (const modelName of modelsToTest) {
37
+ try {
38
+ resultDiv.innerHTML += `<p>🔄 Testing ${modelName}...</p>`;
39
+ console.log(`Testing ${modelName}`);
40
+
41
+ const { pipeline } = window.transformers;
42
+ const generator = await pipeline('text-generation', modelName);
43
+
44
+ const result = await generator('Hello, my name is', {
45
+ max_new_tokens: 20,
46
+ temperature: 0.7,
47
+ do_sample: true,
48
+ return_full_text: false
49
+ });
50
+
51
+ resultDiv.innerHTML += `<p>✅ ${modelName} works!</p>`;
52
+ resultDiv.innerHTML += `<p>Generated: "${result[0].generated_text}"</p><hr>`;
53
+
54
+ statusDiv.textContent = `✅ Found working model: ${modelName}`;
55
+ break;
56
+
57
+ } catch (error) {
58
+ console.error(`${modelName} failed:`, error);
59
+ resultDiv.innerHTML += `<p>❌ ${modelName} failed: ${error.message}</p>`;
60
+ }
61
+ }
62
+
63
+ if (!resultDiv.innerHTML.includes('✅')) {
64
+ statusDiv.textContent = '❌ No SmolLM models work with current setup';
65
+ resultDiv.innerHTML += '<p><strong>Recommendation:</strong> Use DistilGPT-2 or GPT-2 as fallback</p>';
66
+ }
67
+ }
68
+
69
+ // Auto-test when page loads
70
+ document.addEventListener('DOMContentLoaded', () => {
71
+ setTimeout(testSmolLM, 2000);
72
+ });
73
+ </script>
74
+ </body>
75
+ </html>