XQuant: Breaking the Memory Wall for LLM Inference with KV Cache Rematerialization Paper β’ 2508.10395 β’ Published Aug 14 β’ 42
Gradio WebRTC Cookbook β‘οΈ Collection Collection of real-time voice and video demos built with gradio-webrtc custom component β’ 8 items β’ Updated Dec 10, 2024 β’ 18