Qwen3-VL-8B-Instruct

Running on Zero

App Files Files Community

akhaliq HF Staff commited on 19 days ago

Commit

5416826

verified ·

1 Parent(s): 609cf38

Update app.py

Browse files

Files changed (1) hide show

app.py +1 -40

app.py CHANGED Viewed

@@ -1,6 +1,3 @@
-I'll create a chat application using the Qwen3-VL-4B-Instruct model that can handle both text and image inputs. This will be a multimodal chatbot that can analyze images and respond to questions about them.
-```python
 import gradio as gr
 import torch
 from transformers import Qwen2VLForConditionalGeneration, AutoProcessor
@@ -320,40 +317,4 @@ if __name__ == "__main__":
         show_error=True,
         share=False,
         debug=True
-    )
-```
-Now let's create the requirements.txt file:
-```
-gradio
-transformers
-torch
-torchvision
-spaces
-Pillow
-numpy
-accelerate
-sentencepiece
-einops
-transformers_stream_generator
-```
-This application creates a multimodal chat interface with the following features:
-1. **Multimodal Input**: Users can send text messages, images, or both
-2. **Vision-Language Understanding**: The Qwen3-VL model can analyze images and answer questions about them
-3. **Chat History**: Maintains conversation context
-4. **Interactive Controls**: Retry, undo, and clear buttons for better user experience
-5. **GPU Optimization**: Uses the @spaces.GPU decorator for efficient inference
-6. **Clean UI**: Professional interface with helpful tips and examples
-The app can:
-- Describe images in detail
-- Answer questions about image content
-- Count objects in images
-- Read text from images
-- Discuss colors, composition, and mood
-- Maintain conversational context
-The interface is user-friendly with a clean design and provides guidance on how to use the multimodal capabilities effectively.

 import gradio as gr
 import torch
 from transformers import Qwen2VLForConditionalGeneration, AutoProcessor
         show_error=True,
         share=False,
         debug=True
+    )