utarn commited on
Commit
9b7ef55
·
1 Parent(s): 1a8c8de

Add readme and model card

Browse files
Files changed (4) hide show
  1. README.md +24 -1
  2. model_card.md +110 -0
  3. requirements.txt +5 -0
  4. spaces_config.json +6 -0
README.md CHANGED
@@ -1,3 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
1
  # Omni API Gradio UI
2
 
3
  A Gradio-based user interface for the Omni API that supports text, PDF, image, and audio file processing.
@@ -42,4 +53,16 @@ This will monitor for changes in Python files, Markdown files, and TOML configur
42
 
43
  - **PDFs**: Document processing
44
  - **Images**: JPG, PNG, GIF, BMP, WEBP
45
- - **Audio**: MP3, WAV, M4A, FLAC, OGG
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Omni API Gradio UI
3
+ emoji: 🤖
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: gradio
7
+ sdk_version: "4.0.0"
8
+ app_file: app.py
9
+ pinned: false
10
+ ---
11
+
12
  # Omni API Gradio UI
13
 
14
  A Gradio-based user interface for the Omni API that supports text, PDF, image, and audio file processing.
 
53
 
54
  - **PDFs**: Document processing
55
  - **Images**: JPG, PNG, GIF, BMP, WEBP
56
+ - **Audio**: MP3, WAV, M4A, FLAC, OGG
57
+
58
+ ---
59
+ tags:
60
+ - gradio
61
+ - omni-api
62
+ - multimodal
63
+ - chat-interface
64
+ - pdf-processing
65
+ - image-processing
66
+ - audio-processing
67
+ - llm
68
+ - api-client
model_card.md ADDED
@@ -0,0 +1,110 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - gradio
5
+ - omni-api
6
+ - multimodal
7
+ - chat-interface
8
+ - pdf-processing
9
+ - image-processing
10
+ - audio-processing
11
+ - llm
12
+ - api-client
13
+ - chatbot
14
+ - text-generation
15
+ - document-analysis
16
+ - ocr
17
+ - transcription
18
+ widget:
19
+ - src: https://api.modelharbor.com
20
+ ---
21
+
22
+ # Omni API Gradio UI
23
+
24
+ This is a Gradio-based user interface for the Omni API that supports multimodal interactions with various file types including text, PDF documents, images, and audio files.
25
+
26
+ ## Model Description
27
+
28
+ The Omni API Gradio UI provides an easy-to-use web interface for interacting with the Omni API, which supports advanced multimodal AI capabilities. Users can send text prompts along with various file types and receive intelligent responses.
29
+
30
+ ### Supported Models
31
+
32
+ The interface supports several state-of-the-art models:
33
+ - typhoon-ocr-preview
34
+ - openai/gpt-5
35
+ - meta-llama/llama-4-maverick
36
+ - qwen/qwen3-235b-a22b-instruct-2507
37
+ - gemini/gemini-2.5-pro
38
+ - gemini/gemini-2.5-flash
39
+
40
+ ## Features
41
+
42
+ - **Multimodal Support**: Process text, PDFs, images, and audio files in a single interface
43
+ - **File Ordering**: Upload multiple files in a specific order for precise control
44
+ - **Configurable Models**: Switch between different AI models for different tasks
45
+ - **Real-time Responses**: Get immediate feedback from the API
46
+ - **Customizable Parameters**: Adjust max tokens and other settings
47
+
48
+ ## Intended Uses & Limitations
49
+
50
+ ### Intended Uses
51
+ - Document analysis and summarization
52
+ - Image OCR and analysis
53
+ - Audio transcription and analysis
54
+ - Multimodal chat applications
55
+ - Content extraction from various file formats
56
+
57
+ ### Limitations
58
+ - Requires access to the Omni API
59
+ - Dependent on network connectivity
60
+ - File size limitations based on API constraints
61
+ - Some models may require API keys
62
+
63
+ ## How to Use
64
+
65
+ 1. Configure the API base URL (defaults to https://api.modelharbor.com)
66
+ 2. Select your preferred model from the dropdown
67
+ 3. Enter your text message in the input box
68
+ 4. Upload files (PDF, images, or audio) as needed
69
+ 5. Click "Send Request" to interact with the API
70
+ 6. View the response in the output panel
71
+
72
+ ### Supported File Types
73
+
74
+ - **PDFs**: Document processing and analysis
75
+ - **Images**: JPG, PNG, GIF, BMP, WEBP for OCR and visual analysis
76
+ - **Audio**: MP3, WAV, M4A, FLAC, OGG for transcription
77
+
78
+ ## Technical Details
79
+
80
+ ### Frameworks and Libraries
81
+ - Gradio 4.0+
82
+ - Python 3.8+
83
+ - Requests library for API communication
84
+
85
+ ### Installation
86
+ ```bash
87
+ # Install dependencies
88
+ uv sync
89
+
90
+ # Run the application
91
+ uv run python app.py
92
+ ```
93
+
94
+ ### Development Mode
95
+ ```bash
96
+ # Run with auto-reload for development
97
+ uv run python dev.py
98
+ ```
99
+
100
+ ## Citation
101
+
102
+ If you use this interface in your work, please cite:
103
+
104
+ ```
105
+ @misc{omni_api_gradio_ui,
106
+ title={Omni API Gradio UI},
107
+ author={ModelHarbor Team},
108
+ year={2025},
109
+ howpublished={\url{https://github.com/your-username/omni-api-gradio-ui}}
110
+ }
requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ gradio>=4.0.0
2
+ requests>=2.31.0
3
+ python-multipart>=0.0.6
4
+ aiofiles>=23.0.0
5
+ gradio-client>=1.3.0
spaces_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "app_file": "app.py",
3
+ "requirements": "requirements.txt",
4
+ "sdk": "gradio",
5
+ "sdk_version": "4.0.0"
6
+ }