Request: DOI

#121

by jainkanishk - opened Sep 11

Sep 11

We are requesting access to Llama-3.2-11B-Vision-Instruct for evaluation and development.

Our primary objective is to develop an AI-powered form and document understanding system capable of extracting structured data from semi-structured scanned documents (e.g., certificates, government forms, business records).

Planned Use Cases:

Learning & Research: Investigating multimodal LLM capabilities for text–image understanding.

Business/Production: Integrating the model into a document intelligence pipeline to automate data entry, enhance efficiency, and ensure accuracy in enterprise workflows.

Responsible Use: The model will not be used for harmful, biased, or inappropriate content generation. Its application will be strictly limited to OCR and structured data extraction.

This request aligns with Hugging Face’s mission of advancing safe and beneficial AI. Access will enable us to evaluate advanced vision-language reasoning for real-world document processing scenarios.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment