gemma-3-270m-it - MLC WebGPU

This is a quantized version of google/gemma-3-270m-it optimized for WebGPU inference using MLC-LLM.

Model Details

  • Base Model: google/gemma-3-270m-it
  • Quantization: q4f16_1
  • Target: WebGPU
  • Library: MLC-LLM
  • WebLLM Compatible: Yes

Usage

This model is designed to be used with WebLLM in web browsers supporting WebGPU.

WebLLM Integration

import * as webllm from "@mlc-ai/web-llm";

const appConfig = {
  model_list: [{
    model: "https://huggingface.co/llinguini/gemma-3-270m-it-q4f16_1-MLC",
    model_id: "gemma-3-270m-it-MLC",
    model_lib: "https://huggingface.co/llinguini/gemma-3-270m-it-q4f16_1-MLC/resolve/main/libs/gemma-3-270m-it-webgpu.wasm",
    required_features: ["shader-f16"]
  }]
};

const engine = await webllm.CreateMLCEngine("gemma-3-270m-it-MLC", { appConfig });

Files

  • mlc-chat-config.json: Model configuration for MLC-LLM
  • ndarray-cache.json: Metadata cache
  • tokenizer.json: Fast tokenizer configuration
  • params_shard_*.bin: Quantized model parameters
  • libs/gemma-3-270m-it-webgpu.wasm: Compiled WebGPU library

License

This model follows the same license as the base model: Gemma License.

Downloads last month
13
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for llinguini/gemma-3-270m-it-q4f16_1-MLC

Finetuned
(642)
this model