Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -62,32 +62,13 @@ This release prioritizes **practical code generation quality** over benchmark sc
 ## Quickstart
-### 1) Transformers (merged weights)
-```python
-from transformers import AutoTokenizer, AutoModelForCausalLM
-import torch
-repo = "hokar3361/gpt-oss-coderjs-v0.1"
-tok  = AutoTokenizer.from_pretrained(repo, use_fast=True)
-model = AutoModelForCausalLM.from_pretrained(
-    repo,
-    torch_dtype=torch.bfloat16,
-    device_map="auto"
-)
-prompt = "```js\n// Write a function that flattens a nested array of numbers\n"
-inputs = tok(prompt, return_tensors="pt").to(model.device)
-out = model.generate(**inputs, max_new_tokens=128, temperature=0.3, do_sample=False)
-print(tok.decode(out[0], skip_special_tokens=True))
-2) vLLM (recommended)
-bash
-コードをコピーする
 vllm serve hokar3361/gpt-oss-coderjs-v0.1 \
   --async-scheduling \
   --max-model-len 4096 \
   --gpu-memory-utilization 0.90
 For LoRA-only repos, add --lora-modules as per vLLM documentation.
 For merged weights, the above command is sufficient.

 ## Quickstart
+```bash
 vllm serve hokar3361/gpt-oss-coderjs-v0.1 \
   --async-scheduling \
   --max-model-len 4096 \
   --gpu-memory-utilization 0.90
 For LoRA-only repos, add --lora-modules as per vLLM documentation.
+```
 For merged weights, the above command is sufficient.