File size: 2,715 Bytes

---
library_name: transformers
tags: []
---

# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->

CodeThinker is a fine-tuned LLM based on Qwen2.5-Instruct-32B, specializing in code generation. 

# Inference
```
instruct_prompt = """Your role as a coding assistant is to provide thorough, well-reasoned, and precise solutions to coding questions by following a systematic long-reasoning process. Please think step-by-step and carefully before arriving at the final answer. Use the following step-by-step workflow:

@@ Detailed Instructions:

- Understanding the Problem: Begin by thoroughly reading and analyzing the task description, clarifying requirements and constraints.
- Planning: Devise a solution strategy by breaking down the task into smaller, manageable subproblems. Compare potential approaches and select the most efficient one based on constraints.
- Design: Outline the steps needed to implement the chosen approach. This could include pseudocode, notes on algorithms, or identifying edge cases.
- Implementation: Write clean, readable, and well-documented code. Use meaningful variable names and adhere to best practices.
- Testing: Test the code with a variety of cases, including edge cases, to ensure correctness and robustness.
- Optimization Cycle:
  - Identify Failure: Encounter a realistic issue or inefficiency during testing.
  - Analyze Root Cause: Understand why the failure occurred or why the solution is suboptimal.
  - Refactor or Redesign: Make the necessary changes to address the issue or improve performance.
  - Retest: Verify the improved solution.
- Share the polished solution under the header: ## Final Solution. Include the final implementation, neatly wrapped in code blocks with the header: ### Solution Code and followed by ```python[code]```
                    
@@ Coding Question
{}"""

from vllm import LLM, SamplingParams
from transformers import AutoTokenizer

model = LLM(
model = "Chengran98/codethinker",
tensor_parallel_size=N, # N refers to the number of GPUs you wish to use
tokenizer = "Qwen/Qwen2.5-32B-Instruct",
dtype="auto",
enable_prefix_caching=True,
trust_remote_code=True,
)

sampling_params = SamplingParams(
    max_tokens=32768,
    temperature=0.7,
)

prompt = instruct_prompt+{user promt}

response = model.generate(prompt, sampling_params=sampling_params)

```

# Evaluation
The final solution provided in this version of CodeThinker is currently found under:
```
## Final Solution -> ### Solution Code -> ```python[solution]```
```
[TO DO]: An upcoming release will include improvements by wrapping the code solution within a specialized token `<solution>` during inference to enhance parsing capabilities.