DistilBERT NEET Biology MCQ Classifier (NEET_BioBERT)

This model is a fine-tuned version of DistilBERT (base uncased) specifically trained to classify the correct option for NEET-style multiple-choice biology questions. It selects the best answer among four choices (A, B, C, D).


Training Data

Source: sweatSmile / NEET Biology QA Dataset

Domain: NEET (Undergraduate Medical Entrance Exam) – Biology

Format: Each question has 4 options with one correct answer

Dataset Size: 793 questions

Split: 80% train / 20% validation


Training Configuration

Base Model: distilbert-base-uncased

Epochs: 10

Batch Size: 4

Learning Rate: 5e-5

Weight Decay: 0.01

Task Type: Multiple Choice Classification


Results

Validation Accuracy 72.96% (~73%)

Final Training Loss ~0.35


Limitations

Trained on a relatively small dataset (793 questions).

Limited to NEET-level biology content; not suitable for physics or chemistry.

Does not support:

Assertion-reasoning questions

Diagram-based questions

Paragraph/Case study type questions


Intended Use

Educational Research

AI-powered NEET Biology assistants

MCQ practice evaluation

Baseline model for future fine-tuning with larger datasets


NOTE:

Not recommended as a final exam-ready solution without further fine-tuning and validation.


License: MIT

Downloads last month
22
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Neural-Hacker/NEET_BioBERT

Finetuned
(10047)
this model

Dataset used to train Neural-Hacker/NEET_BioBERT