Rajdeep Borgohain
rbgo
AI & ML interests
Solving language barriers.
Organizations
LLM-Alignment Papers
-
Concrete Problems in AI Safety
Paper β’ 1606.06565 β’ Published β’ 1 -
The Off-Switch Game
Paper β’ 1611.08219 β’ Published β’ 1 -
Learning to summarize from human feedback
Paper β’ 2009.01325 β’ Published β’ 4 -
Truthful AI: Developing and governing AI that does not lie
Paper β’ 2110.06674 β’ Published β’ 1
Finetuning
LLM-Alignment Papers
-
Concrete Problems in AI Safety
Paper β’ 1606.06565 β’ Published β’ 1 -
The Off-Switch Game
Paper β’ 1611.08219 β’ Published β’ 1 -
Learning to summarize from human feedback
Paper β’ 2009.01325 β’ Published β’ 4 -
Truthful AI: Developing and governing AI that does not lie
Paper β’ 2110.06674 β’ Published β’ 1
models
8
rbgo/Qwen3-gsm8k-GRPO
Text Generation
β’
4B
β’
Updated
rbgo/SmolLM2-1.7B-R1-Distilled-GRPO
Text Generation
β’
2B
β’
Updated
β’
1
rbgo/SmolLM2-1.7B-R1-Distilled
Text Generation
β’
2B
β’
Updated
rbgo/SmolLM2-1-7B-Distill
Updated
rbgo/inferless-Llama-3-8B
Text Generation
β’
8B
β’
Updated
β’
1
β’
2
rbgo/infer-Llama-3-8B
Text Generation
β’
5B
β’
Updated
β’
2
rbgo/gemma
Text Generation
β’
9B
β’
Updated
rbgo/Super-phi-2-dpo
Text Generation
β’
3B
β’
Updated
β’
3
β’
1