·
AI & ML interests
AI/ML
Organizations
debisoft/smol-course-SmolVLM2-2.2B-Instruct-trl-sft-ChartQA
Updated
debisoft/smollm3-dpo-aligned-peft
Updated
debisoft/smollm3-dpo-aligned
Updated
debisoft/rl_course_vizdoom_health_gathering_supreme
Reinforcement Learning
•
Updated
debisoft/ppo-LunarLander-v2
Reinforcement Learning
•
Updated
•
3
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
debisoft/poca-SoccerTwos-2
Reinforcement Learning
•
Updated
•
575
Reinforcement Learning
•
Updated
•
680
debisoft/a2c-PandaPickAndPlace-v3
Reinforcement Learning
•
Updated
•
7
debisoft/sac-PandaPickAndPlace-v3
Reinforcement Learning
•
Updated
•
10
debisoft/tqc-PandaPickAndPlace-v3
Reinforcement Learning
•
Updated
•
9
debisoft/a2c-PandaReachDense-v3
Reinforcement Learning
•
Updated
•
7
debisoft/Reinforce-Pixelcopter-PLE-v0-checkpt
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
•
1
debisoft/ppo-SnowballTarget
Reinforcement Learning
•
Updated
debisoft/Reinforce-Pixelcopter-PLE-v0
Reinforcement Learning
•
Updated
debisoft/Reinforce-Cartpole-v1
Reinforcement Learning
•
Updated
debisoft/ppo-LunarLander-v3-x
Reinforcement Learning
•
Updated
•
5
debisoft/SpaceInvadersNoFrameskip-v4
Reinforcement Learning
•
Updated
•
8
debisoft/Taxi-v3-5x5-noRain
Reinforcement Learning
•
Updated
debisoft/q-FrozenLake-v1-4x4-noSlippery
Reinforcement Learning
•
Updated
Reinforcement Learning
•
Updated
debisoft/ppo-LunarLander-v2-x
Reinforcement Learning
•
Updated
debisoft/mistral-nemo-12b-instruct-thinking-function_calling-logic-capturing-V0
Updated
debisoft/mistral-nemo-12b-base-thinking-function_calling-logic-capturing-V0
Updated
debisoft/mistral-nemo-minitron-8b-base-thinking-function_calling-logic-capturing-V0
Updated
debisoft/mistral-nemo-minitron-8b-instruct-thinking-function_calling-logic-capturing-V0
Updated
debisoft/mistral-nemo-minitron-8B-Instruct-thinking-function_calling-V0
Updated
debisoft/openmath-mistral-7b-thinking-function_calling-logic-capturing-V0
Updated