BEV-SUSHI: Multi-Target Multi-Camera 3D Detection and Tracking in Bird's-Eye View Paper • 2412.00692 • Published Dec 1, 2024 • 1
Physical AI Collection Collection of open, commercial-grade datasets for physical AI developers • 22 items • Updated 7 days ago • 80
view article Article Llama‑Embed‑Nemotron‑8B Text Embedding Model Ranks First on Multilingual MTEB Leaderboard By nvidia and 4 others • 7 days ago • 11
Cache-to-Cache: Direct Semantic Communication Between Large Language Models Paper • 2510.03215 • Published 25 days ago • 93
Paper2Video: Automatic Video Generation from Scientific Papers Paper • 2510.05096 • Published 22 days ago • 107
Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning Paper • 2507.16784 • Published Jul 22 • 120
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1 • 236
A Survey of Context Engineering for Large Language Models Paper • 2507.13334 • Published Jul 17 • 257
view article Article Post-Training Isaac GR00T N1.5 for LeRobot SO-101 Arm By nvidia and 5 others • Jun 11 • 104
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published Jun 2 • 185
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published May 28 • 130
Shifting AI Efficiency From Model-Centric to Data-Centric Compression Paper • 2505.19147 • Published May 25 • 144
BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs Paper • 2504.18415 • Published Apr 25 • 47
DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation Paper • 2503.06053 • Published Mar 8 • 138
Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM Paper • 2503.14478 • Published Mar 18 • 48
Mol-LLaMA: Towards General Understanding of Molecules in Large Molecular Language Model Paper • 2502.13449 • Published Feb 19 • 45