스큐's picture

스큐

xcnqa

AI & ML interests

None yet

Recent Activity

replied to ibragim-bad's post about 22 hours ago

🎄 67,074 Qwen3-Coder OpenHands trajectories + 2 RFT checkpoints. We release: 67,000+ trajectories from 3,800 resolved issues in 1,800+ Python repos. About 3x more successful trajectories and 1.5x more repos than our previous dataset. Trajectories are long: on average 64 turns, up to 100 turns and 131k context length. > RFT on this data, SWE-bench Verified: Qwen3-30B-Instruct: 25.7% → 50.3% Pass@1. Qwen3-235B-Instruct: 46.2% → 61.7% Pass@1. Also strong gains on SWE-rebench September. > We also did massive evals. We run OpenHands with 100 and 500 turns. We compare models under both limits. We run on SWE-bench Verified and several months of SWE-rebench. !!! We also check tests written by the models. We measure how often tests are correct. We check how often the final patch passes its own tests. This gives a pool of tests for verifiers and auto graders. > Fully permissive licenses Dataset and models: https://huggingface.co/collections/nebius/openhands-trajectories Blog post: https://nebius.ai/blog/posts/openhands-trajectories-with-qwen3-instruct

replied to ronantakizawa's post about 24 hours ago

Thank you @clem (Co-Founder & CEO of Hugging Face) for sharing my dataset on X / Twitter! https://huggingface.co/datasets/ronantakizawa/github-top-developers #github #dataset

replied to CultriX's post 6 months ago

Hi all! I was hoping somebody would be willing to check out this thought experiment of mine with the aim to reduce tokens in inter-agent communications. How It Works: 1. You provide a task in natural language (NL) 2. NL-to-CCL Agent: Converts your request into a structured Compressed Communication Language (CCL) task. 3. Inter-agent communication occurs in CCL 4. CCL is translated back to NL before being presented to the user. I have a notebook with an example that claims to achieve these results: --- Token Usage Summary --- Total NL Tokens (User Input & Final Output): 364 Total CCL Tokens (for NL/CCL Conversions): 159 Total CCL Tokens (Internal Agent Communication): 194 Overall token savings on NL-to-CCL conversion portions: 56.32% ------------------------ When asking Gemini it concludes: "Yes, the methods used in this notebook are sensible. The multi-agent architecture is logical, and the introduction of a Compressed Communication Language (CCL) is a clever and practical solution to the real-world problems of token cost and ambiguity in LLM-based systems. While it's a proof-of-concept that would need more robust error handling and potentially more complex feedback loops for a production environment, it successfully demonstrates a viable and efficient strategy for automating a software development lifecycle." However, I have no idea if it's actually working or if I'm just crazy. Would really like it if someone would be willing to provide thoughts on this! The notebook: https://gist.github.com/CultriX-Github/7f9895bc5e4d99d2d4a3eb17d079f08b#file-token-reduction-ipynb Thank you for taking the time! :)

View all activity

Organizations

None yet

replied to ibragim-bad's post about 22 hours ago

huggingface_case01

replied to ronantakizawa's post about 24 hours ago

huggingface_comment01

replied to CultriX's post 6 months ago

reply_hugging

replied to CultriX's post 6 months ago

hugging_test_1

updated 2 models over 2 years ago

xcnqa/test

Updated Sep 6, 2023

xcnqa/model

Updated Aug 24, 2023