Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
gitlost-murali 's Collections
Agentic & Multi-turn Chat

Agentic & Multi-turn Chat

updated Jul 19

Literature for evaluating agents and multi-turn chat. Blogs: https://arize.com/blog/prompt-learning-using-english-feedback-to-optimize-llm-systems

Upvote
-

  • CodeACT: Code Adaptive Compute-efficient Tuning Framework for Code LLMs

    Paper • 2408.02193 • Published Aug 5, 2024 • 1

  • google/frames-benchmark

    Viewer • Updated Oct 15, 2024 • 824 • 10.5k • 230

  • gaia-benchmark/GAIA

    Viewer • Updated 6 days ago • 932 • 7.37k • 471

  • callanwu/WebWalkerQA

    Viewer • Updated Sep 8 • 14.3k • 6.5k • 44

  • WebSailor: Navigating Super-human Reasoning for Web Agent

    Paper • 2507.02592 • Published Jul 3 • 121

  • Establishing Best Practices for Building Rigorous Agentic Benchmarks

    Paper • 2507.02825 • Published Jul 3 • 1

  • promptfoo/CCP-sensitive-prompts

    Viewer • Updated Jan 28 • 1.36k • 211 • 52
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs