Add agentic and tool usage tests
#388
by
						
newdoria88
	
							
						- opened
							
					
I think a test to see how a model performs for agentic tasks would be relevant since that's another main usage of LLMs beyond chatbots and RP.
Censored models sometimes even refuse to google something if it triggers their safety guardrails but many abliteraded models that weren't fine tuned to heal their lobotomy perform worst than their censored counterparts at tool usage and general agentic tasks, so it'd be useful to see not only how well a model does in the W/10 test but also how well it can handle tool calling and agentic tasks when required.
newdoria88
	
				
		changed discussion title from
		Add agentic performance tests
		to Add agentic and tool usage performance tests
			
newdoria88
	
				
		changed discussion title from
		Add agentic and tool usage performance tests
		to Add agentic and tool usage tests