FLUX.2 [dev]
Create images from text and optional images
A leaderboard would sound awesome, specially to compare the „usability“ (from a aligned perspective).
There is one board, regarding this topic: https://huggingface.co/spaces/AI-Secure/llm-trustworthy-leaderboard
But I think the points you raise are interesting in their own right.
Also subjective work could be standardized to some extent :)
How did you obtain those scores?
Also what does the values mean?
Maybe I'm missing a point, than please advise me I would love to know! :)
But otherwise I can't think of what "health -3" mean and how it compares to "health +15"
(I really don't want to be rude so sorry if it sounds like this! :) )
eBOOK Cover generation
Create and execute interactive Jupyter notebooks with user input
Track, rank and evaluate open LLMs and chatbots
Sexy x6 Images Generator