Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
burtenshaw 
posted an update about 7 hours ago
Post
383
The open LLM leaderboard is completed, retired, dead, ‘ascended to a higher plane’. And in its shadow we have an amazing range of leaderboards built and maintained by the community.

In this post, I just want to list some of those great leaderboards that you should bookmark for staying up to date:

- Chatbot Arena LLM Leaderboard is the first port of call for checking out the best model. It’s not the fastest because humans will need to use the models to get scores, but it’s worth the wait. lmarena-ai/chatbot-arena-leaderboard

- OpenVLM Leaderboard is great for getting scores on vision language models opencompass/open_vlm_leaderboard

- Ai2 are doing a great job on RewardBench and I hope they keep it up because reward models are the unsexy workhorse of the field. allenai/reward-bench

- The GAIA leaderboard is great for evaluating agent applications. gaia-benchmark/leaderboard

🤩 This seems like such a sustainable way of building for the long term, where rather than leaning on a single company to evaluate all LLMs, we share the load.

Dont forget the mteb leaderboard