LLM ArenaCreate and share beautiful side-by-side LLM Comparisons
AboutLLM Arena is a simple project that allows you to compare LLMs, side-by-side.Mutual metadata shared between two or more selected models get displayed together in widgets.
Humaneval Benchmark
code-llama
28.8%
mistral-7b
30.5%
gemini pro
67.7%
PaLM 8b
3.6%
code-davinci-002
65.8%
Context Window Tokens
gpt-4
8.2K
gpt-3.5-turbo
4.1K
gpt-4-turbo-preview
128K
replit-code-v1-3b
32K
llama-2-7b
4K
Use Case
gpt-4
multimodal
dalle-2
text-to-image
sora
text-to-video
CodeLlama-70b-hf
code generation
musicgen-small
text-to-music
InspirationAmjad Masad wanted a way to compare different LLMs, somewhat like this dog breed comparison tool.After some brainstorming, I decided to build a system similar to Community Notes where community approval was required to ensure accurate information and prevent abuse/spam.The UI design approach I took was heavily inspired by Raycast.
Final NotesThank you for taking your time to read about my project. If you'd like to contribute and upload LLMs to this website, check out the Contribution Page and apply to join.Should you have any questions or feedback, hit me up at [email protected] or send me a message on X.