Compare LLMs side-by-side

LLM ArenaCreate and share beautiful side-by-side LLM Comparisons

AboutLLM Arena is a simple project that allows you to compare LLMs, side-by-side.Mutual metadata shared between two or more selected models get displayed together in widgets.

Humaneval Benchmark

code-llama

28.8%

mistral-7b

30.5%

gemini pro

67.7%

PaLM 8b

3.6%

code-davinci-002

65.8%

Context Window Tokens

gpt-4

8.2K

gpt-3.5-turbo

4.1K

gpt-4-turbo-preview

128K

replit-code-v1-3b

32K

llama-2-7b

Use Case

gpt-4

multimodal

dalle-2

text-to-image

sora

text-to-video

CodeLlama-70b-hf

code generation

musicgen-small

text-to-music

InspirationAmjad Masad wanted a way to compare different LLMs, somewhat like this dog breed comparison tool.After some brainstorming, I decided to build a system similar to Community Notes where community approval was required to ensure accurate information and prevent abuse/spam.The UI design approach I took was heavily inspired by Raycast.

Final NotesThank you for taking your time to read about my project. If you'd like to contribute and upload LLMs to this website, check out the Contribution Page and apply to join.Should you have any questions or feedback, hit me up at [email protected] or send me a message on X.