Compass Academic Leaderboard (Full Version)

--WIP--

Model Size
Model Type
Index
Model Name
Parameters
OpenSource
IFEval
drop
bbh
GPQA_diamond
hellaswag
musr_average
korbench_single
math_prm800k_500
aime2024
cmmlu
mmlu
mmlu_pro
openai_humaneval
lcb_code_generation
bigcodebench_hard_instruct
10
DeepSeek-R1-Distill-Llama-70B
456B
OpenSource
83.55
89.37
80.92
58.08
88.46
68.55
55.12
76.80
10.00
83.86
88.92
75.09
89.02
36.25
22.97