HyperAIHyperAI

Long Context Understanding On Ada Leval Tsort

Metrics

128k
16k
2k
32k
4k
64k
8k

Results

Performance results of various models on this benchmark

Model Name
128k
16k
2k
32k
4k
64k
8k
Paper TitleRepository
GPT-4-Turbo-01252.05.515.52.016.54.08.5GPT-4 Technical Report-
GPT-3.5-Turbo-1106-5.54.0-4.5-4.5--
ChatGLM2-6b-32k-0.90.9-0.2-0.7GLM-130B: An Open Bilingual Pre-trained Model-
ChatGLM3-6b-32k-0.72.3-2.4-2.0GLM-130B: An Open Bilingual Pre-trained Model-
LongChat-7b-v1.5-32k-2.55.3-5.0-3.1Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena-
Vicuna-7b-v1.5-16k-1.75.3-2.2-2.3Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena-
Claude-2-3.05.00.05.00.04.5--
Vicuna-13b-v1.5-16k-3.15.4-5.0-2.4Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena-
GPT-4-Turbo-11066.03.518.56.015.56.07.5GPT-4 Technical Report-
InternLM2-7b-4.35.1-3.9-5.1InternLM2 Technical Report-
0 of 10 row(s) selected.
Long Context Understanding On Ada Leval Tsort | SOTA | HyperAI