HyperAIHyperAI

Command Palette

Search for a command to run...

Console

Long Context Understanding On Ada Leval Tsort

Metrics

128k
16k
2k
32k
4k
64k
8k

Results

Performance results of various models on this benchmark

Paper TitleCode
GPT-4-Turbo-01252.05.515.52.016.54.08.5GPT-4 Technical Report
GPT-3.5-Turbo-1106-5.54.0-4.5-4.5-
InternLM2-7b-4.35.1-3.9-5.1InternLM2 Technical Report
GPT-4-Turbo-11066.03.518.56.015.56.07.5GPT-4 Technical Report
Vicuna-13b-v1.5-16k-3.15.4-5.0-2.4Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Claude-2-3.05.00.05.00.04.5-
LongChat-7b-v1.5-32k-2.55.3-5.0-3.1Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Vicuna-7b-v1.5-16k-1.75.3-2.2-2.3Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
ChatGLM2-6b-32k-0.90.9-0.2-0.7GLM-130B: An Open Bilingual Pre-trained Model
ChatGLM3-6b-32k-0.72.3-2.4-2.0GLM-130B: An Open Bilingual Pre-trained Model
0 of 10 row(s) selected.
Long Context Understanding On Ada Leval Tsort | SOTA | HyperAI