HyperAI超神经

Dialogue Safety Prediction On Rt Inod

评估指标

Best-of

评测结果

各个模型在此基准测试上的表现结果

比较表格
模型名称Best-of
benchmarking-llama2-mistral-gemma-and-gpt-for0.91
benchmarking-llama2-mistral-gemma-and-gpt-for0.87
benchmarking-llama2-mistral-gemma-and-gpt-for0.91
benchmarking-llama2-mistral-gemma-and-gpt-for0.86
benchmarking-llama2-mistral-gemma-and-gpt-for0.92