📊 Showing 5 results | 📏 Metric: Prometheus-2 Answer Correctness
Rank | Model | Paper | Prometheus-2 Answer Correctness | Date | Code |
---|---|---|---|---|---|
1 | GPT-3.5-Turbo-0613-16k | Language Models are Few-Shot Learners | 0.24 | 2020-05-28 | 📦 ggml-org/llama.cpp 📦 ggerganov/llama.cpp 📦 karpathy/llm.c |
2 | Llama-3-IT-8B-8k | The Llama 3 Herd of Models | 0.23 | 2024-07-31 | 📦 zhuzilin/ring-flash-attention 📦 wenet-e2e/west 📦 zechenli03/sensorllm 📦 ziye2chen/LLMs-for-Mathematical-Analysis 📦 willemsenbram/mention-detection-vgd |
3 | Llama-3-IT-8B-32k | The Llama 3 Herd of Models | 0.23 | 2024-07-31 | 📦 zhuzilin/ring-flash-attention 📦 wenet-e2e/west 📦 zechenli03/sensorllm 📦 ziye2chen/LLMs-for-Mathematical-Analysis 📦 willemsenbram/mention-detection-vgd |
4 | GPT-4o-2024-08-06-128k | GPT-4 Technical Report | 0.23 | 2023-03-15 | 📦 openai/evals 📦 shmsw25/factscore 📦 unispac/visual-adversarial-examples-jailbreak-large-language-models |
5 | Mistral-v02-7B-32k | Mistral 7B | 0.19 | 2023-10-10 | 📦 mistralai/mistral-src 📦 facebookresearch/fairseq2 📦 mgmalek/efficient_cross_entropy |