BLOOM 176B (one-shot) | 33.6 | 33.8 | 35.17 | BloombergGPT: A Large Language Model for Finance | - |
OPT 66B (one-shot) | 33.1 | 34.2 | 34.92 | BloombergGPT: A Large Language Model for Finance | - |
PaLM 540B (Self Consistency) | - | 64.5 | 63.4 | Large Language Models Can Self-Improve | - |
PaLM 540B (Self Improvement, Self Consistency) | - | 66.5 | 67.9 | Large Language Models Can Self-Improve | - |
GPT-NeoX (one-shot) | 32.6 | 33.8 | 36.17 | BloombergGPT: A Large Language Model for Finance | - |
PaLM 540B (CoT Prompting) | - | 58.9 | 60.6 | Large Language Models Can Self-Improve | - |
PaLM 540B (Self Improvement, Standard-Prompting) | - | 64.8 | 66.9 | Large Language Models Can Self-Improve | - |
Bloomberg GPT (one-shot) | 32.9 | 34.4 | 37.33 | BloombergGPT: A Large Language Model for Finance | - |