HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering

Zhiyu Chen Shiyang Li Charese Smiley Zhiqiang Ma Sameena Shah William Yang Wang

ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering

Abstract

With the recent advance in large pre-trained language models, researchers have achieved record performances in NLP tasks that mostly focus on language pattern matching. The community is experiencing the shift of the challenge from how to model language to the imitation of complex reasoning abilities like human beings. In this work, we investigate the application domain of finance that involves real-world, complex numerical reasoning. We propose a new large-scale dataset, ConvFinQA, aiming to study the chain of numerical reasoning in conversational question answering. Our dataset poses great challenge in modeling long-range, complex numerical reasoning paths in real-world conversations. We conduct comprehensive experiments and analyses with both the neural symbolic methods and the prompting-based methods, to provide insights into the reasoning mechanisms of these two divisions. We believe our new dataset should serve as a valuable resource to push forward the exploration of real-world, complex reasoning tasks as the next research focus. Our dataset and code is publicly available at https://github.com/czyssrs/ConvFinQA.

Code Repositories

czyssrs/convfinqa
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
conversational-question-answering-onFinQANet (RoBERTa-large)
Execution Accuracy: 68.90
Program Accuracy: 68.24
question-answering-on-convfinqaFinQANet (RoBERTa-large)
Execution Accuracy: 68.9

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp