Harnessing Zero-Shot LLMs for Financial Disclosure Insights
A new study tests if a trained aggregator can enhance zero-shot LLMs for corporate disclosure classification. The results show improved accuracy over individual models.
Zero-shot large language models (LLMs) are reshaping the AI landscape with their ability to interpret text without the need for task-specific fine-tuning. But, can they consistently decipher corporate disclosures to predict stock movements? A recent study explores this by combining the judgments of diverse zero-shot LLMs using a lightweight trained aggregator.
Multi-Agent Framework
The research employs a multi-agent setup where three zero-shot agents analyze each disclosure, outputting sentiment labels, confidence scores, and short rationales. The twist? A logistic meta-classifier aggregates these outputs to predict next-day stock returns. This study draws from 18,420 U.S. corporate disclosures by Nasdaq and S&P 500 firms, spanning from 2018 to 2024.
Performance and Results
The results are telling. The trained aggregator outperformed individual agents, majority votes, confidence-weighted voting, and even a FinBERT baseline. Balanced accuracy jumped from 0.561 for the best single agent to 0.612 with the aggregator. Notably, the trained model excelled in scenarios where disclosures mixed strong current performance with weak guidance or elevated risk.
Significance and Implications
What does this mean for financial analysts and AI developers? The study suggests that zero-shot LLM agents capture diverse financial signals, and supervised aggregation can harness these differences into a more accurate classification target. While the improvements may seem incremental, they reflect a meaningful advance in financial AI applications. Are we on the brink of a new era where AI can reliably interpret financial disclosures with minimal human intervention? This paper's key contribution is illustrating how aggregation can enhance the utility of zero-shot models in complex financial settings.
What's missing is perhaps a broader set of models and scenarios. The study focuses on a particular type of financial disclosure, but how would the approach fare with other types? That's an open question worth exploring.
Get AI news in your inbox
Daily digest of what matters in AI.