Enhancing Legal Classifiers: The Role of LLMs in...

The FETCH classifier, an ensemble of low-cost large language models (LLMs), is designed to refine the classification of legal issues. It's an intriguing approach, yet one critical component remains elusive: generating high-quality follow-up questions.

Low-Cost Models Fall Short

The core functionality of FETCH lies in its ability to categorize legal problems. For this, low-cost LLMs perform admirably. However, the same can't be said for crafting plain-language questions. The attempt to generate meaningful inquiries reveals a gap in sophistication, signaling the need for more advanced, and costly, models.

Why does this matter? As legal intake workers interact with clients, the ability to ask the right questions is essential. It ensures accurate classification and effective assistance. Can a low-cost approach truly serve the diverse complexities of legal problems?

Introducing GPT-5

Enter GPT-5, a high-cost model, into the equation. Its inclusion marks a significant improvement in the classifier's performance. With enhanced questioning capabilities, it draws out pertinent details from applicants, boosting classification accuracy.

However, this isn't a perfect solution. The process of prompt engineering alone doesn't suffice in elevating question quality. There's a divergence between LLM and human ratings in evaluating these questions. This disparity underscores the nuances that machine models struggle to capture, despite advancements.

Inconsistent Elicitation and Protocol Gaps

FETCH's performance isn't uniform across legal categories. Particularly, issues like domestic violence don't align with existing family law screening protocols. This inconsistency indicates a broader problem: the need for specialized screening panels in certain legal domains.

That raises a question: Are current legal intake systems prepared to integrate such advanced models effectively? The FETCH classifier's journey suggests there’s room for improvement, not just in technology but in procedural adaptation.

The paper's key contribution lies in highlighting these gaps and proposing solutions. Yet, it's a call to action for legal systems to embrace technological advancements without sidelining human expertise.

Enhancing Legal Classifiers: The Role of LLMs in Question Generation

Low-Cost Models Fall Short

Introducing GPT-5

Inconsistent Elicitation and Protocol Gaps

Key Terms Explained