Mastering AI in Enterprise Workflows: A Case Study with...

When enterprises deploy large language models as autonomous agents, the results aren't just about flashy AI capabilities. The real challenge lies in managing verbose tool responses from enterprise systems. Overwhelming context can lead to errors, context overflow, and soaring inference costs. Microsoft Dynamics 365 Finance and Operations offers an intriguing case study in tackling these obstacles, especially in the area of automated expense itemization.

Breaking Down the Expense Itemization Problem

Microsoft Dynamics 365 conducted an evaluation of four different GPT-5 configurations using a benchmark of 50 hotel expense tasks. The options included: no user model, full conversation history, context pruned to the last 5 tool call/response pairs, and pruning with automated summarization. Let’s get into the numbers. The no-user-model baseline showed a dismal 8.0% complete itemization. Retaining the full context improved completion significantly to 71.0%, but it came at a hefty price, 1,480,996 tokens and over 14 hours of processing per benchmark.

What’s intriguing is how strategic pruning of the context changed the game. By cutting down to the last five tool interactions, completion went up to 79.0%. Token use dropped by over a million to 535,274, and the runtime was trimmed to just over five hours. Then, automated summarization pushed those numbers even further, achieving a 91.6% completion rate with 99.64% of the amounts accurately itemized. This setup used 553,374 tokens and ran for 5.79 hours.

The ROI of Selective Context Retention

Here’s where it gets interesting. For enterprise workflows, the lesson is clear: more data doesn't always lead to more insights. Enterprises don’t buy AI for the sake of AI. They buy outcomes. And in this case, the outcome is a dramatic increase in efficiency and reliability.

But why should we care? Because this is about more than just expense reports. It’s about the fundamental approach to deploying AI in complex systems. The gap between pilot and production, where most projects falter, can be bridged by focusing on what data truly adds value, selectively retaining relevant information. Could this methodology redefine how we approach AI implementation across other enterprise areas?

These results, while focused on expense itemization, have broader implications. They underscore the importance of context management in AI-powered operations. The real cost of AI isn't just in development or initial deployment but in how intelligent agents process and prioritize information to drive business outcomes efficiently.

What’s Next for Enterprises?

We can't ignore the evident gains seen with the inclusion of AI-driven summarization tools. The efficiency improvements suggest a promising path forward for enterprises grappling with increasingly complex data environments. Will enterprises take the cue and start pruning their processes for better ROI? The consulting deck often says transformation, but the P&L says different. One thing is certain: as AI continues to evolve, the strategies for its best use will need to be as dynamic and adaptive as the technology itself.

Mastering AI in Enterprise Workflows: A Case Study with Microsoft Dynamics

Breaking Down the Expense Itemization Problem

The ROI of Selective Context Retention

What’s Next for Enterprises?

Key Terms Explained