Can AI Master South Asian Classical Music?

Large Language Models (LLMs) have made waves in text and music processing, yet their abilities are primarily tested within Western paradigms. These models, however advanced, struggle with non-Western traditions. A recent study dives into this, exploring how well these models understand and generate music in the context of South Asian classical genres.

Understanding Beyond Western Boundaries

Western music has long been the focus of AI research, leaving a critical question: Can AI handle the complexities of alternative musical traditions? This study evaluates LLMs against the backdrop of South Asian classical music, specifically the Hindustani classical theory and Bengali forms like Rabindra and Nazrul Sangeet. These genres are governed by raga and tala, introducing intricacies outside Western harmonic norms.

The paper's key contribution: a 504-question benchmark to assess LLMs on raga grammar, cultural knowledge, and symbolic notation. Frontier models such as Gemini 2.5 Pro excelled, achieving 85-90% accuracy. However, open-source models lagged significantly, stuck in the 23-40% accuracy range. These numbers underscore a disparity in AI's cultural competence.

Generation Gaps: Style vs. Structure

When tasked with music generation, the LLMs face another hurdle. A five-level controlled prompting system revealed that even top models only maintained stylistic accuracy 40% of the time. This highlights a critical challenge: balancing structural validity with stylistic faithfulness. Is AI trying to learn a new language without the proper dictionary?

These findings are a wake-up call. The world is a mosaic of musical expressions, and AI needs to keep up. The study poses an urgent question: Can we build models that reflect cultural diversity, or will AI remain a Western-centric tool?

The Road Ahead

This builds on prior work from diverse musicology and AI intersections. Yet, it's clear there's much to be done. Crucially, AI developers must address these gaps if they aim for globally applicable models. The ablation study reveals where improvements are needed, but it's the broader research community's responsibility to push this frontier.

Code and data are available at the project's repository, enabling further exploration and refinement. While the study doesn't offer all the answers, it marks a important step towards more inclusive AI systems, reminding us that cultural nuances can't be overlooked.