PromptPrint: Unmasking Identity through Short-Form Texts

In the area of digital interactions, the notion of authorship attribution has traditionally zeroed in on lengthy, expressive pieces. However, PromptPrint, a groundbreaking study, shifts the spotlight to the brief, task-oriented prompts used with large language models (LLMs). The paper, published in Japanese, reveals something intriguing: do these short prompts still carry a unique, identifiable fingerprint of their author?

Findings: Lexical Stability and Identity

The study draws on an impressive dataset of 20,680 prompts from 1,034 users. PromptPrint establishes three major findings. Firstly, lexical representations outperform semantic encoders, endorsing the 'lexical stability hypothesis.' Essentially, a user's identity is more tightly bound to the choice of words rather than the underlying intent. Western coverage has largely overlooked this aspect, perhaps due to an overemphasis on semantics over syntax in AI discussions.

Exploring the Uniqueness-Consistency Paradox

Secondly, the study highlights a 'uniqueness-consistency paradox.' Users display distinctiveness across a broad spectrum, yet they exhibit inconsistency when varying contexts are considered. This raises a compelling question: how do we balance individuality with adaptability in digital behaviors?

Security and Privacy Implications at Scale

Finally, the study's adversarial analysis uncovers a vulnerability spectrum. Identity signals withstand minor lexical perturbations, but they falter significantly when faced with semantic paraphrasing. The benchmark results speak for themselves, showing that prompt-based identity can effectively serve as a behavioral biometric. This has profound implications for security and privacy, especially when considering potential misuse or surveillance in digital platforms.

As PromptPrint articulates a new perspective on user modeling in LLM interactions, it poses a direct challenge to existing notions of privacy and security in the digital age. With data and code set to be released upon the acceptance of their work, the study’s impact is poised to extend far beyond academic circles.

PromptPrint: Unmasking Identity through Short-Form Texts

Findings: Lexical Stability and Identity

Exploring the Uniqueness-Consistency Paradox

Security and Privacy Implications at Scale

Key Terms Explained