Catch Me If You Can? Not Yet: LLMs Still Struggle to Imitate the Implicit Writing Styles of Everyday Authors

Zhengxiang Wang*, Nafis Irtiza Tripto*, Solha Park, Zhenzhen Li, Jiawei Zhou
EMNLP 2025 (Findings)

Abstract

As large language models (LLMs) become increasingly integrated into personal writing tools, a critical question arises: can LLMs faithfully imitate an individual's writing style from just a few examples? Personal style is often subtle and implicit, making it difficult to specify through prompts yet essential for user-aligned generation. This work presents a comprehensive evaluation of state-of-the-art LLMs' ability to mimic personal writing styles via in-context learning from a small number of user-authored samples. We introduce an ensemble of complementary metrics including authorship attribution, authorship verification, style matching, and AI detection to robustly assess style imitation. Our evaluation spans over 40,000 generations per model across domains such as news, email, forums, and blogs, covering writing samples from more than 400 real-world authors. Results show that while LLMs can approximate user styles in structured formats like news and email, they struggle with nuanced, informal writing in blogs and forums. Further analysis on prompting strategies such as number of demonstrations reveals key limitations in effective personalization. Our findings highlight a fundamental gap in personalized LLM adaptation and the need for improved techniques to support implicit, style-consistent generation. To aid future research and reproducibility, we open-source our data and code.

Main Results

Conclusion

This paper presents a comprehensive evaluation of state-of-the-art LLMs on their ability to mimic the implicit writing styles of everyday users through few-shot in-context learning. By combining authorship attribution, verification, stylometric modeling, and AI generation detection across four diverse datasets, we provide strong empirical evidence that despite improvements from exemplar-based prompting, current LLMs still struggle to reproduce nuanced personal styles, especially in informal and stylistically diverse domains. Our analysis further shows that prompt design choices, such as length alignment and content similarity, moderately affect stylistic fidelity but do not close the personalization gap. These findings highlight fundamental limitations in the stylistic adaptability of LLMs and suggest that achieving truly personalized generation remains an open challenge. Future work should explore richer personalization signals and hybrid prompting and/or finetuning strategies to better capture the subtleties of individual writing styles in real-world settings.

BibTeX

@inproceedings{wang-etal-2025-catch,
    title = "Catch Me If You Can? Not Yet: {LLM}s Still Struggle to Imitate the Implicit Writing Styles of Everyday Authors",
    author = "Wang, Zhengxiang  and
      Tripto, Nafis Irtiza  and
      Park, Solha  and
      Li, Zhenzhen  and
      Zhou, Jiawei",
    editor = "Christodoulopoulos, Christos  and
      Chakraborty, Tanmoy  and
      Rose, Carolyn  and
      Peng, Violet",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2025",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.findings-emnlp.532/",
    doi = "10.18653/v1/2025.findings-emnlp.532",
    pages = "10040--10055",
    ISBN = "979-8-89176-335-7"
}