PLUM Study: Politeness Effects on LLMs Are Not Universal

Politeness Doesn't Work on Every LLM: The PLUM Verdict

A systematic study of politeness effects on five LLMs across three languages reveals that there is no universal benefit to being polite. The findings force a major rethinking of prompt engineering best practices.

Published May 19, 2026 1 min read By SynapsFlow.com

A new study from researchers at multiple universities tested five leading LLMs across three languages, asking a simple question: does politeness matter? The answer, published on arXiv on April 17, 2026, is a resounding 'it depends' — and that uncertainty is far more dangerous for developers than a simple 'no.'

Researchers created the PLUM corpus to test how politeness and impoliteness affect LLM responses across English, Hindi, and Spanish.
Five models were tested: Gemini-Pro, GPT-4o Mini, Claude 3.7 Sonnet, DeepSeek-Chat, and Llama 3.
The study found that politeness effects are model-specific, language-specific, and history-dependent — there is no universal rule.
This means that current prompt engineering advice ("always be polite") is dangerously oversimplified.

Source and attribution

arXiv
No Universal Courtesy: A Cross-Linguistic, Multi-Model Study of Politeness Effects on LLMs Using the PLUM Corpus

Article details

Author SynapsFlow.com

Published 19.05.2026 00:16

Updated 26.05.2026 00:32

Reading time 1 min

Published by SynapsFlow.com as a brand-led AI publication. Reporting, workflow, and corrections remain accountable to the SynapsFlow editorial standards.

Key implications: The PLUM study shatters the assumption that being nice to an LLM universally improves output quality. For developers and product managers, this means that prompt engineering playbooks must become far more granular — accounting for language, model architecture, and even interaction history. The biggest losers are companies that peddle one-size-fits-all 'polite prompt' templates.