Politeness Doesn't Work on Every LLM: The PLUM Verdict

Politeness Doesn't Work on Every LLM: The PLUM Verdict

A systematic study of politeness effects on five LLMs across three languages reveals that there is no universal benefit to being polite. The findings force a major rethinking of prompt engineering best practices.

A new study from researchers at multiple universities tested five leading LLMs across three languages, asking a simple question: does politeness matter? The answer, published on arXiv on April 17, 2026, is a resounding 'it depends' — and that uncertainty is far more dangerous for developers than a simple 'no.'
  • Researchers created the PLUM corpus to test how politeness and impoliteness affect LLM responses across English, Hindi, and Spanish.
  • Five models were tested: Gemini-Pro, GPT-4o Mini, Claude 3.7 Sonnet, DeepSeek-Chat, and Llama 3.
  • The study found that politeness effects are model-specific, language-specific, and history-dependent — there is no universal rule.
  • This means that current prompt engineering advice ("always be polite") is dangerously oversimplified.

Source and attribution

arXiv
No Universal Courtesy: A Cross-Linguistic, Multi-Model Study of Politeness Effects on LLMs Using the PLUM Corpus

Discussion

Add a comment

0/5000
Loading comments...