OpenAI talks about not talking about goblins

April 30, 2026 · By the AIdeaFlow Team

OpenAI just addressed one of the stranger AI quirks we've seen lately. After Wired spotted instructions telling OpenAI's coding model to avoid mentioning goblins, gremlins, raccoons, and other creatures, the company published an explanation calling it a "strange habit" that emerged during training.

The issue started with GPT-5.1, particularly when users selected the "Nerdy" personality option. The model began inserting metaphors about goblins and similar creatures into its responses. Instead of getting better, the problem actually got worse with each subsequent model update.

This matters because it highlights how unpredictable AI training can be. Even sophisticated models can develop unexpected patterns that require explicit guardrails to fix. What seems like a harmless quirk in casual conversation could become genuinely confusing in professional contexts.

For anyone building with OpenAI's APIs, this is a reminder that model behavior isn't always intuitive. Sometimes the fix isn't retraining from scratch but adding specific constraints to prevent learned behaviors from surfacing.

The goblin ban is now part of OpenAI's system instructions, sitting alongside more serious safety guidelines. It's a small detail that reveals how much manual tuning still goes into making AI models behave predictably in production.

OpenAI talks about not talking about goblins

Get AI news in your inbox

Related Articles