Posts by: Javier Marín

May 5, 2026

Hallucinations in LLMs Are Not a Bug in the Data

Subtitle: It’s a feature of the architecture Summary: Hallucination in LLMs is not a data quality problem. It is not a training problem. It is not a problem you can solve with more [RLHF](https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedbac), better filtering, or a larger context window. **It is a structural property of what these systems are optimized to do.** I have held this position for months, and the reaction is predictable: researchers working on retrieval augmentation, fine-tuning pipelines, and alignment techniques would prefer a more optimistic framing. I understand why. What has been missing from this argument is geometry. Intuition about objectives and architecture is necessary but not sufficient. We need to open the model and look at what is actually happening inside when a system produces a confident wrong answer. Not at the logits. Not at the attention patterns. At the internal trajectory of the representation itself, layer by layer, from input to output. That is what the work I am presenting here did.

View Article

Become Part of the Global Movement

Become part of a thriving network of over 70,000 AI and ML professionals. Experience unparalleled opportunities for learning, collaboration, and growth—all for free!

Join the Community