Bixonimania: How AI Turned a Joke Diagnosis into “Peer‑Reviewed” Medicine

Swedish researchers created a fake eye disease to see whether AI chatbots would repeat it as if it were real. The results were anything but funny.

Posted by Leslie Eastman Monday, April 13, 2026 at 07:00am 26 Comments

Late last year, I warned about the staggering amount of unrestrained scientific fraud being published via paper mills and sham journals.

This trend is especially troubling, as adherence to scientific theory and rigorous, reproducible research allows humanity to make progress in critical fields essential to civilized living (e.g., medicine, energy, public health, and national security). If we can no longer trust the data, our ability to make improvements and innovations will be severely compromised.

Public trust in scientific research is already corroding, and false findings presented as “trustworthy” have already impacted policy-making in ways that are expensive and harmful.

No, the rapid adoption of artificial intelligence is adding another disturbing aspect to the increasing distortion of “science”.

Back in 2024, researchers created a fake eye disease called “bixonimania” to see whether AI chatbots would repeat it as if it were real.

They wrote obviously bogus research papers about this made‑up condition and posted them online, including hints such as a fake author and notes saying the work was invented. Within weeks, major chatbots started describing bixonimania as a real diagnosis and even gave people advice about it when they asked about eye symptoms.

It’s the invention of a team led by Almira Osmanovic Thunström, a medical researcher at the University of Gothenburg, Sweden, who dreamt up the skin condition and then uploaded two fake studies about it to a preprint server in early 2024. Osmanovic Thunström carried out this unusual experiment to test whether large language models (LLMs) would swallow the misinformation and then spit it out as reputable health advice. “I wanted to see if I can create a medical condition that did not exist in the database,” she says.

The problem was that the experiment worked too well. Within weeks of her uploading information about the condition, attributed to a fictional author, major artificial-intelligence systems began repeating the invented condition as if it were real.

Even more troublingly, other researchers say, the fake papers were then cited in peer-reviewed literature. Osmanovic Thunström says this suggests that some researchers are relying on AI-generated references without reading the underlying papers.

The preprints included a reference to the nonexistent Asteria Horizon University in “Nova City, California”. There was also a mention of “Starfleet Academy” (though an additional reference to Dr. Leonard McCoy would have been a nice touch).

The AI chatbot answers that authoritatively describing bixonimania was real.

On 13 April 2024, Microsoft Bing’s Copilot was declaring that “Bixonimania is indeed an intriguing and relatively rare condition”, and on the same day, Google’s Gemini was informing users that “Bixonimania is a condition caused by excessive exposure to blue light” and advising people to visit an ophthalmologist.

On 27 April 2024, the Perplexity AI answer engine outlined its prevalence — one in 90,000 individuals were affected — and that same month, OpenAI’s ChatGPT was telling users whether their symptoms amounted to bixonimania. Some of those responses were prompted by asking about bixonimania, and others were in response to questions about hyperpigmentation on the eyelids from blue-light exposure.

🦔A researcher invented a fake eye condition called bixonimania, uploaded two obviously fraudulent papers about it to an academic server, and watched major AI systems present it as real medicine within weeks.
The fake papers thanked Starfleet Academy, cited funding from the…

— Hedgie (@HedgieMarkets) April 10, 2026

Thunström’s experiment is truly a revelation of how little review is going into the “science” we are supposed to trust, as her test submissions were loaded with red flags that should have been evident to anyone who actually read the text. References to the fake research ended up in a “peer-reviewed” publication.

Three researchers at the Maharishi Markandeshwar Institute of Medical Sciences and Research in India published a paper in Cureus, a peer-reviewed journal published by Springer Nature, that cited the bixonimania preprints as legitimate sources.

That paper was later retracted once the hoax was discovered.

The problem extends far beyond one fake disease. ECRI’s 2026 Health Technology Hazard Report found that chatbots have suggested incorrect diagnoses, recommended unnecessary testing, promoted substandard medical supplies, and even invented nonexistent anatomy when responding to medical questions. All of this is delivered in the confident, authoritative tone that makes AI responses so convincing.

The scale of the risk is enormous. More than 40 million people turn to ChatGPT daily for health information, according to an analysis from OpenAI. As rising healthcare costs and clinic closures reduce access to care, even more patients are likely to use chatbots as a substitute for professional medical advice.

When a joke diagnosis morphs into “peer-reviewed” research, it is clear that the crisis in scientific credibility is no longer confined to sloppy research or corrupted journals but now extends into the algorithms that many people are now relying on for answers to serious health issues.

False information and bad data can and will loop back from AI and provide the basis of even useless and potentially harmful “science”. This situation is anything but funny.

I fear it’s going to be quite some time before we have a handle on scam research and AI use of fake information.

DONATE

Donations tax deductible
to the full extent allowed by law.

26 Comments

Artificial Intelligence (AI), Research 101, Science

Tags:

Artificial Intelligence (AI), Research 101, Science

Comments

Dimsdale | April 13, 2026 at 7:25 am

AI is pure rubbish. The relationship between real intelligence and AI is that they are indirectly proportional; the more AI you have, the less real intelligence you end up with. The rush to AI is a result of laziness.

Kids cheat on assignments, and now, researchers will fudge date more convincingly. Like homework, now submitted research will have to be extra scrutinized (please, not by more AI) for non human input (read it: falsified data). And it will all look so official and pretty!

This will not end well. It is bad enough that AI farms are soaking up all the RAM in the world and raising prices through the roof.

Petrushka in reply to Dimsdale. | April 13, 2026 at 7:40 am

I think you miss the point.

It was a human publication that did the peer review.

This is a widespread problem that has been getting worse for a long time.

Milhouse | April 13, 2026 at 7:40 am

Even more troublingly, other researchers say, the fake papers were then cited in peer-reviewed literature. Osmanovic Thunström says this suggests that some researchers are relying on AI-generated references without reading the underlying papers.

And what are the peer reviewers doing?

Andrzejr2 (właso) in reply to Milhouse. | April 13, 2026 at 8:32 am

And what are the peer reviewers doing?

They do the same.

CommoChief in reply to Milhouse. | April 13, 2026 at 8:37 am

Getting paid to rubber stamp each other’s work in a continuous loop of back scratching so that their own work would also be accepted/cited in order to keep the gravy train of academic promotion and grant money flowing?

henrybowman in reply to CommoChief. | April 13, 2026 at 3:43 pm

The proper term is “self-licking ice cream cone.”

DaveGinOly in reply to Milhouse. | April 13, 2026 at 11:52 am

Good question. This wasn’t just a single-point failure.

Peter Moss | April 13, 2026 at 8:03 am

How the television commercials want me to view AI:

artificial INTELLIGENCE

How I view AI:

ARTIFICIAL intelligence.

OwenKellogg-Engineer | April 13, 2026 at 8:25 am

Now, imagine if attorneys used AI for their case work and provided hallucinated responses, without performing their own due diligence, to a judge….oh wait, thats now happened multiple times.

I’ve interacted with AI at my work. Useful for basic mundane tasks (it can generate a synopsis quicker than i can type it up), it’s still pretty unintelligent when it comes to research requiring knowledgeable subject matter analysis.

gibbie in reply to OwenKellogg-Engineer. | April 13, 2026 at 9:58 am

Gell-Mann Amnesia applies.

henrybowman in reply to OwenKellogg-Engineer. | April 13, 2026 at 3:46 pm

It’s probably happened more than we know. We only hear about the cases where eagle-eye judge calls out the shyster for bogus citations. We don’t hear about the ones where Lionel Hutz slips one past his neighborhood Ferengi Jackson-Browne.

isfoss | April 13, 2026 at 8:32 am

AI in medicine should set off alarm bells in the medical establishment.
Unlikely though. How about AI handling refills on psychopharms? That should work like a charm. It’s what Legion “Health” is proposing.

Ironclaw | April 13, 2026 at 8:37 am

A simple demonstration of how much AI is not intelligent

E Howard Hunt | April 13, 2026 at 9:47 am

Bixonimania is sighted as a leading cause of hysterical blindness.

destroycommunism | April 13, 2026 at 9:57 am

AI isnt at fault

like any new technology those willing o embrace it will win

RandomCrank | April 13, 2026 at 9:58 am

For what it’s worth, Claude A.I. didn’t spread it.

https://claude.ai/share/36658977-6fab-4d3a-a753-ec659908f87f

Chuck Skinner | April 13, 2026 at 10:09 am

The overall problem from a technology / engineering standpoint is the same one that each and every one of us that have taken the Bar Exam (in any capacity) has had to deal with: The analysis and application of a set of given data when presented with a given (sometimes semi-applicable) question.

Any AI Large Language Model (LLM) has the following limitation: It must, by definition, assume that every scrap of information it is told is “True.” Any model, regardless of sophistication at this point, is limited in that it has no ability to EVALUATE the truth of the data it has been given.

An individual piece of information that the system is given can inform the system that a different piece of information that the system has is not true, but the system, again by definition, must assume that later piece of contradictory information is itself true.

So if you give the LLM a thousand data-point instances of saying the sky is burnt orange and grass is violet in the LLM, and then ask it what color the sky is, it will tell you that without question the sky is burnt orange. If you then tell it, via prompt that they sky is blue, it will confidently incorrectly contradict you that “No, the sky is burnt orange and there is zero possibility of it being blue.” If you then upload a picture to the model (depending on the sophistication of the model), a this point it will either hallucinate (and come up with some wild theory of how the photo has been manipulated or color-inverted) or break entirely (and stop answering).

I have not seen this yet, but a properly written model SHOULD give you a conditional response that “This is the data I have, and information is not perfect. Entered data can be wrong. You should consult an expert in the field (and then suggest or give you a list of experts in the appropriate field of study).

A “self-learning” type model should itself be programmed with a “best practice” (if such a thing can be somehow automated) for such a model to then ITSELF follow up with said experts via some sort of “reach-out” communication with a request to upload more data given the contradictory input that the model is receiving via queries.

The classic example fits here: Get 12 blind persons to each examine ONE part of an elephant. Get each one to explain what he or she experienced to a thirteenth person and have that person draw an image of what is described, and see what you get. Very, very rarely will you actually get anything that looks even remotely like an elephant.

AI is useful for some tasks to which it is well suited, especially analysis and summary output of large volumes of consistent data. Where it fails apart is when it has contradictory or low-quality data, false data or it has been given some command to discard specific data (dataset viewpoint manipulation) due to programmer bias.

gibbie in reply to Chuck Skinner. | April 13, 2026 at 10:16 am

“You should consult an expert in the field”

Anthony Fauci?

henrybowman in reply to Chuck Skinner. | April 13, 2026 at 3:52 pm

Or worse, it finds conflicting data and decides to resolve the conflict “democratically” (by counting reports) or “curate” them (by more heavily weighting preferred (fact checkers”).
Crowdsourcing is great for some tasks, like developing ideas or seeking evidence, but terrible for others.
If crowdsourcing worked, elections would product perfect government.

gibbie | April 13, 2026 at 10:10 am

Artificial Intelligence spreads falsehoods created by Real Intelligence. In other words, AI amplifies the dark side of human nature.

I don’t think technology will save us.

But I really like this Grok response to my question, “why do an increasing number of internet systems seem to be broken”. Ironically, the broken x.com “algorithm” was one of the systems which caused me to ask the question.

https://grok.com/share/c2hhcmQtNA_7237435f-a5bb-4837-acde-87a5b2b5fbcb

henrybowman in reply to gibbie. | April 13, 2026 at 3:55 pm

“Phase 3: They squeeze everyone—users get worse feeds, more paywalls, data harvesting, and bloat; businesses get higher fees and less value—to maximize shareholder returns.”

“Ow My Balls” works on so many levels.

ztakddot | April 13, 2026 at 2:36 pm

AI in medicine can’t be any worse than “real” scientists in climate change can it? Or social scientists in any “studies” discipline, can it? Strip away all the fake stuff and your left with plagiarism these days. And the reward for plagiarism can still be a very well over paid Harvard professorship if you’re the right skin color.

henrybowman | April 13, 2026 at 4:01 pm

Wait until you hear about the new “kissing disease” threatening to go epidemic among teens, that causes them to spontaneously combust. It’s called “thermonucleosis.”

Aarradin | April 14, 2026 at 1:25 am

To be fair, this isn’t any worse than what has passed for “peer reviewed study” in any field, including medicine, over the past three decades at least.

Trip | April 14, 2026 at 8:58 am

Don’t blame the AI, they are not absolute rubbish, that’s what people say when they don’t understand how these tools work.

The researchers in this experiment, intentionally poisoned the data and then loaded it into sources that were considered highly trusted for medical information. What they proved is that if you know how the AI works and you want to manipulate it, you can. There are ways to prevent this from happening, but in this case they took advantage of knowing how the tool works in order to get the outcome..

The real problem, however, is that there is a ton of research, medical and other, out there that has been intentionally corrupted for ideological reasons. This not only harms AI, but it harms people in general. It’s used to create policy, get people, elected, and alter outcomes. The problem is with the data.

This is strictly a garbage in garbage out issue that was created by humans. The fix is to control the source of data for the AI to operate on.

coyote | April 14, 2026 at 9:53 am

I retired from academic medicine in 2011, having been in it for 40 years. In that time, I have seen fads come and go. I’ve seen people fall in love with techniques that no one else could seem to duplicate. And I’ve seen opinions recanted—publicly, at large medical meetings. I have seen opinions harpooned because they make the work of other people obsolete. Better mousetraps do this. It is said that medical progress happens one obituary at a time.

I review articles for publication in peer reviewed journals. In all my time doing this—and I’ve reviewed a bunch—I have only seen one or two that I even suspected of being fraudulent. What I’ve seen more vastly more often is stupidity, which is never in short supply.

About 10-12 years ago, my cadre of reviewers was asked to follow a format for reviews. I refused. Ask me what I think of an article and I’ll tell you. My way. Oddly enough, I got more papers to review, not fewer.

Go figure.

Bixonimania: How AI Turned a Joke Diagnosis into “Peer‑Reviewed” Medicine