For years now, artificial intelligence has been saddled with accusations of bias—from politics to race and gender.
Even Democrats recognize that the likes of ChatGPT and Gemini demonstrate a clear left-leaning slant, favoring liberals over conservatives. And for good reason: Based on their responses to user queries, the most prominent large language models (LLMs) are indeed biased against Republicans.
However, the bias runs even deeper than that, and new research proves it.
Put aside politics for now. Let’s take a look at hiring, which increasingly relies on real-world LLM deployment.
AI recruitment is getting more and more common today. We are seeing billion-dollar valuations for AI-native recruitment startups like Mercor ($2 billion) and Paradox ($1.5 billion), alongside the deployment of similar systems by established platforms such as Indeed or LinkedIn. These entities process hundreds of millions of candidate profiles, and they are already used by major companies. Paradox, for instance, lists GM, Dell, McDonald’s, and Marriott as customers.
According to a new study by machine learning researchers Adam Karvonen and Sam Marks, today’s large language models (LLMs)—including those used by ChatGPT and Gemini—consistently favor black over white candidates and female over male candidates in hiring decisions. The bias is stark: up to 12% difference in interview rates for white men. A black woman named “Tamika Williams” is 25-50% more likely to get hired than a white man named “Todd Baker” with the same resume.
This supports earlier findings that AI models value human lives unequally by nationality, willing to trade 20 American lives for 1 Nigerian one or Malala Yousafzai for Donald Trump. Notably, these emergent values strengthen over time as models grow more powerful.
Some won’t care about bias against White men, or Americans. They may even celebrate it. But, if we are striving for true equality in an AI-powered America, LLM bias should concern us all.
And simple fixes don’t work. Adding additional prompts to not be biased, as has been often suggested, do not significantly affect results in realistic hiring scenarios. Neither does removing names from resumes—LLMs have long been able to identify race and gender (as well as many other demographic attributes) from writing style alone, better than humans can.
These findings are emblematic of a broader problem: AI companies are training and deploying models they don’t understand. It’s unlikely they’re introducing these biases deliberately—rather, the field itself has not advanced enough to understand their own models. When frontier companies reassure you that their models are unbiased— or won’t replace your job or end the world—you might want to think twice.
This all begs the question: What can we do?
First, demand transparency. As Dean Ball and Daniel Kokotajlo advocated for last year, we need to know how these models are trained. Anthropic has published the constitution it uses to align the models when training, plus the system prompts it offers to Claude. OpenAI has published its model spec, detailing ChatGPT’s “ideal” behavior. This is a good start.
Let’s be clear: We don’t need to force companies into disclosing proprietary details. The public doesn’t need to know the “how” behind a new AI capability or behavior, just that it was attained in the first place. We also don’t want the government dictating exactly how these models should be trained but rather we should have greater transparency into the training and capabilities of the models, so we can democratize the scrutiny.
AI companies must also improve their evaluations. As this paper shows, current audits are mostly “compliance theater.” Karvonen found that models appearing unbiased in simplified tests exhibited severe discrimination when given real-world context—like actual company names and hiring constraints.
This points at a broader failure: whether testing for bias or bioweapons, AI companies rely on unrealistic scenarios that miss how their models actually behave in the wild. We need evaluations that mirror real deployment conditions, not just simple benchmarks designed to pass investor or regulatory scrutiny.
Karvonen found that, in the case of race and gender bias, simply preventing the model from thinking about race and gender is effective for preventing bias. But this involves engineering the model’s internal “representations”.
This points to where the government should focus their attention, and limited technical capacity: on the frontier labs building these models, not the thousands of startups using them. A two-person startup in Silicon Valley using GPT-4o to “reform HR” can’t fix what OpenAI built wrong.
For bias, we don’t need new science or profound breakthroughs to mitigate bias. We need industry-wide willpower from AI companies to identify bias as a threat and work on addressing it. Beyond this, though, we need substantial effort and resources towards understanding and aligning model behaviors.
Otherwise, the rest of America will bear the consequences—from job-seekers to loan applicants and high school students applying to college. That is a scary world for us all.
Jason Hausenloy is an independent AI policy researcher. He blogs here.
CLICK HERE FOR FULL VERSION OF THIS STORY