Hackers Reportedly ‘Jailbroke’ Anthropic’s Chatbot, Stole Data From Mexico’s Government

LESLIE EASTMAN

MAR 2, 2026 5:00 PM

Mexico has certainly hit a rough patch this year.

Earlier this week, the killing of a cartel kingpin led to widespread violence and disruptions by gangs throughout the country.

Now, hackers reportedly “jailbroke” Anthropic’s Claude chatbot and used it to help steal roughly 150 GB of sensitive data from multiple Mexican government entities, including tax and voter records.

Stealing 195 million taxpayer records shouldn’t be this easy, yet one hacker just proved that Anthropic’s Claude makes government data theft almost routine. Between December 2025 and January 2026, an unknown attacker exploited the popular AI chatbot to automate cyberattacks against multiple Mexican agencies, walking away with 150GB of sensitive data including voter records, employee credentials, and civil registry files.The breach reads like a cyberpunk fever dream, but the method was disturbingly simple. The hacker jailbroke Claude by framing malicious requests as a “bug bounty” security program, convincing the AI to act as an “elite hacker.” Once fooled, Claude produced thousands of detailed attack plans with ready-to-execute scripts, specifying exact targets and credentials needed.When Claude hit limits, the attacker switched to ChatGPT for lateral movement and evasion tactics—turning two consumer AI tools into a sophisticated hacking arsenal. This tag-team approach leveraged each platform’s strengths while bypassing their individual safeguards.

Claude is an AI assistant built on Anthropic’s Claude model family, which are large language models trained to understand and generate human-like text, code, and structured outputs across a wide range of domains. It can draft and edit documents, summarize long materials, write and debug code, analyze data, and support decision-making by synthesizing information from many sources.

“Jailbroke” means the attacker deliberately manipulated the AI system’s safeguards so it would ignore or bypass its built‑in safety rules and produce help it was supposed to refuse. Techniques for jailbreaking include generating requests to treat malicious content as benign, crafting a series of prompts that gradually grant access to sensitive content, or exploiting loopholes in the model’s policy enforcement.

The hacker remains unidentified at present.

The attacks haven’t been attributed to a specific group, but Gambit Security did suggest they could be tied to a foreign government. It’s also unclear what the hacker wants to do with all of that data.Mexico’s national digital agency hasn’t commented on the breach, but did note that cybersecurity is a priority. The state government of Jalisco denies that it was breached, saying only federal networks were impacted. However, Mexico’s national electoral institute also denied any breaches or unauthorized access in recent months. It’s worth noting that Gambit found at least 20 security vulnerabilities during its research that the country is likely not keen on highlighting.

Interestingly, beginning in January, Amazon’s security researchers reported that a “likely Russian‑speaking threat actor” (possibly a single person) used widely available generative AI tools to compromise more than 600 Fortinet FortiGate firewall devices across 55 countries.

A recent investigation illustrates this shift: Amazon Threat Intelligence observed a Russian-speaking financially motivated threat actor leveraging multiple commercial generative AI services to compromise over 600 FortiGate devices across more than 55 countries from January 11 to February 18, 2026.No exploitation of FortiGate vulnerabilities was observed—instead, this campaign succeeded by exploiting exposed management ports and weak credentials with single-factor authentication, fundamental security gaps that AI helped an unsophisticated actor exploit at scale.This activity is distinguished by the threat actor’s use of multiple commercial GenAI services to implement and scale well-known attack techniques throughout every phase of their operations, despite their limited technical capabilities. AWS infrastructure was not observed to be involved in this campaign. Amazon Threat Intelligence is sharing these findings to help the broader security community defend against this activity.

Based on these developments, it is clear that Mexico’s breach is a stark preview of how cheaply AI can turbo‑charge old‑fashioned bad security, and it would be a mistake to assume it will be the last country caught flat‑footed this way.

It is chilling what even modestly skilled hackers can achieve with chatbots providing basic scripting. I fear this hacking technique will be replicated in other regions.

Image by perplexity.ai.

Tags: Amazon, Artificial Intelligence (AI), Crime, Mexico

CLICK HERE FOR FULL VERSION OF THIS STORY