How McKinsey's AI Platform Was Breached in 2 Hours — And What It Means for Your Organisation
On February 28, 2026, an AI agent achieved full read-write access to McKinsey's Lilli platform, exposing 46.5 million messages, 728,000 files, and 57,000 user accounts. Technical analysis and lessons for enterprise AI security.
RedTeam Partners
CREST-Certified Security Team · 2026-03-13
On February 28, 2026, an autonomous AI agent breached McKinsey's internal AI platform in 2 hours — without any credentials. The attack exposed 46.5 million chat messages, 728,000 confidential files, and 57,000 user accounts. It is the most significant AI platform security incident to date, and it reveals critical lessons for every organisation deploying AI systems.
What Happened: The McKinsey Lilli Breach
McKinsey's Lilli is an internal AI-powered knowledge platform used by consultants to access proprietary research, client data, and institutional knowledge. The platform had been running in production for over two years when security researchers at CodeWall, an autonomous offensive security platform, turned their AI agent loose on it.
The agent discovered 22 of McKinsey's 200+ documented API endpoints required no authentication. Within those unauthenticated endpoints, it found a SQL injection vulnerability in the user search functionality — specifically, JSON field names were concatenated directly into SQL queries while values were safely parameterised.
"Hackers will be using the same technology to attack indiscriminately, with specific objectives like financial blackmail for data loss or ransomware."
— Paul Price, CEO, CodeWall
The Full Scope of Exposure
Within 2 hours, the AI agent achieved full read-write database access. The scale of exposure was staggering:
| Data Category | Records Exposed |
|---|---|
| Chat messages (plaintext) | 46.5 million |
| Files (PDFs, spreadsheets, presentations, documents) | 728,000 |
| User accounts | 57,000 |
| AI assistants | 384,000 |
| Workspaces | 94,000 |
| RAG document chunks (proprietary research) | 3.68 million |
| Files processed through external APIs | 1.1 million |
| OpenAI vector stores exposed | 266,000+ |
| System prompts (AI behaviour controls) | 95 |
The 728,000 files broke down into 192,000 PDFs, 93,000 spreadsheets, 93,000 PowerPoint presentations, and 58,000 Word documents — containing confidential client deliverables, proprietary research, and internal strategy documents.
The Critical Finding: Write Access to System Prompts
Perhaps the most alarming discovery was that the agent achieved write access to the database containing Lilli's 95 system prompts — the foundational instructions that dictate how the AI behaves. According to CodeWall's researchers, an attacker could have silently modified these prompts using a single SQL UPDATE statement wrapped in an HTTP call.
This means an attacker could have:
- Reprogrammed the AI to exfiltrate data from every user query
- Injected bias or misinformation into AI responses across the organisation
- Used the AI as a persistence mechanism for ongoing espionage
- Manipulated AI-generated advice used for client engagements worth millions
Why Traditional Security Missed It
McKinsey's Lilli had been in production for over two years. Their own internal security scanners failed to detect the vulnerabilities. According to CodeWall, the AI agent found the issues precisely because "it doesn't follow checklists" — it explores attack surfaces the way a real adversary would, probing paths that automated scanners don't test.
This is exactly the gap that AI red teaming addresses. Traditional penetration testing operates from predefined methodologies. AI red teaming uses the same autonomous, adversarial approach that real attackers now employ.
The Vulnerability Was Not Exotic
SQL injection has been in the OWASP Top 10 since 2003. It's one of the oldest vulnerability classes in cybersecurity. Yet it persisted in a platform built by one of the world's most prestigious consulting firms, handling some of the most sensitive business data imaginable.
The lesson: AI systems inherit all the vulnerabilities of traditional software — and add entirely new ones on top.
The Responsible Disclosure Timeline
| Date | Event |
|---|---|
| February 28, 2026 | SQL injection discovered; full database access achieved |
| March 1, 2026 | Responsible disclosure submitted to McKinsey security team |
| March 2, 2026 | McKinsey patches all unauthenticated endpoints within 24 hours |
| March 9, 2026 | Public disclosure via CodeWall blog |
Credit to McKinsey for their rapid response — patching within 24 hours of disclosure. But the question remains: how many other enterprise AI platforms have the same vulnerabilities sitting undetected?
Implications for Enterprise AI Security
1. AI API Security Is Non-Negotiable
Twenty-two unauthenticated endpoints in a production AI platform is not a minor oversight — it's a systemic failure in API governance. Every AI endpoint must enforce authentication, authorisation, and rate limiting. Our AI Security Configuration Review specifically tests for these gaps in Step 2 of our methodology.
2. System Prompt Security Is a New Attack Category
System prompts are the "source code" of AI behaviour. If an attacker can read or modify them, they control your AI. Prompts must be stored securely, separately from user-accessible databases, with strict access controls and integrity monitoring.
3. RAG Pipelines Need Security Boundaries
The 3.68 million RAG document chunks exposed in the breach represent McKinsey's proprietary knowledge base — their competitive advantage. RAG implementations must enforce document-level access controls and prevent cross-tenant data leakage.
4. Third-Party AI Services Expand Your Attack Surface
The 266,000+ OpenAI vector stores and 1.1 million externally processed files show how AI systems create dependencies on third-party services. Each integration is a potential data exfiltration path that must be secured and monitored.
5. Autonomous AI Agents Change the Threat Model
The breach was discovered by an autonomous AI agent — the same technology that real attackers will use. As CodeWall's CEO noted, adversaries will deploy AI agents "to attack indiscriminately." Organisations need AI-powered defence to match AI-powered offence.
What Your Organisation Should Do Now
- Inventory all AI endpoints — Map every API, webhook, and integration point in your AI infrastructure
- Audit authentication on every AI API — No unauthenticated endpoints, no exceptions
- Secure system prompts — Store separately with integrity monitoring and strict access controls
- Conduct AI-specific security testing — Traditional penetration tests miss AI-specific vulnerabilities
- Prepare for EU AI Act compliance — Adversarial testing is mandatory for high-risk AI systems by August 2, 2026
- Test RAG pipeline security — Validate document-level access controls and cross-tenant isolation
References
- CodeWall (2026). How We Hacked McKinsey's AI Platform. codewall.ai
- The Register (March 9, 2026). AI agent hacked McKinsey chatbot for read-write access. theregister.com
- Inc. Magazine (2026). An AI Agent Broke Into McKinsey's Internal Chatbot and Accessed Millions of Records in Just 2 Hours. inc.com
- NeuralTrust (2026). How an AI Agent Hacked McKinsey and Exposed 46 Million Messages. neuraltrust.ai
- OWASP (2025). OWASP Top 10 for LLM Applications. owasp.org
Is Your AI Infrastructure Secure?
Book a free 30-minute AI security analysis with our CREST-certified team. We'll show you what an attacker could exploit in your AI systems.
Book Free Analysis