Breach Analysis 12 min read

How McKinsey's AI Platform Was Breached in 2 Hours — And What It Means for Your Organisation

On February 28, 2026, an AI agent achieved full read-write access to McKinsey's Lilli platform, exposing 46.5 million messages, 728,000 files, and 57,000 user accounts. Technical analysis and lessons for enterprise AI security.

RedTeam Partners

CREST-Certified Security Team · 2026-03-13

On February 28, 2026, an autonomous AI agent breached McKinsey's internal AI platform in 2 hours — without any credentials. The attack exposed 46.5 million chat messages, 728,000 confidential files, and 57,000 user accounts. It is the most significant AI platform security incident to date, and it reveals critical lessons for every organisation deploying AI systems.

What Happened: The McKinsey Lilli Breach

McKinsey's Lilli is an internal AI-powered knowledge platform used by consultants to access proprietary research, client data, and institutional knowledge. The platform had been running in production for over two years when security researchers at CodeWall, an autonomous offensive security platform, turned their AI agent loose on it.

The agent discovered 22 of McKinsey's 200+ documented API endpoints required no authentication. Within those unauthenticated endpoints, it found a SQL injection vulnerability in the user search functionality — specifically, JSON field names were concatenated directly into SQL queries while values were safely parameterised.

"Hackers will be using the same technology to attack indiscriminately, with specific objectives like financial blackmail for data loss or ransomware."
Paul Price, CEO, CodeWall

The Full Scope of Exposure

Within 2 hours, the AI agent achieved full read-write database access. The scale of exposure was staggering:

Data CategoryRecords Exposed
Chat messages (plaintext)46.5 million
Files (PDFs, spreadsheets, presentations, documents)728,000
User accounts57,000
AI assistants384,000
Workspaces94,000
RAG document chunks (proprietary research)3.68 million
Files processed through external APIs1.1 million
OpenAI vector stores exposed266,000+
System prompts (AI behaviour controls)95

The 728,000 files broke down into 192,000 PDFs, 93,000 spreadsheets, 93,000 PowerPoint presentations, and 58,000 Word documents — containing confidential client deliverables, proprietary research, and internal strategy documents.

The Critical Finding: Write Access to System Prompts

Perhaps the most alarming discovery was that the agent achieved write access to the database containing Lilli's 95 system prompts — the foundational instructions that dictate how the AI behaves. According to CodeWall's researchers, an attacker could have silently modified these prompts using a single SQL UPDATE statement wrapped in an HTTP call.

This means an attacker could have:

  • Reprogrammed the AI to exfiltrate data from every user query
  • Injected bias or misinformation into AI responses across the organisation
  • Used the AI as a persistence mechanism for ongoing espionage
  • Manipulated AI-generated advice used for client engagements worth millions

Why Traditional Security Missed It

McKinsey's Lilli had been in production for over two years. Their own internal security scanners failed to detect the vulnerabilities. According to CodeWall, the AI agent found the issues precisely because "it doesn't follow checklists" — it explores attack surfaces the way a real adversary would, probing paths that automated scanners don't test.

This is exactly the gap that AI red teaming addresses. Traditional penetration testing operates from predefined methodologies. AI red teaming uses the same autonomous, adversarial approach that real attackers now employ.

The Vulnerability Was Not Exotic

SQL injection has been in the OWASP Top 10 since 2003. It's one of the oldest vulnerability classes in cybersecurity. Yet it persisted in a platform built by one of the world's most prestigious consulting firms, handling some of the most sensitive business data imaginable.

The lesson: AI systems inherit all the vulnerabilities of traditional software — and add entirely new ones on top.

The Responsible Disclosure Timeline

DateEvent
February 28, 2026SQL injection discovered; full database access achieved
March 1, 2026Responsible disclosure submitted to McKinsey security team
March 2, 2026McKinsey patches all unauthenticated endpoints within 24 hours
March 9, 2026Public disclosure via CodeWall blog

Credit to McKinsey for their rapid response — patching within 24 hours of disclosure. But the question remains: how many other enterprise AI platforms have the same vulnerabilities sitting undetected?

Implications for Enterprise AI Security

1. AI API Security Is Non-Negotiable

Twenty-two unauthenticated endpoints in a production AI platform is not a minor oversight — it's a systemic failure in API governance. Every AI endpoint must enforce authentication, authorisation, and rate limiting. Our AI Security Configuration Review specifically tests for these gaps in Step 2 of our methodology.

2. System Prompt Security Is a New Attack Category

System prompts are the "source code" of AI behaviour. If an attacker can read or modify them, they control your AI. Prompts must be stored securely, separately from user-accessible databases, with strict access controls and integrity monitoring.

3. RAG Pipelines Need Security Boundaries

The 3.68 million RAG document chunks exposed in the breach represent McKinsey's proprietary knowledge base — their competitive advantage. RAG implementations must enforce document-level access controls and prevent cross-tenant data leakage.

4. Third-Party AI Services Expand Your Attack Surface

The 266,000+ OpenAI vector stores and 1.1 million externally processed files show how AI systems create dependencies on third-party services. Each integration is a potential data exfiltration path that must be secured and monitored.

5. Autonomous AI Agents Change the Threat Model

The breach was discovered by an autonomous AI agent — the same technology that real attackers will use. As CodeWall's CEO noted, adversaries will deploy AI agents "to attack indiscriminately." Organisations need AI-powered defence to match AI-powered offence.

What Your Organisation Should Do Now

  1. Inventory all AI endpoints — Map every API, webhook, and integration point in your AI infrastructure
  2. Audit authentication on every AI API — No unauthenticated endpoints, no exceptions
  3. Secure system prompts — Store separately with integrity monitoring and strict access controls
  4. Conduct AI-specific security testing — Traditional penetration tests miss AI-specific vulnerabilities
  5. Prepare for EU AI Act compliance — Adversarial testing is mandatory for high-risk AI systems by August 2, 2026
  6. Test RAG pipeline security — Validate document-level access controls and cross-tenant isolation

References

  • CodeWall (2026). How We Hacked McKinsey's AI Platform. codewall.ai
  • The Register (March 9, 2026). AI agent hacked McKinsey chatbot for read-write access. theregister.com
  • Inc. Magazine (2026). An AI Agent Broke Into McKinsey's Internal Chatbot and Accessed Millions of Records in Just 2 Hours. inc.com
  • NeuralTrust (2026). How an AI Agent Hacked McKinsey and Exposed 46 Million Messages. neuraltrust.ai
  • OWASP (2025). OWASP Top 10 for LLM Applications. owasp.org

Is Your AI Infrastructure Secure?

Book a free 30-minute AI security analysis with our CREST-certified team. We'll show you what an attacker could exploit in your AI systems.

Book Free Analysis