Skip to content
·10 min read

Security Researchers Sound the Alarm on AI Code Vulnerabilities

What the data shows about vulnerability rates in AI-generated code and what the security community recommends

Share

The AI code vulnerability problem has moved from speculation to hard numbers. With 92% of US developers now using AI coding tools daily, security researchers have published converging findings on vulnerability rates that demand attention. Veracode's 2025 analysis found that 45% of AI-generated code introduces known OWASP vulnerabilities, and real-world breaches have confirmed those findings in production.

This article collects the key data, the real incidents, and the specific recommendations from the security research community.

What the Research Shows

The data comes from multiple independent sources, and they all point in the same direction.

Veracode's 2025 study analyzed AI-generated code across thousands of repositories and found that 45% introduced at least one OWASP vulnerability category. These are not obscure edge cases. They are the top ten most common web application security flaws, including SQL injection, cross-site scripting, and broken access control. The same categories that human developers have been trained to avoid for two decades.

Stanford researchers found that developers who used AI coding assistants produced significantly less secure code than those who wrote it by hand. The critical finding was that the AI-assisted developers were also more confident in the security of their code, despite it being measurably worse. The tools created a false sense of safety.

METR's research on AI coding productivity added another dimension. Their study found that AI tools did not meaningfully speed up experienced developers on real-world tasks, challenging the assumption that the productivity gains justify the security tradeoffs. If the speed benefit is smaller than expected and the vulnerability rate is higher than expected, the risk calculus shifts dramatically.

Key Takeaway

The convergence of findings from Veracode, Stanford, and METR tells a consistent story. AI-generated code has a measurably higher vulnerability rate than human-written code, and the developers using these tools tend to overestimate the security of what they produce. This combination of more vulnerabilities and more confidence is what makes the problem dangerous.

CodeRabbit's analysis added a specific multiplier: AI-generated code carries a 2.74x higher vulnerability rate compared to human-written code. That number puts a concrete ratio on the risk. For every 100 lines of code a developer writes, the AI-generated version is nearly three times more likely to contain a security flaw.

Real Breaches Confirm the Research

The statistics become tangible when you look at what happened in production.

CVE-2025-48757 exposed a systemic flaw in Lovable, a popular AI coding platform. Lovable had been generating Supabase database schemas without Row Level Security (RLS) policies. This meant that any authenticated user could read, modify, or delete data belonging to any other user. Over 170 production applications were affected. The vulnerability was not in one application. It was in the template that every application inherited, a structural failure embedded in the AI's default code generation.

The Moltbook breach leaked 1.5 million authentication tokens and 35,000 email addresses. The root cause was API endpoints that returned sensitive data without authorization checks. The AI had built functional endpoints that served the right data to the right requests, but it never added the logic to verify whether the requester had permission to see that data. Authorization was simply absent.

The Tea App incident exposed 72,000 user images and 1.1 million private messages. Again, the pattern was missing access controls. The application worked perfectly from a feature perspective. Users could upload images, send messages, and interact with the platform. But the boundaries between users did not exist at the data layer.

EXPLAINER DIAGRAM: Three incident cards arranged horizontally. First card labeled LOVABLE CVE-2025-48757 in coral, with bullet points reading NO ROW LEVEL SECURITY, 170+ APPS EXPOSED, and STRUCTURAL TEMPLATE FLAW. Second card labeled MOLTBOOK in coral, with bullet points reading 1.5M TOKENS LEAKED, 35K EMAILS EXPOSED, and NO AUTHORIZATION CHECKS. Third card labeled TEA APP in coral, with bullet points reading 72K IMAGES LEAKED, 1.1M MESSAGES EXPOSED, and MISSING ACCESS CONTROLS. Below all three cards, a shared bar in teal reads COMMON PATTERN: AI BUILDS FEATURES WITHOUT SECURITY BOUNDARIES.
Three major breaches share the same root cause: AI-generated code that delivers functionality without implementing security controls.

These three incidents share a pattern that security researchers have identified as the defining characteristic of AI code vulnerabilities. The AI optimizes for functionality. It builds what you ask for. But security controls, the things that prevent unauthorized access, are constraints that exist outside the feature request. The AI does not add them because nobody asked for them, and the AI does not understand why they matter.

The Industry Responds With Alarm

The security community's reaction has gone beyond publishing papers. Some projects have taken direct action.

Daniel Stenberg, the creator and lead maintainer of cURL, one of the most widely used software libraries in the world, shut down AI-generated bug bounty reports. The volume of low-quality, AI-fabricated vulnerability reports was consuming more time than real security research. Stenberg described the situation as researchers using AI to generate plausible-sounding but ultimately bogus vulnerability reports, flooding the system and making it harder to find actual bugs.

Mitchell Hashimoto, the creator of Ghostty, went further. He banned AI-generated code contributions entirely from the project. His reasoning was direct: the maintenance burden of reviewing AI-generated pull requests, which often introduced subtle issues, exceeded the value they provided.

These are not fringe voices. cURL is installed on billions of devices. These decisions reflect a growing segment of the security community that views AI-generated code as a source of systemic risk requiring active mitigation.

What Security Researchers Recommend

The security research community has coalesced around a set of practical recommendations. These are not theoretical. They come from researchers who analyzed the breaches, studied the vulnerability patterns, and understand the structural reasons AI code is insecure.

Treat every AI-generated line as untrusted input. This is the foundational principle. Security researchers recommend approaching AI output the way you would approach code from an unknown contributor: assume it contains flaws until proven otherwise. Read it. Understand it. Do not accept it because it looks correct.

Run automated security scanning on every AI-generated change. Static analysis tools like Snyk, Semgrep, and SonarQube can catch many of the vulnerability patterns that AI tools consistently reproduce. Integrate them into your CI/CD pipeline so that no AI-generated code reaches production without automated review.

Apply the principle of least privilege everywhere. The Lovable, Moltbook, and Tea App breaches all involved overly permissive access. Security researchers recommend defaulting to the minimum permissions required and explicitly granting additional access only when needed. If your AI tool generates a database schema, add RLS. If it generates an API endpoint, add authorization. If it generates a file upload handler, add access controls.

EXPLAINER DIAGRAM: A vertical checklist with four items, each with a checkbox icon in teal. First item reads TREAT AI CODE AS UNTRUSTED with subtext Review every generated line like a pull request from an unknown contributor. Second item reads RUN SECURITY SCANNING with subtext Integrate Snyk, Semgrep, or SonarQube into CI/CD pipeline. Third item reads APPLY LEAST PRIVILEGE with subtext Default to minimum permissions, explicitly grant additional access. Fourth item reads TEST BEYOND FUNCTIONALITY with subtext Write tests that verify unauthorized access is blocked, not just that features work. A header above the checklist reads SECURITY RESEARCHER RECOMMENDATIONS.
The security research community's four core recommendations for working with AI-generated code.

Test for what should not happen, not just what should. Functional tests verify that your application does what it is supposed to do. Security tests verify that it does not do what it is not supposed to do. Write tests that attempt unauthorized access. Try to read another user's data. Try to modify records without authentication. If those tests pass (meaning the unauthorized action succeeds), you have found the same class of vulnerability that caused every major AI-code breach.

Common Mistake

Running only functional tests on AI-generated code and assuming that passing tests means the code is secure. Functional tests confirm that features work. They do not confirm that unauthorized users are blocked, that input is sanitized, or that sensitive data is protected. Every major AI-code breach involved applications that passed functional testing perfectly.

The Trust Paradox in Numbers

The adoption and trust numbers tell a story that security researchers find particularly concerning.

92% of US developers use AI coding tools daily. But developer trust in AI code accuracy has fallen from 77% to just 33% over two years. Developers are using tools they increasingly do not trust, driven by competitive pressure and productivity expectations rather than confidence in the output.

This trust collapse is not irrational. It is a direct response to experience. Developers who have used AI tools for two years have accumulated enough personal encounters with bugs, hallucinated APIs, and subtle security flaws to calibrate their trust downward. The 33% figure reflects learned skepticism, not pessimism.

If 45% of AI-generated code has vulnerabilities and developers are using these tools for the majority of their work, the aggregate vulnerability surface across the industry is expanding rapidly. The speed of code generation has outpaced the speed of security review.

Building With AI Tools?

The data is clear on the risks. Understanding them is the first step toward building safely.

Start here

Where AI Security Is Heading

The trajectory is toward more structure, more tooling, and eventually more regulation.

Vendor responses have been reactive but accelerating. After CVE-2025-48757, Lovable updated their code generation to include RLS by default. Supabase shipped one-click RLS audit tools. Other platforms have begun integrating static analysis into their generation pipelines, catching obvious vulnerabilities before presenting code to users.

But the fundamental tension remains. Security adds friction, and AI coding tools compete on speed. A tool that adds authentication prompts and security warnings after every generation request will lose users to a tool that just builds the thing. Until the cost of breaches exceeds the cost of friction, the market incentives favor speed over safety.

The regulatory landscape is shifting that calculus. Existing data protection frameworks like GDPR already hold deployers responsible for breaches regardless of how the code was written. "The AI generated it" is not a legal defense. The EU AI Act, state-level privacy laws in the US, and increasing FTC scrutiny are converging on a principle that security researchers have been stating for years: if you deploy it, you own the consequences.

The most likely near-term outcome is a formalization of what experienced developers already practice informally. Generate with AI, review with human judgment, scan with automated tools, test for security, then deploy. That workflow adds time. It also prevents the class of breaches that have defined the AI coding era so far.

What This Means For You

The security research community is not telling you to stop using AI coding tools. They are telling you to stop trusting them blindly.

  • If you are a senior developer, the 45% vulnerability rate and the 2.74x multiplier are your baseline assumptions. Build automated security scanning into your pipeline. Review AI-generated code with the same rigor you would apply to a contribution from someone you have never worked with before. The cURL and Ghostty responses show that even the open-source community is drawing boundaries around AI-generated contributions.
  • If you are a founder shipping an AI-built product, budget for security review before launch. The Lovable, Moltbook, and Tea App incidents demonstrate that a single missing security control can expose your entire user base. A pre-launch security audit is not optional. It is the minimum.

The data from security researchers is clear, consistent, and actionable. The AI code vulnerability problem is real, measurable, and preventable for anyone willing to add the review layer that the tools themselves skip.

Stay Informed on AI Security

The research keeps coming. Knowing the data helps you build responsibly.

Keep reading
PJ
Pranay Joshi

20+ years building products at scale. VP of Product & Engineering, startup founder, and AI coach. Helping dreamers turn ideas into reality with vibe coding.

The Tuesday Shipping Report

Every Tuesday, one focused email:

  • - The tool or technique that's actually working right now
  • - A real problem from the community (and how to solve it)
  • - What changed this week in the vibe coding landscape

Read by 1,000+ founders, developers, and creators building with AI. Free forever. No spam.