AppSec2026-04-087 min read

The Vulnerability Class That Arrived With AI Coding Assistants

LLM-generated code passes syntax checks, passes type checks, and fails security checks at higher rates than hand-written code. Here's why and what to do about it.

The AI Code Quality Problem Is a Security Problem

Multiple published studies (Stanford HAIS, NYU Tandon, Cybersecurity and Infrastructure Security Agency advisories) have found that code generated by LLMs contains security vulnerabilities at meaningful rates. The Stanford 2023 study found that 40% of GitHub Copilot completions in security-sensitive contexts contained vulnerabilities.

This isn't a reason to avoid AI coding assistants. It's a reason to have write-time security scanning that runs on every line of code, regardless of whether a human or an LLM wrote it.

The Patterns AI Code Gets Wrong

Hardcoded Secrets (CWE-798)

LLMs generate example code with hardcoded API keys, passwords, and tokens. When that example code is accepted and committed, the secret is in the repository. AI coding assistants are now a meaningful source of CWE-798 findings in enterprise codebases.

# LLM-generated example. Commonly accepted without modification
API_KEY = "sk-prod-1234567890abcdef"
client = OpenAI(api_key=API_KEY)

Insecure Deserialization (CWE-502)

LLMs trained on older codebases reproduce pickle-based serialization patterns, yaml.load() without Loader argument, and other insecure deserialization patterns. These patterns were common in older Python code and are well-represented in training data.

SQL Construction from String Formatting (CWE-89)

When asked to write database queries, LLMs often produce string-formatted SQL unless explicitly instructed to use parameterized queries. The formatting looks clean in the completion, and developers accept it without recognizing the injection risk.

# Common LLM completion. Injectable
query = f"SELECT * FROM users WHERE username = '{username}'"
cursor.execute(query)

Missing Input Validation at API Boundaries (CWE-20)

LLMs generate complete API handlers that handle the happy path but omit input validation. The generated code is functional and passes tests against valid input, but doesn't validate type, length, format, or encoding.

The Trust Model Problem

AI coding assistants are trusted in a way that copy-pasted Stack Overflow code is not. When a developer pastes code from Stack Overflow, they often read it and modify it. When an LLM completes code inline, the natural tendency is to accept and move on.

This means AI-generated vulnerabilities have a higher probability of reaching production than manually introduced vulnerabilities, because they look authoritative and developers trust the completion.

The Compound Risk: AI + Speed

AI coding assistants increase developer velocity. More code ships per developer per day. Security review processes don't scale at the same rate. The net effect is a growing gap between code being written and code being reviewed. Exactly the gap that write-time scanning closes.

How Deva Addresses AI-Introduced Vulnerabilities

Deva scans every line written, regardless of whether a human or an AI wrote it. The scan runs at write time. As the code is typed or generated. There's no lag between "AI generates insecure code" and "Deva flags it"; the detection is immediate.

The OWASP Top 10 2025 draft's new A10 category (LLM-Assisted Code Vulnerabilities) is covered by Deva's existing CWE detection rules, since the vulnerabilities introduced by LLMs are the same CWEs as always. Just with a new source.

FAQ

Frequently asked questions

Is AI-generated code more vulnerable than human-written code?

Published studies (Stanford HAIS 2023, NYU Tandon, CISA advisories) have found higher vulnerability rates in LLM-generated code in security-sensitive contexts. The Stanford 2023 study found that 40% of GitHub Copilot completions in security-sensitive contexts contained vulnerabilities. The gap is closing but is still meaningful in 2026.

What vulnerabilities do LLMs commonly introduce?

Hardcoded secrets (CWE-798) in example code, insecure deserialization (CWE-502) from pickle and yaml.load patterns common in training data, string-formatted SQL queries (CWE-89) when developers do not ask for parameterized queries, and missing input validation at API boundaries (CWE-20).

Should I stop using GitHub Copilot or Cursor?

No. The productivity gains from AI coding assistants are real. The change is that write-time security scanning has to run on every line of code regardless of who or what wrote it. The combination is fast development and fast detection. Not slow development.

How do you scan AI-generated code for security issues?

Use a scanner that runs at write time, not just on commit or in CI. The same CWE detection rules that catch human-introduced vulnerabilities catch LLM-introduced ones, because the vulnerabilities are the same CWEs. The new requirement is the scan cadence: as code is generated, not after.

Post Share

Summer Ann

Threat research, application security analysis, and defensive engineering insights from the DevSecCode team.

AppSec8 min read

OWASP Top 10:2025 Is Live. SSRF Is Gone, Supply Chain Is #3.

OWASP published the 2025 revision of the Top 10 in May 2026. Three structural changes deserve real attention from anyone writing or auditing application code.

AppSec8 min read

Shift-Left Pentesting: Why Offensive Security Belongs in Your IDE

Traditional penetration testing happens after deployment. A new generation of tools moves attack-surface analysis into the IDE, where the cost of a fix is measured in developer-minutes rather than incident reports.

AppSec9 min read

Prompt Injection in Agentic AI: The 2026 Vulnerability Class That Acts Like Remote Code Execution

Agentic AI systems combining LLMs with tool use and persistent memory have created a new vulnerability class. When the agent has shell or API access, prompt injection behaves like RCE.

Discussion

Loading comments...

Tuning Your Scanner to the 2024 CWE Top 25 Without Drowning in False Positives

Zero-Trust for Developer Environments: What Air-Gapped AI Actually Means

Back to all articles