Fairy
Resources

Why AI Security Assessments Miss the Vulnerabilities That Matter Most

July 1, 2026 · 9-minute read · Fairy

The short answer

AI security assessments systematically miss business logic vulnerabilities, multi-step authorization flaws, attack chains combining low-severity issues, and context-dependent risks. Automated tools excel at pattern-matching known CVEs and OWASP signatures but cannot understand application intent, trace authorization across endpoints, or evaluate how isolated findings combine into exploitable paths.

What AI Security Assessments Actually Miss

AI security tools miss business logic vulnerabilities, multi-step authorization flaws, attack chains combining low-severity issues, and context-dependent risks that require understanding application intent. Automated scanners excel at detecting known patterns—SQL injection signatures, XSS vectors, CVE matches—but they cannot reason about what your application is supposed to do, only what it does.

This isn't a flaw in any specific tool. It's a structural limitation of pattern-matching approaches to security assessment. The vulnerabilities that lead to breaches are increasingly the ones that don't match any signature in the database.

The Pattern-Matching Ceiling

Modern AI security tools are genuinely impressive at what they do well. They can scan millions of lines of code in minutes, identify known vulnerability patterns with high accuracy, and flag dependencies with published CVEs before you deploy. For OWASP Top 10 signatures and documented attack patterns, automated scanning is both faster and more consistent than manual review.

The problem is the ceiling. Pattern-matching requires a pattern to exist. When a vulnerability type has been documented, categorized, and added to training data, AI tools catch it reliably. When a vulnerability is novel, context-dependent, or requires understanding the application's business purpose, AI tools fail systematically—not occasionally, but as a category.

What the Training Data Contains

AI security tools learn from:

This training produces tools that are excellent at finding yesterday's vulnerabilities. The critical gap is everything that doesn't look like a pattern the model has seen before.

Business Logic Vulnerabilities: The Intent Problem

Business logic vulnerabilities are the most significant category AI tools miss, and the reason is fundamental: these flaws require understanding what the code is supposed to do.

Consider a pricing function that calculates discounts. An AI scanner can verify the code executes without errors, doesn't contain injection vulnerabilities, and follows secure coding practices. What it cannot do is verify that applying a 50% discount twice shouldn't stack to 100% off, or that a coupon meant for first-time customers shouldn't work on the tenth order.

These aren't edge cases. In production codebases, business logic flaws appear consistently:

The Webhook Verification Problem

One pattern we see repeatedly in code reviews: payment webhooks that process events without signature verification. This is a critical vulnerability—anyone who can reach the endpoint can forge payment events, grant themselves subscriptions, or trigger fulfillment without paying.

# What automated scanners see: valid code, no injection, handles events
@app.route('/webhook/stripe', methods=['POST'])
def handle_stripe_webhook():
    event = request.json
    if event['type'] == 'checkout.session.completed':
        fulfill_order(event['data']['object'])
    return '', 200
# What the code should do: verify the signature first
@app.route('/webhook/stripe', methods=['POST'])
def handle_stripe_webhook():
    sig = request.headers.get('Stripe-Signature')
    try:
        event = stripe.Webhook.construct_event(
            request.data, sig, webhook_secret
        )
    except stripe.error.SignatureVerificationError:
        return '', 400
    
    if event['type'] == 'checkout.session.completed':
        fulfill_order(event['data']['object'])
    return '', 200

An AI scanner looking for SQL injection or XSS will pass both versions. The first version is syntactically correct, follows Python best practices, and contains no known vulnerability signatures. But it's exploitable by anyone who can send an HTTP request.

This is the intent problem: the scanner doesn't know that Stripe webhooks require signature verification, or that payment events without verification are forgeable. It sees code that handles webhooks. A security engineer sees code that trusts arbitrary input.

Authorization Flaws Spanning Multiple Endpoints

Authorization vulnerabilities are another systematic blind spot for AI tools, particularly when the flaw spans multiple endpoints or requires understanding the relationship between components.

The Path-Matcher Gap

Most applications define which routes require authentication. The vulnerability isn't in the authentication code itself—it's in the paths that aren't covered.

// Auth middleware applied to most routes
app.use('/api/users', authMiddleware);
app.use('/api/orders', authMiddleware);
app.use('/api/admin', authMiddleware);

// But this route was added later and missed
app.get('/api/internal/user-export', (req, res) => {
    return db.users.findAll();
});

An AI scanner analyzing each route in isolation will find that /api/internal/user-export returns data without errors and doesn't contain injection vulnerabilities. What it misses: this endpoint should require authentication and doesn't.

Catching this requires understanding:

  1. The application's authentication architecture
  2. Which routes should be protected (based on what data they expose)
  3. That the middleware list is incomplete relative to the route list

This isn't a pattern—it's a gap. AI tools are trained to find things that are present and wrong, not things that are absent and should be present.

Cross-Endpoint Authorization

More subtle authorization flaws appear when permissions checked on one endpoint aren't enforced on related endpoints:

# Endpoint checks ownership correctly
@app.route('/documents/<doc_id>')
def view_document(doc_id):
    doc = Document.query.get(doc_id)
    if doc.owner_id != current_user.id:
        return 'Forbidden', 403
    return doc.content

# But the export endpoint doesn't
@app.route('/documents/<doc_id>/export')
def export_document(doc_id):
    doc = Document.query.get(doc_id)
    return generate_pdf(doc.content)

The ownership check on /documents/<doc_id> doesn't extend to /documents/<doc_id>/export. Any authenticated user can export any document. Finding this requires understanding that these endpoints operate on the same resource and should have consistent authorization—context AI scanners don't have.

Attack Chains: When Low Severity Becomes Critical

Individual findings rated "low" or "informational" by automated scanners can combine into critical exploitation paths. AI tools evaluate findings in isolation; attackers think in chains.

The Combination Problem

Consider three separate findings from an automated scan:

  1. Informational: Verbose error messages expose stack traces
  2. Low: User enumeration possible via password reset timing
  3. Low: Session tokens use predictable incrementing IDs

Individually, each finding might be deprioritized or accepted as low risk. Together, they form an attack:

  1. Use verbose errors to discover internal structure
  2. Enumerate valid usernames via the timing side-channel
  3. Predict session tokens to hijack authenticated sessions

AI scanners report these as three separate low-priority items. A security engineer sees them as steps in a single attack path.

Idempotency and Replay

Another pattern we encounter regularly: webhook handlers that process events without checking if they've already been processed.

// No deduplication
app.post('/webhook', async (req, res) => {
    const event = req.body;
    if (event.type === 'order.fulfilled') {
        await fulfillOrder(event.data.orderId);
    }
    res.sendStatus(200);
});

An AI scanner sees: valid endpoint, handles events, returns 200. What it misses: if this webhook is called twice with the same event (which happens—Stripe retries on timeouts), the order fulfills twice. Depending on what fulfillment means—shipping physical goods, granting credits, creating accounts—this is exploitable.

The fix requires tracking processed event IDs:

app.post('/webhook', async (req, res) => {
    const event = req.body;
    
    if (await isEventProcessed(event.id)) {
        return res.sendStatus(200);
    }
    
    if (event.type === 'order.fulfilled') {
        await fulfillOrder(event.data.orderId);
    }
    
    await markEventProcessed(event.id);
    res.sendStatus(200);
});

This isn't a vulnerability signature—it's the absence of a defense that should exist given the context. AI scanners don't flag missing idempotency because there's no pattern for "webhook handling without deduplication" in the CVE database.

Infrastructure-Level Risks Invisible in Code

Some security risks don't appear in source code at all. Database security configurations, for example, exist in the infrastructure layer:

The RLS Gap

With platforms like Supabase, a common finding in security reviews: tables that have no Row-Level Security policies, making every row readable and writable by any authenticated user.

The code might look secure:

const { data } = await supabase
    .from('documents')
    .select('*')
    .eq('user_id', userId);

But if RLS isn't enabled on the documents table, the .eq('user_id', userId) filter is a suggestion, not an enforcement. Any client can remove that filter and retrieve all documents.

This vulnerability exists in the database configuration, not the application code. An AI scanner analyzing the JavaScript will find nothing wrong—the code is correct. The security gap is in a separate layer entirely.

What Expert Review Adds

The patterns above share a common thread: they require understanding context that AI tools don't have access to. Expert security review complements automated scanning by providing:

Business Context

Understanding what the application is supposed to do, not just what it does. This includes:

Cross-Component Reasoning

Evaluating how components interact, including:

Attack-Path Thinking

Connecting individual findings into exploitation chains:

Absence Detection

Finding what should exist but doesn't:

Structuring Security Assessment for Coverage

Automated scanning and expert review aren't competing approaches—they're complementary layers. The most effective security posture uses both:

Automated scanning handles:

Expert review handles:

Running automated scans first gives experts a baseline to work from. They can review scanner output, validate findings, identify false positives, and focus their time on the categories scanners structurally miss.

The Reliability Layer

AI security tools are getting better. Each year brings improved detection of more vulnerability categories, larger training datasets, and more sophisticated analysis. But the structural limitations remain: pattern-matching cannot find vulnerabilities that don't match patterns, and understanding intent requires context that doesn't exist in code alone.

For teams deploying AI-assisted code to production, this creates a specific challenge: the code may have passed every automated check while containing vulnerabilities no scanner would catch. The solution isn't abandoning automated tools—they're valuable infrastructure. The solution is complementing them with review that covers what they structurally cannot.

Fairy provides this verification layer for AI-generated code, combining automated analysis with expert review to catch the vulnerabilities that matter most: business logic flaws, authorization gaps, and context-dependent risks that require understanding your specific application. For teams that want to see how this works on their own code, Fairy Scout offers free AI PR reviews that demonstrate what expert verification finds that automated scanning misses.

Frequently asked questions

Can AI replace penetration testing?

AI cannot fully replace penetration testing. Automated tools handle known vulnerability signatures efficiently, but penetration testing requires understanding business context, chaining findings into attack paths, and testing authorization logic across workflows—tasks that require human judgment.

What types of vulnerabilities do automated security scanners miss?

Automated scanners commonly miss business logic flaws (like payment bypass), authorization gaps spanning multiple endpoints, webhook signature verification issues, and attack chains where multiple low-severity findings combine into critical exploits.

Why do AI security tools miss business logic vulnerabilities?

Business logic vulnerabilities require understanding what the application is supposed to do, not just what it does. AI tools pattern-match against known vulnerability signatures but cannot infer intent, pricing rules, or workflow constraints from code alone.

How accurate are AI-powered security assessments?

AI security assessments are highly accurate for known vulnerability patterns like SQL injection and XSS. However, accuracy drops significantly for context-dependent issues like missing webhook verification, authorization flaws, and business logic bypasses that don't match established signatures.

Should I rely solely on automated security scanning?

No. Automated scanning should be your baseline, not your ceiling. The vulnerabilities that cause breaches are often the ones scanners cannot detect—missing signature verification, authorization gaps, and logic flaws that require understanding your specific application.


Have AI-generated work you’d want verified? Connect with a Fairy → or run a free check with Scout.

More resources