ResearchIndustryApril 20268 min read

100 vibe apps, one scanner: what a random sweep of the AI-built web actually finds

100

Apps scanned

Platforms sampled

Critical + High

Live production leaks

90 minutes, 12 AI-coding / vibe-coding deploy platforms, random sample from Certificate Transparency logs.

TL;DR

The velocity trap. AI coding tools let you ship a full-stack app in an afternoon. Platform defaults still assume a team of engineers shows up on day two to lock things down. Most of the time, they don't.
The damage. Across 100 random production apps: 5 critical live leaks (Stripe live keys in client bundles, user-data tables queryable by anyone with a browser). 57 of 100 had no Content-Security-Policy at all. Platform choice decided most of the security ceiling, not app code.
The fix. Continuous security validation. Same reflex as npm test before push — one more check that says "is my production URL leaking anything right now?" before the pull request turns green.

AI coding tools are the most productive thing to happen to web development in a decade. Shipping a functional product in an afternoon used to be a demo trick; now it's a Tuesday. We love that. It's also exactly why we built Sekrd — speed without a safety net is how stupid, fatal mistakes reach real users.

To see the current state of the art, we spent a day running Sekrd against 100 real apps deployed across a dozen AI-coding hosting platforms. Not our handpicked test corpus. Not synthetic fixtures. Fresh hostnames pulled from the Wayback Machine Certificate Transparency firehose — a random sample of public deploys from 12 distinct providers. Plus 10 reference URLs from a prior sweep as a sanity check.

Every scan was a deep scan — all 15 providers firing, no integration credentials, no authenticated session. The view any visitor would get walking up to the site cold. Rate-limited at 2 submits/minute per key. Total run ≈90 minutes.

On naming: we anonymized platforms and sites. The goal is to show a structural problem, not to hand anyone a target list.

The ugly honest numbers

Of 112 hostnames submitted: 100 scanned cleanly, 12 failed at fetch (dead deploys, CDN blocks, SSRF-denylist hits). Among the 100 clean scans: 23 critical or high findings, five of which were real live-production issues on sites we don't own.

The 12 fetch failures carry a signal of their own. On some providers, most of our sampled URLs were already 404 by scan time — those platforms host high volumes of short-lived preview deploys that expire in days. On others, apps from the same CT-log vintage were still live months later. That's a hosting-model observation, not a security one, but it biases the sample so we flag it.

Platform fingerprints are very real

The most striking pattern isn't about any individual app — it's how tightly scores cluster within a platform.

On one deploy provider, all 8 sampled apps scored within 0 points of each other. Identical.
On another, 9 of 10 sat at the same score — the tenth drifted 12 points up only because its owner had added custom CSP on top.
On two more providers, every sampled app produced the same exact finding set — same missing headers, same server-version disclosure, same cookie flags.

This isn't the scanner being lazy. Apps on the same provider inherit that provider's default response headers — Strict-Transport-Security, Permissions-Policy, X-Frame-Options, whether the CDN leaks its fingerprint in Server:. The app-level code differs wildly across the sample — todo apps, chatbots, landing pages, e-commerce, AI tools. The platform-level posture is nearly uniform.

The spread between providers was about 30 points. The spread within any single provider was typically ≤4 points. That second number is the practical one: once you've picked a provider, nearly all of your security ceiling is already decided.

Per-provider averages are deliberately not published — a leaderboard would serve attackers, not builders. Platform operators curious where their defaults land can reach us at research@sekrd.com.

The five real leaks

Most findings across the 100 scans were the expected medium-severity shape: missing X-Frame-Options, no privacy-policy link, unsafe-inline in CSP. Five sites had real critical issues a scanner must flag. All five were live when we hit them; all five have been notified.

1. Stripe live secret key in a client bundle

[critical] stripe-live-secret-in-client
Target: live production deploy (name withheld)
Evidence: matched sk_live_ pattern in JS bundle. Score: 0/100.

A sk_live_ key in client JavaScript is a full-compromise situation. Any visitor who opens DevTools copies it and makes Stripe API calls as the merchant — refunds themselves, creates charges, reads the entire payment history. This isn't a technical curiosity. It's the end of the business. Customers whose card history is now exposed, chargebacks the founder can't absorb, a Stripe account that gets locked the moment fraud shows up. The deploy appeared to be a sample or demo accidentally shipped public. Fix: rotate, remove from bundle, redeploy.

2. Stripe webhook accepting unsigned requests

[critical] payment-webhook-no-sig
Target: small-business service (name withheld)
Evidence: POST /api/webhooks/stripe → 200 on forged payload without Stripe-Signature header.

Attacker sends a forged payment_intent.succeeded event. The server accepts it, marks the order paid, customer gets the service — zero dollars transferred. Stripe's docs literally begin the webhook section with "always verify signatures." Fix is two lines of code using the stripe SDK. Vibe-generated scaffolds frequently skip it because the happy-path template doesn't emphasize it and the developer can't feel a missing check during normal testing.

3–5. Supabase RLS bypass on three live SaaS projects

[critical] supabase-rls-bypass (×3)
Three separate SaaS projects. User-data tables (profiles, sessions, documents) readable via the Supabase anon key alone.
Evidence: GET /rest/v1/<table>?select=*&limit=1 returned row-level data including user IDs, session IDs, timestamps.

All three projects ship the Supabase anon key in the browser — that's the intended architecture. The problem is the tables: RLS is either disabled, or the policy is USING (true). Any visitor paginates the entire table: every customer row, every session token, every document ID. In three different production apps run by three different founders, real users had their data handed to any browser that asked.

This is the most common and the most dangerous class of issue in the sample, and it's specific to the vibe-coding workflow. "Connect Supabase, create table, ship app" outpaces "now add a row-level policy saying who can read what." Sekrd detects it by trying the probe the attacker would: a cold GET with just the anon key. If rows come back, that's a bypass, full stop.

Why this matters past the technical

Every critical finding in this sweep is a user-trust event in waiting. Nobody buys a second subscription after their inbox full of someone else's data hits Hacker News. Nobody recommends a dev tool whose own user table is world-readable. The companies that get written up for data leaks rarely recover their growth curve — even when the technical fix was a two-line patch.

The founders shipping these apps aren't careless. They're just moving fast — which is exactly what AI coding tools are supposed to let them do. What's missing is the equivalent of the npm test reflex for production safety: a cheap, automatic check that says "before I ship this, does it expose anything it shouldn't?" That reflex used to require a dedicated security engineer. It doesn't anymore.

What the baseline looks like on defaults alone

The most-hit finding across 100 scans was the absence of a Content-Security-Policy header: 57 of 100 apps had no CSP at all. Second: absence of Permissions-Policy (57). Third: server-version disclosure (53) — the Server: response header leaking CDN or framework fingerprints.

Defaults win. Whatever the platform sends, your app ships. Stronger security requires the developer to go out of their way — write the CSP header, configure Permissions-Policy, strip Server. None of that is free; all of it matters.

What to do about it

Your platform sets your floor. Whatever posture your hosting provider sends by default is your starting point. Custom work moves the needle upward. Pick with this in mind.
Webhook signatures are non-negotiable. Any POST /api/webhooks/<provider> MUST verify the signature header that provider sends. Active-probe scanners will find you.
Row-level security is a practice, not a setting. Every table needs an explicit policy saying who reads / writes what. If a raw HTTP request with the anon key returns rows, that's a critical and will be exploited.
CSP is free. Four header lines move your score ~10 points on most hosting platforms. Ship them.
Make scanning automatic. Running a scanner once after launch is a snapshot. Wiring one into CI means every PR gets checked before merge — the issue in this post gets caught by your build, not by a stranger on HN.

The shift-left playbook (with or without us)

The real point of this post isn't "find a scanner." It's "put a scanner in your pipeline so you never have to think about this again." Pick any tool that honestly maps findings to CWE + OWASP and gives you actionable evidence. Wire it into CI. Done.

If you want the Sekrd version, the integrations ship today:

GitHub Action — one uses: line in .github/workflows/ci.yml. Submits the scan, polls, posts findings as SARIF into GitHub Code Scanning, fails the build on configured severity.
GitLab CI template — drop .gitlab-ci.yml in the repo root, set SEKRD_API_KEY as a masked variable, done.
Live badge — /badge/by-domain/<your-site>.svg embedded in your README flips red the moment verdict goes BLOCK.

Continuous, not once. That's the whole idea.

Methodology notes

Sample source: Certificate Transparency logs via the Wayback Machine CDX API, *.<tld> queries for each of 12 deploy-platform DNS patterns. Random sampling per platform, filtered for junk (wildcards, HTML-entity placeholders, parked subdomains). Unauthenticated, safe perimeter scanning against public internet endpoints — consistent with modern threat-intelligence practice (the same methodology Shodan, Censys, and commercial ASM tools use).

Scanner configuration: deep-scan profile, 15 providers, no integration credentials, no page crawl, no authenticated header set. Each scan is the view a stranger hitting the site cold would get. Active probes respect rate limits (2 submits/minute per key, Nuclei throttled at 5 req/sec per target).

All five affected operators have been contacted via email, Telegram, or the public security@ address listed on their domains. Public disclosure of per-site URLs, specific platform attribution, and reproducible exploit paths is withheld pending a 90-day remediation window, per the Sekrd responsible-disclosure policy.

Try it yourself

Free anonymous scan at sekrd.com/scan. Score + top 3 findings + evidence. Free sign-in unlocks all findings + copy-pasteable fix prompts for AI-coding IDEs. Pro adds unlimited scans, Deep Inspect (AI re-reads source captured during the scan), CI badges, scheduled rescans, and the full CI/CD integration above.

Don't ship until you're sekrd

Run a free scan to find the vulnerabilities your AI missed.

Scan Your App Free

Back to Blog