Compliance Methodology
How we produce compliance findings
We don't ask a language model to invent legal opinions. We use patterns to find what's missing, structured extraction to read what's there, and a closed-set citation whitelist to guarantee that every cited article actually exists in our corpus. This page documents the chain of reasoning behind each finding so you can verify it.
The four layers
- Pattern detection. Deterministic Go code looks for specific markers in your live page: presence or absence of a privacy policy link in five languages, cookie consent platforms (25+ recognised CMPs), 11 tracker scripts loaded without consent, PII patterns in browser storage, account deletion mechanisms, the CCPA "Do Not Sell or Share" link, Global Privacy Control signal handling, sensitive PI markers (geolocation, biometrics, health), App Store and Play Store privacy patterns. Each pattern produces a finding with explicit evidence we quote back to you.
- Structured extraction. Findings are sent to a large language model (Gemini 2.5 Pro) along with a corpus of verbatim regulation text. The model is constrained by a JSON schema to return only structured facts (controller named, retention disclosed, trackers consent-gated) and citation IDs from a fixed whitelist. It cannot return free text or invent new citation identifiers.
- Deterministic mapping. Extracted facts pass through Go code that maps them to applicable jurisdictions (EU GDPR, UK GDPR, US CCPA/CPRA, Brazil LGPD) and to specific articles. The model never generates article numbers — only fact extraction. Article numbers come from our corpus lookup table.
- Closed-set whitelist. Every citation ID the model proposes is checked against our corpus. Any unknown ID is dropped and counted in a fabrication metric. We measure this rate continuously across our evaluation fixture set; if it exceeds 2% we swap models. The current rate, on our test corpus, is 0%.
Jurisdictions covered
Compliance v1 covers the regulations that account for roughly 95% of our customer audience:
- EU General Data Protection Regulation (GDPR) — 19 article references
- UK GDPR + Data Protection Act 2018 + PECR — 7 references
- US California Consumer Privacy Act (CCPA) — 9 sections
- US California Privacy Rights Act (CPRA) — 5 sections
- Brazil Lei Geral de Proteção de Dados (LGPD) — 8 articles
- Apple App Store Privacy Manifest — pattern detection only
- Google Play Data Safety form — pattern detection only
Other jurisdictions (HIPAA, COPPA, additional US state privacy laws like VCDPA, CPA, CTDPA, plus PIPL, DPDP, APPI, POPIA) are on the roadmap. The absence of a finding under one jurisdiction does not imply compliance under another.
Update cadence
Privacy regulation changes. Our corpus refreshes quarterly:
- For EU GDPR and UK GDPR, an automated workflow pulls the consolidated text from the EUR-Lex SPARQL endpoint and opens a pull request when the official text changes.
- For US state laws and Apple / Google store policies, a manual quarterly review catches updates that aren't exposed via API.
- Each citation in our corpus carries an
effective_dateandretrieved_atstamp, rendered in your audit PDF so you know how current the source text is.
What we don't do
- We don't certify GDPR, CCPA, or any other compliance.
- We don't replace counsel. Generated documents and findings are starting points, not final products.
- We don't predict enforcement risk or regulator behaviour. Findings reference what the law says, not what a regulator might do.
- We don't store your raw HTML beyond the audit run that produced it.
Verifying a finding
Every finding in your audit PDF is annotated with the citation IDs the analyzer matched to it. Each citation ID corresponds to a specific article in our corpus, and the corpus quote is included verbatim in the report. If you want to verify the underlying source, the article number, the effective date, and the source URL are all printed alongside the citation.
Our compliance citation corpus is published as a public read-only repository at github.com/sekrdcom/compliance-rules. Apache 2.0. Every citation_id in your audit report links to its line in citations.yaml over there, so you can verify the article number, the verbatim quote, and the source URL one click away.
The compliance audit is an automated detection tool, not legal advice. See Terms of Service Section 14 for the full legal framing.