Setting Up Lighthouse CI Thresholds for WCAG 2.2 AA

Pipeline failures frequently occur when Lighthouse CI score thresholds conflict with actual WCAG 2.2 AA compliance. Misconfigured assertions trigger false negatives on valid pages, stalling deployments and wasting engineering cycles. This guide provides exact lighthouserc.json assertion mapping, DOM-state validation steps, and false-positive mitigation strategies. For foundational context on audit selection and tool alignment, review Web Accessibility Testing Fundamentals & Tool Selection before adjusting pipeline gates. To decide whether Lighthouse’s score gate or a rule-based scanner should own the merge block, see axe-core vs Lighthouse CI for PR Gating.

Scored vs manual audit triage Lighthouse audits split into scored audits, which gate on minScore, and manual scoreDisplayMode audits, which return null and must be set to off and sent to a human reviewer. a11y audit scoreDisplayMode? scored: 0–1 gate on minScore manual: null no auto score error block warn log off + human review issue tracker
Manual audits return a null score and will stall a build if left to fail — set them to off and route them to human review; only scored audits belong on error or warn.

Root Cause: Threshold Misalignment & Audit ID Mapping

Lighthouse evaluates accessibility through heuristic scoring, not direct WCAG conformance checks. This architectural gap causes CI pipelines to fail even when manual audits confirm compliance.

  • WCAG 2.2 AA introduces updated success criteria, including Focus Not Obscured (2.4.11) and Target Size (Minimum) (2.5.8). Lighthouse audits approximate these via DOM inspection.
  • Default lhci configurations often enforce minScore: 0.90 across all categories. This blanket threshold triggers false negatives on heuristic-heavy audits.
  • Audits returning scoreDisplayMode: "manual" default to a score of null in Lighthouse’s internal model. Without explicit pipeline handling, they can block builds unexpectedly.

Diagnose failures by inspecting the raw Lighthouse JSON output. Cross-reference failing audit IDs against your actual DOM state before adjusting thresholds.

Exact Configuration: lighthouserc.json Assertions

Enforce WCAG 2.2 AA compliance by defining granular thresholds in the assertions object. Avoid global category gates that penalize heuristic variance.

{
  "ci": {
    "collect": {
      "settings": {
        "formFactor": "desktop",
        "onlyCategories": ["accessibility"]
      }
    },
    "assert": {
      "assertions": {
        "categories:accessibility": ["error", { "minScore": 0.95 }],
        "target-size": ["warn", { "minScore": 0.80 }],
        "focus-order-semantics": ["error", { "minScore": 1.0 }],
        "color-contrast": ["error", { "minScore": 1.0 }]
      }
    }
  }
}

This configuration sets strict error gating for deterministic criteria like focus order and color contrast. It downgrades heuristic-heavy audits like target-size to warn, preventing pipeline paralysis on edge cases. Note that the audit key focus-order used in some documentation is actually focus-order-semantics in Lighthouse’s audit catalog—use npx lhci collect locally and inspect the JSON report to confirm exact audit IDs in your Lighthouse version. To prevent threshold drift across sprints, align your persistent baseline storage with Lighthouse CI Baseline Configuration.

Validation & False-Positive Resolution

Validate threshold accuracy against staging environments before merging to main. Run targeted collections to isolate scanner artifacts.

  1. Execute npx lhci autorun against your staging environment to capture live DOM states.
  2. Inspect the generated report. Differentiate between score: 0 (actual failure) and scoreDisplayMode: "manual" (requires human review).
  3. Filter out manual audits from CI gating by setting them to off in your assertions, and route them to issue trackers for manual validation.
  4. Enforce desktop form factor in collection settings. Mobile viewport scaling artificially inflates target-size failures due to touch-area calculations.

If audits consistently fail despite valid markup, verify that ARIA attributes are injected before the load event fires.

Pipeline Impact & Gating Strategy

Integrate thresholds into your CI workflow without causing build paralysis. Implement tiered gating to balance velocity and compliance.

  • Apply warn thresholds on feature branches. Reserve error gating for main and release branches.
  • Cache baseline reports in your CI runner. This prevents flaky failures caused by transient network latency or third-party script injection.
  • Configure the upload step with LHCI_GITHUB_APP_TOKEN to surface PR annotations directly in the diff.
  • Monitor target-size and focus-order-semantics for dynamic DOM injection delays. Increase load timeouts if SPAs hydrate after initial paint.
- name: Run Lighthouse CI
  run: |
    npx lhci autorun \
      --collect.url=https://staging.your-domain.com \
      --collect.settings.maxWaitForLoad=15000 \
      --collect.settings.formFactor=desktop
  env:
    LHCI_GITHUB_APP_TOKEN: ${{ secrets.LHCI_GITHUB_APP_TOKEN }}

The extended maxWaitForLoad parameter (in milliseconds) ensures dynamically injected ARIA attributes and focus traps are captured before scoring begins.

Common Pitfalls

  • Setting global minScore: 1.0 thresholds causes unavoidable failures on heuristic audits.
  • Ignoring scoreDisplayMode: "manual" audits, which require human review and should be set to off in CI gating rather than left to fail the build.
  • Failing to account for dynamic content injection delays, causing focus-order-semantics scanner misses.
  • Overriding target-size thresholds without verifying actual touch target bounding boxes in the computed DOM.

FAQ

How do I map Lighthouse audit IDs to specific WCAG 2.2 AA success criteria? Run npx lhci collect locally and open the generated HTML report. Each audit card links to relevant WCAG criteria. For programmatic access, inspect lhr.audits in the JSON output: each audit object has a description and helpUrl field pointing to the corresponding WCAG criterion.

Why does Lighthouse CI fail on target-size despite passing manual WCAG checks? Lighthouse calculates target size from the computed CSSOM, which may exclude pseudo-elements or dynamically injected touch areas. Set the assertion to warn and validate manually, or use --collect.settings.formFactor=desktop since touch-target audits are less relevant for desktop viewports.

How do I prevent threshold flakiness from dynamic DOM updates? Increase maxWaitForLoad in lighthouserc.json to 15000–30000ms for heavy SPAs. Use ci.collect.startServerCommand to spin up a production-equivalent local server before collection. Cache baselines to ignore transient layout shifts that normalize between runs.