Documentation

How It Works

Understand the evidence SDK philosophy - customer-run, read-only, and inspectable.

Customer-Run Architecture

Traditional compliance tools scan your infrastructure from the vendor side. The evidence SDK flips this model.

You Control Collection

┌─────────────────────────────────────────────────────────────┐
│  Your Infrastructure                                         │
│                                                              │
│  ┌──────────────┐      ┌──────────────┐      ┌───────────┐ │
│  │   GitHub     │      │     AWS      │      │  Google   │ │
│  │              │      │              │      │ Workspace │ │
│  └──────┬───────┘      └──────┬───────┘      └─────┬─────┘ │
│         │                     │                     │        │
│         └─────────────────────┼─────────────────────┘        │
│                               │                              │
│                    ┌──────────▼──────────┐                   │
│                    │  evidence SDK       │                   │
│                    │  (Your CI/CD)       │                   │
│                    └──────────┬──────────┘                   │
│                               │                              │
│                    ┌──────────▼──────────┐                   │
│                    │  Signed Bundle      │                   │
│                    │  (.tar.gz)          │                   │
│                    └──────────┬──────────┘                   │
└───────────────────────────────┼──────────────────────────────┘

                     ┌──────────▼──────────┐
                     │  evidence Platform  │
                     │  (Optional Upload)  │
                     └─────────────────────┘

Key points:

  • Runs in your environment - Your CI/CD, your infrastructure, your control
  • You decide when - Manual runs, scheduled jobs, or triggered by events
  • You decide what - Configure which connectors and controls to collect
  • Direct collection - SDK collects directly from your APIs using your credentials

Why This Matters

Security: You provide signed evidence bundles, not API credentials. Credentials stay in your infrastructure.

Compliance: Meets auditor requirements for customer-controlled evidence collection. You can prove when and how evidence was collected.

Transparency: Complete visibility into what's collected. All artifacts are standard JSON files that can be inspected with any text editor.

Read-Only Enforcement

The evidence SDK never requests write permissions. Read-only scopes are enforced at multiple levels.

Scope Enforcement

Each connector has hardcoded allowlists for API scopes:

GitHub:

// Only these scopes allowed
const ALLOWED_SCOPES = [
  'repo:read',
  'public_repo',
  'read:org'
];

AWS:

// Only Get*, Describe*, List* actions
const ALLOWED_ACTIONS = [
  'iam:GetAccountPasswordPolicy',
  'cloudtrail:DescribeTrails',
  'cloudtrail:GetTrailStatus',
  'logs:DescribeLogGroups'
];

Google Workspace:

// Only readonly scopes
const ALLOWED_SCOPES = [
  'admin.directory.user.readonly',
  'admin.directory.rolemanagement.readonly'
];

Runtime Validation

The policy pack validates scopes before collection:

// Validation happens before any API calls
validateScopes(config.sources.github.scopes);
// Throws error if write scopes detected

// Then collection proceeds
collectEvidence(config);

If you accidentally configure write scopes:

✗ Configuration validation failed

Forbidden scope detected: repo (write access)

evidence SDK only uses read-only scopes. Allowed GitHub scopes:
  - repo:read
  - public_repo
  - read:org

Please update your token to use read-only scopes only.

Why This Matters

Security: Principle of least privilege. SDK cannot modify your infrastructure, even if compromised.

Audit compliance: Auditors require proof that collection tools are read-only. Hardcoded allowlists provide this proof.

Trust: You can verify the SDK never requests write access by inspecting the source code.

Inspectable Bundles

evidence bundles use industry-standard formats. No proprietary tools required.

Standard Formats

TAR + GZIP

# Extract with standard tools
tar -xzf evidence-bundle-*.tar.gz

JSON

# View with any JSON tool
cat sources/github/org_settings.json | jq

SHA-256

# Verify with system tools
sha256sum -c checksums.sha256

Ed25519

# Verify with OpenSSL
openssl dgst -sha256 -verify public.pem -signature signature.sig checksums.sha256

Bundle Anatomy

evidence-bundle-20260109-123456.tar.gz

├── manifest.json              # Bundle metadata
│   ├── bundle_version: "1.0"
│   ├── framework: "soc2_type1"
│   ├── controls: ["CC6.1", "CC6.6", "CC7.2"]
│   ├── artifacts: [...]       # List of collected files
│   └── signer: {...}          # Public key info

├── run.json                   # Collection context
│   ├── started_at
│   ├── completed_at
│   ├── hostname
│   └── environment

├── checksums.sha256           # SHA-256 of all files
│   abc123...  manifest.json
│   def456...  run.json
│   ghi789...  sources/github/org_settings.json

├── signature.sig              # Ed25519 signature over checksums

├── sources/                   # Raw API responses
│   ├── github/
│   │   ├── org_settings.json
│   │   └── repo_*_branch_protection.json
│   ├── aws/
│   │   ├── iam_password_policy.json
│   │   └── cloudtrail_trails.json
│   └── google-workspace/
│       └── users_2sv.json

└── derived/                   # Normalized evidence
    ├── normalized.json        # Control mappings
    └── hints.json            # Compliance recommendations

Verification Chain

1. Files → SHA-256 Checksums
   Each file hashed individually

2. Checksums → Ed25519 Signature
   Checksums file signed with private key

3. Signature → Public Key Verification
   Anyone with public key can verify

Chain of Trust:
  Tamper ANY file → Checksum changes → Signature invalid

Why This Matters

Auditor-friendly: Auditors can verify bundles with standard system tools (tar, sha256sum, openssl).

Legally defensible: Cryptographic signatures provide non-repudiation. Creator cannot deny creating the bundle.

Future-proof: Standard formats ensure bundles remain accessible regardless of SDK changes.

No Sensitive Data Collection

The evidence SDK never collects source code, secrets, or credentials.

Forbidden Artifacts

Never collected:

  • Repository source code
  • Pull request content or diffs
  • Environment variables
  • Secrets or credentials
  • User passwords
  • API keys
  • Database contents

Only configuration:

  • GitHub: Org settings, branch protection rules, CODEOWNERS files
  • AWS: IAM policies, CloudTrail configuration, log settings
  • Google Workspace: 2SV enforcement, admin roles, user lifecycle

Policy Validation

The policy pack scans artifacts for forbidden patterns:

const FORBIDDEN_PATTERNS = [
  /password/i,
  /secret/i,
  /api[_-]?key/i,
  /credentials?/i,
  /token/i
];

// Scans each artifact
for (const artifact of artifacts) {
  validateNoSecrets(artifact);
}

If forbidden data detected:

✗ Policy violation detected

Artifact may contain secrets: sources/github/env_vars.json
Pattern matched: /api[_-]?key/i

evidence SDK does not collect sensitive data. This artifact will be excluded.

Why This Matters

Security: No risk of leaking source code or credentials through evidence bundles.

Privacy: Only configuration metadata collected, never user data or content.

Compliance: Meets data minimization requirements. Collect only what's necessary for compliance.

Collection Workflow

Here's how evidence collection works from start to finish:

1. Configuration

# evidence.yaml
framework: soc2_type1
controls:
  - CC6.1
  - CC6.6
  - CC7.2

sources:
  github:
    mode: token
    token_env: GITHUB_TOKEN
    org: acme
    repos:
      - acme/backend
      - acme/frontend

2. Validation (Pre-Collection)

✓ Configuration schema valid
✓ Framework supported: soc2_type1
✓ Controls valid: CC6.1, CC6.6, CC7.2
✓ GitHub token found: GITHUB_TOKEN
✓ GitHub scopes valid: repo:read, read:org
✓ Test connection successful

3. Collection

Collecting evidence...

GitHub connector:
  ✓ Org settings (2FA enforcement)
  ✓ Branch protection: acme/backend
  ✓ Branch protection: acme/frontend
  ✓ CODEOWNERS: acme/backend
  ✓ CODEOWNERS: acme/frontend

5 artifacts collected (23.4 KB)

4. Normalization

Mapping artifacts to controls...

CC6.1 (Logical Access):
  ✓ GitHub 2FA enabled
  ✓ AWS password policy (12+ chars)

CC6.6 (Access Modification):
  ✓ CODEOWNERS enforced (code review)

CC7.2 (Change Management):
  ✓ Branch protection (2+ reviewers)
  ✓ CloudTrail logging enabled

5. Bundle Creation

Creating bundle...
  ✓ Manifest created
  ✓ SHA-256 checksums generated
  ✓ Ed25519 signature created
  ✓ Compressed to tar.gz

Bundle: evidence-bundle-20260109-123456.tar.gz (24.1 KB)

6. Verification

Verifying bundle...
  ✓ Checksums valid (7/7 files)
  ✓ Signature valid

Bundle ready for upload or auditor distribution.

Key Design Decisions

Why TAR/GZIP?

  • Universal: Supported on all platforms (Unix, Windows, macOS)
  • Standard: No proprietary compression needed
  • Auditor-friendly: Familiar format, easy to inspect
  • Future-proof: Will be readable for decades

Why Ed25519?

  • Fast: Faster than RSA, smaller signatures
  • Secure: Resistant to timing attacks, designed by cryptography experts
  • Simple: Single signing algorithm, no configuration needed
  • Standard: Widely supported (OpenSSL 1.1.1+, modern crypto libraries)

Why SHA-256?

  • Proven: Industry standard for integrity verification
  • Fast: Efficient computation on modern CPUs
  • Collision-resistant: Practically impossible to find two files with same hash
  • Standard: Built into all operating systems

Why JSON for Artifacts?

  • Human-readable: Can inspect with text editor
  • Machine-parsable: Easy to process programmatically
  • Standard: Universal format, widely supported
  • Structured: Maintains data types and relationships

Deployment Patterns

Pattern 1: Manual Collection

Run locally on your machine:

# Set credentials
export GITHUB_TOKEN=ghp_...
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...

# Collect
evidence collect

# Review bundle
tar -xzf evidence-bundles/*.tar.gz
cat evidence-bundle-*/manifest.json | jq

# Share with auditor
# Send bundle + public key via separate channels

Use case: Ad-hoc collection, initial setup, testing

Pattern 2: Scheduled CI/CD

Run monthly in GitHub Actions:

# .github/workflows/evidence.yml
name: evidence collection

on:
  schedule:
    - cron: '0 0 1 * *'  # Monthly on 1st
  workflow_dispatch:

jobs:
  collect:
    runs-on: ubuntu-latest
    steps:
      - uses: evidence-sdk/action@v1
        with:
          command: collect
          upload: true
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          EVIDENCE_SIGNING_KEY: ${{ secrets.EVIDENCE_SIGNING_KEY }}

Use case: Continuous compliance, SOC 2 Type II (3-6 months of evidence)

Pattern 3: Event-Triggered

Run on specific events:

on:
  push:
    branches: [main]
  pull_request:
    types: [closed]

Use case: Capture evidence at critical moments (releases, major changes)


Next Steps

Configure Your Setup Learn about all available configuration options in evidence.yaml.

Configuration Guide →

Understand Bundles Deep dive into bundle structure, checksums, and signatures.

Bundle Format →

Map to SOC 2 Controls See how collected evidence maps to CC6.1, CC6.6, and CC7.2.

SOC 2 Controls →