Testing
Testing in CI
Continuous Integration (CI) means running tests automatically on every commit, before code reaches main or production. CI testing catches bugs immediately—a developer knows within minutes if their code broke something. This fast feedback is invaluable. Without CI, bugs hide until someone manually tests or a user finds them.
Why CI Testing Matters
CI testing provides instant feedback:
- Catch bugs early: A developer fixes their own code in the same session, not days later.
- Prevent breaking changes from reaching main: A test failure blocks a merge, protecting the main branch.
- Build confidence: Passing CI tests mean the code has been vetted by automated checks.
- Document assumptions: Tests document how code should behave, visible to reviewers.
- Enable refactoring: With CI tests, developers can refactor safely. Tests prove nothing broke.
- Reduce manual testing: Automated tests replace tedious manual regression testing.
Teams without CI often discover bugs late—during manual testing, UAT, or in production. CI shifts testing left, catching bugs when they're cheapest to fix.
CI/CD Platforms
Popular CI/CD platforms:
| Platform | Pros | Cons |
|---|---|---|
| GitHub Actions | Native to GitHub, free tier, easy workflow files | Limited to GitHub, can be verbose |
| GitLab CI/CD | Built into GitLab, powerful, free tier | Only for GitLab |
| CircleCI | Great user experience, multi-platform support, free tier | Paid for advanced features |
| Jenkins | Open source, highly customizable, works with any git host | Self-hosted (requires maintenance), steeper learning curve |
| Travis CI | Simple to set up, good for open source | Paid tiers, smaller community |
Most modern teams use GitHub Actions (free, simple) or GitLab CI/CD. If you need advanced features or multiple platforms, CircleCI is excellent. Jenkins is the choice for teams needing maximum control.
Setting Up a CI Pipeline
A basic CI pipeline has these stages:
- Trigger: On every commit (or pull request), the pipeline starts.
- Checkout: CI pulls the code.
- Install dependencies: npm install, pip install, etc.
- Lint/format: ESLint, Prettier, etc. fail the build if code doesn't meet standards.
- Type check: TypeScript, mypy, etc. catch type errors.
- Unit tests: Run fast unit tests.
- Integration tests: Run integration tests against test database.
- Build: Compile or bundle the code.
- Security scanning: Snyk, SonarQube, dependency scanning.
- E2E tests (optional): Run against staging environment (slower, run less frequently).
- Report: Generate reports and notify developers of results.
Not every pipeline includes all stages. A simple pipeline might just: install, lint, unit test, build. A complex one might include all stages plus deployment. Tailor to your needs.
Test Parallelization and Sharding
Test suites can grow large (thousands of tests). Running them sequentially takes too long. Parallelization runs tests in parallel:
- Split tests across workers: CI system runs tests on multiple machines or processes. Worker 1 runs tests A-M, Worker 2 runs N-Z.
- Sharding: Tests are sharded (divided) by category. Unit tests on one machine, integration tests on another.
- Load balancing: Sophisticated systems distribute tests to balance work. Fast tests and slow tests mix so no worker is idle.
With 4 workers, a 10-minute test suite becomes 2.5 minutes. With 10 workers, 1 minute. Parallelization is powerful but requires care: tests must be independent (they can run in any order).
Failing Fast
Not all tests are equally important. Organize tests by speed and order them to fail fast:
- Linting and type checking: Run first (milliseconds). Catch obvious errors.
- Unit tests: Fast (seconds total). Most tests are here.
- Integration tests: Moderate (tens of seconds).
- E2E tests: Slow (minutes). Run last or skip for speed.
- Performance tests: Very slow (minutes/hours). Run on schedule, not per-commit.
If a lint check fails, no point running tests. If unit tests fail, integration tests will likely fail too. Order tests so developers get feedback fast.
Test Caching
CI can be slow if you rebuild/retiest unchanged code. Caching speeds things up:
- Dependency caching: Cache node_modules or equivalent. Skip npm install if package.json hasn't changed.
- Build artifacts: Cache the build output. Skip rebuild if source hasn't changed.
- Test caching: Skip tests for files that haven't changed. If only the README changed, skip tests.
- Docker layer caching: Docker caches layers. If base layer hasn't changed, it's reused.
Caching must be smart: if you cache incorrectly, tests pass locally but fail in CI (cache inconsistency). Most modern CI systems handle caching well; configure it but be cautious about edge cases.
Branch Protection Rules
CI is only useful if you enforce its results. Branch protection rules on GitHub (or equivalent) prevent merging without passing CI:
- Require status checks to pass: CI pipeline must pass before merging.
- Require code review: At least one approval before merge (in addition to tests passing).
- Dismiss stale PR approvals: If tests are rerun, approvals are dismissed. New approvals required.
- Require branches to be up to date: PR must be rebased on latest main before merging. Prevents merge conflicts and ensures latest tests ran.
With branch protection rules, it's impossible to merge failing code. Developers must fix it first. This discipline keeps main clean.
Flaky Test Detection
A flaky test passes sometimes, fails other times, without code changes. Flaky tests are poison: developers stop trusting the test suite. Detect and quarantine them:
- Monitor test failures: If a test fails, then immediately passes on retry, it's probably flaky.
- Disable flaky tests: Mark them as quarantined. They run but don't block merges.
- Investigate: Why is the test flaky? Timing issue? Race condition? Inconsistent test data?
- Fix and re-enable: Once fixed, re-enable the test.
Flakiness usually comes from E2E tests (timing, network), but can happen in unit tests (randomness, mock issues). A flaky test is worse than no test.
Test Reporting
CI should provide clear test reports:
- Summary: X tests passed, Y failed, Z skipped.
- Failed test details: Which tests failed and why? Show the assertion error.
- Timing: How long did tests take? Are they getting slower?
- Coverage: Code coverage percentage. Trend over time.
- Artifacts: Logs, screenshots, videos from failed E2E tests.
- Annotations: GitHub/GitLab show test results in the PR interface directly.
Good reports make debugging easier. A developer can see which test failed and why without diving into CI logs.
Different Test Suites on Different Triggers
Not every test needs to run on every trigger. Smart pipelines run different tests for different situations:
- On every commit: Linting, type checking, unit tests (fast). Takes < 5 minutes.
- On pull request: All of above plus integration tests. Takes < 15 minutes.
- Before merging to main: All tests plus E2E tests. Takes < 30 minutes.
- On schedule (nightly): Full test suite plus performance tests and security scans. Takes 1+ hours.
- Before production deployment: Smoke tests against staging. Takes < 5 minutes.
This approach balances speed (developers get feedback fast) with thoroughness (critical tests run before production).
Secrets in CI
CI jobs often need secrets (database passwords, API keys). Never hardcode secrets in CI configuration:
- Use secret management: GitHub Secrets, GitLab Variables, CircleCI Contexts store secrets encrypted.
- Reference secrets in configuration: "$DATABASE_PASSWORD" is replaced with the actual password at runtime.
- Don't log secrets: Make sure secrets aren't printed in logs. CI systems mask them, but be careful.
- Rotate secrets: If a CI secret is exposed, rotate it immediately.
Secrets in CI are powerful for testing against real services, but require care to keep them safe.
Docker-Based Test Environments
CI environments should be consistent. Using Docker ensures tests run the same everywhere:
- Container as test environment: Everything the app needs (base OS, runtime, dependencies) is in the container.
- Services in containers: Database, cache, message queue all run in containers. Spin up fresh for each test.
- Consistency: Developer's laptop, CI system, and production all use the same container image.
This prevents the "works on my machine, fails in CI" problem. Docker is powerful for consistent test environments.
Keeping CI Fast as Test Suite Grows
As your codebase grows, so does your test suite. CI can become slow (15+ minutes per commit). Strategies to stay fast:
- Parallelize: Use multiple workers. Split tests across machines.
- Cache aggressively: Cache dependencies, build artifacts, test databases.
- Remove slow tests: If a test is slow and doesn't add value, remove it.
- Optimize slow tests: Profile tests. Why is this one slow? Can you make it faster?
- Run different tests on different schedules: Quick tests on every commit, slow tests nightly.
- Filter tests by impact: Only run tests affected by the change. (Tools like Nx do this.)
- Hardware: Use better hardware in CI. Faster machines = faster tests.
A slow CI pipeline hurts productivity. Invest in speed. Developers should get feedback within 10 minutes.
Key Takeaways
CI (Continuous Integration) runs tests automatically on every commit. This catches bugs immediately and prevents broken code from reaching main. Set up a CI pipeline with stages: linting, type checking, unit tests, integration tests, build, security scanning. Parallelize tests for speed. Use branch protection rules to require passing CI before merging. Detect and quarantine flaky tests. Run different test suites on different triggers (quick tests per-commit, slow tests nightly). Monitor CI performance and optimize as the test suite grows. CI is the safety net that keeps your codebase healthy.