Safety Layer for AI Coding Assistants

SpecFlow Workflow Examples

Real-world scenarios showing how SpecFlow catches breaking changes before they ship

See how SpecFlow works alongside Claude Code to prevent production incidents when modifying legacy systems, building safety incrementally as you ship features.

Table of Contents

  1. Legacy Code: First Touch
  2. The 80/20 Rule in Action
  3. Catching Security Vulnerabilities
  4. Living Documentation
  5. Team Inheritance Scenario
  6. New Feature Development
  7. Handling Drift Conflicts

1. Legacy Code: First Touch

Scenario

You inherit a 50,000 line payment processing system. Zero tests. No documentation. Original developers gone. Boss says: "Add refund support by Friday." You have Claude Code to help, but how do you ensure it doesn't break existing payment flows?

❌ Claude Code alone:

AI modifies code blindly, no context on constraints. Breaks fraud detection. Ships to production. 😱

✅ Claude Code + SpecFlow:

SpecFlow documents constraints, AI preserves them, tests verify safety. Ships confidently. ✅

Workflow (Auto Mode)

You: "Add refund support to src/legacy/payment-processor.js"
Claude Code: [File has no spec]
[SpecFlow cold-start kicks in]
Analyzing existing code...
Found payment processor with:
- Stripe integration (API v1)
- USD only
- Max transaction: $10,000
- Card validation before charge
- No refund support currently
Generated baseline spec: payment_processor_spec.json v1
Now adding refund functionality...
Added full refund support
Added partial refund logic
Added refund validation
Updated spec to v2
Generated 15 tests covering:
- Full refund scenarios (5 tests)
- Partial refund scenarios (4 tests)
- Validation edge cases (3 tests)
- Error handling (3 tests)
Saved to: tests/generated/payment_processor.spec.js
You: "Ship it!"
[You shipped on time]
[payment-processor.js now has specs + 15 tests]
[Can safely modify it in the future]

Key Outcomes

  • Shipped on time - No stopping work to add tests
  • Built quality - Specs + tests generated automatically
  • Documented behavior - Spec shows "Stripe v1, USD only, $10k max"
  • Safe for next time - Future changes check against spec
  • 1 of 500 files - Covered the one you touched (0.2% → 100% on that file)

This is the SpecFlow approach: Build quality where you work.

2. The 80/20 Rule in Action

Scenario

Legacy 50,000 line codebase. Team of 3 developers. 6 months of feature work.

Reality: 20% of code changes 80% of the time.

SpecFlow Strategy: Build specs + tests for the 20% you actually touch.

Month-by-Month Progression

Month 1: Payment Features

Team touches:

  • • payment-processor.js (add refunds)
  • • stripe-gateway.js (update API version)

SpecFlow auto-generates:

  • ✓ 2 specs documenting existing + new behavior
  • ✓ 27 tests covering edge cases

Coverage: 2/500 files (0.4%)

But these are the files you modify monthly!

Month 2: Authentication Updates

Team touches:

  • • auth-middleware.js (add 2FA)
  • • jwt-validator.js (add refresh tokens)
  • • session-manager.js (extend timeout)

SpecFlow auto-generates:

  • ✓ 3 specs
  • ✓ 34 tests

Coverage: 5/500 files (1%)

These files change every sprint.

Month 3: User Management

Team touches:

  • • user-service.js (add soft delete)
  • • permissions.js (add role inheritance)

SpecFlow auto-generates:

  • ✓ 2 specs
  • ✓ 18 tests

Coverage: 7/500 files (1.4%)

Month 6: The Result

Total files touched: 15/500 (3%)

Files with full specs + tests: 15 (100% of touched files)

Total tests: 143 auto-generated

Files never touched: 485 (97% - still no tests, and that's FINE)

The critical 3% now has:

  • ✅ Full specification documentation
  • ✅ Comprehensive test coverage
  • ✅ Version history explaining changes
  • ✅ Drift protection enabled

The Reality Check

❌ Old approach: "We need to add tests to 50,000 lines!"

  • • Time: 6 months
  • • Cost: 3 developers full-time
  • • Result: Never happens

✅ SpecFlow approach: "We'll add tests to code we touch"

  • • Time: 0 extra (built while shipping)
  • • Cost: Free
  • • Result: Critical 3% has enterprise-grade quality

The 97% with no tests? Hasn't been touched in years. Doesn't need tests.

3. Catching Security Vulnerabilities

Scenario

Claude Code is refactoring your authentication middleware. It accidentally removes the rate limiting check. SpecFlow catches it before you even commit.

❌ Without SpecFlow:

Security regression ships to production. Hackers find it in hours. You spend the weekend firefighting.

✅ With SpecFlow:

SpecFlow detects spec violation immediately. Claude fixes it. You never even notice.

Workflow Example

You: "Refactor auth middleware to use async/await"
Claude Code: Reading src/middleware/auth.js...
Converting callbacks to async/await...
Converted token validation
Converted session lookup
Simplified error handling
Running SpecFlow validation...
⚠️ SPEC VIOLATION DETECTED
File: src/middleware/auth.js
Spec: auth_middleware_spec.json v3
Violation:
REMOVED: Rate limiting (max 100 req/min per IP)
This is a CRITICAL security requirement
Original spec (v3) required:
✓ JWT validation
✓ Session check
✗ Rate limiting (MISSING)
✓ CORS headers
Claude Code: Oops! Let me fix that...
Re-adding rate limiting...
Rate limiter restored (async version)
Re-validating against spec...
✓ All spec requirements met
Updated spec to v4:
Changed: Async/await pattern (was callbacks)
Preserved: Rate limiting (100 req/min)
Preserved: JWT + session validation
You: *sleep soundly*
[Security vulnerability prevented]

Key Outcomes

  • Security guardrail - Spec caught critical requirement removal
  • Self-healing AI - Claude fixed its own mistake automatically
  • Zero human intervention - You never saw the security bug
  • Documented constraints - Rate limiting now explicit in spec
  • Safe refactoring - Implementation changed, behavior preserved

SpecFlow is your safety net when AI refactors critical code.

4. Living Documentation

Scenario

New developer joins your team. Asks: "How does the payment processor work?" You point them to the spec file. It's always accurate because it's enforced by SpecFlow.

❌ Traditional docs:

README says "Supports Stripe v1". Code uses Stripe v3. Doc is 2 years stale. New dev gets confused.

✅ SpecFlow docs:

Spec file is generated from code and enforced on every change. Always accurate. Always up to date.

Workflow Example

New Dev: "Where's the documentation for payment processing?"
You: "Check .specflow/specs/payment_processor_spec.json"
New Dev: *opens spec file*
# Payment Processor Specification v2
# Auto-generated by SpecFlow on 2024-03-15
# Last updated: 2024-03-20 (refund support added)
integration:
provider: Stripe
api_version: v1
mode: live (production), test (development)
supported_operations:
- charge (single payment)
- refund_full (100% refund)
- refund_partial (custom amount)
constraints:
currency: USD only
max_transaction: $10,000
min_transaction: $0.50
refund_window: 180 days
validation_rules:
- Card validation before charge
- Amount must be positive integer (cents)
- Refund cannot exceed original charge
error_handling:
- Invalid card → PaymentError with card decline reason
- Network failure → Retry 3 times with exponential backoff
- Duplicate charge → Return existing transaction ID
New Dev: "Nice! This is super clear. When was this written?"
You: "It wasn't written. SpecFlow generated it from the code."
New Dev: "How do I know it's accurate?"
You: "SpecFlow enforces it. If code drifts from spec, build fails."
New Dev: "So this is like executable documentation?"
You: "Exactly. It can't go stale. Try changing the code without updating the spec."
New Dev: *makes test change*
SpecFlow: ⚠️ Spec violation detected. Run 'specflow update' to sync.
New Dev: "Okay, I'm sold."

Why This Matters

  • Always accurate - Spec enforced on every code change
  • Zero maintenance - Auto-generated, auto-updated
  • Onboarding is instant - New devs read specs, understand system
  • Human-readable - YAML format, not cryptic code comments
  • AI-friendly - Claude reads specs before making changes

Documentation that can't lie to you.

5. Team Inheritance Scenario

Scenario

Sarah wrote the email service 3 years ago. She left the company. Now you need to add SMS support. SpecFlow has the complete behavioral contract she left behind.

❌ Without specs:

Spend 2 days reading code, guessing what's safe to change. Break retry logic accidentally. Users complain.

✅ With SpecFlow:

Read spec in 5 minutes. Understand all requirements. Add SMS without breaking email. Ship confidently.

Workflow Example

You: "Add SMS support to notification service"
Claude Code: Reading src/services/notifications.js...
Found existing spec: notification_service_spec.json v5
Current Behavior (from spec v5):
Email Provider: SendGrid
Retry Logic: 3 attempts, exponential backoff (1s, 2s, 4s)
Failure Handling: Log to Sentry, don't throw
Rate Limiting: 100 emails/min
Template Engine: Handlebars
Required Fields: to, subject, body
Note from spec:
"Retry logic is CRITICAL. Don't remove it."
"Template engine switched from EJS to Handlebars in v3"
"Rate limiting added in v4 after SendGrid ban incident"
Claude Code: Now adding SMS support...
Added Twilio SMS provider
Applied same retry logic (3 attempts)
Added SMS rate limiting (50/min per Twilio limits)
Preserved email behavior (no changes)
Updated spec to v6:
ADDED:
SMS Provider: Twilio
SMS Rate Limit: 50/min
SMS Required Fields: to (phone), body
PRESERVED:
Email retry logic (unchanged)
Email rate limiting (unchanged)
Template engine (still Handlebars)
Generated 12 new SMS tests
All 24 existing email tests still pass
You: "Perfect! Did we break anything?"
Claude Code:
✓ All email tests passing (24/24)
✓ All SMS tests passing (12/12)
✓ Spec validation passed
No regressions detected
[Sarah's email code is untouched and working]
[SMS support added safely]
[Future developers will inherit both specs]

Key Outcomes

  • Understood legacy behavior - Spec showed retry logic, rate limits, critical notes
  • No regression - Email code preserved, all tests still pass
  • Consistent patterns - SMS inherited same retry/rate-limiting approach
  • Knowledge transfer - Sarah's decisions documented in spec forever
  • Future-proof - Next dev will inherit v6 spec with both email + SMS

Specs are the institutional memory of your codebase.

6. New Feature Development

Scenario

You're building a brand new API endpoint. SpecFlow guides you through creating specs first, then implementing with AI, then generating tests. True spec-driven development.

❌ Traditional flow:

Write code, realize you forgot edge cases, add them, tests break, fix tests. Messy iteration.

✅ SpecFlow guided:

Define behavior spec, Claude implements it, tests auto-generated, all edge cases covered upfront.

Workflow Example (Guided Mode)

You: "Create new endpoint: POST /api/subscriptions"
SpecFlow: Let's define the behavior first!
Starting guided spec creation...
What should this endpoint do?
You: "Create monthly subscriptions with Stripe integration"
SpecFlow: Great! Let me ask some questions...
1. What data is required?
You: "email, plan_id, payment_method_id"
2. What plans are supported?
You: "basic ($10/mo), pro ($30/mo), enterprise ($100/mo)"
3. What should happen if payment fails?
You: "Return 402 with error details, don't create subscription"
4. What about duplicate subscriptions?
You: "Check if email already has active subscription. Return 409 if yes."
SpecFlow: Generated spec draft:
endpoint: POST /api/subscriptions
auth: Required (JWT)
input:
email: string (required, valid email format)
plan_id: enum [basic, pro, enterprise]
payment_method_id: string (Stripe PM ID)
behavior:
1. Validate email format
2. Check for existing active subscription → 409 if exists
3. Charge via Stripe ($10/$30/$100 based on plan)
4. If payment fails → 402 with error, no subscription
5. If payment succeeds → Create subscription, return 201
success_response: 201 + {id, email, plan, status, next_billing_date}
error_responses:
400: Invalid input
401: Missing/invalid JWT
402: Payment failed
409: Duplicate subscription
SpecFlow: Does this look correct?
You: "Yes, implement it!"
Claude Code: Implementing from spec...
Created route handler
Added input validation
Added duplicate check
Integrated Stripe payment
Added error handling
Generated 18 tests covering:
- Happy path (3 plans × success) = 3 tests
- Validation errors = 5 tests
- Payment failures = 4 tests
- Duplicate subscription = 2 tests
- Auth failures = 2 tests
- Edge cases = 2 tests
✓ All 18 tests passing
✓ Spec saved: subscriptions_api_spec.json v1
[Feature complete with full test coverage]

Key Outcomes

  • Spec-driven development - Defined behavior before writing code
  • All edge cases upfront - Duplicate check, payment failures covered in spec
  • Complete test coverage - 18 tests generated from spec automatically
  • Clear API contract - Spec documents all inputs, outputs, error codes
  • Future-proof - Next dev reads spec, understands complete behavior

Think about behavior first. Let AI handle the implementation.

7. Handling Drift Conflicts

Scenario

Your teammate merged a hotfix directly to production, bypassing CI/CD. Now the code and spec are out of sync. SpecFlow detects it and helps you resolve the conflict intelligently.

❌ Without SpecFlow:

Silent drift. Production code differs from repo. Tests pass locally, fail in prod. Mystery bugs.

✅ With SpecFlow:

Drift detected immediately. SpecFlow shows exact differences. You choose: update spec or revert code.

Workflow Example

You: "Deploy latest changes"
SpecFlow: Running pre-deploy validation...
⚠️ DRIFT DETECTED
File: src/api/orders.js
Spec: orders_api_spec.json v3
Production code has changes not in spec:
What changed:
+ Added field: "priority" (enum: normal, urgent)
+ Urgent orders skip fraud check for speed
- Removed: Fraud check for all orders
Last modified: 2024-03-20 11pm by alex@company.com
Git commit: HOTFIX: urgent order support (merged to prod)
⚠️ This is a SECURITY change (fraud check removed)
SpecFlow: What should I do?
1. Update spec to match new code (accept hotfix)
2. Revert code to match spec (reject hotfix)
3. Show me the full diff first
You: "Show me the diff"
SpecFlow:
Spec v3 (last known good):
1. Validate order data
2. Run fraud check (ALWAYS)
3. Process payment
4. Create order record
Production code (current):
1. Validate order data
2. Check if priority === 'urgent'
3. Skip fraud check if urgent
4. Run fraud check only for normal orders
5. Process payment
6. Create order record
You: "That's a security issue. Revert the code."
SpecFlow: Reverting to spec v3...
Restored fraud check for ALL orders
Removed priority field
Code now matches spec v3
Created revert commit: "Revert unsafe hotfix, restore fraud check"
You: "Thanks. Now let's add urgent support the RIGHT way."
You: "Add priority field, but keep fraud check for all orders."
Claude Code:
Added priority field (normal, urgent)
Urgent orders get priority queue processing
Fraud check STILL runs for all orders
Updated spec to v4:
ADDED: priority field (doesn't skip security)
PRESERVED: Fraud check (ALWAYS runs)
[Security preserved, feature added safely]

Key Outcomes

  • Drift detected immediately - Production hotfix caught before it spread
  • Security preserved - Unsafe fraud check removal caught and reverted
  • Clear diff analysis - Exactly what changed shown side-by-side
  • Safe re-implementation - Feature added properly with security intact
  • Spec as source of truth - When conflict arose, spec won

SpecFlow prevents spec drift from becoming production disasters.

Summary: The SpecFlow Philosophy

What We've Learned

SpecFlow transforms how you work with legacy code and AI assistants. Instead of choosing between shipping fast and building quality, you get both.

For Legacy Code

  • Build quality incrementally as you ship
  • Focus on the 20% of code you actually touch
  • Never write tests for dormant files
  • Specs auto-generate on first touch

For AI Safety

  • Catch AI mistakes before commit
  • Detect security regressions automatically
  • Self-healing when spec violations occur
  • Specs guide Claude's code generation

For Documentation

  • Living docs that can't go stale
  • Zero-maintenance specification
  • Instant onboarding for new developers
  • Institutional memory preserved forever

For Teams

  • Inherit knowledge from departed developers
  • Prevent drift from hotfixes
  • Consistent patterns across codebase
  • Spec-driven development for new features

The Core Principle

Don't boil the ocean. Build quality where you work.

SpecFlow meets you where you are: shipping features on legacy code with AI assistance. It doesn't force you to stop and write tests. It builds them automatically as you work, making your codebase better with every change.

Next Steps

Ready to try SpecFlow?

1

Install SpecFlow

Follow the Quick Start Guide to set up SpecFlow in your project

2

Pick one file to start

Don't try to spec your entire codebase. Start with one file you're actively working on

3

Enable Auto Mode

Let SpecFlow generate specs automatically as Claude Code makes changes

4

Ship with confidence

Watch as specs and tests build up organically, file by file, as you ship features