Scenarios

Test your AI system's resilience with "what-if" scenario planning.

Overview

Scenarios are controlled "what-if" exercises that help you test how your AI systems would respond to provider changes, outages, or policy shifts. Unlike signals (which are real-time events), scenarios are proactive planning tools that let you model disruptions before they happen.

Why Scenarios Matter

Proactive risk management: Identify vulnerabilities before they become incidents
Business continuity planning: Test fallback strategies and recovery plans
Stakeholder communication: Demonstrate resilience planning to leadership and auditors
Compliance readiness: Meet requirements for continuity planning (ISO 22301, NIST CSF)

Scenarios vs. Signals

Feature	Signals	Scenarios
Source	Real-time provider events	User-defined "what-if" tests
Purpose	React to actual changes	Proactively test resilience
Status	Always current	Draft → Ready → Executed → Archived
Impacts	Detected automatically	Generated during execution
Timing	Happens to you	You control when to test

Use both together: Create scenarios from signals (turn real events into reusable test cases) or use templates to prepare for risks you haven't experienced yet.

Scenario Types

SignalBreak supports 6 common disruption scenarios based on real-world AI governance risks:

1. Provider Unavailable 🔴

What it tests: Your AI provider experiences an outage (API down, service degraded, network issues).

Business impact:

Workflows that depend on this provider cannot execute
Customer-facing features may fail or degrade
Manual fallback processes activated

Example use case: "What happens if Claude API is down for 4 hours during our busiest support period?"

Key questions this answers:

Which workflows have no fallback providers?
What's the business impact (revenue, users, downtime)?
Do we have manual procedures documented?

2. Model Deprecated 📦

What it tests: A specific AI model you depend on is deprecated with X days notice.

Business impact:

Need to migrate workflows to newer models
May require prompt/parameter tuning
Quality or cost trade-offs with alternatives

Example use case: "GPT-4 is deprecated in 90 days. Which workflows are affected?"

Key questions this answers:

How many workflows use this model?
What's the migration timeline (testing, QA, rollout)?
Are there suitable replacement models?

3. Price Increase 💰

What it tests: API pricing increases by X% (e.g., +50%, +100%).

Business impact:

Unit economics change (cost per user, per transaction)
ROI models need recalculation
May need to reduce automation scope or switch providers

Example use case: "If Azure OpenAI prices increase 75%, what's the financial impact?"

Key questions this answers:

Which high-volume workflows become cost-prohibitive?
What's the monthly/annual cost delta?
Are cheaper alternatives available?

4. Performance Degradation 📉

What it tests: Model quality, accuracy, or latency degrades (e.g., slower responses, lower accuracy).

Business impact:

Customer experience affected
Increased manual review/correction needed
May breach SLAs or quality standards

Example use case: "What if our contract extraction accuracy drops from 95% to 80%?"

Key questions this answers:

Which workflows have tight accuracy/latency requirements?
What's the downstream business impact?
Do we have quality monitoring in place?

5. Policy Change 📜

What it tests: Provider changes terms of service (data retention, training data usage, jurisdictions).

Business impact:

May conflict with compliance requirements (GDPR, HIPAA, client NDAs)
Legal review required
Potential forced migration

Example use case: "Provider changes data retention from 30 to 90 days. Does this violate client agreements?"

Key questions this answers:

Which workflows process sensitive data?
Are we compliant with industry regulations?
Do we need legal sign-off?

6. Rate Limit Exceeded ⏱️

What it tests: API rate limits imposed or reduced during high-traffic periods.

Business impact:

Workflows queue or fail during peak times
Reduced capacity when you need it most
Customer experience degrades

Example use case: "What if fraud detection is rate-limited during Black Friday?"

Key questions this answers:

Which workflows are critical during peak periods?
Do we have queuing/retry logic?
Can we request higher limits in advance?

Scenario Lifecycle

Scenarios move through 4 states as you plan, test, and act on results:

draft → ready → executed → archived

1. Draft (Gray Badge)

What it means: Scenario is being prepared but not yet ready to execute.

Typical actions:

Fill in scenario name and description
Select scenario type
Choose target product (the AI model/provider to test)
Refine business context

When to use: Initial creation, gathering requirements, stakeholder input needed.

Execute button: Disabled (need target product first).

2. Ready (Blue Badge)

What it means: Scenario is configured and ready to execute.

Requirements to reach Ready:

Scenario name filled in
Scenario type selected
Target product specified (which AI model/provider to test against)

When to use: Scenario is complete and you're ready to run the impact analysis.

Execute button: Enabled (click to run scenario).

3. Executed (Green Badge)

What it means: Scenario has been run. Impact analysis and mitigations are available.

What happens during execution:

Workflow detection: Identifies all workflows using the target product
Impact calculation: Estimates severity, downtime, affected users, revenue impact
Mitigation generation: Creates ranked mitigation options for each impact
Results stored: Impacts and mitigations saved to database

When to use: After running the scenario, review impacts and plan next steps.

Execute button: Shows "Re-run Scenario" (to refresh impacts with latest data).

4. Archived (Slate Badge)

What it means: Scenario is no longer active (completed, superseded, or no longer relevant).

When to use:

Scenario was tested and mitigations implemented
Risk is no longer relevant (e.g., migrated away from that provider)
Replaced by a newer scenario

Note: Archived scenarios are read-only but retain historical impact data for audit trails.

Creating Scenarios

You have 3 ways to create scenarios in SignalBreak:

Option 1: Quick Start (Templates)

Best for: First-time users, common scenarios, onboarding

How it works:

Navigate to Dashboard → Scenarios
Click "Quick Start"
Select your archetype (industry/use case)
Browse pre-built templates
Click "Use This Template"
Review pre-populated fields
Select target product
Click "Create Scenario"

Available archetypes:

SaaS / Support: Customer chatbots, ticket triage, help desk automation
Consulting / Agency: Proposal generation, research, document analysis
Legal / Compliance: Contract review, e-discovery, legal research
Finance / Operations: Invoice processing, fraud detection, reconciliation
Public Sector: Citizen services, FOI processing, case management
Healthcare: Clinical notes, triage, medical coding, HIPAA scenarios
Retail / E-commerce: Product recommendations, review analysis, content generation
Manufacturing: Quality inspection, predictive maintenance, safety classification
Media / Content: Content moderation, transcription, UGC analysis
Education: AI tutors, assignment feedback, student data privacy

Template example (SaaS Support → Provider Outage):

Scenario Name: Support Chatbot Provider Outage
Description: Your primary AI provider goes down for 24 hours. Customer support chatbot is offline.
Type: Provider Unavailable
Severity: Critical
Business Context: Immediate impact on Tier 1 support capacity and customer satisfaction

Option 2: Create from Scratch

Best for: Custom scenarios, specific risk modeling, advanced users

How it works:

Navigate to Dashboard → Scenarios
Click "+ New Scenario"
Fill in form:

Scenarios page with list of created scenarios

Scenario Name: Clear, descriptive title (max 200 chars)
Description: What you're testing and why (optional but recommended)
Scenario Type: Select from 6 types (dropdown)
Target Product: Select the AI model/provider to test (required for execution)

Click "Create Scenario"

Example (custom scenario):

Scenario Name: Azure OpenAI GPT-4 EU West Outage During Tax Season
Description: Testing resilience during our busiest period (Jan-April) if our primary region fails. This scenario combines provider unavailability with peak load.
Type: Provider Unavailable
Target Product: Azure OpenAI GPT-4 (EU West)

Pro tip: Write descriptions that explain:

What you're testing (specific provider, model, or condition)
Why it matters (business context, timing, impact)
When this might happen (seasonal peaks, known maintenance windows)

Option 3: Create from Signal

Best for: Turning real events into reusable tests, incident response planning

How it works:

Navigate to Dashboard → Signals
Find a signal you want to model (e.g., "OpenAI Deprecates GPT-3.5")
Click "Create Scenario from Signal" button
SignalBreak auto-populates:
- Scenario name from signal title
- Scenario type mapped from signal change type
- Description from signal body
- Target product auto-linked if signal has model impacts matching your enabled products
Review and edit fields
Click "Create Scenario"

Change type → Scenario type mapping:

Signal: deprecation → Scenario: model_deprecated
Signal: pricing → Scenario: price_increase
Signal: incident → Scenario: provider_unavailable
Signal: policy → Scenario: policy_change
Signal: capability → Scenario: model_deprecated

Example workflow:

Real signal: "Anthropic announces Claude 2 sunset in 6 months"
↓
Create scenario from signal
↓
Scenario: "Claude 2 Sunset: Document Analysis Impact"
Type: Model Deprecated
Target Product: Claude 2 (Anthropic)
↓
Execute scenario
↓
Results: 12 workflows affected, 45 days to migrate, cost impact £2,400/month

Why this is powerful: You turn reactive signal monitoring into proactive resilience testing. When a signal arrives, immediately model the impact and plan your response.

Executing Scenarios

Execution is when SignalBreak analyzes your scenario and generates:

Impacts: Which workflows are affected and how severely
Mitigations: Ranked options to reduce or eliminate the impact

Scenario execution results showing affected workflows and mitigation options with scores

Model Types for Scenarios

SignalBreak supports testing scenarios against two types of AI models:

1. Platform Models (Managed Provider APIs)

What they are: Cloud-hosted models from AI providers (OpenAI, Anthropic, Google, etc.)

Examples:

gpt-4o (OpenAI)
claude-3-5-sonnet-20241022 (Anthropic)
gemini-2.0-flash-exp (Google AI)

How to use in scenarios:

Ensure the model is enabled in your account (Providers → Products → Enable)
Select the model as the target product when creating a scenario
Execute to analyze impact across workflows using that platform model

Use cases: Test outages, deprecations, price increases for major cloud AI providers

2. Discovered Models (Self-Hosted)

What they are: Models you host yourself (on-premises, private cloud, or local infrastructure)

Examples:

llama3.2:3b (Ollama self-hosted)
mistral-7b-instruct (Azure AI self-hosted)
Custom fine-tuned models on your infrastructure

How to use in scenarios:

Register discovered models via Providers → Discovered Models (requires self-hosted connection)
Bind workflows to discovered models (same process as platform models)
Select the discovered model as the target product in scenarios
Execute to test resilience against self-hosted infrastructure failures

Use cases:

Test on-premises GPU server failures
Model infrastructure capacity planning (what if we lose a node?)
Hybrid cloud/on-prem dependency analysis

Platform vs Discovered Model Scenarios

Scenario Type	Platform Models	Discovered Models
Provider Unavailable	Test cloud API outages	Test self-hosted infrastructure failures
Model Deprecated	Test model sunset announcements	Test custom model lifecycle (retiring internal models)
Price Increase	Test cloud API pricing changes	Test infrastructure cost increases (GPU, hosting)
Performance Degradation	Test cloud model latency/quality	Test on-prem performance issues (load, hardware)
Policy Change	Test provider ToS changes	Test internal policy changes (data retention, access)
Rate Limit Exceeded	Test cloud API rate limits	Test self-hosted capacity limits (throughput, concurrency)

Pro tip: Create scenarios for both platform and discovered models to ensure complete coverage of your AI supply chain. A comprehensive resilience strategy accounts for both cloud provider risks and self-hosted infrastructure risks.

Prerequisites for Execution

Before you can execute a scenario, you must:

✅ Scenario is in Draft or Ready status
✅ Target product is selected (the AI model/provider to test - either platform or discovered model)
✅ You have at least 1 workflow configured in SignalBreak
✅ Target product is enabled (for platform models) or discovered (for self-hosted models)

Note: Scenarios without a target product cannot be executed (you'll see a disabled "Execute" button).

How to Execute

From Scenario List Page:

Navigate to Dashboard → Scenarios
Click on the scenario card to open detail view
Click "Execute Scenario" button (top right)
Wait for execution (typically 2-10 seconds)
Results appear automatically

From Scenario Detail Page:

Already on the scenario detail page
Click "Execute Scenario" button
Execution runs
Impacts and Mitigations sections populate below

Re-running a scenario:

If scenario status is Executed, button shows "Re-run Scenario"
Re-running replaces previous impacts and mitigations (idempotent operation)
Useful after adding new workflows or changing bindings

What Happens During Execution

SignalBreak performs 8 steps in a database transaction:

Step 1: Workflow Detection

Identifies all workflows that use the target product (the AI model being tested).

SQL query logic:

sql

SELECT workflow_id, workflow_name
FROM signalbreak.workflows
WHERE product_id = <target_product_id>
  AND tenant_id = <your_tenant_id>

Result: List of affected workflows (e.g., 12 workflows use Claude 2).

Step 2: Fallback Check

For each affected workflow, checks if fallback bindings exist.

Logic:

Does the workflow have bindings to other products (not the failing one)?
If yes: Impact severity is reduced (you have backup options)
If no: Impact severity is higher (single point of failure)

Example:

Workflow "Contract Review" uses Claude 2 only → No fallback → Critical impact
Workflow "Email Triage" uses Claude 2 + GPT-4 fallback → Has fallback → Medium impact

Step 3: Impact Calculation

For each affected workflow, calculates:

Impact severity: Critical, High, Medium, or Low
Estimated downtime: Minutes of potential disruption
Affected users: Number of users impacted
Revenue impact: Financial loss estimate (£ GBP)

Severity logic:

Critical: No fallback + high-traffic workflow
High: No fallback + medium-traffic workflow OR fallback exists but critical workflow
Medium: Fallback exists + medium-traffic workflow
Low: Fallback exists + low-traffic workflow

Example impact:

Workflow: "Customer Support Chatbot"
Severity: Critical
Downtime: 240 mins (4 hours)
Affected Users: 5,000 customers
Revenue Impact: £15,000 (lost support capacity, customer churn)
Business Notes: "Tier 1 support offline. Call centre overwhelmed.
                 Manual ticket routing required."

Step 4: Delete Existing Impacts (Idempotent)

If scenario was previously executed, deletes old impacts to ensure fresh results.

Step 5: Insert New Impacts

Stores calculated impacts to signalbreak.scenario_impacts table.

Step 6: Generate Mitigations

For each impact, generates 2-4 ranked mitigation options.

Mitigation types:

Switch Provider: Use an alternative AI provider
Fallback Model: Use a different model from the same provider
Reduce Scope: Limit automation to critical use cases only
Manual Process: Temporarily revert to human workflows
Cache/Queue: Use cached responses or queue requests until recovery

Mitigation ranking factors:

Effectiveness score (0-10): How well does this solve the problem?
Implementation cost (£): One-time cost to implement
Time to implement (hours): How long to deploy this fix?

Example mitigations for "Support Chatbot Down":

1. Switch Provider (Effectiveness: 9/10, Cost: £500, Time: 2 hours)
   → Failover to backup provider (OpenAI GPT-4)

2. Manual Process (Effectiveness: 6/10, Cost: £0, Time: 0.5 hours)
   → Route all tickets to human agents, extend wait times

3. Reduce Scope (Effectiveness: 7/10, Cost: £200, Time: 1 hour)
   → Limit chatbot to FAQs only, escalate complex queries

Step 7: Insert Mitigations

Stores generated mitigations to signalbreak.mitigation_options table.

Step 8: Update Scenario Status

Changes scenario status from Draft/Ready to Executed.

Transaction commit: All 8 steps succeed or rollback (atomic operation).

Execution Results

After execution completes, you'll see:

Execution Summary Card:

✅ Scenario Executed Successfully

Analyzed 12 affected workflows.
3 critical impacts identified - immediate action required.
5 high severity impacts identified.
Mitigation options generated and ranked by effectiveness.

Executed: 15 Jan 2025, 14:23 GMT

Impacts Section:

Table of affected workflows
Severity badges (Critical/High/Medium/Low)
Business impact notes
Downtime/users/revenue estimates
Click workflow name to view details

Mitigations Section:

Grouped by impact
Ranked by effectiveness
Implementation cost and time shown
Click to expand full mitigation details

Execution Errors

If execution fails, you'll see an error message. Common causes:

Error	Cause	Solution
"No target product specified"	Scenario has no `target_product_id`	Edit scenario, select a product
"No workflows affected"	No workflows use this product	Add workflows or choose different product
"Unauthorized"	Not logged in or session expired	Log in again
"Feature limit exceeded"	Scenario quota reached on your plan	Upgrade plan or archive old scenarios
"Database error"	Connection issue or constraint violation	Retry, contact support if persists

Understanding Impacts

Impacts represent the estimated effect of a scenario on each affected workflow.

Impact Severity Levels

Scenarios generate impacts at 4 severity levels:

Critical (Red Badge)

What it means: Immediate, severe disruption with high business impact.

Typical conditions:

Workflow has no fallback bindings
High-volume or customer-facing workflow
Significant revenue/user impact

Example:

Workflow: "Payment Fraud Detection"
Severity: Critical
Downtime: 120 mins
Affected Users: 10,000 transactions/day
Revenue Impact: £50,000 (fraud exposure)
Business Notes: "Real-time fraud checks offline. Manual review required for
                 all high-value transactions. Increased fraud risk."

Action required: Immediate mitigation planning. This is a single point of failure.

High (Orange Badge)

What it means: Significant disruption but with some mitigation options.

Typical conditions:

Workflow has fallback but quality/cost trade-off
Medium-volume workflow with no fallback
Important but not critical business function

Example:

Workflow: "Legal Contract Review"
Severity: High
Downtime: 240 mins
Affected Users: 50 contracts/day
Revenue Impact: £8,000 (delayed deals)
Business Notes: "M&A due diligence delayed. Manual review capacity limited.
                 Fallback model has lower accuracy for legal terminology."

Action required: Mitigation planning within days. Monitor closely.

Medium (Yellow Badge)

What it means: Noticeable disruption but manageable.

Typical conditions:

Workflow has fallback bindings in place
Medium-volume workflow
Workarounds available

Example:

Workflow: "Product Description Generation"
Severity: Medium
Downtime: 60 mins
Affected Users: 200 SKUs/day
Revenue Impact: £1,500 (delayed catalog updates)
Business Notes: "Catalogue updates delayed. Fallback to GPT-3.5 increases
                 generation time by 2x but maintains quality."

Action required: Mitigation planning within weeks. Low urgency.

Low (Green Badge)

What it means: Minor disruption with minimal business impact.

Typical conditions:

Workflow has fallback bindings
Low-volume or non-critical workflow
Multiple alternatives available

Example:

Workflow: "Internal Meeting Summaries"
Severity: Low
Downtime: 30 mins
Affected Users: 20 meetings/week
Revenue Impact: £200 (lost productivity)
Business Notes: "Internal productivity tool. Fallback model works fine.
                 Manual note-taking viable if needed."

Action required: No immediate action. Monitor trends.

Impact Metrics

Each impact includes quantitative estimates:

Metric	Description	How It's Calculated
Estimated Downtime	Minutes of potential disruption	Based on scenario type + workflow criticality
Affected Users	Number of users/transactions impacted	Workflow metadata + historical volume
Revenue Impact	Financial loss estimate (£ GBP)	Downtime × users × average transaction value
Business Impact Notes	Qualitative description	Template + workflow context

Note on estimates: Impact metrics are estimates based on workflow configuration and historical patterns. Use them for prioritization and planning, not as exact predictions.

Viewing Impact Details

From Scenario Detail Page:

Scroll to "Impacts" section
Table shows all affected workflows:
- Workflow name (clickable link to workflow detail)
- Severity badge (Critical/High/Medium/Low)
- Business impact notes (truncated)
Click workflow name to navigate to workflow detail page
Click impact row to expand full business notes

Empty state:

No Impacts Yet

Execute this scenario to identify affected workflows and estimate
business impact. Click "Execute Scenario" above to begin.

Understanding Mitigations

Mitigations are ranked options to reduce or eliminate the impact of a scenario.

Mitigation Types

SignalBreak generates mitigations across 5 categories:

1. Switch Provider

What it does: Move workflow to a different AI provider entirely.

When to use:

Current provider frequently unreliable
Better pricing/features available elsewhere
Compliance requires multi-provider strategy

Example:

Mitigation: Switch to Azure OpenAI GPT-4
Effectiveness: 9/10
Cost: £800 (API integration + testing)
Time: 4 hours
Description: "Migrate workflow to Azure OpenAI. Maintains quality and
              performance. Requires API key configuration and prompt tuning."

Implementation checklist:

[ ] Acquire API key from new provider
[ ] Update workflow binding in SignalBreak
[ ] Test prompts/parameters on new model
[ ] Run QA validation
[ ] Deploy to production
[ ] Monitor quality metrics for 48 hours

2. Fallback Model

What it does: Use a different model from the same provider.

When to use:

Provider is reliable overall
Specific model deprecated or expensive
Prefer continuity with existing vendor relationship

Example:

Mitigation: Fallback to GPT-3.5 Turbo
Effectiveness: 7/10
Cost: £200 (prompt optimization)
Time: 2 hours
Description: "Use GPT-3.5 Turbo as fallback. 50% cheaper but slightly lower
              quality for complex tasks. Requires prompt adjustments."

Implementation checklist:

[ ] Configure fallback binding in workflow
[ ] Test fallback model performance
[ ] Adjust prompts if needed (fallback models may need simpler instructions)
[ ] Set up monitoring for quality degradation
[ ] Document fallback trigger conditions

3. Reduce Scope

What it does: Limit automation to highest-priority use cases, manual fallback for others.

When to use:

Cost containment needed
Quality concerns with alternatives
Temporary measure during migration

Example:

Mitigation: Limit chatbot to Tier 1 queries only
Effectiveness: 6/10
Cost: £500 (logic changes)
Time: 3 hours
Description: "Route only simple FAQs to chatbot. Escalate complex queries
              to human agents immediately. Reduces API usage by 60%."

Implementation checklist:

[ ] Define scope criteria (what stays automated?)
[ ] Implement routing logic
[ ] Train staff on manual procedures
[ ] Update customer-facing messaging (e.g., "Live agent wait times increased")
[ ] Monitor customer satisfaction metrics

4. Manual Process

What it does: Temporarily revert to human workflows until automation restored.

When to use:

Short-term outage expected (hours, not days)
No suitable automated alternatives
High-stakes scenarios where quality cannot be compromised

Example:

Mitigation: Manual contract review by legal team
Effectiveness: 5/10
Cost: £0 (existing staff)
Time: 0.5 hours
Description: "Route all contracts to legal team for manual review.
              Processing time increases from 2 mins to 45 mins.
              Throughput drops by 90%."

Implementation checklist:

[ ] Alert staff to activation of manual procedures
[ ] Distribute work queue to team
[ ] Track backlog size and velocity
[ ] Communicate delays to stakeholders
[ ] Monitor for automation recovery
[ ] Document lessons learned for future incidents

5. Cache/Queue

What it does: Use cached responses or queue requests until service recovers.

When to use:

Short outage expected (< 1 hour)
Requests can tolerate delays
High likelihood of similar queries (cacheable)

Example:

Mitigation: Cache recent chatbot responses for 1 hour
Effectiveness: 8/10
Cost: £300 (caching infrastructure)
Time: 2 hours
Description: "Serve cached responses for common queries (FAQ, status checks).
              Queue unique queries for 30 mins. 70% of traffic handled."

Implementation checklist:

[ ] Implement caching layer (Redis, CDN)
[ ] Define cache TTL (time-to-live)
[ ] Set up queue with max wait time
[ ] Implement fallback if queue exceeds capacity
[ ] Monitor cache hit rate
[ ] Flush cache when service recovers

Mitigation Ranking

Mitigations are ranked by effectiveness score (0-10 scale).

Ranking logic:

Effectiveness = Quality Score × Feasibility Score

Quality Score:
- How well does this solve the problem? (0-10)
- Does it maintain customer experience?
- Does it preserve business outcomes?

Feasibility Score:
- How quickly can this be implemented? (faster = higher score)
- What's the cost? (cheaper = higher score)
- Do we have the skills/resources? (yes = higher score)

Example ranking:

Rank 1: Switch to Azure OpenAI (9/10) ← Best quality, reasonable cost/time
Rank 2: Fallback to GPT-3.5 (7/10) ← Good balance of speed and effectiveness
Rank 3: Manual Process (5/10) ← Slow and limited capacity
Rank 4: Reduce Scope (4/10) ← Degrades customer experience

How to use rankings:

Start with Rank 1 (highest effectiveness) if time/budget allow
Rank 2-3 are good compromises if urgent or budget-constrained
Lowest ranks are last-resort options (manual processes, severe scope reduction)

Viewing Mitigation Details

From Scenario Detail Page:

Scroll to "Mitigations" section
Grouped by impact (each affected workflow has its own mitigation options)
Each mitigation shows:
- Mitigation type badge (Switch Provider, Fallback Model, etc.)
- Effectiveness score (0-10)
- Implementation cost (£ GBP)
- Time to implement (hours)
- Description (what to do)
Click mitigation card to expand full details

Example display:

Mitigations for: Customer Support Chatbot (Critical)

┌─────────────────────────────────────────────────────┐
│ 🔄 Switch Provider                     Rank 1       │
│ Effectiveness: 9/10                                 │
│ Cost: £500   |   Time: 2 hours                     │
│                                                     │
│ Switch to Azure OpenAI GPT-4. Maintains quality    │
│ and performance. Requires API key configuration.   │
└─────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────┐
│ 📋 Manual Process                      Rank 2       │
│ Effectiveness: 6/10                                 │
│ Cost: £0     |   Time: 0.5 hours                   │
│                                                     │
│ Route all tickets to human agents. Extend wait     │
│ times. Temporary measure for short outages.        │
└─────────────────────────────────────────────────────┘

Empty state:

No Mitigations Yet

Execute this scenario to generate ranked mitigation options.
Mitigations are automatically generated based on your workflow
configuration and available alternatives.

Scenario Management

Editing Scenarios

You can edit scenarios in Draft or Ready status (not Executed or Archived).

To edit:

Navigate to scenario detail page
Click "Edit" button (top right)
Modify fields:
- Scenario name
- Description
- Scenario type
- Target product
Click "Save Changes"

Note: Editing an Executed scenario requires re-execution to update impacts.

Deleting Scenarios

You can delete scenarios at any lifecycle stage.

To delete:

Navigate to scenario detail page
Click "Delete" button (top right, red)
Confirm deletion in modal dialog
Scenario and all associated impacts/mitigations are permanently deleted

⚠️ Warning: Deletion is permanent and cannot be undone. Impacts and mitigations are also deleted.

When to delete:

Duplicate scenarios
Test scenarios (created during training)
Scenarios no longer relevant to your organization

Alternative to deletion: Archive the scenario instead (preserves historical data for audits).

Archiving Scenarios

Archive scenarios to mark them as complete without deleting historical data.

To archive:

Navigate to scenario detail page
Click "Edit"
Change status to "Archived"
Click "Save"

What happens when archived:

Scenario is read-only (cannot edit or re-execute)
Impacts and mitigations are preserved (for audit trail)
Scenario is hidden from main list by default (but can be filtered to view)

When to archive:

Scenario was tested and mitigations implemented
Risk is no longer relevant (e.g., migrated away from that provider)
Replaced by a newer, more accurate scenario

Restoring from archive: Edit and change status back to Ready or Executed.

Filtering Scenarios

On the scenarios list page, you can filter by:

Status:

All
Draft
Ready
Executed
Archived

Type:

All
Provider Unavailable
Model Deprecated
Price Increase
Performance Degradation
Policy Change
Rate Limit Exceeded

How to filter:

Navigate to Dashboard → Scenarios
Click "Filter" button (top right)
Select status and/or type
Click "Apply Filters"
Scenario list updates to show matching scenarios only

Clear filters: Click "Clear Filters" to reset to "All" view.

Best Practices

1. Run Scenarios Quarterly

Why: AI landscape changes rapidly. Providers deprecate models, change pricing, and update policies frequently.

Recommended schedule:

Q1 (Jan): Test outage scenarios (post-holiday traffic spikes)
Q2 (Apr): Test price increase scenarios (budget planning season)
Q3 (Jul): Test model deprecation scenarios (major provider updates common in summer)
Q4 (Oct): Test policy change scenarios (pre-holiday compliance review)

How to implement:

Create recurring calendar reminder: "Run SignalBreak scenarios"
Re-execute existing scenarios to refresh impacts with latest workflows
Review mitigation effectiveness and update strategies

2. Test Your Highest-Risk Workflows First

Why: 80/20 rule applies—80% of business impact comes from 20% of workflows.

How to identify high-risk workflows:

Customer-facing (chatbots, recommendations, content moderation)
Revenue-critical (fraud detection, pricing, lead scoring)
Compliance-sensitive (legal, healthcare, financial services)

Workflow priority matrix:

High Business Impact + No Fallback = Test First (Critical Risk)
High Business Impact + Has Fallback = Test Second (High Risk)
Low Business Impact + No Fallback = Test Third (Medium Risk)
Low Business Impact + Has Fallback = Test Last (Low Risk)

Action: Start with 1-3 scenarios covering your most critical workflows before expanding to full coverage.

3. Document Mitigations in Your Runbooks

Why: Scenario results are only valuable if your team can act on them during an incident.

What to document:

Trigger conditions: When to activate this mitigation (e.g., "Provider downtime > 15 mins")
Step-by-step instructions: Exactly what to do (API keys, config changes, commands)
Rollback procedure: How to revert if mitigation fails
Owner/contact: Who executes this (on-call engineer, ops team)

Example runbook entry:

markdown

## Mitigation: Chatbot Fallback to Azure OpenAI

**Trigger:** Claude API downtime > 15 minutes OR error rate > 5%

**Steps:**
1. SSH to app server: `ssh ops@chatbot-prod-1`
2. Edit config: `sudo nano /etc/chatbot/config.yaml`
3. Change provider: `provider: anthropic` → `provider: azure_openai`
4. Restart service: `sudo systemctl restart chatbot`
5. Verify: `curl https://chatbot.example.com/health` (expect HTTP 200)
6. Monitor logs: `tail -f /var/log/chatbot/app.log`

**Rollback:** Reverse step 3, restart service

**Owner:** On-call SRE (see PagerDuty schedule)
**Azure OpenAI API Key:** Stored in 1Password vault: "Production/Azure OpenAI"

Where to store runbooks:

Internal wiki (Confluence, Notion, GitHub Wiki)
Incident management tool (PagerDuty, Opsgenie)
Git repo (infrastructure-as-code)

Why: Resilience planning is a team sport. Leadership, legal, finance, and ops all have a role.

Who to share with:

Stakeholder	What They Care About	What to Share
Executives	Business continuity, revenue risk	Critical/High impacts, revenue estimates, mitigation costs
Legal/Compliance	Regulatory exposure, policy changes	Policy change scenarios, data residency impacts
Finance	Budget planning, cost exposure	Price increase scenarios, mitigation costs, ROI analysis
Engineering	Technical implementation	Mitigation details, fallback architectures, implementation time
Customer Success	Customer impact, SLAs	Downtime estimates, affected user counts, communication plans

How to share:

Export scenario results (PDF or CSV)
Create executive summary (1 page: key risks + top mitigations)
Present in risk review meetings (quarterly business reviews, audit prep)
Include in incident postmortems ("We predicted this in Scenario #42...")

Sample executive summary:

Q1 2025 AI Resilience Scenario Testing Summary

Scenarios Tested: 8
Critical Impacts Identified: 3
High Impacts Identified: 7

Top Risk: Claude API Outage (Scenario #12)
- 12 workflows affected (including customer support chatbot)
- Estimated downtime: 4 hours
- Revenue impact: £50,000
- Mitigation: Fallback to Azure OpenAI (£800, 2 hours to implement)
  Status: Implemented ✅

Recommendation: Implement multi-provider strategy for all customer-facing workflows by Q2.
Budget Required: £5,000 (API integrations + testing)

5. Use Templates for Onboarding New Team Members

Why: Templates encode organizational knowledge. New hires can see "how we think about AI risk."

How to use templates for training:

Assign 2-3 scenarios from templates matching your industry
Have new hire execute scenarios and review results
Discuss as team: "Are these the right mitigations? What would you add?"
Update templates based on team feedback (continuous improvement)

Benefits:

Faster onboarding (learn by doing)
Shared mental models (everyone tests the same scenarios)
Improved templates (fresh perspectives catch blind spots)

6. Combine Scenarios with Signals for Complete Coverage

Strategy:

Signals: Real-time monitoring (what's happening now?)
Scenarios: Proactive planning (what could happen next?)

Workflow:

Signal arrives → Assess impact → Is this a new risk we haven't modeled?
                                  ↓
                                  Yes: Create scenario from signal
                                  ↓
                                  Execute scenario to quantify full impact
                                  ↓
                                  Implement top-ranked mitigation
                                  ↓
                                  Archive scenario when resolved

Example:

Signal: "OpenAI increases GPT-4 pricing by 30%"
↓
Create Scenario: "GPT-4 Price Increase: Budget Impact"
↓
Execute: 8 workflows affected, £4,500/month increase
↓
Mitigation: Migrate 5 low-priority workflows to GPT-3.5 (saves £3,000/month)
↓
Implement migration over 2 weeks
↓
Archive scenario: "Resolved - Hybrid pricing strategy deployed"

Scenario Limits by Plan

SignalBreak enforces scenario limits based on your subscription tier:

Plan	Scenarios	Executions/Month	Notes
Free	5	10	Good for testing, proof-of-concept
Starter	25	100	Small teams, single product focus
Professional	100	Unlimited	Mid-size teams, multiple products
Enterprise	Unlimited	Unlimited	Large orgs, compliance requirements

What happens when you hit the limit:

Scenario creation is blocked (HTTP 403 error)
Error message: "Scenario limit reached. Please upgrade your plan."
Existing scenarios remain accessible (read-only)

How to manage limits:

Archive or delete old/irrelevant scenarios to free up quota
Upgrade plan if consistently hitting limits
Prioritize high-risk scenarios (test critical workflows first)

Check your usage:

Navigate to Settings → Billing
View "Scenarios" usage meter
Shows: X / Y scenarios used (Z% of quota)

Compliance & Audit

Scenarios for Compliance Frameworks

Scenarios help demonstrate compliance with business continuity and risk management requirements:

ISO 22301 (Business Continuity Management)

Requirement: Organizations must identify potential disruptions and test recovery procedures.

How scenarios help:

✅ Proactive identification of AI supply chain risks
✅ Documented impact analysis for each disruption scenario
✅ Tested and ranked mitigation strategies
✅ Periodic review cycle (re-run scenarios quarterly)

Evidence for auditors: Export scenario results showing regular testing and mitigation planning.

NIST Cybersecurity Framework (CSF)

Function: Identify (ID.RM) - Risk Management Strategy

How scenarios help:

✅ Risk identification (AI provider dependencies)
✅ Impact assessment (business, financial, operational)
✅ Response strategies (ranked mitigations)

Function: Protect (PR.IP) - Information Protection Processes and Procedures

How scenarios help:

✅ Resilience testing (test continuity plans)
✅ Improvement cycle (update strategies based on results)

Evidence for auditors: Scenario execution logs, mitigation runbooks, quarterly review summaries.

ISO 42001 (AI Management System)

Clause 8.1.2: Organizations shall identify and assess risks related to AI system dependencies.

How scenarios help:

✅ Dependency mapping (which workflows depend on which providers)
✅ Risk quantification (severity levels, business impact)
✅ Control effectiveness (fallback strategies tested)

Evidence for auditors: Scenario templates, execution results, archived scenarios showing risk management lifecycle.

Audit Trail

SignalBreak maintains a complete audit trail for scenarios:

What's logged:

Creation: Who created the scenario, when, and from what source (template, scratch, signal)
Modifications: All edits (scenario name, type, target product changes)
Executions: Timestamp, impacts generated, mitigations calculated
Status changes: Draft → Ready → Executed → Archived transitions
Deletion: Who deleted, when (soft delete available for Enterprise plans)

How to access audit logs:

Navigate to Settings → Audit Logs
Filter by "Scenarios" category
View chronological log of all scenario activity

Retention:

Free/Starter: 90 days
Professional: 1 year
Enterprise: 7 years + custom retention policies

Troubleshooting

Problem: "Execute Scenario" Button is Disabled

Possible causes:

No target product selected
- Solution: Edit scenario, select a target product from dropdown
Scenario is archived
- Solution: Edit scenario, change status to "Ready"
Feature gate hit (scenario limit reached)
- Solution: Archive old scenarios or upgrade plan

How to diagnose:

Hover over disabled button (tooltip shows reason)
Check scenario detail page: Is target product shown? Is status "Archived"?

Problem: "No Workflows Affected"

Meaning: Scenario executed successfully, but no workflows use the target product.

Possible causes:

Wrong product selected
- Solution: Edit scenario, choose a different target product
No workflows configured yet
- Solution: Create workflows first (Dashboard → Workflows → + New Workflow)
Workflows use different bindings
- Solution: Check workflow bindings (Workflows → Edit → Provider Bindings)

How to diagnose:

Navigate to Dashboard → Workflows
Filter by target product (use search or product filter)
Do any workflows appear? If no, that's the issue.

Problem: Impacts Show Zero Revenue/Users

Meaning: Impacts were generated, but metrics are all zeros or nulls.

Possible causes:

Workflow metadata incomplete
- Solution: Edit workflows, add business context (users, revenue, criticality)
No historical data available
- Solution: Run workflows in production first to establish baselines
Low-priority workflows
- Solution: This may be correct (internal tools, low-traffic workflows)

How to improve estimates:

Navigate to Dashboard → Workflows → [workflow] → Edit
Fill in metadata:
- Daily transaction volume (e.g., 500 API calls/day)
- User base size (e.g., 1,000 active users)
- Revenue per transaction (e.g., £10/transaction)
Re-run scenario to refresh impacts

Problem: Mitigations Don't Make Sense

Example: "Switch to Provider X" suggested, but you don't use Provider X.

Possible causes:

Mitigation generator uses generic templates
- Solution: SignalBreak generates options based on best practices, not your exact setup
Missing product bindings
- Solution: Configure your available products (Settings → Integrations → Providers)

How to interpret:

Mitigations are suggestions, not commands
Use them as a starting point for your own planning
Adapt mitigations to your org's constraints (budget, skills, vendor relationships)

Custom mitigations:

SignalBreak generates 2-4 options automatically
You can document additional custom mitigations in scenario description or external runbooks

Problem: Scenario Execution is Slow (> 10 seconds)

Typical execution time: 2-10 seconds for 1-50 workflows.

Possible causes:

Large number of workflows (100+ workflows affected)
- Solution: This is expected. Execution time scales with workflow count.
Database connection issues
- Solution: Retry. Contact support if consistently slow.
Complex impact calculations
- Solution: Execution is CPU-intensive for critical workflows. Wait for completion.

Performance tips:

Avoid creating scenarios during peak usage times
If you have 500+ workflows, consider narrowing scope (target specific product subsets)

FAQ

What's the difference between a signal and a scenario?

Signals:

Real-time events from AI providers (deprecations, pricing changes, incidents)
You react to signals as they happen
Always reflect current reality

Scenarios:

Hypothetical "what-if" exercises you design
You proactively test scenarios before they happen
You control timing and scope

Use both:

Monitor signals to stay informed of real changes
Create scenarios to test resilience and plan mitigations

Can I create scenarios for providers I don't use yet?

Yes. You can test scenarios for any provider/model in SignalBreak's directory, even if you haven't integrated them yet.

Why this is useful:

Vendor evaluation: "If we switch to Provider X, what's the impact if they have an outage?"
Negotiation: "We need 99.9% uptime SLA because our scenario testing shows £50k/day exposure."
Future-proofing: "We're planning to adopt Provider Y in Q3. Let's test risks now."

How to do it:

Create scenario, select target product from full provider directory
Execute scenario (may show "No workflows affected" if you haven't integrated yet)
Manually estimate impacts based on planned usage

How often should I re-run scenarios?

Recommended frequency:

After major workflow changes: New workflows added, bindings changed, providers switched
Quarterly: Re-run all critical scenarios to refresh impacts with latest data
After signals: When a signal arrives, re-run related scenario to assess current impact

Why re-run:

Your workflow landscape changes (new workflows, deleted workflows, updated bindings)
Impact estimates should reflect current reality
Mitigations may change (new providers available, costs updated)

Example:

Jan 2025: Execute "Claude API Outage" scenario
Result: 8 workflows affected, £15k impact

April 2025: Add 4 new customer-facing workflows using Claude
April 2025: Re-run "Claude API Outage" scenario
Result: 12 workflows affected, £32k impact ← Impact doubled!

Action: Implement fallback bindings for new workflows

Can I export scenario results?

Yes. SignalBreak provides export options for scenario data:

Export formats:

PDF: Executive summary (scenario overview, impacts table, top mitigations)
CSV: Impacts and mitigations data (import into Excel, Google Sheets)
JSON: Full scenario data (for integration with other tools)

How to export:

Navigate to scenario detail page
Click "Export" button (top right)
Select format (PDF, CSV, JSON)
Download file

Use cases:

Reporting: Share with executives, auditors, compliance teams
Analysis: Import into BI tools for trend analysis
Archival: Store in company wiki or document management system

Do scenarios affect my AI providers?

No. Scenarios are simulations only. They do not:

Send API requests to your AI providers
Modify workflows or bindings
Trigger actual disruptions
Cost API credits

What scenarios DO:

Query your SignalBreak database (workflows, products, bindings)
Calculate estimated impacts based on configuration
Generate mitigation suggestions
Store results for review

Scenarios are safe to run in production environments.

Can I create scenarios for multiple products at once?

No. Each scenario tests one target product (one AI model/provider).

Why:

Scenarios model specific, localized disruptions (e.g., "GPT-4 is down" not "All AI is down")
Impacts and mitigations are product-specific

If you need to test multiple products:

Create separate scenarios for each product
Run them in sequence or parallel
Compare results to identify highest-risk dependencies

Example:

Scenario 1: "Claude 2 Deprecated" (target: Claude 2)
Scenario 2: "GPT-4 Deprecated" (target: GPT-4)
Scenario 3: "Gemini Deprecated" (target: Gemini)

Compare:
- Claude 2: 12 workflows, £50k impact ← Highest risk
- GPT-4: 5 workflows, £15k impact
- Gemini: 2 workflows, £3k impact

What happens to scenarios if I delete a workflow?

Scenario remains, impacts updated on next execution.

What happens:

You delete a workflow (e.g., "Email Triage")
Existing scenario results still reference that workflow (historical data preserved)
Re-running the scenario removes the deleted workflow from impacts

Why this matters:

Scenarios maintain historical audit trail (what was tested when)
Re-running scenarios keeps them current with your active workflows

Best practice:

Re-run scenarios quarterly to ensure impacts reflect current workflow inventory
Archive scenarios that are no longer relevant after major architecture changes

Can I test scenarios against staging/dev environments?

Not directly. SignalBreak scenarios analyze your production workflow configuration (as defined in the platform).

Workaround for testing:

Create test tenant (separate SignalBreak account)
Configure staging workflows in test tenant
Run scenarios against test tenant
Apply learnings to production

Future feature: Multi-environment support is on the roadmap (test scenarios in staging before applying to production).

Signals: Real-time monitoring of AI provider changes (combine with scenarios for complete risk coverage)
Workflows: Configure your AI workflows and provider bindings (required before running scenarios)
- See Workflow Risk Tab — View MIT Framework risks identified for each workflow
- See Workflow Controls Tab — Document risk treatments and mitigation patterns
Provider Bindings: Set up fallback providers and model alternatives (improves scenario resilience)
Risk Management: Track risks, decisions, and remediation across all workflows
Governance Reports: Export scenario results for compliance and audits
Dashboard: Monitor high-risk scenarios and recent executions at a glance

Support

Need help with scenarios?

📧 Email: support@signal-break.com
💬 Live Chat: Click chat icon (bottom right) for instant support
📚 Knowledge Base: docs.signal-break.com
🎥 Video Tutorial: How to Run Your First Scenario (5 mins)

Common requests:

Customizing mitigation templates
Integrating scenario results with incident management tools (PagerDuty, Opsgenie)
Bulk scenario creation via API
Advanced impact calculation customization

Enterprise support:

Dedicated scenario planning sessions with our risk advisory team
Custom scenario templates for your industry
Quarterly resilience review meetings

Last updated: January 2025

Scenarios ​

Overview ​

Why Scenarios Matter ​

Scenarios vs. Signals ​

Scenario Types ​

1. Provider Unavailable 🔴 ​

2. Model Deprecated 📦 ​

3. Price Increase 💰 ​

4. Performance Degradation 📉 ​

5. Policy Change 📜 ​

6. Rate Limit Exceeded ⏱️ ​

Scenario Lifecycle ​

1. Draft (Gray Badge) ​

2. Ready (Blue Badge) ​

3. Executed (Green Badge) ​

4. Archived (Slate Badge) ​

Creating Scenarios ​

Option 1: Quick Start (Templates) ​

Option 2: Create from Scratch ​

Option 3: Create from Signal ​

Executing Scenarios ​

Model Types for Scenarios ​

1. Platform Models (Managed Provider APIs) ​

2. Discovered Models (Self-Hosted) ​

Platform vs Discovered Model Scenarios ​

Prerequisites for Execution ​

How to Execute ​

What Happens During Execution ​

Step 1: Workflow Detection ​

Step 2: Fallback Check ​

Step 3: Impact Calculation ​

Step 4: Delete Existing Impacts (Idempotent) ​

Step 5: Insert New Impacts ​

Step 6: Generate Mitigations ​

Step 7: Insert Mitigations ​

Step 8: Update Scenario Status ​

Execution Results ​

Execution Errors ​

Understanding Impacts ​

Impact Severity Levels ​

Critical (Red Badge) ​

High (Orange Badge) ​

Medium (Yellow Badge) ​

Low (Green Badge) ​

Impact Metrics ​

Viewing Impact Details ​

Understanding Mitigations ​

Mitigation Types ​

1. Switch Provider ​

2. Fallback Model ​

3. Reduce Scope ​

4. Manual Process ​

5. Cache/Queue ​

Mitigation Ranking ​

Viewing Mitigation Details ​

Scenario Management ​

Editing Scenarios ​

Deleting Scenarios ​

Archiving Scenarios ​

Filtering Scenarios ​

Best Practices ​

1. Run Scenarios Quarterly ​

2. Test Your Highest-Risk Workflows First ​

3. Document Mitigations in Your Runbooks ​

4. Share Scenario Results with Stakeholders ​

5. Use Templates for Onboarding New Team Members ​

6. Combine Scenarios with Signals for Complete Coverage ​

Scenario Limits by Plan ​

Compliance & Audit ​

Scenarios for Compliance Frameworks ​

ISO 22301 (Business Continuity Management) ​

NIST Cybersecurity Framework (CSF) ​

ISO 42001 (AI Management System) ​

Audit Trail ​

Troubleshooting ​

Problem: "Execute Scenario" Button is Disabled ​

Problem: "No Workflows Affected" ​

Problem: Impacts Show Zero Revenue/Users ​

Problem: Mitigations Don't Make Sense ​

Problem: Scenario Execution is Slow (> 10 seconds) ​

Scenarios

Overview

Why Scenarios Matter

Scenarios vs. Signals

Scenario Types

1. Provider Unavailable 🔴

2. Model Deprecated 📦

3. Price Increase 💰

4. Performance Degradation 📉

5. Policy Change 📜

6. Rate Limit Exceeded ⏱️

Scenario Lifecycle

1. Draft (Gray Badge)

2. Ready (Blue Badge)

3. Executed (Green Badge)

4. Archived (Slate Badge)

Creating Scenarios

Option 1: Quick Start (Templates)

Option 2: Create from Scratch

Option 3: Create from Signal

Executing Scenarios

Model Types for Scenarios

1. Platform Models (Managed Provider APIs)

2. Discovered Models (Self-Hosted)

Platform vs Discovered Model Scenarios

Prerequisites for Execution

How to Execute

What Happens During Execution

Step 1: Workflow Detection

Step 2: Fallback Check

Step 3: Impact Calculation

Step 4: Delete Existing Impacts (Idempotent)

Step 5: Insert New Impacts

Step 6: Generate Mitigations

Step 7: Insert Mitigations

Step 8: Update Scenario Status

Execution Results

Execution Errors

Understanding Impacts

Impact Severity Levels

Critical (Red Badge)

High (Orange Badge)

Medium (Yellow Badge)

Low (Green Badge)

Impact Metrics

Viewing Impact Details

Understanding Mitigations

Mitigation Types

1. Switch Provider

2. Fallback Model

3. Reduce Scope

4. Manual Process

5. Cache/Queue

Mitigation Ranking

Viewing Mitigation Details

Scenario Management

Editing Scenarios

Deleting Scenarios

Archiving Scenarios

Filtering Scenarios

Best Practices

1. Run Scenarios Quarterly

2. Test Your Highest-Risk Workflows First

3. Document Mitigations in Your Runbooks

4. Share Scenario Results with Stakeholders

5. Use Templates for Onboarding New Team Members

6. Combine Scenarios with Signals for Complete Coverage

Scenario Limits by Plan

Compliance & Audit

Scenarios for Compliance Frameworks

ISO 22301 (Business Continuity Management)

NIST Cybersecurity Framework (CSF)

ISO 42001 (AI Management System)

Audit Trail

Troubleshooting

Problem: "Execute Scenario" Button is Disabled

Problem: "No Workflows Affected"

Problem: Impacts Show Zero Revenue/Users

Problem: Mitigations Don't Make Sense

Problem: Scenario Execution is Slow (> 10 seconds)