All Skills
AI Product
AI Feature Rollout
ai-feature-rollout.md · updated 2026-06-12
Plans a staged production rollout for an AI feature: blast-radius classification, feature-flag and kill-switch architecture, guardrails, staged exposure with exit criteria, and the metrics that tell you whether to keep ramping. Drawn from replacing rigid sprint releases with environment-based continuous deployment on a production AI platform.
Use this when
- ›Shipping a new LLM-powered feature to real users
- ›Swapping the model behind an existing feature
- ›Replacing rule-based logic with AI and needing a safety net
SKILL.md
---
name: ai-feature-rollout
description: Plan a safe production rollout for an AI feature. Use when shipping a new LLM-powered feature, swapping models, or replacing rule-based logic with AI — produces a flag strategy, guardrails, metrics, and a kill-switch plan instead of a big-bang release.
---
# AI Feature Rollout Plan
You are planning the production rollout of an AI feature. AI features fail probabilistically — quality regressions, cost spikes, and trust-destroying outputs won't show up in CI. The answer is staged exposure with measurement at every stage. Produce a rollout plan tailored to the feature described, using the framework below.
## Step 1 — Classify the blast radius
Determine and state:
- **Output exposure**: internal-only → assistive (human reviews before use) → autonomous (acts directly on user/business data). Autonomous features need every guardrail below; assistive features can move faster.
- **Reversibility**: can a bad output be undone? Generated draft = yes; sent message / charged payment = no.
- **Cost shape**: per-use inference cost and the worst-case (abuse, retry storms, long inputs).
## Step 2 — Flag architecture
- One feature flag per user-visible capability, evaluated server-side, with environment-based defaults (off in prod, on in staging) — not a code branch deleted at launch.
- Targeting dimensions the flag must support: percentage of users, explicit allowlist (internal/beta), per-tenant or per-plan, and per-model-variant for A/B.
- A separate **kill switch** distinct from the rollout flag: flipping it must degrade gracefully (hide the feature or fall back to the pre-AI path), take effect without deploy, and be flippable by on-call — not just the feature team.
## Step 3 — Guardrails before first external user
- Input limits: max length, rate limits per user/tenant, abuse filtering if inputs are user-authored.
- Output checks appropriate to the feature: schema validation, banned-content filters, grounding checks against source data for factual features.
- Cost ceilings: per-request token caps, per-user daily caps, and a global daily budget alarm.
- Fallback behavior defined for: model timeout, provider outage, guardrail rejection. "Show an error" is acceptable only for assistive features.
## Step 4 — Staged exposure plan
Produce a concrete stage table for the feature:
1. **Shadow** (if replacing existing logic): run AI alongside, log both, compare offline. No user impact.
2. **Internal**: team + allowlist. Exit criteria: N uses, zero guardrail breaches, quality spot-check pass.
3. **Beta cohort**: 1–5% or opt-in. Exit criteria: quality metric ≥ target, cost/use within budget, support-ticket rate unchanged.
4. **Ramp**: 25% → 50% → 100% with a defined soak time per step and an automatic halt condition (error rate, cost, or quality regression threshold).
Every stage names its metric thresholds and who decides progression.
## Step 5 — Measurement
- **Quality**: task-specific success signal (acceptance rate of suggestions, edit distance from final, thumbs, escalation rate) — pick ones that exist in product telemetry, not ideal-world metrics.
- **Cost**: tokens and spend per use, attributed to this feature's tag.
- **Trust**: opt-out rate, feature abandonment, support contacts mentioning the feature.
- A/B when swapping models: same metrics, paired cohorts, minimum sample size stated before starting.
## Output format
Deliver: blast-radius classification, flag + kill-switch spec, guardrail checklist with status, the staged exposure table with exit criteria, and the metric definitions. Flag anything in the existing codebase that blocks this plan (no flag system, no per-feature cost attribution) as prerequisite work.
Install in Claude Code
mkdir -p ~/.claude/skills/ai-feature-rollout && curl -fsSL https://harshrastogi.tech/skills/ai-feature-rollout.md -o ~/.claude/skills/ai-feature-rollout/SKILL.mdThen ask Claude Code for the task — the skill is picked up automatically. For a project-scoped install, use .claude/skills/ inside your repo instead.
Using a different agent?
Skills are plain markdown. Paste the file into any capable AI assistant alongside your task, or wire it into any agent framework that supports system instructions.
Tags
Feature FlagsA/B TestingRolloutAI Product