Stages as lenses: learning, testing, and scaling responsibly

Estimated read time: 7 minutes

For more than a decade, the rhythm of Discovery, Alpha, Beta and Live has structured how digital public services come to life in the UK. That rhythm remains vital: civic scaffolding that helps us learn safely, stage by stage, on the path from curiosity to confidence. 

In a vibe-coded world, where ideas can be prototyped as deployed code before a meeting ends, the meaning of those stages deepens. They are not hurdles to clear, but lenses for understanding: of users, of systems, and of what it takes to deliver a safe, sustainable service.

Discovery is where we see clearly

The core discipline – begin with users and their needs – does not change. If anything, speed makes the early stages of discovery more important, not less. Discovery isn’t about what can be built; it’s about what should be built, and why. 

It’s where we pause long enough to see the world through our users’ eyes, map the journeys and constraints that shape their experience, and surface the organisational, policy and data realities that will make or break the service later. When building becomes fast, understanding becomes precious. Discovery endures because it forces us to look before we leap. Its essence is not documentation but discernment. 

AI tools can help: quick visualisations, simulated flows, prompts co-created with users. But quick sketches are not the same as insight. Discovery demands the discipline to slow down enough to keep asking, is this still the right problem? Prototypes will start to appear, but they do so to help us illuminate the problem, not jump to the solution. 

The discipline is to stay curious: keep asking “why does this exist?” before racing to answer “what could we build?”

Alpha is where we learn cheaply

Here, understanding is tested. Ideas take form – lightly and provisionally – so we can explore directions without committing too early. AI-assisted delivery teams might generate multiple prototypes in a morning, test with users the same day, and carry forward the most promising. The artefacts may multiply but the output is still insight. 

Product and design shift from planning builds to curating and comparing them: which best meets the need we uncovered? Which breaks on policy, legal, operational, or data constraints? What hidden assumptions have surfaced? 

Alpha becomes a rhythm of rapid, responsible experiments: the smallest thing that tests the riskiest assumption, with the right mix of people involved in the learning. 

Alpha is allowed to be untidy but abundantly so – it should produce multiple versions and expect to throw most of them away. But too often “Alpha” runs as polite waterfall to deliver One True Service. When building gets cheaper, we lose some excuses: it becomes practical, and healthier, to explore properly and let evidence choose what survives.

And yet, AI assistance introduces a new tension. The old adage – discard Alpha code – exists for good reason: prototypes are built for learning, not for longevity. But what if your AI pair can generate something secure, accessible, tested, observable and operable by default? Isn’t it wasteful to throw it away?

If an Alpha slice meets the standards you’d demand in Beta, and the team has capacity to maintain it, then build on top of it. Otherwise, keep the learning and let the code go. We must never smuggle Alpha code into production simply because it looks tidy. The point is not to build something beautiful. It’s to learn something true.

You don’t end Alpha because your prototype is polished but because you have greater clarity: we’ve understood the problem, tested the main risks, and identified a direction worth pursuing. 

This isn’t linear. Research informs design; design provokes better research. A Discovery question today might become an Alpha experiment tomorrow. You can hold both mindsets at once – learning about users while showing them working ideas to test your own assumptions. The teams who thrive will be those who keep evidence flowing in both directions – research shaping what gets built, and working ideas surfacing new questions to explore. 

Beta is where learning meets reality

It’s where teams build something that works end-to-end for real users, integrates with real systems, and operates under real-world conditions. Here, performance, reliability, accessibility and operational readiness get tested properly. 

Beta is also the bridge to long-term ownership. 

Teams must start thinking about lifecycle: who will care for the service when the spotlight moves on? What does ‘good’ look like for frontline staff? Going fast here doesn’t mean cutting corners; it means building in the open, showing your working, and inviting feedback from users, colleagues and peers so that decisions are tested socially as well as technically.

With AI in the loop, Beta can also expand its scope. The question becomes not just “does it work?” but “is it working fairly, securely and as we intended?” This is where new assurance shows up: checking model outputs for bias, confirming prompts don’t leak sensitive data, validating that AI-generated content still meets accessibility and clarity standards. 

AI assistance changes what must be tested, not whether we test.

In practice, Beta may fragment into progressive rollouts, feeding live data straight back into iteration. A service might scale from 5% to 20% in days, but only if each step earns its place through evidence. This is evidence, not vibes. You don’t skip evidence just because you’re getting more confident in how you’re orchestrating AI assistance.

Live is not the end

It’s where stewardship begins: running safely, at scale, with a clear owner, published metrics, and a rhythm of continuous improvement. In a vibe-coded world, the expected daily rhythm should no longer be exceptional. The line between late Beta and Live can blur but Live is where accountability crystallises: a commitment to own consequences, listen to users, and fix issues promptly.

And beyond Live is the unglamorous but necessary discipline of retiring a service well – handling data properly, redirecting users cleanly, and ensuring that institutional learning isn’t lost. Ending well is as important as starting well.

Across all these stages, the theme is evidence, not assumption. 

We move on because we’ve learned enough to proceed responsibly – not because we get the paperwork right; not because a prototype looks good; not because the code runs. 

So while loops may shorten and transitions become more fluid, their purpose remains. The stages are not gates to rush through but disciplines of reflection:

  • Discovery asks why this?
  • Alpha asks how might we?
  • Beta asks can this work for real?
  • Live asks how do we sustain and improve it?

AI-assisted delivery can change the medium of evidence – more tangible, more immediate, more participatory – but it doesn’t change the need for evidence itself.

The challenge for teams in a vibe-coded world is not to skip stages but to tighten feedback loops while preserving their integrity. Building might have got easier; learning is still hard. The Service Manual remains our guide not because it slows us down, but because it translates speed into safety, quality and value. 

The future of delivery won’t abandon these stages. It will fold them tighter. Discovery will happen within Beta; Alpha may re-open after Live. Teams will move through them in days, not months, circling back as new questions emerge. The boundaries between the stages might soften but the learning will deepen.

What holds it together is a commitment to earn confidence with proof, not promise. To stay honest at speed.In that sense, AI-assisted delivery gets us closer to our aspirations: delivery as a practice of learning in public – disciplined enough to keep pace with the tools it uses, humble enough to keep asking what users actually need, and rigorous enough to keep the whole endeavour safe.

Download the full paper as a PDF.