the aboard newsletter

A Legal Framework for Understanding Bad AI-Generated Bugs

With spec-first approaches, we’re trying to bring order to AI coding by creating a set of rules and processes.

by -
Aerial view of yellow robots on an assembly line.

Now that’s a process.

AI is changing the way we build software in a few ways. The most commonly discussed way right now is “vibe coding”: You say what you want, and the AI coding tools attempt to make it. This works best for churning through well-understood problems in popular programming languages—because LLM spiders ate up GitHub and other public code sources, which gave them tons of JavaScript to chomp through. LLMs are “good” at the web-app frontend framework React…because there’s so much React code on the web. 

It’s pretty magical to see it take a simple prompt and just start feverishly coding, but as everyone keeps finding out, it’s hard to get an LLM to finish anything. Enter “spec-first.” When you’re working spec-first, you use AI as an assistant to plan, think, define, design, and document it all—and then feed that spec to the LLM, often in small pieces, building to the specification. App projects with any sort of complexity benefit greatly from this approach, which is how we build software at Aboard. We’ve made a particular bet against vibe coding. Without definition and prep work, it simply doesn’t get you there. 

We’re not alone here. The latest AI tool to take a “spec-first” approach is Amazon’s just-announced Kiro:

Kiro specs are artifacts that prove useful anytime you need to think through a feature in-depth, refactor work that needs upfront planning, or when you want to understand the behavior of systems—in short, most things you need to get to production. Requirements are usually uncertain when you start building, which is why developers use specs for planning and clarity. Specs can guide AI agents to a better implementation in the same way.

Screenshot of Kiro (black background and white text)
Here’s Kiro taking a crack at requirements documents.

They’re making a similar bet: Rushing to code feels good at first, but that doesn’t actually get you a complete, stable, secure application that groups of people can reliably use. 

To boil it down:

  • Vibe coding is for learning, playing, and prototyping software.
  • Spec-first coding is for building a robust alpha version of an app.
  • Humans—with or without AI assistance—take it from there.

What About Bugs?

If you ask a very senior engineer about their job, they’ll tell you that it’s not really about coding: It’s about debugging. A few weeks ago, I wrote about bugs—why they aren’t only about “broken” software. Bugs are defects that happen when software deviates from human expectations.

In product liability law, there’s a key distinction between design defects and manufacturing defects. 

  • A design defect exists when a product is inherently unsafe due to a flaw in its original design, even if manufactured perfectly. 
  • A manufacturing defect occurs when the product was designed properly, but the production process made mistakes leading it to deviate from the original design. 

Design defects are far more severe. It means every car coming off the assembly line is flawed. The workers in the factory did exactly as they were told, but the plan—the spec—was bad.

Through this lens, AI fails miserably when you rush to code. Why? Even if it makes a working app, it is not based on any design. There is no design as a reference point to work off of. It’s not a design defect, it’s devoid of design.

So then what? Kiro (and Aboard) have decided to draft a spec first (i.e., a reference design that must be followed). Now mind you, AI hates to follow. It loves to riff. But it’s a start. We can at least find design defects because we have a design in hand. That’s more than vibes.

Let’s stretch that metaphor: With LLMs, companies like OpenAI and Anthropic have built a factory that generates totally random cars. Some of them explode, some of them are incredibly fast, some have dashboards where their windows should be. It sounds really bad, but it generates the cars at the rate of one per minute, a dollar per car.

With spec-first approaches, we’re building an assembly line that will increase the likelihood that the thing that gets produced has wheels, and an engine. We’re trying to bring order to the factory by creating a set of rules and processes. We know that sometimes the dashboard will be replaced by a small bird, or that instead of a car, the LLM will occasionally produce a cow. But we try to anticipate that in the spec, so it’s less likely each time we make a new car.

We all love the AI demos (Aboard has one!), but there’s a lot of work to do. If we want to use AI seriously in software development, we have to stop treating it like a shortcut and start treating it like infrastructure. That means guiding it, not reacting to it. It means putting in the effort to define what we’re building before we ask AI to build it. 

I’m not a fan of process—it often gets weaponized—but I am a fan of delivery, and it’s clear that we need new processes that people can agree on, with testable results, more than we need LLM-assisted coding bots. A big part of our work at Aboard is in creating that process. Because the speedup is real and exciting, and we are finding that, even in these early days, we’re able to give people more good software, more quickly than ever before.

Spec-first isn’t nostalgia for old processes—it’s a commitment to delegating work without delegating responsibility, and to outcomes that make sense. So we’ll stick with it, and keep pushing until we get to the next phase.