The Three Sacred Guardrails of AI

Solving all of your LLM problems with just nine words.

by Paul Ford - September 3, 2025

More than a year ago, we realized AI’s ability to create code would drastically change our product roadmap. After all, our product, Aboard, was designed to speed up data-driven software development. Our chosen industry, with acronyms like SaaS, ERP, CRM, CMS, B2B, ARPU, and CAC, was in the crosshairs of the new giant LLM companies like Anthropic and OpenAI, not to mention Google and Microsoft.

As a tiny bootstrapped enterprise, we had to adapt—or perish—or adapt then perish, or…what? There were no books and no solid guides, only LinkedIn posts on how to crush it with ChatGPT and YouTube videos about AI passive income. The world offered a vast range of opinions married to an extreme paucity of expertise.

Today, however, we can see a craft of working with LLMs emerging. It usually comes down to one word: “Guardrails.” (Listen to the recent podcast episodes with our VP of Engineering Kevin Barrett, then with our CTO Adam Pash—we joke that listening for the word “guardrails” is our official drinking game.)

I think it’s worth taking a moment and talking about guardrails: What the term means here, and how they work in practice. It’s been an education, and hopefully we can share some of what we learned. The good news is there are only three guardrails, and they total nine words.

Let’s get pedantic!

The Challenge

First, zoom way out: LLMs are general-purpose systems that produce extremely specific nonsense. The result looks less like what we think of as a classic, “computer-ey” database output and more like a conversation. You can ask an LLM for anything, with no limits: Code, prose, poems, summaries, literally any form of expression. It will almost always produce something. That is its nature. We interpret it and assign meaning to its output. That is our nature.

Guardrail #1: Expect nonsense.

But wait! Nonsense? It helped me write my college application! And sure, often it produces a clear, cogent block of text that seems like a sensible reply to your prompt. But unfortunately, that breaks down in surprising ways: The LLM might produce unrelated text, or offer too much information, or too little. It might be too formal, too casual, too boring, or too random. It might produce biased information or repeat harmful stereotypes, or it might produce text that looks factual but is in fact a “hallucination,” with names and details that no one has ever seen before.

These aren’t bugs as computer people have understood them over the last 70 years. They’re a “natural” part of how LLMs work. If you’re serious about working with the technology in a scalable way, you need to assume what it’s generating is meaningless or error-filled, even if it looks pretty good—because that’s how they get you.

Guardrail #2: Aim to summarize.

Where to begin? As Laurie Voss noted some time ago, an LLM is more accurate when it summarizes, and less predictable otherwise. Ask it to write a letter to your sister, and it’ll fill in the details. You can insist that it “summarize the trip to Slovenia” and it’ll write, “Lake Bled was as stunning as you’d imagine, with the island church and the castle above the water.” I just tested ChatGPT 5 and it did that for me—and I’ve never been to Slovenia, nor do I have a sister. But when Google Gemini automatically summarizes my emails to my brother, who does exist, lives in Maryland, and is frankly a great guy, it does a good job with expected accuracy.

So if we just have a prompt, how do we add guardrails? Well, it’s pretty simple: You need to (1) expand the inputs; and (2) filter and format the outputs.

Take a product like Deep Research: When you give it a prompt, it asks questions, makes a plan, and searches the web to gather more text. Then it churns through all of that to generate a report. It might produce code and run it, then read the output of that code to guide its next steps.

It also can read local files and PDFs. It converts the end result into structured data—text with sections, headings, and citations—which can be exported as a document or a PDF. There’s an intense mix of old-school programming and LLM action feeding into each other. But in the end: It expands, then it summarizes.

Guardrail #3: Target one data structure.

There’s a great set of programming epigrams by a brilliant computer scientist named Alan Perlis. They’re timeless—every one of them applies to this new world of AI as much as it did to the old world of software development.

My favorite is: “It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures.” A more universal rule might be, “Keep like with like.” We’ve spent the last year defining our “Blueprint” along these lines: It’s a compact but expressive data structure that describes a whole app.

When you prompt Aboard to make software, we follow Guardrail #1—we ask questions before we generate anything at all. Then we get to work on Guardrail #2, and spin up around 30 separate agents that work together to add context and define requirements, constantly interrogating each other’s work. After that, we have lots and lots of content to summarize. When we summarize, we summarize into a well-defined data structure: That’s Guardrail #3.

We do everything we can to fill out every field. As our VP of Engineering Kevin told me, “This turns the LLM from a black box into a smoke-filled, occasionally translucent box.” (The entire company is ironic, not just me.) Our CTO Adam expanded on this: We take all the output from our agents, then transform that into “complex but strictly typed JSON schemas and structured outputs.” We write custom code validations to test correctness, and send errors back to the LLM to see if we can fix them.

Image of small wooden shelves with tiny animal figures on them. — This is what a database looks like.

Why does this matter? Like Alan Perlis said, now we can actually operate on that structure. We can interrogate it, edit it, add new nodes, convert it to software, make documentation, or transform it into a web API. We’ve gone from highly specific nonsense to reasonable, working applications. And similar inputs yield similar outputs.

While I love our specific approach, this is also what most vibe-coding tools do—we’re all swimming in the same pond. But those tools don’t really nail down their data structures in the same way. They aim to produce “working code,” which is an ambitious and mind-bogglingly vast surface area to target; the user might want to make a 3D game, or they might want to make a CRM.

But I don’t know if there’s a general solution that will allow for consistent, predictable results. My guess is that vibe-coding platforms, just like the big LLM vendors, will start to offer lots of sub-products, like “Game Builder” or “CRM Expert,” that guide and focus the user. Otherwise they can’t offer predictability. Each one of those products will take a long time for those companies to build—but then again, they have a lot of funding. It’ll be wild to watch.

Our company chose a very big but much more tractable target: Business applications. We put in guardrails that lead the user to that destination. As a result, you can’t use Aboard to build a game, but you can build a light CRM, as noted in the recent article about us in Fast Company. (“When I gave them a similar prompt, neither Replit nor Lovable hit all those marks without throwing errors, running out of starter credits, or asking me for upsells.”) Our LLM bills are very low, because we focus on one compact data structure; thinking this way brings discipline and efficiency to the business.

Summing It Up

There are many other sub-guardrails and approaches we’re taking, some of which are pretty novel, and we’re starting to really like our process. It’s getting easier and easier for Aboard’s engineers to expand our system, too. It’s a new way of working—and no one wants to go back to the old way—but it’s also legible to anyone who worked in software development over the last 50 years. You still have to do a lot of thinking and planning, and catch as many errors as possible.

Our engineering team is huge on “functional” programming (we use the language Elixir for a lot of stuff). As a result, they think in recursive processes based around one very large data structure—our Blueprint—that is simultaneously code and data.

I know this is very, very nerdy, but what I just described—that blend of code, data, and recursion—is basically the LISP programming language of the 1950s. LISP was the original “AI” language, too. The best way we’ve found to make LLMs behave is to bring them into the most advanced programming constructs of the 50s. That isn’t as cool as AGI, I guess. But I think it’s pretty neat.