7 min read

Notes on Agentic Engineering

Written by

Published on

March 27, 2026

https://andamp-1.webflow.io/blog/notes-on-agentic-engineering

For the past year, I have wanted to set up a blog and personal website for myself. I had a particular tech stack in mind, so I could also learn more about databases, DNS, and hosting on a VPS. The specification was clear in my head, but between work and being a parent, I never quite found the momentum to start.

Since coding agents arrived, there has been a growing number of developers handing off, for better or worse, coding tasks to agents, in order to focus on the product or the specification of a feature. I decided to hand the blog idea to a coding agent, hoping it would finally give the project the momentum it needed. The plan was to let the coding agent write the code while I steered the engineering through prompts, ultimately giving me the final say over the code, design, and infrastructure of the project. As opposed to vibe coding, I was using my knowledge of the tech stack I chose to guide the agent.

I was able to transition from local development to a live site in less than a week, working with the coding agent whenever I found time to give it a task. My hesitation to let my coding agent take total control of the project led me to the decision to stay in the loop, in order to double check the agent’s output and the functionality of the blog. I made a point to intentionally steer the coding agent. Steering means engineers let the agent drive but can take the wheel at any time. Without steering, agents might go left on a fork in the road when the engineer would have gone right. The real question is not how much to intervene, but how to give the agent enough clarity that it rarely needs redirecting. I used the following techniques to help steer my coding agent to a blog I am excited to deploy to production.

‍

Spec-Driven Development

If a coding agent receives the prompt “Generate a blog for me where I can write my thoughts,” it will produce a vague result: a generic version of a blog built with unspecified tooling. But if an agent is given a specification which includes everything from user interface designs to architectural decisions, the output will be specific, coherent, and built to spec. Coding agents highlight a long-standing challenge in the software industry: writing the right code is easier with an unambiguous specification. But the specification isn’t just a one-shot prompt; it can be the context for every prompt within a project.

The specification and documentation of a project are converging in the prompts sent to coding agents. Including a specification in a prompt or an agent’s context has emerged as a clear technique to improve an agent’s outputs within a given project. Johnny Boursiquot has suggested including a specification folder in projects consisting of markdown files, to give the agent richer context about the project. Along similar lines, Laura Tacho argues that AI has incentivized developers to invest in their documentation:

Docs immediately influence the quality of AI-generated code. Capturing intent and writing specs has real impact on the work you’re doing today.

Teams that previously shrugged off documentation now have a concrete reason to write it. As an example of what AI can do with good specifications, an engineer from Cloudflare was able to create his own flavor of NextJS, because Next is so well-specified. Specifications and documentation don’t just help agents write code, but help them test it too.

‍

Red, Green, then Click Around

Developers who have deployed untested code to production all have something in common: urgent messages from stakeholders about regressions. Coding agents are capable of minimizing the likelihood of this experience, but only if directed to do what so many teams skip: write tests. Despite the known benefits of testing code, many teams elect to treat it as a nice-to-have at the expense of implementing more features. And merging untested code written by a coding agent is a surefire way to get an e-mail from a frustrated stakeholder.

One can imagine how quickly this strategy could lead to developer paralysis if coding agents are generating all or most of the code, but no tests are being written. Developers would have no confidence to make changes without knowledge of the code and tests to guide their changes. The argument that there is not enough time to write tests no longer stands. There will still be time pressure, but the speed at which tests can be written and testing infrastructure set up has significantly dropped. Simon Willison notes:

The old excuses for not writing [tests]— that they’re time consuming and expensive to constantly rewrite while a codebase is rapidly evolving — no longer hold when an agent can knock them into shape in just a few minutes.

One of the early tasks coding agents excelled at was writing tests, and writing tests is one of the clearest ways to get reliable output from a coding agent.

‍

Test-Driven Development

The concept of test-driven development (TDD) has been long debated, but rarely resolved. Some developers prefer the experience of writing code and testing later, while others enjoy the reliability of TDD. But there are real risks involved with agents skipping tests, as Willison notes:

A significant risk with coding agents is that they might write code that doesn’t work, or build code that is unnecessary and never gets used, or both.

TDD is perfectly suited to coding agents, as it gives them a clear set of rules for success and failure. While developers might not write all of the code in the near future, the specification and its translation into functioning code will be paramount, which reinforces the case for spec-driven development. If agents write the tests first based on a specification, run the tests to find all are failing (red), and then make the simplest change to turn the tests green, then developers can have a higher level of confidence in the generated code. These tests become a kind of documentation of their own, and can be used in future sessions to help give context to the agent. Similar to TDD, developers can tell agents to take small or large steps depending on the task, but the option to move slowly and with intention is always there. Writing automated tests is the first step towards a reliable application, but manual testing always reveals unforeseen bugs.

‍

Test the Native Interface

Testing a system using its native interface is one of the most reliable ways to find real bugs. Developers spend countless hours manually testing their colleagues’ code as well as their own. This remains true for the code generated by agents, but developers can give tools in the form of command line interfaces (CLI) to agents to test code changes within an application’s native environment. An interesting example of this is testing browser applications.

Testing the user interface of a web application has long been a pain point for frontend developers, and this pain of writing automated browser tests can lead to no tests at all. Setting up a testing framework for local use as well as for CI pipelines can turn into its own role at a company, which speaks to the complexity and time-commitment of these setups. Testing frameworks such as Playwright, developed by Microsoft, arose from this pain, and is now being used as a tool to enable coding agents to automate browser tests.

Developers can give coding agents the Playwright CLI, or programs built on Playwright such as Vercel’s agent-browser, which enable coding agents to ensure the code matches the desired outcomes in the interface. This setup is particularly advantageous, as developers can use coding agents to write the initial tests, update these tests based on changes to the interface, and execute tests and fix bugs in a closed loop. This loop enables agents to generate, test, and improve code using command line tools, thus giving the agent’s code credibility. The idea of giving a framework a CLI is a powerful pattern which we will explore in the next section.

‍

Give Your Framework a CLI

Before coding agents, developers used web frameworks such as Django, Spring Boot and Ruby on Rails to increase their output. Someone with deep knowledge of one of these frameworks could scale a product with a small team, because web frameworks take repetitive tasks away from developers, allowing them to focus on the product. One particular strength of Ruby on Rails is its command line interface, which gives developers the option to generate new features with a single command. Developers come up with a data structure they want to implement, and Rails generates models, views, and controllers for the new structure. Hand those same capabilities to a coding agent and the combination becomes something else entirely.

I used Ruby on Rails to build my blog, and as soon as I asked my coding agent to build new pages for the website, I saw the power of an opinionated framework with a CLI in combination with an agent. The agent knew the Rails CLI, and I knew which levers to pull, how I wanted the code to look, and which libraries I wanted to use. Instead of generating files and code without constraints, the agent generated the conventional Rails files and patterns. The great danger of allowing agents to act outside of a framework or set of constraints is they can go in any direction they want. Structure, strong opinions, and taste are more valuable than ever in a world where code is cheap.

‍

Conclusion

The concepts discussed in the previous sections are not new in the software industry, but figuring out how they can improve the outputs of coding agents is what’s new. In the simplest terms, it is important to give coding agents a specification, which they can use to write tests first to drive development, and then use CLIs to build and execute actions in a reproducible way.

What most surprised me about letting my coding agent write the code was a newfound ability to focus on what I wanted to build next. Instead of feeling overwhelmed, I felt empowered to build the application I had in my imagination. While the stakes for this project were low, and I built it alone, I was able to get a feel for how to work with a coding agent to build software that matched my vision. I’m looking forward to building my next idea, perhaps a personalized CLI, with a coding agent.

If you’re exploring agentic engineering in your own work, or have thoughts on where it’s heading, we’d love to hear from you.

‍

Share this post

https://andamp-1.webflow.io/blog/notes-on-agentic-engineering

Code