Zencoder… and the art of AI software engineering

Zen and the Art of Motorcycle Maintenance was Robert Pirsig’s seminal book that looked into the values of life.

By uncommon coincidence, perhaps, AI-first software code orchestration layer company Zencoder has now launched the Zenflow desktop app, a free orchestration platform designed to transition the industry from the perceived value that might exist in so-called vibe coding to what it defines as AI-first engineering.

Zenflow is essentially an orchestration tool that turns chaotic AI interactions into repeatable workflows with multi-agent verification and spec-driven development.

While the appeal of chat interfaces have popularised some of the move towards AI coding, so vibe coding with its ability to “describe” what functionality a developer wants out of an application and set agentic controls off on a journey to deliver it in the form of workable, secure code, Zencoder argues that this process has hit a ceiling.

Slop code in the road

Why so? Because, says the firm, uncoordinated agents produce “slop code” that looks correct but fails in production or degrades with iteration.

“Chat-based user interfaces were fine for copilots, but they break down when you try to scale,” asserts Andrew Filev, CEO of Zenflow. “Teams are hitting a wall where speed without structure creates technical debt. Zenflow replaces what we can define as ‘prompt roulette’ with an engineering assembly line where agents plan, implement and, crucially, verify each other’s work.”

Internal data from Zencoder’s research team suggests that replacing standard prompting with Zenflow’s orchestration layer improved code correctness on average by about 20% and, in some cases, by up to around 5%.

4 four pillars of AI orchestration

Keen to use its latest platform release as something of a defining moment on the company’s roadmap, Filev and team have laid down what they define as the four pillars of the AI orchestration.

Pillar #1 – Structured AI Workflows

In high-performing engineering teams, quality might be said to come from repeatable processes. Zenflow applies the same principle to AI implementation and subsequent orchestration i.e. replacing ad-hoc prompting with disciplined workflows, complete with smart defaults and full customisation e.g. Plan > Implement > Test > Review.

Pillar #2 – Spec-Driven Development:

To prevent iteration drift, AI agents are anchored to evolving technical specifications. Errors are caught at the spec level – before code is written – reducing downstream rework and eliminating “code slop.”

Pillar #3 – Multi-Agent Verification

Also known as the “committee approach”, Zenflow uses model diversity (e.g. having Claude critique code written by OpenAI models) to eliminate blind spots. Research indicates this cross-verification produces quality improvements comparable to a next-generation model release, but available immediately.

Pillar #4 – Parallel Execution

Developers can move from chatting with a single bot to commanding a fleet – implementing new features, fixing bugs and running refactors simultaneously in isolated sandboxes.

“The hard part of engineering isn’t writing code; it’s understanding intent and maintaining quality,” said Will Fleury, head of engineering at Zencoder. “By moving to an orchestrated SDD workflow, our internal team now ships features at nearly twice the pace of our pre-AI baseline, with agents handling the vast majority of implementation.”

Zenflow is model-agnostic, supporting major providers including Anthropic, OpenAI and Google Gemini. The desktop app provides a command centre for complex multi-agent projects, while updated plugins for VS Code and IntelliJ bring workflow capabilities directly into the IDE.

Deeper dive

The Computer Weekly Developer Network spoke to Zencoder CEO Filev to gain some deeper insight into the mechanics of the platform on offer here and the software application development culture the company seeks to seed and propagate.

CWDN: You advocate spec-driven development (SDD) to prevent iteration drift. Does this approach translate across all current software engineering methodologies from agile and onwards, or does it work best in certain teams with specific structures?

Filev: People usually associate specs with heavier, waterfall-driven methodologies. In this case, there is “terminology overload” as SDD is used to fix bugs and implement features in agile teams day in and day out.

Many readers have heard of the “chain of thought” prompting technique, where a model is asked to think before acting. This behaviour has recently been trained into new “thinking” models, which produce that reasoning chain without explicit prompting. In Spec-Driven Development, the agent is asked to produce a spec based on your prompt. This unlocks several key advantages:

Research: It allows the agent to do research on your project and collect relevant information. Remember, by default, LLMs are stateless and know nothing about your specific codebase.
Context: When agents work for a while, they start to “forget” things. A spec acts as a succinct context anchor that the agent can always refer back to.
Clarification: Good SDD tooling, like Zenflow, allows the agent to ask clarifying questions if and only when, necessary.
Review: When the spec is ready, you can review it and align with the agent on the work to be done. If you catch an error early, you save time and tokens. It’s also much easier to review a short spec than a huge diff with 20 files. This helps you maintain understanding and ownership of the code, which is vital in the mid-to-long term.

So, SDD is about giving AI the context it lacks to be effective within any methodology. In a human-only team, a lot of context is implicit: we know what “good” looks like. AI doesn’t. Without a spec, AI relies on “Prompt Roulette,” where you gamble on the output. By implementing SDD in Zenflow, we see teams moving from constantly rewriting agent code to barely touching it, shipping features at nearly 2x the pace of their pre-AI baseline.

CWDN: The committee approach is a neat idea. Are you worried about the major model developers knowing that multi-agent verification happens and at some level, starting to engineer-in controls to be aware of that interconnection point and creating some abnormal skews of some kind?

Filev: I would consider that unlikely for a variety of reasons. First, LLMs are stateless and the verification happens in the orchestration layer above, in Zenflow. To the model, both the input submitted for review and the feedback received look like standard user messages.

Second, deliberately sabotaging the work of paying customers would be a questionable legal and moral move without an obvious upside. A model provider’s business is predicated on the successful use of their models, the key word being “successful” right? If anything improves that success rate on top of the model, it’s in their business interest to be supportive.

CWDN: Although you’re saying that your technology represents a chance to go beyond vibe coding, your head of engineering still says that the secret to software development is “understanding intent”… so can you see some element of vibe still in the groove in future?

Filev: My favourite company t-shirt features a button that reads “Generate Good Vibes.” But speaking seriously about understanding intent – asking the right questions and thinking about them ahead of time, be it UX or architecture, is indeed a cherished skill and one area where we as humans still hold the frontier over AI. This is what separates brilliant products from average ones and durable architectures from rewrites.

So, back to the vibe coding question: the king is dead, long live the king. Vibe coding unlocked creativity by giving everyone an opportunity to quickly prototype ideas. Now, it’s time to unlock that creativity in engineering teams working on production solutions.

CWDN: Just as multi-cloud (i.e. hybrid) has become the de facto standard, do you feel that multi-agent should also be a core working principle in AI-first software development?

Filev: Absolutely. We believe “AI orchestration” is the new software category required to scale AI use in production engineering organisations. Repeatable outcomes require repeatable workflows, paving the way for multi-agent pipelines.

If you get to a repeatable outcome with your AI workflow, the next immediate thought is, “Why don’t I run five of these at the same time?” – which leads to additional orchestration load. Just as cloud computing required orchestration tools (like Kubernetes) to manage complexity, AI-First Engineering requires multi-agent orchestration to manage reliability.