Do one thing well: the case for small open source testing tools

This is a guest post by Mikhail Golikov, a Hove-based software engineer who works as the sole QA on a backend team at a high-load e-commerce platform.

A software engineer, he drove regression testing for PHP runtime upgrades across more than seven teams’ services. He builds small, single-purpose open-source Python testing tools, among them postman2pytest and secure-log2test, and writes about testing at scale from one person’s seat.

His work turns the artefacts teams already have, the collections, logs and traces, into tests that actually run in CI.

Golikov writes in full as follows…

When a runtime upgrade has to land across seven teams’ backend services and you are the one regression-testing it, the sensible-sounding move is to build a framework.

One place for everything: the fixtures, the helpers, the config, the house style. I started down that road more than once. Each time I stopped, and I ended up instead with a handful of small open source tools that each do a single job and know nothing about one another.

On a solo maintenance budget, that has aged far better than any framework I could have built.

Deliberately boring, in a good way

I can count five of them, all Python and deliberately boring.

postman2pytest turns a Postman collection into a runnable pytest module. secure-log2test does the same from a Kibana or Splunk log export, scrubbing auth headers and secret-looking fields before they reach the output. pytest-conversational asserts on multi-turn chatbot dialogue with plain rules and ships with an empty dependency list.

pytest-resilience-agent injects gateway failures, timeouts and rate limits so you can test how an AI-backed service behaves when the things it depends on misbehave. phoenix2pytest turns captured traces into test cases. Five inputs, one output shape: a test file you can read, run and commit.

None of them imports one another. There is no shared core, no plugin registry, no platform. If you only need to get your Postman drawer into CI, you install one tool and never hear about the other four.

That separation is the point, and it matters more for open source than inside a company.

Adoption before acceptance

Adoption comes first.

A stranger can pick up one tool and get it running in CI in an afternoon, without buying into my worldview. A framework asks people to adopt your abstractions, your naming, and your idea of how testing should feel. A small tool asks for one command and a file path. The distance to a first successful run is the single biggest thing that decides whether an open-source project gets used at all, and small tools keep that distance short.

Then there is composition without coupling. The tools share a convention, not a codebase: whatever goes in, what comes out is committable pytest. So you can chain them in your own pipeline: logs to tests in one place, traces to tests in another, without any of them depending on the others.

The user composes; the tools stay ignorant of each other. Unix pipes have worked this way for decades, for the same reason: agree on the format, keep the parts independent.

Know your blast radius

Maintenance is the honest one. I am one person, and a framework fails as a unit. One bad release, one leaky abstraction, and everything built on it is down at once. Five small tools fail in isolation. Each has its own release cadence, its own issue tracker, its own blast radius. A bug in the log converter cannot break the resilience plugin, because the two have never met. When you maintain open source in the margins of a full-time job, that isolation is what keeps the tools people rely on standing.

Golikov: Build the smallest thing that stands on its own. Give it one job and let people compose the rest.

Small is not free, and it would be dishonest to pretend otherwise. There is duplication: each tool carries its own argument parsing, its own README, its own release workflow, and I have fixed the same class of bug in more than one of them.

There is no unified API, so anyone who wants all five has five things to learn rather than one. And discovery is harder, because five repositories are easier to miss than one well-marketed framework. I take that trade every time, because the costs land on me, the maintainer, while the benefits land on the person trying to use the thing. For open source, that is the right way round.

The same instinct runs inside the tools.

pytest-conversational ships with no runtime dependencies on purpose: nothing to install, audit, or outlive. Not every tool can be that strict, but the leaning is deliberate. Every dependency you add to an open source tool is one you are asking every adopter to trust and every future version of Python to keep working with.

The smaller that surface, the longer the tool stays useful without you.

Do one thing…

The framework urge never goes away. It looks like tidiness, and it promises that everything will live in one well-organised place.

But for open source, tidiness for the author is friction for the user, and a monolith is a single point of failure for a maintainer who is also asleep, on holiday, or busy with a day job.

So then, build the smallest thing that stands on its own. Give it one job and one honest README, then let people compose the rest. Do one thing well has aged into the most practical advice I know for keeping open source alive when you are the only one keeping it.