This is a guest post for the Computer Weekly Developer Network written by Taariq Lewis, CEO and founder of Promise — the company delivers onchain credit reputations, for the world, using digital signatures.
For businesses that want to conduct confidential, cross-border, business transactions, the company delivers a type of credit cryptography with prox re-signatures to create a network designed with credit and payment automation features with proxy re-signing cryptography.
The full title of Lewis’s intended piece here reads: The three pillars of decentralized protocol testing: Test Again, Test Better, Test Again.
Promise’s Lewis contends that complex software development coordinated by remote teams is a challenge to avoid dreaded delays. In crypto ecosystems, failures to ship can be detrimental — negatively impacting community perception as well as token value.
Lewis writes as follows…
The successful release of the Cosmos Stargate software upgrade is a testament to a hard-won victory over this (above stated) challenge. Stargate introduces some of the most groundbreaking features the Cosmos ecosystem has seen so far. This massive task was taken on by 7 different entities around the globe with a common passion for the Internet of Blockchains. After months of collaboration, the release candidate for Stargate has finally been released. However, releasing software is just half of the equation.
The other half is testing it.
With so many new features like state sync, cosmovisor, protobuf, IBC and more – it is crucial that these features are tested in the hands of those who will use it the most.
When anyone says that a software release contains ‘breaking changes’, it often means that some behaviour has changed without a full understanding of what these changes entail. However, even if engineers conduct thorough testing, there’s always a risk that a bug is lurking deep beneath. Building open source distributed systems software that forms the foundation for production blockchain networks securing billions of dollars in market capitalization can be messy and complex.
Here is what we have learned so far:
Run simulations & run profiles
As blockchain protocols grow in complexity due to speed of innovation and growing interchain interaction requirements, we need to ship software that will break compatibility with prior features. Breaking changes are common in software development. However, when we speak of value transfer systems involving cryptocurrency and digital assets, breaking changes are liable to have impacts unseen until a substantial amount of transactions complete. We assume that digital transaction volume increases over time. As such, it is very difficult to test for these hidden issues with even extensive integration and unit tests.
An informative simulation, and profile, test of Stargate requires that we run a version of the protocol for several thousand blocks. Subtle errors that could exist in the decimal number implementation, fee calculation, or state machine transitions can be detected by generating and sending randomised messages. The goal of simulations is to detect failures that could halt a chain and provide as much detail as possible, such as log files and the application state at which a failure occurred. Profiling node starts also help highlight new performance regressions introduced with new code.
Formal verification must exist
Building and verifying software, especially in a multi-entity context, is a delicate dance requiring continuous feedback between protocol researchers and software engineers. At the end of the day, the most important thing is working software and if verification isn’t in service of that, then what’s it really good for?
While formally specifying the protocols has helped us think more clearly about them and surface a number of important bugs, the most valuable artifact is likely to be the model-based testing — complex test cases generated directly from the TLA+ specifications that can be run against any implementation.
As decentralised economic systems proliferate, we may have to extend our verification to economic performance verification. Do our economic designs execute with specificity and do we understand the possible economic issues that may arise?
Serialisation is the ultimate implementation detail
It is very important to separate serialisation code from the rest of your logic — pushing it as far to the edges of your software as possible. This frees developers from having to worry too much about serialisation, makes it easier to provide backwards compatibility and simplifies the work of upgrading the serialization in the future.
Check out how tendermint-rs implemented its domain types, separate from its protobuf types, as an example of this.
Distributed software systems are the foundation for many critical infrastructures in our society, spanning financial services, transport, healthcare, cloud services and more. Building and maintaining them is difficult, expensive and error-prone. Performance testing is continuously evolving with the changes in modern software developments and it is becoming more and more vital in building confidence in distributed software.
Building the assurance tools to help ensure that the software does what it is intended to do will ensure a smooth transition from whitepapers to Proof of Concepts to tangible production systems that hold funds is paramount for the success of cryptoeconomic systems.
At Cosmos, we are always pushing the boundaries of great protocol software development. Community engagement is important to us, especially when it comes to software testing. If you’d like to share your thoughts on this topic, please join our Cosmos Stargate community and help us test our protocols further.