Trent McConaghy: Blockchain Infrastructure Landscape: A First Principles Framing

Share with:

FacebookFacebookTwitterTwitterGoogleGoogleTumblrTumblrStumbleUponStumbleUponLinkedInLinkedInRedditRedditPinterestPinterestPocketPocketDiggDiggEmail this pageEmail this pagePrint this pagePrint this page WhatsappWhatsapp

How are Ethereum, IPFS/Filecoin, and BigchainDB complementary? What about Golem, Polkadot, or Interledger? I often get questions like this. So, I decided to write about how I answer those questions, via a first-principles framing.

The quick answer:

…there’s no one magic system called “Blockchain” that magically does everything. Rather, there are really good building blocks of computing that can be used together to create effective decentralized applications. Ethereum can play a role, BigchainDB can play a role, and many more as well. Let’s explore…

Background

The elements of computing are storage, compute, and communications. Mainframes, PCs, mobile, and cloud all manifest these elements in their own unique ways. Specialized building blocks emerge to reconcile tradeoffs within a given element.

For example, in the storage element we have both file systems and databases, where file systems are for storing blobs like mp3s with a hierarchy of directories and files, and databases are for storing structured metadata with a query interface like SQL [1]. In the centralized cloud, we might use Amazon S3 for blob storage, MongoDB Atlas for databases, and Amazon EC2 for processing.

This article focuses on the Blockchain landscape: the blocks for each element of computing, and some examples of systems manifesting each block. For each block, I will focus on being illustrative over thorough.

Blockchain Building Blocks

Here is each element of computing, with related decentralized building blocks:

  • Storage: token storage, database, file system / blobs
  • Processing: stateful business logic, stateless business logic, high performance compute
  • Communications: connect networks of data, of value, and of state

Blockchain Infrastructure Landscape

Blockchain technology is manifesting in each block, as this image shows:

Storage

The fundamental computing element of storage has the following building blocks.

Token storage. Tokens are stores of value (e.g. assets, securities) whether it’s Bitcoins, air miles, or digital art copyright. The main actions on a token storage system are to issue and transfer tokens (with many variants), while preventing double-spends and the like.

Bitcoin and Zcash are two prominent “pure play” systems focusing solely on tokens. Ethereum happens to use tokens in service towards its mission of being a world computer. These are all examples of tokens given out as internal incentives to run the network infrastructure.

Other tokens aren’t internal to a network to power the network itself, but are used for incentives in a higher-level network where the lower-level infrastructure actually stores the tokens. One example is ERC20 tokens like Golem (GNT) running on top of the Ethereum mainnet. Another example is Envoke’s IP licensing tokens, running on the IPDB network.

Finally, I have listed a “.*” to illustrate that most Blockchain systems have a mechanism for token storage.

Database. Databases specialize in storing structured metadata, for example as tables (relational DB), document stores (e.g. JSON), key-value stores, time series, or graphs; and then rapidly retrieving that data via queries (e.g. SQL).

Traditional distributed (but centralized) databases like MongoDB and Cassandra routinely store hundreds of Terabytes and even Petabytes of data, with throughput that can exceed 1 million writes per second, like here.

Query languages like SQL are profound because they separate implementation from specification, and are therefore not bound to any particular application. SQL has been a standard for decades. This is why the same database system can be used across many different industries.

Put another way:

…to generalize beyond Bitcoin to more applications without any application-specific code, you don’t need to go all the way to Turing completeness. You just need a database. This has corresponding benefits in simplicity and scale. There are still great reasons to have Turing completeness in some places; we discuss this further in the “decentralized processing” section.

BigchainDB is decentralized database software; specifically a document store. Being built on MongoDB (or RethinkDB), it inherits the querying and scale of Mongo. But it also has Blockchain-y characteristics like decentralized control, tamper-resistance, and token support. IPDB is a public net instance of BigchainDB, with governance.

Also in the Blockchain space, we can think of IOTA as a time-series database, if we squint a bit.

File system / data blob storage. These are systems to store large files (movies, mp3s, large datasets), organ