Blockchain, Explained

by John Bezark

Introduction

It is no exaggeration to say that the blockchain is one of the most influential and misunderstood technologies on the World Wide Web today. If you’re reading this, odds are you have heard of blockchains and know they are an important part of contemporary internet infrastructure, but you might be fuzzy on the details of what a blockchain actually IS or DOES (don’t worry, I am too). However, understanding what blockchains are and how they function is key to understanding the mechanisms behind many important trends in recent internet history. Everything from digital currency exchanges, inventory management, Art collection and auctioning, Decentralized collectives and organizations and much more have all been built on top of blockchains. Each implementation of a blockchain comes with its own set of design choices and ethical implications, and it’s thus worthwhile to understand how blockchains typically work so you can make more informed decisions when engaging with them.

A Very Brief Overview and History

A blockchain is simply a distributed database; it’s constantly growing list of records, called blocks, that are linked together using cryptography.

The blockchain was originally articulated in David Chaum’s 1982 Dissertation Computer Systems Established, Maintained, and Trusted by Mutually Suspicious Groups.

Some of the key components:

  • Public and Private Keys Encryption
  • Blocks and chaining blocks together

Public and Private Keys Encryption is a cryptographic method of sending information securely back and forth. Essentially they are mathematical functions that allow you to take a piece of data and encrypt it with your special Secret Key and then anyone with the public key can then decrypt the message. This allows many people to read your message but ensures that only you can write it. More on Public Private Key Encryption

Blocks are just what they sound like- discrete chunks of information. Chaum described how they could be linked together with cryptography to create a chain of information that was very difficult to alter (more on that below.)

The idea of cryptographically encrypted block chains then sort of sat around for 20 years until “Satoshi Nakamoto” (who apparently may or may not be a real person…) coupled the idea with an electronic currency and invented Bitcoin

So how does the blockchain actually work?

The best blockchain explainer I’ve found is hand’s down the 3 Blue 1 Brown youtube video and accompanying article. It explains the Blockchain through the lens of bitcoin, but is still a very helpful guide. 

When thinking about the fundamentals of how a blockchain works, it’s really important to remember that this is a DECENTRALIZED technology, meaning there is no one person or institution in charge of it. It is an agreed upon protocol- everyone who is using this protocol agrees to follow the conventions of the protocol and they also agree to reject any messages that do not follow the conventions of the protocol. 

Bitcoin uses all of this chained together data to create a public ledger. This is essentially just a record of all the different bitcoin transactions in history, but it is the heart of bitcoin itself. By making this ledger publicly available, the currency gains it’s value organically: there is no governing body setting the price of a bitcoin, but because the ledger is public, everyone knows and agrees upon the history of bitcoin transactions and the current distribution of those coins. This shared ledger is what gives users of bitcoins trust in the fact that they can freely exchange coins with total strangers as often as they like without fear of fraud. 

Since blockchains are decentralized, they are designed to help structure streams of information coming in from a variety of different sources all at once. 

In order for any blockchain to work, it needs to have a couple of key ingredients:

  • A mechanism for linking information together in a chain that has a fixed order (if nobody is in charge, how do I know that everyone else has the same previous blocks in their chain as I do?)
  • A mechanism for agreeing on the trustworthiness of the new blocks you’re receiving (how do you know you’re not getting fake news??)

Most blockchains solve these problems with a variety of different data verification methods.

Hashing

Cryptographic hash functions are essential to blockchains and a lot of other cybersecurity. A full explainer on cryptographic hash functions is beyond the scope of this article, but in brief, a cryptographic hash function is a mathematical algorithm that transforms a message of any length into a bit array ( a long string of numbers) of a FIXED length which is often called a hash. What’s also important to understand is that these functions are one way functions- this means that the hash that is produced bears no resemblance to the message used to produce it. If you only have the hash of a message, it is not really possible to use it to recreate the original message. These functions are however deterministic; this means that if you use the same input message it will always produce the same output hash. However, even altering the input by one bit produces a completely different output. For example:

SHA256("Whatever you want to say") = "5249ca39c589af9f9d7b84a1ab7f266f99103bc68df06786fe79dc4b970f6166"

SHA256("Whatever you want to soy") = "3184b66f39d9c231aa7836423013ecdb55c4ade09a2f2a02785157b9b5898a9a"

These two strings are very similar, but the hash is TOTALLY different.

Try it out an interactive example over here!

That’s cool, but what does that have to do with trustworthy chains??

Well, remember that you can use any sequence of letters and numbers to generate a hash. This includes a previous hash.

So if you want to create a CHAIN of information, what you can do now is:

  • Generate a message
  • Hash it
  • Generate the next message.
  • Take the Hash of the previous message, combine it with the new message and hash both of them.
  • Attach the new hash to the next message
  • rinse and repeat.
Image depicts the flow of Blockchain information: Message A gets hashed into Hash A. This points to Message B which, when combined with hash A produces Hash B. Finally, message C and Hash B get combined to produce Hash C.
Image depicts the flow of Blockchain information: Message A gets hashed into Hash A. This points to Message B which, when combined with hash A produces Hash B. Finally, message C and Hash B get combined to produce Hash C.

This image depicts blocks in a blockchain. Each block contains a header with a previous hash attribute and a body with Transactions and information. The Transactions/Information get hashed and passed into the “Previous Hash” attribute of the header of the next block.
This image depicts blocks in a blockchain. Each block contains a header with a previous hash attribute and a body with Transactions and information. The Transactions/Information get hashed and passed into the “Previous Hash” attribute of the header of the next block.

This allows a decentralized group of computers to trust that they all have the same message history. Let’s say that you and I have a shared blockchain. We are each keeping our own copies of course! All you need to do to verify that our chains are in sync is compare the latest hash we each have- if the hashes match, that means that every previous piece of data in our chain must be the same.

Remember, if just one but bit of information is different in a hash function then the entire thing is different, so if any block has been altered it will create a completely different hash which, when fed into the next block will also create a completely different hash that will cascade all the way down and corrupt the entire chain! It is theoretically not possible to alter any block in a chain without this happening so when you get two matching hashes from different chains, you can be very confident that the two chains are in agreement.

Bitcoin uses all of this chained together data to create a public ledger. This is essentially just a record of all the different bitcoin transactions in history, but it is the heart of bitcoin itself. By making this ledger publically available, the currency gains it’s value emergently: there is no governing body setting the price of a bitcoin, but because the ledger is public, everyone knows and agrees upon the history of bitcoin transactions and the current distribution of those coins. This shared ledger is what gives users of bitcoins the ability to exchange their coins with anyone else in the world, because both parties will trust the transactions recorded in the public ledger. 

That’s all well and good for agreeing on what has already happened, but the next big challenge is how do you agree on what new information to trust?

Consensus Mechanism

This part is a little tricky and ultimately reflects the values of your blockchain. There are a variety of ways to come to consensus on a network, but the two big ones in blockchains are currently known as Proof of Work and Proof of Stake. Each of these are very different from each other and have very different side effects.

Proof of work

Proof of Work is the consensus mechanism implemented in BitCoin blockchain and essentially places trust in computational work.

Specifically, proof that the author of the block went through a large amount of computational work. In bitcoin, essentially a special number (called the nonce) is appended to the end of each block so that when it is input into the SHA256 Hash function, the resulting hash has a large number of zeros at the front of the block. Because this is a cryptographic function, the only way to generate that number is to guess one number at a time. This means that figuring this out REQUIRES a large amount of computational work. This, however, is easy to verify because all that is required is to attach the number to the block and run the algorithm. Try it out over here!

Basically- it takes a long time for a computer to find the right answer, but when the answer is distributed across the network, anyone can verify the result by just running the block through the hash function with the included nonce. In bitcoin specifically, the first computer that correctly solves the proof of work equation for each block is rewarded with some currency.

A Bitcoin Block. The hash of this block begins with 00000000000000000000000000003deed5… when the Nonce (which is just the number 2988763826) was appended to the rest of the information in this block, the resulting hash of this block began with 32 0’s and thus satisfied the proof of work requirement. Check out more blocks at https://www.blockchain.com/explorer?view=btc
A Bitcoin Block. The hash of this block begins with 00000000000000000000000000003deed5… when the Nonce (which is just the number 2988763826) was appended to the rest of the information in this block, the resulting hash of this block began with 32 0’s and thus satisfied the proof of work requirement. Check out more blocks at https://www.blockchain.com/explorer?view=btc

This method of verifying consensus is fairly cryptographically secure because it means that in order to forge parts of the blockchain, a malactor would need to control 51% of all the computing power on the bitcoin network. But because the currency incentivizes creating new blocks as quickly as possible, this means that there is always a very large amount of computing power on the network and the prospect of one person controlling 51% of it is infeasible for all practical purposes. 

However, because this consensus mechanism places trust in and financially rewards computational work, this consensus mechanism has had several unintended consequences, most notably the exponentially increasing energy usage of all the computers competing to create (or “Mine”) new bitcoins.

Proof of Stake

Proof of Stake has recently emerged as an alternative consensus mechanism to Proof of Stake and is being adopted by several major cryptocurrencies, most notably Ethereum. Proof of Stake was first suggested on a bitcoin forum in 2011.

Screenshot of the first Suggestion of Proof of Stake. The essential idea expressed is the notion that blocks would be voted on and your votes would be weighted by how many bitcoins you owned.
Screenshot of the first Suggestion of Proof of Stake. The essential idea expressed is the notion that blocks would be voted on and your votes would be weighted by how many bitcoins you owned.

Proof of Stake has many different implementations, but essentially the underlying idea is that instead of placing your trust into computational work, users of a blockchain would place trust in whatever was being ‘staked’ on the validation of new blocks. This makes the most sense when applied to currency blockchains- validators (people who want to add blocks to the blockchain) stake (or wager) some of their currency on the transaction. If it’s later found out that the transaction was forged or incorrect, then the validator loses the amount of money that was staked. If, however, the validation is deemed correct, then the block gets added to the chain and the validator may receive a financial reward. This thus incentivizes good behavior by applying a financial consequence to noncompliance.

However, this consensus mechanism is not without it’s own drawbacks: because it places trust in those who have the ability to stake value on validation transactions, it thus also gives power to those who already have significant vested interests in the system. In the case of a currency, this means that those who already have a large amount of the currency may be able to exert more control over that blockchain than those who don’t. As of this writing, Proof of Stake blockchains are still in their early stages of deployment and it will be a while before any unintended  side effects come to light.

In general, however, it’s important to remember that all blockchains need a consensus mechanism- a set of rules used to bring conflicting copies into agreement, and very often the design of these different mechanisms reveals a lot about what that blockchain and users of it value and prioritize.

Common Questions/Misconceptions

So in light of all that, let’s now clear up some common blockchain misconceptions

How do changes to blockchains get made?

Each blockchain is different, but many utilize the concept of forking. A fork is essentially a proposed change to the underlying bitcoin protocol- what constitutes a bitcoin block and how it gets exchanged between computers. Each computer interacting with the Bitcoin blockchain must choose which versions of the protocol they accept, much like a fork in the road (hence the name). When forks get created they can come in two flavors: a soft fork is a change to the protocol that is backwards compatible and a hard fork is a change that is not backwards compatible. Anyone is free to make these changes to their own copies of the blockchain and propose them to others, and if enough users implement the changes then they become the majority. In this way, the bitcoin protocol continues to evolve without relying on a centralized authority, similar to something like an open source software. 

Is the blockchain a crypto-currency?

NO!!! The blockchain on its own is NOT a currency. A blockchain is just a list of records (called blocks) which are cryptographically linked together using the mechanisms described above. These Blocks are each generated using information from the previous blocks. All these blocks are hashed together into a chain, and this is why it’s called a block-CHAIN. That’s all that the blockchain is- cryptographically linked data.

Cryptocurrencies USE this blockchain data structure to create decentralized public ledgers.  However, a blockchain is essential to cryptocurrencies because it enables everyone who uses the currency to agree on one public ledger of the currency without having to rely on a centralized authority like a bank or government to verify accuracy. Cryptocurrencies DEPEND on a blockchain in the same way a car DEPENDS on having an engine, but is not the engine.

The blockchain will destroy the planet

yes and no – as of this writing Cryptocurrencies currently use a LOT of electricity and are therefore responsible for emitting a boat-load of CO2 into the atmosphere. This is primarily due to reliance on the Proof of Work Consensus Mechanism in major cryptocurrency blockchains such as Bitcoin and Ethereum. However other consensus mechanisms like Proof of Stake aer far less energy intensive.

The Proof of Work consensus mechanism that was implemented in the original Bitcoin Blockchain ignored the network effects of millions of participating computers all executing similar computationally intensive calculations and thus produced runaway energy usage. However, not all blockchains utilize these choices.

The blockchain will save the planet!

meh This one is also misleading. blockchains certainly have power: prior to their invention there was no way to share verifiable and immutable data in peer-to-peer networks without reliance on centralized systems. Not only that, but certain blockchains are now Turing complete- this means that a blockchain can now be used to share code which then can be used to run programs and applications. This basically means that programs and computation can now be executed on decentralized computer networks. 

A great example of the power of this is the creation of Distributed Autonomous Organizations’ or DAOs: these are essentially groups of people who all contribute funds to a collective and then come up with rules for how to distribute those funds towards a common goal. All of these DAOs are different and have wildly different intentions: some are geared towards financial speculation, others are engaging in nonprofit work, while some function as social clubs. Oftentimes the amount of funds a member brings to the collective correlates to a certain number of votes they get on setting the rules of the collective. This may seem very similar to a traditional corporate or nonprofit structure, buat what’s important to remember is that again, all of the infrastructure for these organizations is distributed across a network of computers. The funds the collective owns as well as the computer code representing it’s rules of governance all exist on a blockchain.

In conclusion, blockchains are nothing more than a distributed database composed of cryptographically linked blocks. The implications and application of such a technology, however, are quite novel and fairly  powerful: prior to their invention, information on the internet lived primarily on proprietary servers. But now with the advent of blockchains, it’s possible for large numbers of people to autonomously agree on a shared dataset and thus to do many different things at scale without being beholden to platforms or institutions. As we’ve seen, however, there are many different design patterns for blockchains and they all come with their own values and priorities. Some have caused dramatic ecological damage while others have perpetuated patterns of income inequality. Furthermore, the rush to apply blockchains to all sorts of new domains opens up the possibility of complicating or damaging existing systems and infrastructure that might not have needed blockchain technology in the first place. That’s not to say that the whole endeavor should be canceled, merely that when thinking about interacting with and designing blockchains it’s very important to take all of these factors into consideration. For an in depth dive into how to weigh all these different variables, The Blockchain Ethical Design Framework is a great place to start.


Other Notable Links: