Howard Poston

View Original

Introduction to Blockchain: Fundamentals

This article is the first in a multi-part series describing how blockchain works and some of the security assumptions associated with it. Each article will describe a different level of how the blockchain’s distributed ledger operates, starting with the fundamentals.

At the most basic level, blockchain technology is composed of cryptographic algorithms. The creator of blockchain, Satoshi Nakamoto, developed a system in which the trust that we traditionally place in organizations to maintain trusted records (like banks) is transferred to the blockchain and the cryptographic algorithms that it uses.

The Cryptography Behind Blockchain

The goal of the blockchain is to create a distributed, decentralized, and trusted record of the history of the system. The most famous blockchain, Bitcoin, uses this record to store the history of transactions, so people can make and receive payments on the Bitcoin blockchain and trust that their money won’t be lost or stolen.

In order to achieve this level of trust, the blockchain uses a couple of cryptographic algorithms as building blocks. Hash functions and public key cryptography are crucial to both the functionality and security of the blockchain ecosystem.

Hash Functions

A hash function is a mathematical function that can take any number as an input and produces an output in a fixed range of numbers. For example, 256-bit hash functions (which are commonly used in blockchain), produce outputs in the range 0-2256.

In order to be considered secure, a hash function needs to be collision-resistant, this means that it’s extremely difficult (to the point of being nearly impossible) to find two inputs that create the same hash output. Accomplishing this requires a few different features:

  • No weaknesses in the hash function

  • A large number of possible outputs

  • A one-way hash function (can’t derive the input from the output)

  • Similar inputs produce very different outputs

If a hash function meets these requirements, it can be used in blockchain. However, if any of these requirements are violated, then the security of the blockchain is at risk. Blockchain relies heavily on secure hash functions to ensure that transactions cannot be modified after being stored in the ledger.

Public Key Cryptography

The other cryptographic algorithm used in blockchain technology is public key cryptography. This type of cryptography is also widely used on the Internet as well since it has so many useful properties.  With public key cryptography, you can:

  • Encrypt a message so that only the intended recipient can read it

  • Generate a digital signature proving that you sent a given message

  • Use a digital signature to verify that a message was not modified in transit

In public key cryptography, everyone has two different encryption keys: a private one and a public one. Your private key is a random number that you generate and keep secret. It is used for decrypting messages and generating digital signatures.

Your public key is derived from your private key and, as the name suggests, is designed to be publicly distributed. It’s used for encrypting messages to you and generating digital signatures. Your address (where people sent transactions to) on the blockchain is typically derived from your public key.

The security of public key cryptography is based on two things. The first is the secrecy of your private key. If someone can guess or steal your private key, they have complete control of your account on the blockchain. This allows them to perform transactions on your behalf and decrypt data meant for you. The most common way that blockchain is “hacked” is people failing to protect their private key.

The other main assumption of public key cryptography is that the algorithms used are secure. Public key cryptography is based off of mathematical “hard” problems, where performing an operation is much easier than reversing it. For example, it’s relatively easy to multiply two numbers together but hard to factor the result. Similarly, it’s easy to perform exponentiation but hard to calculate logarithms. As a result, it’s possible to create schemes where computers are capable of performing the easy operation but not the hard one.

The security of these “hard” problems are why you’ll often see articles about quantum computers breaking blockchain. Due to how quantum computers work, factoring and logarithms aren’t much harder than multiplication and exponentiation, so traditional public key cryptography no longer works. However, other problems exist that are still “hard” for quantum computers, so the threat of quantum computers to blockchain can be fixed with a simple upgrade. 

How Blockchains are Put Together

As its name suggests, the blockchain is a collection of blocks that are chained together to create a continuous whole. In this section, we explore how this works.

Blocks

The purpose of the blockchain is to act as a distributed ledger that stores data in a secure fashion. The blocks are the place where this data is stored.

The image above illustrates the basic structure of a block in a blockchain. We’ll talk about every part of this image throughout the series, but for now focus on the green sections. Each green piece represents a transaction within the block. While a transaction may represent a literal transaction (i.e. a transfer of value) on blockchains like Bitcoin, this is not the only option. As we’ll see later, smart contract platforms store other things (like computer code) as transactions as well.

The security of the blocks in the digital ledger depends on the security of public key cryptography. Every transaction and block in the blockchain is digitally signed by its creator. This allows anyone with access to the blockchain to easy validate that every transaction is authenticated (i.e. sent by someone who owns the associated account) and has not been modified since creation. The integrity and authenticity of the blocks in the chain is also assured by the digital signature of the block creator.

Chaining

Each block is equivalent to a single page in a bank’s account ledger; it only represents a slice of the history of the network’s history. In order to combine these slides into a continuous whole, the blockchain makes use of hash functions.

In the image above, you can see the hash functions linking each block together. Each block contains the hash of the previous block as part of its block header (the section not containing transaction data).

The fact that every block is dependent on the previous one is significant because of the collision-resistance of hash functions. If someone wanted to forge block 51 in the image, they have two options: find another version of block 51 that has the same hash or forge every block after 52 as well. The first is supposed to be impossible (due to collision resistance) and the other should be difficult or impossible since the blockchain is designed to make forging even a single block difficult (more on this later).

The security of the “chain” part of blockchain is based upon the collision resistance of the hash function that it uses. If someone can find a way to generate another version of block 51 that has the same hash, the immutability assumptions of blockchain break down and you can’t trust that any transaction will remain in the distributed ledger.

What's Next...

At this point, we’ve covered the fundamental cryptographic algorithms and data structures used in blockchain. The next two articles in this series are focused on the infrastructure behind the blockchain: the nodes and how they’re networked together.