Blockchain - Simple Example
Published:
Lesson: 6
Topic: Blockchain
Author: Ashley Rosilier
The central feature of the blockchain is a cryptographically signed chain of data blocks. These blocks can store transactions, such as the transfer of Bitcoin currency, or any other type of data. To understand thoe blockchain concept, it’s not important WHAT data is stored, but rather HOW it is stored. This example will describe a simple implementation of blockchain and how data gets added to the distributed ledger.
Block Creation
When new data is to be added into the distributed ledger, a node must first create a “block” that typically contains the following information:
- Data - whatever data is to be stored
- Hash of data - a unqiue “fingerprint” of the block’s data
- Hash of previous block - the “fingerprint” of the pevious block in the chain
The specific block structure varies based on the system, but generally the hash values are stored in the block header while the transaction data is stored in the block body. The block header often contains other information as well, such as a time stamp.
Hash
In cryptography, a “hash” is a spcial function to create a “fingerprint” for any data. You can think of it as a very complicated set of mathematical manipulations to take any data and turn it into a unique, randomized string of letters and numbers. There are many different hashing algorithms, but one of the most common is SHA-256.
This website has an online SHA-256 calcluator to compute the hash of any data entered into the text box. For any data typed in, a unique string of 256 bits (equivalent to 32 characters) will be generated that corresponds to the data. If even one character is changed in the text box, the hash output will change as well.
Just like a fingerprint, no two hash outputs are the same for different sets of data. This is very useful for comparing two large sets of data. If their hashes are the same, then the underlying data are the same. Hashes are also useful for determining if data has been corrupted. If a hash is calculated for a particular set of data but the data is later modified, the hash will no longer match with the data set, making it self-evident that the data has been changed.
Chaining
A chain is created by having each block refer to the previous block through its hash. This creates the chronological relationship between all data blocks and makes it possible to follow the audit trail of all transactions. The first block in the chain is called the “genesis” block.
Referring to the previous block’s hash makes it possible for the chain to identify any blocks that have been tampered with. If a block’s data gets changed, it’s hash value will also change, and the blockchain will no longer be connected properly. The system will recognize this fault but will be able to recover because other nodes still have copies of the entire blockchain without the tampered data.
A graphical depiction of this chain is shown below in the figure from [9].
Block Validation
Any newly proposed block must be validated before it is added to the distributed ledger throughout the system. This is done by means of a “consensus algorithm” to gain agreement by all nodes that the block should be added. In Bitcoin, the specific consensus algorithm used is called “Proof of Work” and requires the node to expend a large amount of CPU processing to solve a mathematical puzzle in order to demonstrate credibility.
More details on Proof of Work and other consensus algorithms will be covered in a later lesson, but for this example we will assume this mathematical puzzle:
- Find a numerical value (called a “nonce”) that, when combined with the data in the block, results in a hash value starting wtih four zeros
In order to find a specific nonce value that will result in a hash starting with four zeros, the CPU will have to try many, many possible values. Since the hash function is seemingly random, there is no shortcut to finding the nonce, and the only method is just to keep trying different numbers and re-calcluating the hash. The effect of this is that the validation of the block takes a significant amount of time and the node must have a significant amount of CPU processing power behind it.
Slowing down the validation of blocks is important because it allows the network to identify and delete invalid data before it propogates. It is also important that the CPU requirement for validation is costly, because that keeps bad actors from spamming the network with invalid blocks. In Bitcoin this Proof of Work validation is called “mining” and the node who performed the validatation is rewarded with currency.
Insertion into the Chain
Once the block is created and validated by a node, it is then sent to all other nodes in the network to be added to their copies of the distributed ledger. There are different methods of broadcasting and synchronizing the block insertion, but in all cases each node independently verifies that the block is valid and has not been tampered with before adding it to their copy of the ledger. If a node identifies tampering (ie if the hash values don’t validate properly), the node is rejected and not added to the ledger.
Blockchain Demo
Anders Brownworth has created an excellent sandbox tool to allow you to get hands on with the blockchain mechanics. The following two YouTube videos walk throuh the tool and how it visually demonstrates the blockchain concept.
You can try out this tool yourself to understand how the blockchain operates by visiting https://andersbrownworth.com/blockchain/
References
[9] Madaan, L., Kumar, A., & Bhushan, B. (2020). Working principle, Application areas and Challenges for Blockchain Technology. 2020 IEEE 9th International Conference on Communication Systems and Network Technologies (CSNT), 254–259. https://doi.org/10.1109/CSNT48778.2020.9115794