Blockchain technology can seem complex; however, it can be simplified by examining each component individually. At a high level, blockchain technology utilizes well-known computer science mechanisms and cryptographic primitives (cryptographic hash functions, digital signatures, asymmetric-key cryptography) mixed with record keeping concepts (such as append only ledgers). This section discusses each individual main component: cryptographic hash functions, transactions, asymmetric-key cryptography, addresses, ledgers, blocks, and how blocks are chained together.

Cryptographic Hash Functions :

An important component of blockchain technology is the use of cryptographic hash functions for many operations. Hashing is a method of applying a cryptographic hash function to data, which calculates a relatively unique output (called a message digest, or just digest) for an input of nearly any size (e.g., a file, text, or image). It allows individuals to independently take input data, hash that data, and derive the same result – proving that there was no change in the data. Even the smallest change to the input (e.g., changing a single bit) will result in a completely different output digest. Table 1 shows simple examples of this. Cryptographic hash functions have these important security properties: 1. They are preimage resistant. This means that they are one-way; it is computationally infeasible to compute the correct input value given some output value (e.g., given a digest, find x such that hash(x) = digest). 2. They are second preimage resistant.

This means one cannot find an input that hashes to a specific output. More specifically, cryptographic hash functions are designed so that given a specific input, it is computationally infeasible to find a second input which produces the same output (e.g., given x, find y such that hash(x) = hash(y)). The only approach available is to exhaustively search the input space, but this is computationally infeasible to do with any chance of success.

They are collision resistant. This means that one cannot find two inputs that hash to the same output. More specifically, it is computationally infeasible to find any two inputs that produce the same digest (e.g., find an x and y which hash(x) = hash(y)). A specific cryptographic hash function used in many blockchain implementations is the Secure Hash Algorithm (SHA) with an output size of 256 bits (SHA-256). Many computers support this algorithm in hardware, making it fast to compute. SHA-256 has an output of 32 bytes (1 byte = 8 bits, 32 bytes = 256 bits), generally displayed as a 64-character hexadecimal string (see Table 1 below).

This means that there are 2256 ≈ 1077, or 115,792,089,237,316,195,423,570,985,008,687,907,853,269,984,665,640,564,039,457,584,007,913,129,639,936 possible digest values. The algorithm for SHA-256, as well as others, is specified in Federal Information Processing Standard (FIPS) 180-4 [5]. The NIST Secure Hashing website [6] contains FIPS specifications for all NIST-approved hashing algorithms.

Collision Resistance and the Role of Hash Functions in Blockchain:

Since there are infinitely many possible input values and only a finite number of possible output digest values, it is theoretically possible for a collision to occur—where:

hash(x) = hash(y)
even though x ≠ y

In other words, two different inputs could produce the same hash digest.

However, modern cryptographic hash functions such as SHA-256 are designed to be collision resistant, meaning that finding such a collision is computationally impractical.

How Difficult Is It to Find a SHA-256 Collision?

To find a collision in SHA-256, one would need to perform approximately:

[

2^{128}

]

hash operations on average.

This equals:

340,282,366,920,938,463,463,374,607,431,768,211,456

(approximately 3.402 × 10³⁸) attempts.

This number is astronomically large.

Putting This into Perspective

In 2015, the entire Bitcoin network had a hash rate of approximately:

300 quadrillion hashes per second

(300,000,000,000,000,000 hashes/second)

Even at that enormous rate, it would take roughly:

35,942,991,748,521 years

(approximately 3.6 × 10¹³ years)

to manufacture a single SHA-256 collision.

For comparison, the age of the universe is estimated to be:

1.37 × 10¹⁰ years

This demonstrates how practically impossible it is to generate a collision using current or foreseeable computational power.

Furthermore, even if two inputs x and y were found that produce the same digest, it would be highly unlikely that:

Both inputs are valid blockchain transactions
Both are accepted by the network under protocol rules

Uses of Cryptographic Hash Functions in Blockchain

Within a blockchain network, cryptographic hash functions serve several critical purposes:

1. Address Derivation

Hash functions are used to derive blockchain addresses from public keys, ensuring both uniqueness and security.

2. Creating Unique Identifiers

Transactions and blocks are identified using hash digests, which act as unique fingerprints.

3. Securing Block Data

When a node publishes a block:

It hashes the block’s data.
The resulting digest is stored in the block header.
Any modification to the block data changes the hash, immediately revealing tampering.

4. Securing the Block Header

The publishing node also hashes the block header itself.

If the blockchain uses a Proof-of-Work (PoW) consensus mechanism:

The node repeatedly hashes the block header.
It changes a special value called the nonce each time.
This process continues until the resulting hash satisfies the network’s difficulty requirements.

The hash of the current block header is then included in the next block’s header. This creates a chain of cryptographic links, ensuring that:

Altering any previous block would change its hash.
That change would invalidate all subsequent blocks.
The integrity of the entire blockchain is preserved.

Conclusion

Although hash collisions are theoretically possible due to mathematical limitations, SHA-256’s collision resistance makes such events practically impossible. This security property is fundamental to blockchain systems, ensuring data integrity, immutability, and trust in decentralized networks.

Cryptographic Nonce :

A cryptographic nonce is an arbitrary number that is only used once. A cryptographic nonce can be combined with data to produce different hash digests per nonce: hash (data + nonce) = digest Only changing the nonce value provides a mechanism for obtaining different digest values while keeping the same data. This technique is utilized in the proof of work consensus model (see Section 4.1). 3.2 Transactions A transaction represents an interaction between parties. With cryptocurrencies, for example, a transaction represents a transfer of the cryptocurrency between blockchain network users. For business-to-business scenarios, a transaction could be a way of recording activities occurring on digital or physical assets. Figure 1 shows a notional example of a cryptocurrency transaction. Each block in a blockchain can contain zero or more transactions.

For some blockchain implementations, a constant supply of new blocks (even with zero transactions) is critical to maintain the security of the blockchain network; by having a constant supply of new blocks being published, it prevents malicious users from ever “catching up” and manufacturing a longer, altered blockchain (see Section 4.7). The data which comprises a transaction can be different for every blockchain implementation, however the mechanism for transacting is largely the same. A blockchain network user sends information to the blockchain network.

The information sent may include the sender’s address (or another relevant identifier), sender’s public key, a digital signature, transaction inputs and transaction outputs. A single cryptocurrency transaction typically requires at least the following information, but can contain more: • Inputs – The inputs are usually a list of the digital assets to be transferred. A transaction will reference the source of the digital asset (providing provenance)

either the previous transaction where it was given to the sender, or for the case of new digital assets, the origin event. Since the input to the transaction is a reference to past events, the digital assets do not change. In the case of cryptocurrencies this means that value cannot be added or removed from existing digital assets. Instead, a single digital asset can be split into multiple new digital assets (each with lesser value) or multiple digital assets can be combined to form fewer new digital assets (with a correspondingly greater value). The splitting or joining of assets will be specified within the transaction output.

Inputs –

The inputs are usually a list of the digital assets to be transferred. A transaction will reference the source of the digital asset (providing provenance) – either the previous transaction where it was given to the sender, or for the case of new digital assets, the origin event. Since the input to the transaction is a reference to past events, the digital assets do not change. In the case of cryptocurrencies this means that value cannot be added or removed from existing digital assets. Instead, a single digital asset can be split into multiple new digital assets (each with lesser value) or multiple digital assets can be combined to form fewer new digital assets (with a correspondingly greater value). The splitting or joining of assets will be specified within the transaction output.

Asymmetric-Key Cryptography:

Blockchain technology uses asymmetric-key cryptography4 (also referred to as public key cryptography). Asymmetric-key cryptography uses a pair of keys: a public key and a private key that are mathematically related to each other. The public key is made public without reducing the security of the process, but the private key must remain secret if the data is to retain its cryptographic protection. Even though there is a relationship between the two keys, the private key cannot efficiently be determined based on knowledge of the public key.

One can encrypt with a private key and then decrypt with the public key. Alternately, one can encrypt with a public key and then decrypt with a private key. Asymmetric-key cryptography enables a trust relationship between users who do not know or trust one another, by providing a mechanism to verify the integrity and authenticity of transactions while at the same time allowing transactions to remain public.

To do this, the transactions are ‘digitally signed’. This means that a private key is used to encrypt a transaction such that anyone with the public key can decrypt it. Since the public key is freely available, encrypting the transaction with the private key proves that the signer of the transaction has access to the private key.

Alternately, one can encrypt data with a user’s public key such that only users with access to the private key can decrypt it. A drawback is that asymmetric-key cryptography is often slow to compute. This contrasts with symmetric-key cryptography in which a single secret key is used to both encrypt and decrypt. With symmetric-key cryptography users must already have a trust relationship established with one another to exchange the pre-shared key. In a symmetric system, any encrypted data that can be decrypted with the pre-shared key confirms

it was sent by another user with access to the pre-shared key; no user without access to the pre-shared key will be able to view the decrypted data. Compared to asymmetric-key cryptography, symmetric-key cryptography is very fast to compute. Because of this, when one claims to be encrypting something using asymmetric-key cryptography, oftentimes the data is encrypted with symmetrickey cryptography and then the symmetric-key is encrypted using asymmetric-key cryptography.

This ‘trick’ can greatly speed up asymmetric-key cryptography. Here is a summary of the use of asymmetric-key cryptography in many blockchain networks: • Private keys are used to digitally sign transactions. • Public keys are used to derive addresses. • Public keys are used to verify signatures generated with private keys. • Asymmetric-key cryptography provides the ability to verify that the user transferring value to another user is in possession of the private key capable of signing the

Private Key Storage:

With some blockchain networks (especially with permissionless blockchain networks), users must manage and securely store their own private keys. Instead of recording them manually, they often use software to securely store them. This software is often referred to as a wallet. The wallet can store private keys, public keys, and associated addresses. It may also perform other functions, such as calculating the total number of digital assets a user may have. If a user loses a private key, then any digital asset associated with that key is lost, because it is computationally infeasible to regenerate the same private key. If a private key is stolen, the attacker will have full access to all digital assets controlled by that private key.

The security of private keys is so important that many users use special secure hardware to store them; alternatively, users may take advantage of an emerging industry of private key escrow services. These key escrow services can also satisfy KYC laws in addition to storing private keys as users must provide proof of their identity when creating an account.

Private key storage is an extremely important aspect of blockchain technology. When it is reported in the news that “Cryptocurrency XYZ was stolen from…”, it almost certainly means some private keys were found and used to sign a transaction sending the money to a new account, not that the blockchain network itself was compromised. Note that because blockchain data cannot generally be changed, once a criminal steals a private key and publicly transfers the associated funds to another account, that transaction generally cannot be undone.

Ledgers :

A ledger is a collection of transactions. Throughout history, pen and paper ledgers have been used to keep track of the exchange of goods and services. In modern times, ledgers have been stored digitally, often in large databases owned and operated by a centralized trusted third party (i.e., the owner of the ledger) on behalf of a community of users. These ledgers with centralized ownership can be implemented in a centralized or distributed fashion (i.e., just one server or a coordinating cluster of servers).

There is growing interest in exploring having distributed ownership of the ledger. Blockchain technology enables such an approach using both distributed ownership as well as a distributed physical architecture. The distributed physical architecture of blockchain networks often involve a much larger set of computers than is typical for centrally managed distributed physical architecture. The growing interest in distributed ownership of ledgers is due to possible trust, security, and reliability concerns related to ledgers with centralized ownership: