Introduction to Hash Functions
Instead of using public and private keys for encryption and decryption (as with ECC and RSA), a hash function can be used to secure data. A hash function processes plaintext of any size and creates a unique, fixed-length identifier. In other words, hash functions take the generic text (like this article) and turn it into a string of numbers and letters. This hash is unique to the message; this means that Message A and Message B cannot have the same hash function and that Message A will always have the same hash provided that the message doesn’t change. Additionally, any change to the original message will produce an entirely different hash. This is also called a collision-resistant hash function, meaning that no two different messages will produce the same hash. The resulting hash usually has a size between 128 and 512 bits.
In contrast to RSA and ECC asymmetric cryptography, hash functions are not used to facilitate communication between different parties. Hash functions are used to securely store messages or to reach an agreement on a message between different parties, without revealing the plaintext. The pure hash cannot be used to reconstruct the original plaintext message and the user must know some properties of the plaintext in order to decrypt the hash. This makes hash functions ideal for storing passwords securely on third-party servers. In this scenario, once the user registers a password, the password is hashed with a certain hash function. Only the hash function is stored by the platform. When the user wants to log into the platform, the same hash function is used to create a hash of the provided password. The resulting hash is compared to the hash stored on the password server. The user will only be granted access if the original hash and the new hash match. Essentially, instead of matching passwords, one would match the hashes of their password, ensuring that the third party application never had access to their password.
In addition, the security of stored passwords can be increased through a process known as salting. With salting, a random value (called the salt) is added to the plaintext password prior to the hashing process. However, salting is only effective if the chosen random salt is long enough that, when added to a password, it creates enough possible hashed values that an attacker could not generate a table containing all possible hashes from a salted dictionary.
Various properties of hash functions include:
- The hash of any message has to be relatively easy to compute, ultimately making the implementation practical;
- One way computable: it should be utterly impossible to find the original message from a hash. For example, for Alice to authenticate herself to Bob, she would hash a secret ‘S’ together with the current timestamp. The only requirement is that Bob knows about the secret too. Once Bob receives the hash of the message, he can hash the secret ‘S’ together with the timestamp of the message and obtain the same hash. The hash will only be secure if it is impossible (probabilistically improbable) to compute the original message from the hash without knowing the shared secret.
- A hash function must be deterministic: the same message must always result in the same hash.
- Any changes to the original plaintext should change the hash value completely; thus, there is no correlation between hash values.
- One hash value should never be connected to two different messages.
Note that a hash function can show that the data has not changed over a period of time, or after transmission since users would always generate the same hash given that the same hash function has been used. This property is utilized by blockchain architecture. Information is chained together via linked hashes. Each additional unit is called a block. Blocks maintain the hash of the entire history of the blockchain. Thus, any change to historical inputs fundamentally affects the entire resulting chain.
However, the hash alone cannot determine which user has authority over the hash-represented data. A system must implement digital signatures based on private-public key encryption to associate specific data to a user. The user would then encrypt the hash with their private key.
Hash functions usually vary in the size of the hash generated. The following hash functions are most commonly used: SHA-224, SHA-256 (Bitcoin), SHA-384, or SHA-512, where the number denotes the number of bits.
Besides the above, there are other applications for the use of hash functions, such as:
- Verification of the integrity of a message: Hash functions can be used to show that a message has not been changed. Any change to the message will have an irreversible and dramatic change to the associated hash. Blockchains gain their immutability through this characteristic of hashes.
- Password storing and verification: Centralised servers don’t store the blank passwords, but rather a hash of the password itself. If the user provides their password to the system, the hash of the password is generated with the same hash function and compared with the stored hash function.
- Proof of Work: In Proof of Work, the miner has to find the right hash to a cryptographic puzzle. Depending on the length of the hash and the number of zeros required at the front, the difficulty of the hash function will vary. Bitcoin was the first to implement this technique.
Hash functions vary in their application as well as in their speed, security, and availability. The type of hash function chosen depends on the intended application.
Chameleon hash function
Chameleon functions are also called trapdoor functions, as they include a trapdoor to find collisions in the domain of the functions. Trapdoor functions are collision resistant, meaning, the same hash cannot be obtained from different input values. In other words, no two different messages will produce the same hash. “These hash functions are characterised by the nonstandard property of being collision-resistant for the signer but collision tractable for the recipient.” (7)
Chameleon signatures evolved from chameleon functions. These signatures allow the signer to sign a digital document. The recipient of the signed document will be able to see that it has been signed but the document details remain private.
In contrast to encryption keys like RSA and ECC, hash functions are one-way computable. Hash functions allow users to generate the hash of a message of any size and store these securely. Even if the hash is received by a third party, the plaintext would not be accessible without any information on the plaintext. The type of hash function employed depends on the application and the type of data stored. Hash functions vary in security, speed, and availability. Resulting, longer and more complex hash functions are used for higher security, while shorter hash functions provide speed advantages. The applications of hashing are diverse and include PoW mining, use cases in blockchain architecture, password management, and data security. Novel implementation ideas involving hashing continue to be developed.