An Illustrated Guide to Cryptographic Hashes
by Steve Friedl
15 pages of text
Update 2006.02.11: clearer explanation of CTFP preimage resistance.
This is a very good introduction to what a hash algorithm is, what it is for and what collisions are all about. It does not cover specific details, only the general understanding. It’s a quick read so I’ll forgo summarizing the contents.
The article explains the common terms used in most papers that discuss collisions. These terms are used to classify the type of collision attacks possible and are necessary to understand when reading other papers:
- Collision resistance measures how difficult it is to create two inputs which produce any hash value which is the same for both inputs. In this scenario the attacker can control both inputs.
- Preimage resistance measures how difficult it is to create one input which matches the hash value of an unknown input. Here the attacker does not know the other input and is restricted by needing to create a specific hash value.
- Second preimage resistance measures how difficult it is to create one input which matches the hash value of a known input. Here the attacker can see both inputs but only controls one. Attacker is still restricted by having to create an input which matches the specific hash value of the other. However, knowing the input that produced the hash might be of assistance.
Both preimage and second preimage are similar in that the objective is to get one input to match a predefined hash which is not controlled by the attacker. Also, in the Herding Hash Functions by John Kelsey and Tadayoshi Kohno they that there is a 4rth resistance value:
- Chosen Target Forced Prefix preimage resistance measures how difficult it is to create a collision when the first input is known while the second input is not know yet. This is similar to preimage resistance except that here the attacker controls the first input and not the second. Well, almost. The attacker is permitted to append data to the second input. The attacker must determine the hash first using the first input and then “herd” the second input to the same hash. Herding is done by adding data to the second input to make it collide. Is a process that involves carefully predetermining the first input and using internal states from its hash generation in the appended data to the second.