Newer
Older
labs / tiddlers / content / labs / 03 / _Labs_03_Password hashing schemes.md

As mentioned before, a hashing scheme such as MD5 has characteristics that make it unsuitable or even dangerous for use as a password hashing scheme. Superior algorithms such as scrypt ("ess-scrypt") and Argon2 require much greater workloads (particularly CPU and/or memory utilisation) to compute password hashes, greatly increasing the workload for any brute-force attacks. They usually also use tuneable parameters so that system administrators can adjust the workloads to give adequate security in the face of advancing technology (faster processors, more memory, new hardware such as GPUs and FPGAs).

Whereas MD5 can apply directly to an input value, scrypt requires some additional parameters:

  • A salt (which ideally would be random bytes).
  • A cost factor n, which influences CPU and memory load. This must be a power of 2, and the standard suggested value is 16384.
  • A block size r, which influences memory access patterns. A common value for this is 8.
  • A parallelisation factor p.

Here is an example demonstrating its use in Python:

import hashlib

password = b'PasswordPlainText'
scrypted = hashlib.scrypt(password, salt=b'IdeallyWouldBeRandom', n=2, r=8, p=1)
target_hash = scrypted.hex()

(Note the slightly different method of converting the scrypt hash output into a hexadecimal string.)

Exercise: What is the default length of scrypt hashes (in bits)?

Exercise: Make a copy of the Python crack function named crack_scrypt, and modify the code so that it uses the hashlib.scrypt() function instead of hashlib.md5(). Add an additional parameter work_factor which maps to the scrypt n value; your function's signature should be def crack_scrypt(target_hash, work_factor, alphabet, max_length):. To keep things simple, use the values shown in the example above for the other parameters.

Exercise: Perform a series of timing tests using your crack_scrypt function, to find out how the speed of scrypt compares with plain MD5, and how it varies with the work factor n. To keep the timings reasonable, use the lowercase letters alphabet and a max_length of 2. Start with the recommended work factor of 16384, and halve this number for each successive test (8192, 4096, etc.), plotting your results in a spreadsheet. You only need to test work factors down to 128.

To help you carry out the timing tests easily, use the following code as a single line (and if you are comfortable with Python programming already, you could automate the timing tests using a loop):

start = timer(); crack_scrypt(target_hash, work_factor=1024, alphabet=letters, max_length=4); end = timer(); print(end - start, "seconds")

How do the computation times compare with the MD5 case? How does the time vary as you increase the value of n?

Exercise: Discuss the significance of the salt parameter, especially in terms of how it would affect a brute-force attack like the ones in this lab. What would be the implications of using a predictable salt value?