As mentioned in "Hashing - The Basics" I felt that a separate post on hashing for passwords was required.

Principal

The principal of storing hashed passwords is fairly simple and can be described in these steps:

  1. When an account is created the plain text password is entered
  2. The password is run through a hashing algorithm
  3. The output of the hash is stored

 

  1. When the user logs in they enter their plain text password again
  2. The password is run through the hashing algorithm
  3. The hash just generated is checked against the stored one. If the two match then the user has entered the same password as the account was created with

Salting

One of the most important techniques with hashing is salting. Salting is the process of injecting random data into a string before it is hashed. There are a few reasons for this.

  1. Collision avoidance. If two users have the same password they will have the same hash in the database. This means if we crack one password then we have easily cracked the other.
  2. Additional complexity. By adding good random data into a string we can no longer brute force a hash. So we could try a brute force and easily get "mypass" as the password or by using salting we would have to brute force mypass#$^q∩gwjoεai←cyuw3b5asdφ♂

One of the things here to note is to add good random data. Math.Random() does not supply strong random numbers. Instead use the RNGCryptoServiceProvider instead.

public static string GenerateSalt(int sizeInBits)
{
    RNGCryptoServiceProvider provider = new RNGCryptoServiceProvider();
    byte[] salt = new byte[size/8];
    provider.GetBytes(salt);
    return Convert.ToBase64String(salt);
}

By adding salting techniques makes the account creating steps a little more involved. The reason for this is when the user logs in we need the original salt data that the hash was created with. Normally we store this along with our data but there are some other techniques.

Here are the steps again with salting involved:

  1. When an account is created the plain text password is entered
  2. A random salt is generated and appended onto the password (i.e. "mypass" + "#$^q∩gwjoεai←cyuw3b5asdφ♂")
  3. The password+salt is run through a hashing algorithm
  4. The output of the hash is stored
  5. The salt used is also stored

 

  1. When the user logs in they enter their plain text password again
  2. The salt is loaded
  3. The plain text password + salt is run through the hashing algorithm
  4. The hash just generated is checked against the stored one. If the two match then the user has entered the same password as the account was created with.

Now it does not always have to be 'password' + 'salt' we could reverse it to be 'salt' + 'password' or even 'salt' + 'password' + 'salt'. As long as the way we create the hash is the same procedure we use to check the hash then the possibilities are endless. In fact as I write this I would say that 'password' + 'salt' is the common pattern that to add more security to your hashes mix the salt in differently (i.e. the 'salt' + 'password' + 'salt')

IMPORTANT: Do not use the same salt for each record. Having random data thrown in to each unique record will make each record harder to crack. I will discuss this idea a bit more in my next post on attacking hashed passwords.

Drawbacks

Hashing passwords does have its drawbacks like anything else:

1. Passwords can not be retrieved for a user. This is because hashes are just a fingerprint of the data; they never actually contain the password inside of them. The way to get around this is to have the ability to reset a users password (i.e. generate a new hash/salt and overwrite it in the database)

2. Creation / Login disparities. I had an issue once where when the account was created the password did not get whitespace trimmed off of it. I created an account with a password of "blah " but the user could not login with "blah" because "blah" and "blah " result in two totally different hashes. This took quite a while to debug.

3. Database/Code truncation. It is easy to spot that a users password exceeds the size of a database column as it would look like this "HelloMyNameIsSi" but a truncated hash would look like "A56FY#SDFH^(EU". These are hard issues to spot and debug sometimes

4. Performance. The performance impact is not that large but it is still higher than a plain text login system. I would never sacrifice security for performance myself and there are usually better places to improve performance of a system.