More about passwords

In my previous post I said that proper password hashing is not harder to code than MD5 and here I’m going to demonstrate that. Here’s how it works:

 

1
2
3
4
5
6
7
8
9
10
11
//calculate hash from cleartext $password
$hash = password_hash($password, PASSWORD_BCRYPT, array('cost' => 12));
//now store it in database

//when registered user wants to log in
if(password_verify($password, $hash){
    //OK, password is valid
}
else{
    //wrong password
}

This is it in its simplest form. The second argument here specifies the algorithm used. PASSWORD_DEFAULT uses bcrypt and it will change over time, so the hash string length might change. PASSWORD_BCRYPT uses CRYPT_BLOWFISH and it will always produce 60 character long hash, however it will only process first 72 characters of the input and discard the rest. Optional third argument specifies salt and cost. If salt is omitted, random salt will be generated and stored in the hash string. This is intended mode of usage. The beauty of password_hash() is that it stores algorithm and cost in the hash, so you don’t have to worry about these things when you are using password_verify()

Cost is more interesting option, it specifies the cost in terms of CPU cycles needed to calculate the hash. Values must be in range of 04-31 and this is logarithmic (base 2) scale, meaning that each step is 2 times slower (as you go up) or slower (as you go down). Default value for cost is 10, which is good baseline value, but you should experiment with your server to see what suits you. Here’s a simple benchmark test:

1
2
3
4
5
$start = microtime(true);
$hash = password_hash('test', PASSWORD_BCRYPT, array('cost' => 11));
$end = microtime(true);
$time = $end - $start;
echo "Elapsed time: $time\n";

I also spoke of password entropy. Entropy is a measure of password strength expressed as base 2 logarithm of the number of guesses it would take to break the password. For example 30 bit password has 230 possible combinations. So, how do you calculate this, most calculators don’t base 2 log and how do you calculate password strength in the first place?

Password strength is calculated as the number of characters in the character set used raised to the power of password length. So, if we use password with 8 characters and it has lowercase letters and numbers, password strength is (26+10)8. This is theoretical value assuming that the password is random, which is never the case, but that’s another story. Base 2 log of number N is easily calculated by using base 10 log or natural logarithm (ln) like this: log10(N)/log10(2), or ln(N)/ln(2).

And finally, here’s snippet of jQuery code I use on my Numbers Relay Page to calculate password strength on registration. This gives only an approximate value.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
var password = $('#regPassword').val(); //grab password from the form
var chBase = 0; //initialize counter to 0
if(password.match(/[a-z]/g)){
    chBase += 26; //if password contains lowercase letters increase counter by 26
}
if(password.match(/[A-Z]/g)){
    chBase += 26;  //the same for uppercase letters
}
if(password.match(/[0-9]/g)){
    chBase += 10; //digits
}
if(password.match(/[^a-zA-Z0-9]/g)){
    chBase += 32;  //standard US keyboard has 32 non-alphanumeric characters
}

var entropy = Math.round(Math.log(Math.pow(chBase, password.length)) / Math.log(2));

If you want to find out more about this topic I suggest you start with an excellent Wikipedia article.

Read More

Hashing passwords properly

Hopefully everyone knows that storing passwords in plain text is very bad idea. That one should be obvious. What’s not so obvious is that not all hashing algorithms are safe.

One of the least safe and unfortunately the most widely used is MD5. SHA1, although in theory a little safer is just as bad. The problem with these algorithms is that they are fast, they are designed that way, but unfortunately many books and tutorials still use them. For example, an average computer with fast GPU could crack about 8 billion MD5 hashes or 3 billion SHA1 hashes each second. To put this into perspective, a 6 character password made of random letters, single case, has about 300 million possible combinations.

There are techniques used to make these insecure algorithms more secure. Adding salt for example is good protection against rainbow tables, but it does little to protect against brute force attacks. Key stretching and peppering appear to be better, but in the field of security appearances are not good enough. As Bruce Schneier said:

“Anyone, from the most clueless amateur to the best cryptographer, can create an algorithm that he himself can’t break.”

Reinventing the wheel is bad, especially in cryptography. Hash functions were designed for signing and they are designed to be fast. Implementing them for hashing passwords is therefore bad idea even if peppering is applied. GPU attacks are one of the reasons for this, they can calculate hashes fast, but they are not effective with bcrypt.

To store passwords securely, use Password Hashing API. It is available since PHP 5.5 and it’s easy to use. Bare in mind that the 60 character long hash string may expand in future and that manually specifying salt will be deprecated in PHP 7.0.

Now, why is this so important? You might be thinking that your database is secure and that by imposing password length and complexity requirements, you made your users use secure passwords. The problem is that no database is perfectly safe and more importantly – we humans can’t generate secure passwords. Even something like a passphrase made of 5 random words has only 30 bits of entropy which is really insecure. And finally, if you are thinking that you don’t have to worry about your users’ passwords being cracked because your web service is not very important, think again. Password reuse is a big problem, many of your users will be using the same password for their email and other important accounts.

Read More