Understand PHP Hash function to enhance password security

  • 2020-03-31 21:36:31
  • OfStack

1. The statement
Cryptography is a complex topic, and I'm not an expert on it. Many universities and research institutes have long-term research in this field. In this article, I hope to show you a way to securely store passwords for Web applications in an easy-to-understand way.
2. What does "Hash" do?
"A Hash converts a piece of data (small or big) into a relatively small piece of data, such as a string or integer."
This is done by relying on the one-way hash function. One-way means that it is difficult (or virtually impossible) to reverse it back. A common example of a hash function is md5(), which is popular in various computer languages and systems.
 
$data = "Hello World"; 
$hash = md5($data); 
echo $hash; // b10a8db164e0754105b7a99be72e3fe5 

The result of an md5() operation is always a 32-character string, but it contains only hexadecimal characters, which can technically be represented as a 128-bit (16-byte) integer. You can use md5() to handle long strings and data, but you'll always get a fixed-length hash value, which may also help you understand why the function is "one-way."
Use the Hash function to store passwords
Typical user registration process:
Users fill in the registration form, which contains the password field;
The program stores all the information filled in by the user in the database.
However, the password is encrypted with a hash function before being stored in the database.
The original password is no longer stored anywhere, or it is discarded.
User login process:
User enters username and password;
The program encrypts the password by registering the same hash function;
The program checks the user from the database and reads the hash password.
The program compares user names and passwords and authorizes users if they match.
We will discuss this later in this article as to how to choose an appropriate method to encrypt a password.
Problem 1: hash collisions
A hash collision is when you hash two different things to get the same hash value. The likelihood of a hash collision depends on the hash algorithm used.
How?
For example, some older programs use crc32() to hash a password, which produces a 32-bit integer as the hash result, meaning that there are only 2^32 (i.e., 4,294,967,296) possible outputs.
Let's hash a password:
 
echo crc32('supersecretpassword'); 
// outputs: 323322056 

Now let's say someone steals the database and gets the hashed password. He may not be able to restore 323322056 to 'supersecretpassword', but he can find another password that can be hashed to the same value. All it takes is a very simple program:
 
set_time_limit(0); 
$i = 0; 
while (true) { 
if (crc32(base64_encode($i)) == 323322056) { 
echo base64_encode($i); 
exit; 
} 
$i++; 
} 

The program may take a while to run, but eventually it will return a string. We can use this string instead of 'supersecretpassword' and use it to successfully log in to the user account using the password.
For example, after running the above program on my computer for a few months, I get a string: 'MTIxMjY5MTAwNg=='. Let's test it out:
 
echo crc32('supersecretpassword'); 
// outputs: 323322056 
echo crc32('MTIxMjY5MTAwNg=='); 
// outputs: 323322056 

How to solve it?
Now that a slightly stronger home PC can run a billion hash functions a second, we need a hash function that produces a wider range of results. Md5 () is more appropriate, for example, it can produce 128 - bit hash value, also is to have 340282366920938463463374607431768211456 possible output. So it's not generally possible to do that many loops to find hash collisions. While some people still find ways to do this, check out the examples for details.
Sha1 () is a better alternative because it produces up to 160 hash values.
5. Question 2: rainbow table
Even if we solve the collision problem, it's still not safe enough.
"The rainbow table is a table built by calculating the hash values of common words and their combinations."
This table may store millions or even billions of pieces of data. Storage is now very cheap, so you can build very large rainbow tables.
Now let's say someone steals the database and gets millions of hashed passwords. The thief can easily look up these hash values one by one in the rainbow table and get the original password. Not all hash values will be found in the rainbow table, but some will be.
How to solve it?
We can try to add some interference to the password, such as the following example:
 
$password = "easypassword"; 
// this may be found in a rainbow table 
// because the password contains 2 common words 
echo sha1($password); // 6c94d3b42518febd4ad747801d50a8972022f956 
// use bunch of random characters, and it can be longer than this 
$salt = "f#@V)Hu^%Hgfds"; 
// this will NOT be found in any pre-built rainbow table 
echo sha1($salt . $password); // cd56a16759623378628c0d9336af69b74d9d71a5 

All we've done here is hash each password after appending an interference string, which, as long as the appending string is sufficiently complex, will not be found in the pre-built rainbow table. But it's still not safe enough.
6. Question 3: the rainbow table again
Note that the rainbow table may be created from scratch after stealing the dry string. The interference string can also be stolen along with the database, and they can then use this interference string to create a rainbow table from scratch. The hash value for "easypassword" may exist in the normal rainbow table, but in the new rainbow table, the hash value for "f# @v)Hu^%Hgfdseasypassword" will also exist.
How to solve it?
We can use a unique interference string for each user. One possible solution is to use the user's id in the database:
 
$hash = sha1($user_id . $password); 

The premise of this approach is that the user's id is a constant value (which is the case in most applications)
We can also randomly generate a unique string of interference for each user, but we also need to store the string:
 
// generates a 22 character long random string 
function unique_salt() { 
return substr(sha1(mt_rand()),0,22); 
} 
$unique_salt = unique_salt(); 
$hash = sha1($unique_salt . $password); 
// and save the $unique_salt with the user record 
// ... 

This prevents us from being compromised by the rainbow table, because each password is interfered with using a different string. It is impractical for an attacker to create a rainbow table with the same number of passwords.
7. Problem 4: hash speed
Most hash algorithms are designed with speed in mind because it is commonly used to calculate the hash value of big data or files to verify the correctness and integrity of the data.
How?
As mentioned earlier, a powerful PC can now perform billions of calculations a second, making it easy to try every password with brute force. You might think that passwords of more than eight characters would be safe from brute force, but let's see if that's the case:
If the password can contain lowercase letters, uppercase letters and Numbers, there are 62 (26+26+10) characters to choose from.
An 8-bit password has 62^8 possible combinations, which is slightly more than 218 trillion.
At the rate of a billion hash values a second, this would take only 60 hours to solve.
For a six-bit password, which is also a very common password, it only takes a minute to crack. Asking for a 9 - to 10-digit password may be more secure, but it can be a hassle for some users.
How to solve it?
Use a slower hash function.
"If you use an algorithm that only runs a million times a second on the same hardware instead of a billion times a second, an attacker might take 1,000 times as long to do a brute force attack, and 60 times as long!"
You can do this yourself:
 
function myhash($password, $unique_salt) { 
$salt = "f#@V)Hu^%Hgfds"; 
$hash = sha1($unique_salt . $password); 
// make it take 1000 times longer 
for ($i = 0; $i < 1000; $i++) { 
$hash = sha1($hash); 
} 
return $hash; 
} 

You can also use an algorithm that supports "cost parameters," such as BLOWFISH. In PHP, you can use the crypt() function to:
 
function myhash($password, $unique_salt) { 
// the salt for blowfish should be 22 characters long 
return crypt($password, '$2a$10.$unique_salt'); 
} 

The second argument to this function contains several values separated by the "$" sign. The first value is "$2a", indicating that the BLOWFISH algorithm should be used. The second parameter "$10" in this case is the cost parameter, which is logarithm base 2, indicating the number of iterations of the calculation cycle (10 => 2 to the 10th is equal to 1024.
For example:
 
function myhash($password, $unique_salt) { 
return crypt($password, '$2a$10.$unique_salt'); 
} 
function unique_salt() { 
return substr(sha1(mt_rand()),0,22); 
} 
$password = "verysecret"; 
echo myhash($password, unique_salt()); 
// result: $2a$10$dfda807d832b094184faeu1elwhtR2Xhtuvs3R9J1nfRGBCudCCzC 

The resulting hash value contains the $2a algorithm, the cost parameter $10, and a 22-bit interference string that we used. All that's left is the calculated hash value, so let's run a test program:
 
// assume this was pulled from the database 
$hash = '$2a$10$dfda807d832b094184faeu1elwhtR2Xhtuvs3R9J1nfRGBCudCCzC'; 
// assume this is the password the user entered to log back in 
$password = "verysecret"; 
if (check_password($hash, $password)) { 
echo "Access Granted!"; 
} else { 
echo "Access Denied!"; 
} 
function check_password($hash, $password) { 
// first 29 characters include algorithm, cost and salt 
// let's call it $full_salt 
$full_salt = substr($hash, 0, 29); 
// run the hash function on $password 
$new_hash = crypt($password, $full_salt); 
// returns true or false 
return ($hash == $new_hash); 
} 

Run it and we'll see "Access Granted!"
8. Integrate
Based on the above discussion, we wrote a tool class:
 
class PassHash { 
// blowfish 
private static $algo = '$2a'; 
// cost parameter 
private static $cost = '$10'; 
// mainly for internal use 
public static function unique_salt() { 
return substr(sha1(mt_rand()),0,22); 
} 
// this will be used to generate a hash 
public static function hash($password) { 
return crypt($password, 
self::$algo . 
self::$cost . 
'$'. self::unique_salt()); 
} 
// this will be used to compare a password against a hash 
public static function check_password($hash, $password) { 
$full_salt = substr($hash, 0, 29); 
$new_hash = crypt($password, $full_salt); 
return ($hash == $new_hash); 
} 
} 

Here's how to register:
 
// include the class 
require ("PassHash.php"); 
// read all form input from $_POST 
// ... 
// do your regular form validation stuff 
// ... 
// hash the password 
$pass_hash = PassHash::hash($_POST['password']); 
// store all user info in the DB, excluding $_POST['password'] 
// store $pass_hash instead 
// ... 

Here's how to log in:
 
// include the class 
require ("PassHash.php"); 
// read all form input from $_POST 
// ... 
// fetch the user record based on $_POST['username'] or similar 
// ... 
// check the password the user tried to login with 
if (PassHash::check_password($user['pass_hash'], $_POST['password']) { 
// grant access 
// ... 
} else { 
// deny access 
// ... 
} 

9. Is encryption available
Not all systems support the Blowfish encryption algorithm, although it is now common, you can check whether your system supports it with the following code:
 
if (CRYPT_BLOWFISH == 1) { 
echo "Yes"; 
} else { 
echo "No"; 
} 

With php5.3, however, you don't have to worry about this because it has the implementation of this algorithm built in.
conclusion
Passwords encrypted this way are secure enough for most Web applications. Don't forget, though, that you can still get users to use more secure passwords that require a minimum number of digits and use a mix of letters, Numbers and special characters.

Related articles: