do… Web Application Development and Security

Q & A: Risk of Duplicates When Using MD5?


Update April 27th, 2013 – WARNING – if your intention is to encrypt (hash) passwords, this isn’t the code you’re looking for. Because of massive advances in attacker capabilities the code in the original post is obsolete for that purpose. It’s better to use the built-in password hashing functions or an alternative made with appropriate techniques and expert review.


Yes, MD5 can produce hash collisions in a very small percentage of cases. For many uses this shouldn’t be significant, but for security there are better options.

I prefer the SHA-2 series, referred to as SHA-224/256/384/512, because the algorithms are strong and widely supported.

If you need the hashes to be un-guessable then I’d recommend hashing more than just the input data. A well accepted strategy is to include a secret key in the computation, resulting in a keyed-Hash Message Authentication Code (HMAC), and another useful technique is to concatenate a “salt”, which may or may not be secret, with the input.

PHP versions >= 5.1.2 have the hash_hmac() and hash() functions:

$hmac = hash_hmac('sha256', $data, $key); // hex string output

$hmac = base64_encode(hash_hmac('sha256', $data, $key, TRUE)); // force binary output before encoding

$hash = hash('sha256', $data . $salt);

PHP versions < 5.3 have the mhash() function:

$hmac = base64_encode(mhash(MHASH_SHA256, $data, $key)); // mhash produces binary output

$hash = bin2hex(mhash(MHASH_SHA256, $data . $salt));

There’s a nice table of algorithms and their properties on Wikipedia.

Original email discussion was on the DC PHP Developers Group list.

2 Responses to “Q & A: Risk of Duplicates When Using MD5?”

  • Vladimir says:

    thanks for mentioning this, I wonder if there’s a way to compute the chances of duplication in the md5 algorithm though.


  • Barry says:


    More detailed discussions are available at and

  • Leave a Reply