How often haven’t you read about entire user databases being retrieved from vulnerable websites? And how often do people use the same password multiple places? Let me answer both those questions: too often.
Let us assume that Ada is running a website for her guild. She has decided to code the thing from scratch, and are about to implement the user management system. She decided she’d only need to store username, password, and user type in the accounts table.
She knows that storing passwords in plain text is unacceptable, and think they need to be hashed. Therefore she decides to store the MD5 hashes of the passwords. When someone try to log in, the code checks if the MD5 hash of the specified password is equal to that stored in the database. She’s thinking that if someone somehow attained a copy of the users table, they wouldn’t get to find the real passwords in a sensible amount of time because the passwords are hashed and they’d have to mount a brute force attack to obtain it.
She’s wrong. A direct hashing of a password is vulnerable to rainbow table attacks. While a few conditions apply, this method makes the process of retrieving the password from the user database instant. Even though she didn’t store the passwords in plain text, they’d still be easily exposed. Knowing this, she decide to salt the passwords. This means the password will be modified in some way before being turned into a hash; usually by adding some string to the end of the input password. She knows of two ways to do this.
Salting Passwords
The first way may sound like it’s easier to implement, and it might be tempting to consider it “good enough”: She could generate a single salt, which is unique for her website. This salt would be stored in the website config, and it’d be used for all passwords. The immediate reaction is probably that the stored password hashes are no longer subject to general-purpose, pre-generated rainbow tables.
Although that’s right, there’s at least one problem: All users passwords have the same salt. Given enough users, the bad guy will consider it being worth it to generate a new rainbow table for this particular attack. This may not seem like a big problem at first, but sites grow over time and become more lucrative targets. It’s better to be safe than sorry; Especially since the alternative isn’t any harder to implement.
The alternate way of doing it, is to generate a random salt each time a password is set or changed, and storing this salt in the user table. Performance impact is neglectable, but the bad guy would have to brute force each password. It’s pointless to generate rainbow tables, because the generated values can’t be reused. If she decides to loop the hashing process as well, the brute force attack would take much longer time. Looping basically means the first generated hash will be re-salted then hashed, and the same applies to the resulting hash. Do this X number of times, where computing time required to brute force the password increases linearly with X.
Preventing the leak in the first place
Although there are no way to guarantee the bad guys won’t get a hold of the users password hashes, there are a few steps which may be taken to significantly reduce the risk. Ada could use different database users for different operations. One with “full access”(A) for doing maintenance (altering table structures for example), one which have read/write/delete/update access (B) (but cannot change database schema or drop tables), and one which can only read (C) information which is intended to be viewed by the public. The website would use these three different users, and most likely store the password for B & C in order to operate correctly. Even though this is the case, this segregation would reduce the risk of SQL injection or misc code execution vulnerabilities, which may have lead to the user table being leaked.
To increase security even further, none of the database users which the website have stored credentials for should have access to the users password hash and password salt fields. This can be restricted in many databases, including MySQL.
The website would rather use stored procedures. Upon creation of these procedures, they should be configured to run as a database user which have access to read&modify those fields.
Procedures with the self-explaining names CreateUser, SetPassword and CheckPassword should be created. The SetPassword procedure should require the current password as well, in order to verify the request is legit. These procedures would of course have to handle salting as discussed earlier. Since the passwords will be transferred across the SQL connection, it should be secure. Local UNIX socket or SSL-encrypted TCP connection should be enough. With this setup, the website would never know or have access to the password hash nor the salt; SQL injections or arbitary code execution wouldn’t be able to retrieve this information from the database.
Short of the database server itself being hacked, the password hashes should be unobtainable by the bad guy. And even if the bad guy got a hold of the hashes, the salting procedure would make sure they’d still have quite the wait before they can use them. They’d most likely do a very limited attempt to brute force passwords, then move on to the next hacked sites passwords instead.