http://lists.grok.org.uk/pipermail/full-disclosure/2005-November/038719.html
Looks like it's time to move anything you have still using MD5 away from it. 45 minutes to generate a collision on a slow P4 is about as broken as you can get, makes it completely useless for any practical purposes now.
Wow...a post...from Skywing...
Anyways,
QuoteAverage runtime on P4 1.6ghz - 5 seconds
Damn....but I guess it's still good for a hash of a program, as I assume it would be pretty hard to get executable code (or whatever) to collide with other other executable code.
/me runs to update his website. :(!
Wow...
Quote from: Warrior on November 14, 2005, 07:38 PM
/me runs to update his website. :(!
I told you! Now what I thought would happen, actually did.
Good thing PHP has built in sha1(). I'll have to make everyone resign up though.
Not something top priority however.
Now to see what the experts say about this. I'm surprised there was no discussion for that thread.
Quote from: Warrior on November 14, 2005, 08:18 PM
Good thing PHP has built in sha1(). I'll have to make everyone resign up though.
Not something top priority however.
Aint sha1 the structure of MD5 I could be wrong but that is what it says on wiki.
Wasn't MD5 cracked last year? hm...
Quote from: Topaz on November 14, 2005, 10:33 PM
Wasn't MD5 cracked last year? hm...
Not with a mechanism for a 45 minute collision...
Quote from: rabbit on November 14, 2005, 06:46 PM
Wow...a post...from Skywing...
Anyways, QuoteAverage runtime on P4 1.6ghz - 5 seconds
Damn....but I guess it's still good for a hash of a program, as I assume it would be pretty hard to get executable code (or whatever) to collide with other other executable code.
No, not at all. Programs are routinely quite large these days, and it's no trouble at all to pad your .rdata segment with whatever bytes are necessary to get the hash you want. Failing that, put the colliding bytes into your .text segment and just ensure that the code flow never tries to execute it.
It may have been cracked, but that doesn't make it useless. DES is still used after all.
Quote from: noob on November 15, 2005, 07:46 PM
It may have been cracked, but that doesn't make it useless. DES is still used after all.
It's safer to move onto newer, more advanced hashing algorithm's however.
Quote from: Skywing on November 14, 2005, 01:53 PM
makes it completely useless for any practical purposes now.
I disagree. As far as I know, there's still no practicle way (without a full brute force) to obtain the original data of MD5 based on the hash. In other words, it's still safe for storing passwords.
Quote from: Warrior on November 14, 2005, 08:18 PM
Good thing PHP has built in sha1(). I'll have to make everyone resign up though.
Not something top priority however.
You're replacing a dead algorithm with a dying algorithm. Not a good idea :-P
I'd recommend that you start storing the SHA1 of the MD5. The advantages are:
- Extra layer of security; even though if both are broken then together they're broken, it'll still keep away the riff raff.
- You don't have to make people re-sign up. Just SHA1() their MD5() hashes.
But if you're using it to store a password, I still don't think it's an issue. Odds are, for most passwords, it can be bruteforced with a dictionary attack anyways :)
Yea, I thought of that after I posted.
It's an interesting read. I tried the provided sample Win32 executable (http://pub1.positivenetworks.net/~dispensa/md5coll.zip), but it failed with an error of 0x8009000F.
Quote from: tA-Kane on November 16, 2005, 11:12 AM
It's an interesting read. I tried the provided sample Win32 executable (http://pub1.positivenetworks.net/~dispensa/md5coll.zip), but it failed with an error of 0x8009000F.
I'll let him know and ask him to post an updated version. That's because he was doing something wrong with how he was generating a random number with cryptoapi. It should probably work if you reboot after running it, since the keyset it creates is temporary I think.
(For that version, executions after the first are likely to break due to that bug.)
What I did is I loaded the source up in VS2005 and simply replaced "CRYPT_NEWKEYSET" with 0.
However, since I'm not too familiar with the Cryptography API, I fear I may have broken the speed at which it works: it has been running for about four hours and has displayed "block #1 done" for several hours and has yet to display "block #2 done". Perhaps I misinterpreted the 45 minutes as being best-time rather than average-time?
Quote from: iago on November 15, 2005, 08:46 PM
I disagree. As far as I know, there's still no practicle way (without a full brute force) to obtain the original data of MD5 based on the hash. In other words, it's still safe for storing passwords.
Not true at all -- if you can generate a collision, you don't need the original data.
To generate a collision, you need the password, don't you? Or is it actually possible to generate a collision based purely on the hashed data?
It is. Generating a collision entails finding data to match a given hash. The hash is needed for comparison, but not the original data. I'm not sure if a brute-force attack on an unknown hash is feasible yet.
Quote from: Arta[vL] on November 17, 2005, 05:11 PM
It is. Generating a collision entails finding data to match a given hash. The hash is needed for comparison, but not the original data. I'm not sure if a brute-force attack on an unknown hash is feasible yet.
I thought that the collisions discussed here entailed finding two pieces of data that generates the same hash. Am I mistaken about that, or did we make different assumptions? I haven't read the paper or anything, so I could very well be wrong there.
Quote from: iago on November 17, 2005, 07:57 PMQuote from: Arta[vL] on November 17, 2005, 05:11 PMIt is. Generating a collision entails finding data to match a given hash. The hash is needed for comparison, but not the original data. I'm not sure if a brute-force attack on an unknown hash is feasible yet.
I thought that the collisions discussed here entailed finding two pieces of data that generates the same hash. Am I mistaken about that, or did we make different assumptions? I haven't read the paper or anything, so I could very well be wrong there.
It depends what use of MD5 you're trying to defeat. If you obtain the MD5 of someone's password and you simply want to masquerade as that user (but are satisfied with never learning their original password), you need only come up with a data sequence which suffers a collision with the known hash. Then when you send that sequence off claiming it to be the password, the server will compute its MD5, get the same value as the MD5 as the correct password (since you picked this data sequence specifically for this property!), and conclude that you're the legitimate owner. If the target site is badly thought out, you could then edit his account settings to point the e-mail associated with it to an account you own, then trigger the "password recovery" feature to get his real password (instead of some random sequence which happens to collide with his password). On the other hand, if you're trying to spoof a document whose MD5 signature is known, you'd need the original document so you could check if your forgery looked even vaguely like the original.
Though, given that we're dealing with hashing algorithms, if you have the original input data, you can compute the hash result easily enough. :)
Quote from: Kp on November 17, 2005, 09:18 PM
Quote from: iago on November 17, 2005, 07:57 PMQuote from: Arta[vL] on November 17, 2005, 05:11 PMIt is. Generating a collision entails finding data to match a given hash. The hash is needed for comparison, but not the original data. I'm not sure if a brute-force attack on an unknown hash is feasible yet.
I thought that the collisions discussed here entailed finding two pieces of data that generates the same hash. Am I mistaken about that, or did we make different assumptions? I haven't read the paper or anything, so I could very well be wrong there.
It depends what use of MD5 you're trying to defeat. If you obtain the MD5 of someone's password and you simply want to masquerade as that user (but are satisfied with never learning their original password), you need only come up with a data sequence which suffers a collision with the known hash. Then when you send that sequence off claiming it to be the password, the server will compute its MD5, get the same value as the MD5 as the correct password (since you picked this data sequence specifically for this property!), and conclude that you're the legitimate owner. If the target site is badly thought out, you could then edit his account settings to point the e-mail associated with it to an account you own, then trigger the "password recovery" feature to get his real password (instead of some random sequence which happens to collide with his password). On the other hand, if you're trying to spoof a document whose MD5 signature is known, you'd need the original document so you could check if your forgery looked even vaguely like the original.
Though, given that we're dealing with hashing algorithms, if you have the original input data, you can compute the hash result easily enough. :)
The first scenario is the one I was considering. I fully agree that it's useless for signing documents now, that's not even an issue.
For it to be an issue with passwords, it seems like it depends on how the passwords are used, then. To be a threat, you have to find some way to obtain the MD5 of the password. That could likely be done in 2 ways:
1. Sniff traffic containing the MD5 -- if you can sniff their traffic, odds are (in any webapp) you can see their password go by in plaintext. It's still rare to see SSL used for sites. So that's irrelevant.
2. Have access to the password database -- if they have access to the database of MD5's, odds are it's already game over.
However, if all you have is access to the MD5 hash, and the server requires that you
send it an MD5 hash, which I don't think is very common at all, then yeah, this MD5 weakness is a danger. But I don't believe I've ever seen a server that does that.
For a web-app like, for example, Warrior's, I think the only weakenss in MD5 that could adversely affect its security is if there was a large amount of clustering, narrowing the number of brute-force attempts needed to guess the correct password. But I don't believe that MD5 contains
that particular weakness, so that's not a problem.
I guess what I'm saying is that whether or not it's a danger to an application depends rather strongly on how it's being used, and in the most common cases, web apps aren't secure anyway, so using MD5 to store passwords really doesn't matter.
Quote from: Kp on November 17, 2005, 09:18 PM
If the target site is badly thought out, you could then edit his account settings to point the e-mail associated with it to an account you own, then trigger the "password recovery" feature to get his real password (instead of some random sequence which happens to collide with his password).
Minor nit-pick: Badly thought out as in storing a hash of the password and the password plain text. If the website only stores the hash, you're probably never going to recover the original password.
Note that numerous web apps don't guard password hashes very carefully. They are often stored in a cookie to facilitate autologon, and are easily exposed by XSS attacks.
Personally, I bruteforced a password (a friend of mine's, we were exploiting a IPB, but enough about that) in about 45 seconds. It was two numbers and three letters, so not all that long, but eh?
Quote from: Arta[vL] on November 18, 2005, 08:36 AM
Note that numerous web apps don't guard password hashes very carefully. They are often stored in a cookie to facilitate autologon, and are easily exposed by XSS attacks.
Well regardless, autologon is a pretty insecure concept as anyone with physical access can impersonate a user (unless you have the stored cookies expire after some set amount of time). I'm sure many websites don't even require physical access, just the contents of the stored cookie unless that cookie contains a hash of variables unique to a computer (i.e. IP address).
Quote from: Arta[vL] on November 18, 2005, 08:36 AM
Note that numerous web apps don't guard password hashes very carefully. They are often stored in a cookie to facilitate autologon, and are easily exposed by XSS attacks.
If they are stored in a cookie, then you don't need to find a collision. You just copy/paste the hash into your own cookie and proceed to log in.
That's an interesting point, though. How would you recommend storing passwords in cookies? I was thinking of storing a salted hash, as well as the salt, but that wouldn't create any advantages versus packetlogs. Maybe the IP and the password (or password hash) together, hashed. But that would be annoying for dial-up users.
You can't do it securely. It's just one of those times when lots of convinience outweighs a small security risk.
Ok, that's pretty much what I figured. Hashing it with the IP is probably the best plan, if I had to choose.
But back to the MD5 thing: passwords stored in MD5 in cookies don't help you any unless you want to find a collision with his password for a different site that only accepts plaintext password, then compares it to MD5, which isn't really realistic.
My point? MD5 is still ok for storing passwords for webapps, in general. :)
Quote from: iago on November 18, 2005, 06:20 PM
But back to the MD5 thing: passwords stored in MD5 in cookies don't help you any unless you want to find a collision with his password for a different site that only accepts plaintext password, then compares it to MD5, which isn't really realistic.
Don't most sites accept a plain-text password, hash it server-side and compare? I wish there was a nice standard for hashing client-side (i.e. <input type="password" hash="md5"> or something).
Quote from: dxoigmn on November 18, 2005, 07:40 PM
Quote from: iago on November 18, 2005, 06:20 PM
But back to the MD5 thing: passwords stored in MD5 in cookies don't help you any unless you want to find a collision with his password for a different site that only accepts plaintext password, then compares it to MD5, which isn't really realistic.
Don't most sites accept a plain-text password, hash it server-side and compare?
Yeah, but if you're doing that you're sending your password plaintext over the network, so if you can sniff their traffic it doesn't matter if you can break md5..
Quote from: iago on November 18, 2005, 11:21 PM
Quote from: dxoigmn on November 18, 2005, 07:40 PM
Quote from: iago on November 18, 2005, 06:20 PM
But back to the MD5 thing: passwords stored in MD5 in cookies don't help you any unless you want to find a collision with his password for a different site that only accepts plaintext password, then compares it to MD5, which isn't really realistic.
Don't most sites accept a plain-text password, hash it server-side and compare?
Yeah, but if you're doing that you're sending your password plaintext over the network, so if you can sniff their traffic it doesn't matter if you can break md5..
But most sites use SSL to encrypt the whole stream.
Quote from: dxoigmn on November 19, 2005, 12:11 AM
Quote from: iago on November 18, 2005, 11:21 PM
Quote from: dxoigmn on November 18, 2005, 07:40 PM
Quote from: iago on November 18, 2005, 06:20 PM
But back to the MD5 thing: passwords stored in MD5 in cookies don't help you any unless you want to find a collision with his password for a different site that only accepts plaintext password, then compares it to MD5, which isn't really realistic.
Don't most sites accept a plain-text password, hash it server-side and compare?
Yeah, but if you're doing that you're sending your password plaintext over the network, so if you can sniff their traffic it doesn't matter if you can break md5..
But most sites use SSL to encrypt the whole stream.
I can't think of any normal sites that use SSL. eCommerce sites do, obviously. And porn sites do, which I suppose is a form of eCommerce. But it's definitely not common.
MD5CALC(MD5CALC(data)+data) = secure md5
Quote from: Mephisto on November 29, 2005, 10:24 PM
MD5CALC(MD5CALC(data)+data) = secure md5
Except not. If MD5CALC(Data) == MD5CALC(FalsifiedData), then MD5CALC(MD5CALC(Data)+Data) == MD5CALC(MD5CALC(FalsifiedData)+FalsifiedData) == MD5CALC(MD5CALC(Data)+FalsifiedData) == MD5CALC(MD5CALC(FalsifiedData)+Data)
Quote from: tA-Kane on November 29, 2005, 11:09 PM
Quote from: Mephisto on November 29, 2005, 10:24 PM
MD5CALC(MD5CALC(data)+data) = secure md5
Except not. If MD5CALC(Data) == MD5CALC(FalsifiedData), then MD5CALC(MD5CALC(Data)+Data) == MD5CALC(MD5CALC(FalsifiedData)+FalsifiedData) == MD5CALC(MD5CALC(Data)+FalsifiedData) == MD5CALC(MD5CALC(FalsifiedData)+Data)
:(
Quote from: Mephisto on November 29, 2005, 10:24 PM
MD5CALC(MD5CALC(data)+data) = secure md5
MD5 is broken... calling it multiple times doesn't make it any less broken.
How is this the death of MD5?
What this does is generate two sets of data that have the same hash. You can't generate false data from a given hash using this.
Quote from: Maddox on December 02, 2005, 12:07 PM
How is this the death of MD5?
What this does is generate two sets of data that have the same hash. You can't generate false data from a given hash using this.
http://www.schneier.com/essay-074.html
Quote from: dxoigmn on December 02, 2005, 01:19 PM
Quote from: Maddox on December 02, 2005, 12:07 PM
How is this the death of MD5?
What this does is generate two sets of data that have the same hash. You can't generate false data from a given hash using this.
http://www.schneier.com/essay-074.html
Ok?
Quote from: Maddox on December 03, 2005, 04:33 AM
Quote from: dxoigmn on December 02, 2005, 01:19 PM
Quote from: Maddox on December 02, 2005, 12:07 PM
How is this the death of MD5?
What this does is generate two sets of data that have the same hash. You can't generate false data from a given hash using this.
http://www.schneier.com/essay-074.html
Ok?
Well let me outline the main points:
- One-way hash functions are supposed to have two properties.
- One, they're one-way.
- Two, they're collision-free.
- Breaking a hash function means showing that either -- or both -- of those properties aren't true.
You might argue that mathematically no hash function can be collision free. But that isn't the point. The point is it should be sufficiently difficult to find such a collision. Well no longer is it difficult. "Attacks always get better; they never get worse."
Quote from: dxoigmn on December 03, 2005, 04:04 PM
[li]Two, they're collision-free.[/li]
Not so much collision-free (since that'll be pretty much impossible for any source value longer than the length of the hash) as relatively impossible to locate collisions in a timely manner (eg, even a few years is too soon).
Quote from: tA-Kane on December 03, 2005, 08:09 PM
Quote from: dxoigmn on December 03, 2005, 04:04 PM
[li]Two, they're collision-free.[/li]
Not so much collision-free (since that'll be pretty much impossible for any source value longer than the length of the hash) as relatively impossible to locate collisions in a timely manner (eg, even a few years is too soon).
Yea, I said that in my post...
Still wonder about the uselessness of it. If you cannot find the original text from a hash, and you cannot find a collision for a hash given a hash, it still seems pretty useful. Being able to generate two completely plaintexts that give the same hash is not all that useful. For it to be useful, those two plaintexts need to both be meaningful.
Quote
and you cannot find a collision for a hash given a hash
I think you can do that.
Quote
For it to be useful, those two plaintexts need to both be meaningful.
That's not as hard as it sounds. For example, you can often stretegically pad documents with whitespace to achieve this, which most people wouldn't notice. Also, I'm not sure that both plaintexts need to be meaningful: that's only the case if the plaintext needs to 'fool' a person.
Quote from: Maddox on December 07, 2005, 08:52 PM
Quote from: Arta[vL] on December 07, 2005, 09:28 AM
I think you can do that.
Prove it.
Easy. There are an infinite number of possible inputs to MD5, but only 2
128 outputs. Therefore, there exist multiple inputs which generate the same output. Given some particular output, a collision can be found by computing the hash of each member of the lexicographic enumeration of the possible inputs.
Quote from: Kp on December 08, 2005, 01:32 AM
Quote from: Maddox on December 07, 2005, 08:52 PM
Quote from: Arta[vL] on December 07, 2005, 09:28 AM
I think you can do that.
Prove it.
Easy. There are an infinite number of possible inputs to MD5, but only 2128 outputs. Therefore, there exist multiple inputs which generate the same output. Given some particular output, a collision can be found by computing the hash of each member of the lexicographic enumeration of the possible inputs.
To rephrase: prove that it can be done in a reasonable amount of time.
Quote from: Arta[vL] on December 07, 2005, 09:28 AM
That's not as hard as it sounds. For example, you can often stretegically pad documents with whitespace to achieve this, which most people wouldn't notice. Also, I'm not sure that both plaintexts need to be meaningful: that's only the case if the plaintext needs to 'fool' a person.
Well, the paper describes an attack by flipping bits. If they can only attack md5 in a way that works when the data is random binary, I would say that is a limitation. I do not think their method for breaking md5 in a reasonable time could at all apply if the inputs are not arbitrary binary data, but limited to adding or removing whitespace.
See here: http://www.schneier.com/blog/archives/2005/06/more_md5_collis.html
Quote from: Arta[vL] on December 09, 2005, 09:15 AM
See here: http://www.schneier.com/blog/archives/2005/06/more_md5_collis.html
Ah, but again, it is a random string that they insert. Someone producing a document like this would easily be revealed as a fraud.
Quote from: Adron on December 09, 2005, 02:38 PM
Quote from: Arta[vL] on December 09, 2005, 09:15 AMSee here: http://www.schneier.com/blog/archives/2005/06/more_md5_collis.html
Ah, but again, it is a random string that they insert. Someone producing a document like this would easily be revealed as a fraud.
Not necessarily. Many modern document formats (for instance, Microsoft Office Document) have quite a bit of garbage inside them that the UI never shows you. If the random string is inserted in one of these dead zones, you'd have to take the file apart to discover it. MD5 is also still used for validating that programs, source archives, etc. have not been altered. These formats also have the potential to remain "valid data" after having arbitrary bits inserted, provided that the arbitrary bits are put in the right place. For instance, put your junk string in the slack space of the .data segment, or destroy a debug message with your new data.
Ah, yes, you can insert your junk overwriting a debug message or you can put it in a space that is not used, but then you as the author are conspiring to make a bad executable from the start. You could do that by just inserting obfuscated evil code.
In the other case, where you produce a document and ask someone to sign that, then applying that signature to a different document, the existence of such inserted data means you can identify this as an attempt at fraud.
Quote from: Kp on December 08, 2005, 01:32 AM
Quote from: Maddox on December 07, 2005, 08:52 PM
Quote from: Arta[vL] on December 07, 2005, 09:28 AM
I think you can do that.
Prove it.
Easy. There are an infinite number of possible inputs to MD5, but only 2128 outputs. Therefore, there exist multiple inputs which generate the same output. Given some particular output, a collision can be found by computing the hash of each member of the lexicographic enumeration of the possible inputs.
Are you being particularly dense on purpose?
Quote from: Kp on December 08, 2005, 01:32 AM
Easy. There are an infinite number of possible inputs to MD5, but only 2128 outputs. Therefore, there exist multiple inputs which generate the same output. Given some particular output, a collision can be found by computing the hash of each member of the lexicographic enumeration of the possible inputs.
Is it possible to find a collision for every 128-bit combination?
Quote from: Adron on December 09, 2005, 07:21 PM
Ah, yes, you can insert your junk overwriting a debug message or you can put it in a space that is not used, but then you as the author are conspiring to make a bad executable from the start. You could do that by just inserting obfuscated evil code.
Similarly, you'd also be conspiring to create a forged document. What's the difference?
Quote from: Adron on December 09, 2005, 07:21 PMIn the other case, where you produce a document and ask someone to sign that, then applying that signature to a different document, the existence of such inserted data means you can identify this as an attempt at fraud.
You're missing the point, Adron. The inserted junk data would be junk -- probably in an unprintable area of the document, or perhaps just whitespace. But nonetheless, it would not alter the look of the document (except of course the wording perhaps, so you'd have (for example, in an executive order) "go to place Y" instead of "go to place X"). Thus, while it
is a fraudulent document, anyone signing the document would note that the MD5 sum of the fraud matches the MD5 sum of the real document,
and the fraud looks like a real document with which has not been tampered.
Quote from: tA-Kane on December 10, 2005, 04:10 AM
Quote from: Adron on December 09, 2005, 07:21 PM
Ah, yes, you can insert your junk overwriting a debug message or you can put it in a space that is not used, but then you as the author are conspiring to make a bad executable from the start. You could do that by just inserting obfuscated evil code.
Similarly, you'd also be conspiring to create a forged document. What's the difference?
If you do not trust the author from the start, any amount of signatures is not going to make a difference. If you do trust the author not to prepare to fool you, there is no problem.
Quote from: tA-Kane on December 10, 2005, 04:10 AM
Quote from: Adron on December 09, 2005, 07:21 PMIn the other case, where you produce a document and ask someone to sign that, then applying that signature to a different document, the existence of such inserted data means you can identify this as an attempt at fraud.
You're missing the point, Adron. The inserted junk data would be junk -- probably in an unprintable area of the document, or perhaps just whitespace. But nonetheless, it would not alter the look of the document (except of course the wording perhaps, so you'd have (for example, in an executive order) "go to place Y" instead of "go to place X"). Thus, while it is a fraudulent document, anyone signing the document would note that the MD5 sum of the fraud matches the MD5 sum of the real document, and the fraud looks like a real document with which has not been tampered.
No, you are missing the point. The important fact here is that there
would be junk inserted into the document. Upon investigation, it would be clear that this
is indeed a forgery.
as i pointed out on the aim dev forum, md5 can be reversed simply using precomputed hash tables, you can squeeze a rather large alphanumeric table onto a dvd, or use some of the available tools to generate your own tables based on what characters you want. I know of a couple sites that store precomputed tables for very very large tables (alphanumeric + symbols + lots of chars) but they all charge you to perform a look up (you submit an md5 and it returns to you all matches to that md5). finding the same tables already generated probably isnt that hard with a lil looking around though. And performing a lookup on said table takes very little time.
just some 2 cents.
http://thepiratebay.org/details.php?id=3408209
1-7 character alphanumeric table
yes i know this thread is oldnews
Quote from: AcidAngel on January 23, 2006, 11:02 AM
as i pointed out on the aim dev forum, md5 can be reversed simply using precomputed hash tables, you can squeeze a rather large alphanumeric table onto a dvd, or use some of the available tools to generate your own tables based on what characters you want. I know of a couple sites that store precomputed tables for very very large tables (alphanumeric + symbols + lots of chars) but they all charge you to perform a look up (you submit an md5 and it returns to you all matches to that md5). finding the same tables already generated probably isnt that hard with a lil looking around though. And performing a lookup on said table takes very little time.
just some 2 cents.
There are also free tables.
But any programmer with half a brain for security adds a salt, which makes pre-computed tables effectively useless.
any programmer with half a brain for security would not have used md5 as their authentication protection in the first place =)
Why not? MD5 is still fine for passwords, and I still use it for passwords. If there's some reason not to hash passwords with it, please tell me.