• Welcome to Valhalla Legends Archive.
 

The death of MD5

Started by Skywing, November 14, 2005, 01:53 PM

Previous topic - Next topic

Adron

Still wonder about the uselessness of it. If you cannot find the original text from a hash, and you cannot find a collision for a hash given a hash, it still seems pretty useful. Being able to generate two completely plaintexts that give the same hash is not all that useful. For it to be useful, those two plaintexts need to both be meaningful.

Arta

Quote
and you cannot find a collision for a hash given a hash

I think you can do that.


Quote
For it to be useful, those two plaintexts need to both be meaningful.

That's not as hard as it sounds. For example, you can often stretegically pad documents with whitespace to achieve this, which most people wouldn't notice. Also, I'm not sure that both plaintexts need to be meaningful: that's only the case if the plaintext needs to 'fool' a person.

Maddox

asdf.

Kp

Quote from: Maddox on December 07, 2005, 08:52 PM
Quote from: Arta[vL] on December 07, 2005, 09:28 AM
I think you can do that.
Prove it.

Easy.  There are an infinite number of possible inputs to MD5, but only 2128 outputs.  Therefore, there exist multiple inputs which generate the same output.  Given some particular output, a collision can be found by computing the hash of each member of the lexicographic enumeration of the possible inputs.
[19:20:23] (BotNet) <[vL]Kp> Any idiot can make a bot with CSB, and many do!

iago

Quote from: Kp on December 08, 2005, 01:32 AM
Quote from: Maddox on December 07, 2005, 08:52 PM
Quote from: Arta[vL] on December 07, 2005, 09:28 AM
I think you can do that.
Prove it.

Easy.  There are an infinite number of possible inputs to MD5, but only 2128 outputs.  Therefore, there exist multiple inputs which generate the same output.  Given some particular output, a collision can be found by computing the hash of each member of the lexicographic enumeration of the possible inputs.
To rephrase: prove that it can be done in a reasonable amount of time.
This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*


Adron

Quote from: Arta[vL] on December 07, 2005, 09:28 AM
That's not as hard as it sounds. For example, you can often stretegically pad documents with whitespace to achieve this, which most people wouldn't notice. Also, I'm not sure that both plaintexts need to be meaningful: that's only the case if the plaintext needs to 'fool' a person.

Well, the paper describes an attack by flipping bits. If they can only attack md5 in a way that works when the data is random binary, I would say that is a limitation. I do not think their method for breaking md5 in a reasonable time could at all apply if the inputs are not arbitrary binary data, but limited to adding or removing whitespace.


Arta


Adron

Quote from: Arta[vL] on December 09, 2005, 09:15 AM
See here: http://www.schneier.com/blog/archives/2005/06/more_md5_collis.html

Ah, but again, it is a random string that they insert. Someone producing a document like this would easily be revealed as a fraud.

Kp

Quote from: Adron on December 09, 2005, 02:38 PM
Quote from: Arta[vL] on December 09, 2005, 09:15 AMSee here: http://www.schneier.com/blog/archives/2005/06/more_md5_collis.html
Ah, but again, it is a random string that they insert. Someone producing a document like this would easily be revealed as a fraud.

Not necessarily.  Many modern document formats (for instance, Microsoft Office Document) have quite a bit of garbage inside them that the UI never shows you.  If the random string is inserted in one of these dead zones, you'd have to take the file apart to discover it.  MD5 is also still used for validating that programs, source archives, etc. have not been altered.  These formats also have the potential to remain "valid data" after having arbitrary bits inserted, provided that the arbitrary bits are put in the right place.  For instance, put your junk string in the slack space of the .data segment, or destroy a debug message with your new data.
[19:20:23] (BotNet) <[vL]Kp> Any idiot can make a bot with CSB, and many do!

Adron

Ah, yes, you can insert your junk overwriting a debug message or you can put it in a space that is not used, but then you as the author are conspiring to make a bad executable from the start. You could do that by just inserting obfuscated evil code.

In the other case, where you produce a document and ask someone to sign that, then applying that signature to a different document, the existence of such inserted data means you can identify this as an attempt at fraud.

Maddox

Quote from: Kp on December 08, 2005, 01:32 AM
Quote from: Maddox on December 07, 2005, 08:52 PM
Quote from: Arta[vL] on December 07, 2005, 09:28 AM
I think you can do that.
Prove it.

Easy.  There are an infinite number of possible inputs to MD5, but only 2128 outputs.  Therefore, there exist multiple inputs which generate the same output.  Given some particular output, a collision can be found by computing the hash of each member of the lexicographic enumeration of the possible inputs.
Are you being particularly dense on purpose?
asdf.

Adron

Quote from: Kp on December 08, 2005, 01:32 AM
Easy.  There are an infinite number of possible inputs to MD5, but only 2128 outputs.  Therefore, there exist multiple inputs which generate the same output.  Given some particular output, a collision can be found by computing the hash of each member of the lexicographic enumeration of the possible inputs.

Is it possible to find a collision for every 128-bit combination?

tA-Kane

Quote from: Adron on December 09, 2005, 07:21 PM
Ah, yes, you can insert your junk overwriting a debug message or you can put it in a space that is not used, but then you as the author are conspiring to make a bad executable from the start. You could do that by just inserting obfuscated evil code.
Similarly, you'd also be conspiring to create a forged document. What's the difference?

Quote from: Adron on December 09, 2005, 07:21 PMIn the other case, where you produce a document and ask someone to sign that, then applying that signature to a different document, the existence of such inserted data means you can identify this as an attempt at fraud.
You're missing the point, Adron. The inserted junk data would be junk -- probably in an unprintable area of the document, or perhaps just whitespace. But nonetheless, it would not alter the look of the document (except of course the wording perhaps, so you'd have (for example, in an executive order) "go to place Y" instead of "go to place X"). Thus, while it is a fraudulent document, anyone signing the document would note that the MD5 sum of the fraud matches the MD5 sum of the real document, and the fraud looks like a real document with which has not been tampered.
Macintosh programmer and enthusiast.
Battle.net Bot Programming: http://www.bash.org/?240059
I can write programs. Can you right them?

http://www.clan-mac.com
http://www.eve-online.com

Adron

Quote from: tA-Kane on December 10, 2005, 04:10 AM
Quote from: Adron on December 09, 2005, 07:21 PM
Ah, yes, you can insert your junk overwriting a debug message or you can put it in a space that is not used, but then you as the author are conspiring to make a bad executable from the start. You could do that by just inserting obfuscated evil code.
Similarly, you'd also be conspiring to create a forged document. What's the difference?

If you do not trust the author from the start, any amount of signatures is not going to make a difference. If you do trust the author not to prepare to fool you, there is no problem.


Quote from: tA-Kane on December 10, 2005, 04:10 AM
Quote from: Adron on December 09, 2005, 07:21 PMIn the other case, where you produce a document and ask someone to sign that, then applying that signature to a different document, the existence of such inserted data means you can identify this as an attempt at fraud.
You're missing the point, Adron. The inserted junk data would be junk -- probably in an unprintable area of the document, or perhaps just whitespace. But nonetheless, it would not alter the look of the document (except of course the wording perhaps, so you'd have (for example, in an executive order) "go to place Y" instead of "go to place X"). Thus, while it is a fraudulent document, anyone signing the document would note that the MD5 sum of the fraud matches the MD5 sum of the real document, and the fraud looks like a real document with which has not been tampered.

No, you are missing the point. The important fact here is that there would be junk inserted into the document. Upon investigation, it would be clear that this is indeed a forgery.

AcidAngel

as i pointed out on the aim dev forum, md5 can be reversed simply using precomputed hash tables, you can squeeze a rather large alphanumeric table onto a dvd, or use some of the available tools to generate your own tables based on what characters you want. I know of a couple sites that store precomputed tables for very very large tables (alphanumeric + symbols + lots of chars) but they all charge you to perform a look up (you submit an md5 and it returns to you all matches to that md5). finding the same tables already generated probably isnt that hard with a lil looking around though. And performing a lookup on said table takes very little time.

just some 2 cents.

|