md5(x) = x [and other properties of md5]

A place to discuss the science of computers and programs, from algorithms to computability.

Formal proofs preferred.

Moderators: phlip, Larson, Moderators General, Prelates

md5(x) = x [and other properties of md5]

Postby mgcclx » Fri Oct 24, 2008 6:38 pm UTC

Is there a integer x exist, so that md5(x) = x?
just wondering...
If such x exists, how many are there.
If non exists, what's the least amount of md5(x) require to be applied to an x until it equals to itself?

I've sort of hijacked this thread for other properties of MD5 too, and renamed it appropriately. If discussion picks up on exists x. md5(x) = x again, I'll split them.
mgcclx
 
Posts: 43
Joined: Mon Aug 25, 2008 6:37 am UTC

Re: md5(x) = x

Postby Xanthir » Fri Oct 24, 2008 7:50 pm UTC

md5 does have fixed points. I don't know what they are, or how many of them there are.
(defun fibs (n &optional (a 1) (b 1)) (take n (unfold '+ a b)))
User avatar
Xanthir
 
Posts: 2872
Joined: Tue Feb 20, 2007 12:49 am UTC
Location: Deep in the heart of Texas

Re: md5(x) = x

Postby Orborde » Sun Oct 26, 2008 3:54 am UTC

Xanthir wrote:md5 does have fixed points. I don't know what they are, or how many of them there are.

Can you provide a citation for that?
Orborde
 
Posts: 29
Joined: Mon May 12, 2008 5:45 am UTC

Re: md5(x) = x

Postby effective_ » Sun Oct 26, 2008 5:20 am UTC

d131dd02c5e6eec4693d9a0698aff95c 2fcab58712467eab4004583eb8fb7f89
55ad340609f4b30283e488832571415a 085125e8f7cdc99fd91dbdf280373c5b
d8823e3156348f5bae6dacd436c919c6 dd53e2b487da03fd02396306d248cda0
e99f33420f577ee8ce54b67080a80d1e c69821bcb6a8839396f9652b6ff72a70

and

d131dd02c5e6eec4693d9a0698aff95c 2fcab50712467eab4004583eb8fb7f89
55ad340609f4b30283e4888325f1415a 085125e8f7cdc99fd91dbd7280373c5b
d8823e3156348f5bae6dacd436c919c6 dd53e23487da03fd02396306d248cda0
e99f33420f577ee8ce54b67080280d1e c69821bcb6a8839396f965ab6ff72a70

Each of these blocks has MD5 hash 79054025255fb1a26e4bc422aef54eb4.

http://www.mscs.dal.ca/~selinger/md5collision/

md5 hash collision has been proven as seen by the link. but there does not exist an x where md5(x) = x. sorry not possible.
I'm on the brink of insanity, between extreme intelligence and split personalities
But I elevate to the point of reversing gravity
Revolutionary conceptuality spitting out of me
Even the dead people in my family tell me they proud of me
User avatar
effective_
 
Posts: 21
Joined: Sun Oct 26, 2008 1:29 am UTC

Re: md5(x) = x

Postby davean » Sun Oct 26, 2008 5:34 am UTC

effective_ wrote:md5 hash collision has been proven as seen by the link. but there does not exist an x where md5(x) = x. sorry not possible.


I think we all know about md5's collision issue, but the claim that it has no fixed points is interesting. Proof or citation to back that up? While it is common for functions not to, saying that it can't have one is interesting and requires a reason.
User avatar
davean
Site Ninja
 
Posts: 2214
Joined: Sat Apr 08, 2006 7:50 am UTC

Re: md5(x) = x

Postby effective_ » Sun Oct 26, 2008 6:05 am UTC

davean wrote:
effective_ wrote:md5 hash collision has been proven as seen by the link. but there does not exist an x where md5(x) = x. sorry not possible.


I think we all know about md5's collision issue, but the claim that it has no fixed points is interesting. Proof or citation to back that up? While it is common for functions not to, saying that it can't have one is interesting and requires a reason.
My induction sucks but I'm sure it's possible to disprove that especially if you're just talking about integers. I don't have a source for that, but by looking at the md5 algorithm it's looks pretty unlikely to have an integer that would produce an output of itself. Irrational numbers though I'm not so sure.
I'm on the brink of insanity, between extreme intelligence and split personalities
But I elevate to the point of reversing gravity
Revolutionary conceptuality spitting out of me
Even the dead people in my family tell me they proud of me
User avatar
effective_
 
Posts: 21
Joined: Sun Oct 26, 2008 1:29 am UTC

Re: md5(x) = x

Postby evilbeanfiend » Sun Oct 26, 2008 9:55 am UTC

well the output can't be an irrational number as it has a finite representation
in ur beanz makin u eveel
User avatar
evilbeanfiend
 
Posts: 2641
Joined: Tue Mar 13, 2007 7:05 am UTC
Location: the old world

Re: md5(x) = x

Postby davean » Sun Oct 26, 2008 10:06 pm UTC

effective_ wrote:My induction sucks but I'm sure it's possible to disprove that especially if you're just talking about integers.


Disprove what? And MD5 only deals with binary, which can be represented as integers but the operations it does on them are all logical operations. I'd be very impressed if you could prove anything about it with induction, this isn't the sort of problem that usually lends itself to induction at all.

Anyway, the answer would be smaller then 512 bits, the chunk size MD5 operates on.

effective_ wrote:I don't have a source for that, but by looking at the md5 algorithm it's looks pretty unlikely to have an integer that would produce an output of itself. Irrational numbers though I'm not so sure.


Unlikely is meaningless. Anyway, MD5 operates on binary, it doesn't apply to irrationals. Also, did you mean real? It seems odd to skip from integers to irrationals. Of course, if you meant real, any continuous function mapping from one range to its self must have at least one fixed point. Not that MD5 is continuous bit it would have to satisfy a fairly strict set of requirements not to have a fixed point with reals ... though it can't operate on reals ...

You aren't making any sense.
User avatar
davean
Site Ninja
 
Posts: 2214
Joined: Sat Apr 08, 2006 7:50 am UTC

Re: md5(x) = x

Postby Notch » Mon Oct 27, 2008 1:24 pm UTC

Assuming md5 has perfectly normal distribution:

The odds of a random normal integers in the range 0<=x<n being anything other than 0 is 1-1/n.
The odds of n random normal integers in the same range being anything other than 0 is (1-1/n)^n.

For md5, n is 2^128, so the odds that there are no md5(x)=x is (1-1/(2^128))^(2^128), which is about 0.37.


So I'd say, yes, it's likely there exists an md5(x)=x.

Disclaimer: I might suck at math.
Notch
 
Posts: 315
Joined: Tue Dec 12, 2006 5:52 pm UTC
Location: Stockholm, Sweden

Re: md5(x) = x

Postby '; DROP DATABASE;-- » Wed Oct 29, 2008 6:52 am UTC

So with MD5 or SHA-1, is there a known string length, that you can be sure no two strings of this length or less will ever collide?
poxic wrote:You suck. And simultaneously rock. I think you've invented a new state of being.
Image
User avatar
'; DROP DATABASE;--
 
Posts: 3242
Joined: Thu Nov 22, 2007 9:38 am UTC
Location: Midwest Alberta, where it's been raining for a week, which is odd

Re: md5(x) = x

Postby evilbeanfiend » Wed Oct 29, 2008 9:32 am UTC

well you could test it yourself for strings of length 1 pretty quickly (i'd be surprised if that got you any collisions) if we assume there is a string length at which there are no collisions then the really interesting question is what is the maximum string length with no collisions

do we need to split this? MD5(x) = MD5(y) is a somewhat different discussion to MD5(x) = x

I don't think so. It's related, and I'm not convinced there's a whole lot more to say on the original topic. I will change the name of the thread though.
in ur beanz makin u eveel
User avatar
evilbeanfiend
 
Posts: 2641
Joined: Tue Mar 13, 2007 7:05 am UTC
Location: the old world

Re: md5(x) = x

Postby Notch » Wed Oct 29, 2008 9:37 am UTC

A quick brute force check reveals no collisions in md5 for strings of length 2 or less.
(Technical note: I checked all chars between 32 (space) and 127 (⌂))

[edit:]

I think a split is a good idea..
md5(x) = md5(y) is guaranteed to be true for some values of x and y, but md5(x) = x might never be.
Notch
 
Posts: 315
Joined: Tue Dec 12, 2006 5:52 pm UTC
Location: Stockholm, Sweden

Re: md5(x) = x

Postby Sana » Wed Oct 29, 2008 10:59 am UTC

I don't understand what there is to discuss about md5(x) == md5(y). Of course this has to be true for some values of x and y because there are infinitely many strings, and only finitely many MD5 hashes.
Sana
 
Posts: 121
Joined: Wed Sep 26, 2007 3:53 am UTC

Re: md5(x) = x

Postby evilbeanfiend » Wed Oct 29, 2008 11:22 am UTC

yes but there must be a shortest string length for which MD5(x) = MD5(y) which is interesting as if you limit your strings to less than this length you can guarantee MD5(x) != MD5(y)
in ur beanz makin u eveel
User avatar
evilbeanfiend
 
Posts: 2641
Joined: Tue Mar 13, 2007 7:05 am UTC
Location: the old world

Re: md5(x) = x [and other properties of md5]

Postby MrSparkle » Thu Oct 30, 2008 12:11 am UTC

There can never be an X such that MD5(x) = x because The input to MD5 is different size than the output.

The input to MD5 is actually 512 bit blocks, the minimum length that you can input is 512 bits. For all inputs not divisible by 512 then input is padded with 0s to be a multiple of 512 bits. The output of MD5 is 128 bits.

Thus if you take an X128 and input it into a MD5 the actual calculation is not on X128 but on X128 with 384 zero bits tacked on to the end.

If you modify your question to account for this (basically asking what MD5(X appended with 384 zeros) = X) you could still ask the question I guess... but its not as simple as you think it is.

The answer to that question I am pretty sure is no, I lack the dedication to this thread to prove it to you. But I don't really understand why you seem to be convinced that such an X has to exist.
"Intuition, like a flash of lightning, lasts only for a second. [...] Suddenly the light breaks through and one finds after a few minutes what previous days of labor were unable to reveal."
~Cryptonomicon
User avatar
MrSparkle
 
Posts: 27
Joined: Thu May 10, 2007 3:40 am UTC

Re: md5(x) = x [and other properties of md5]

Postby Notch » Thu Oct 30, 2008 9:52 am UTC

MrSparkle wrote:There can never be an X such that MD5(x) = x because The input to MD5 is different size than the output.

The input to MD5 is actually 512 bit blocks, the minimum length that you can input is 512 bits. For all inputs not divisible by 512 then input is padded with 0s to be a multiple of 512 bits. The output of MD5 is 128 bits.


I can most certainly calculate the md5 hash value of a password 16 letters long (128 bits), and the output could most certainly be the same as the input (although it's fairly unlikely).
Just because md5 works on 512 bit blocks, it doesn't mean it can't handle inputs of a different size. That's exactly why it does the padding, to allow for different sizes.


Haha, I started brute forcing it, then did the math:

Brute forcing all md5(x)=x means checking 2.4*10^38 values. My quick test implementation can test some 2.3*10^9 values per hour, meaning it would take almost exactly 10^29 hours to brute force it.
Let's say I get a million people to help me out, then we're down to 10^23 years.. And let's say the algorithm gets a million times faster with some clever optimization, and we're down to 10^17 years. And let's pretend computers get a million times faster over night, and we're down to 10^11 years, which is significantly longer than the universe has existed for.

I stopped the program.
Notch
 
Posts: 315
Joined: Tue Dec 12, 2006 5:52 pm UTC
Location: Stockholm, Sweden

Re: md5(x) = x [and other properties of md5]

Postby davean » Thu Oct 30, 2008 3:08 pm UTC

MrSparkle wrote:There can never be an X such that MD5(x) = x because The input to MD5 is different size than the output.

The input to MD5 is actually 512 bit blocks, the minimum length that you can input is 512 bits. For all inputs not divisible by 512 then input is padded with 0s to be a multiple of 512 bits. The output of MD5 is 128 bits.

Thus if you take an X128 and input it into a MD5 the actual calculation is not on X128 but on X128 with 384 zero bits tacked on to the end.

If you modify your question to account for this (basically asking what MD5(X appended with 384 zeros) = X) you could still ask the question I guess... but its not as simple as you think it is.

The answer to that question I am pretty sure is no, I lack the dedication to this thread to prove it to you. But I don't really understand why you seem to be convinced that such an X has to exist.


As already said, the zero padding is defined as part of the function, so it does handle smaller input. That aside, no one said it had to exist, it very well might not but there hasn't been any reason put forth to believe it does not.

Read the original post, it asks "Is there a integer x exist, so that md5(x) = x?", the answer so far is "we don't know, maybe". For there to be a belief one way or another some reasoning has to be put forward.
User avatar
davean
Site Ninja
 
Posts: 2214
Joined: Sat Apr 08, 2006 7:50 am UTC

Re: md5(x) = x [and other properties of md5]

Postby mrbaggins » Wed Nov 05, 2008 9:45 am UTC

It's not a matter of probability based on bit allowances...

The function could be something as simple as adding one. You'll never get f(x) = x out of that.

I'd say no, there isn't... But that's purely speculation.
Why is it that 4chan is either infinitely awesome, infinitely bad, or "lolwut", but never any intermediary level?
User avatar
mrbaggins
 
Posts: 1565
Joined: Tue Jan 15, 2008 3:23 am UTC
Location: Wagga, Australia

Re: md5(x) = x [and other properties of md5]

Postby jimrandomh » Wed Nov 12, 2008 2:37 am UTC

MD5 has a fixed size (128-bit) output, so md5(x)=x implies that x is 128 bits long. MD5 operates on 512-bit blocks, so this gets padded out with zeroes (on the right). Since the length is fixed, each output bit o[1..128] is a fixed boolean function of the 128 inputs i[1..128], and we can get that function just by following through the algorithm. Finding a fixpoint is then equivalent to finding a solution to the boolean expression ((o[1]&i[1])|(~o[1]&~i[1])) & ((o[2]&i[2])|(~o[2]&~i[2])) & ... ((o[128]&i[128])|(~o[128]&~i[128])). We can then substitute formulas for the o[i]s, simplify, and solve.

Boolean satisfiability is NP-complete, so it's entirely possible that the resulting equation is impossible to compute. But it also might simplify to something easy, or allow an easy proof of its unsolvability. Determining which it is would take a fair bit of work, for no practical application. If you were to try to tackle this, the best strategy would be to use C++ to create a vector-of-boolean-expressions type, overload the bitwise operators, and use a stock implementation of MD5 with only the datatypes changed.
jimrandomh
 
Posts: 110
Joined: Sat Feb 09, 2008 7:03 am UTC

Re: md5(x) = x [and other properties of md5]

Postby Rysto » Wed Nov 12, 2008 2:47 am UTC

jimrandomh wrote:Boolean satisfiability is NP-complete, so it's entirely possible that the resulting equation is impossible to compute.

Say what? NP-complete problems might not be tractable, but they most certainly are computable.
Rysto
 
Posts: 1357
Joined: Wed Mar 21, 2007 4:07 am UTC

Re: md5(x) = x [and other properties of md5]

Postby Yakk » Wed Nov 12, 2008 5:16 am UTC

2^128 =~ (10^3)^( 12.8 ) =~ 10^38.4

There are roughly 10^80 atoms in the visible universe, so it seems possible to build a computer capable of calculating a problem with complexity on the order of 2^128 without using all of the known matter in the universe.

Going closer to home, can we do it using mass already in the solar system? We'll presume we keep the sun as a power source (as finding an energy source to work against that gravity gradient would suck, especially without the sun to power it!)

Jupiter weighs 10^27 kg, and the solar system consists of Jupiter plus unimportant debris. If converted into a computer that engaged in 1 trillion operations per second per kg (10^12 ops/kg), it would take about 1 second for the Jupiter-computer to walk through all numbers from 0 to 2^128. If it takes 1 million instructions to do an MD5 checksum and compare it to the source value, that makes it take about 10 days for the Jupiter-computer to answer the question. This is an extremely ballpark figure: your computronium could kick the crap out of 10^12 ops/kg, and it probably takes far less than 10^6 instructions to do an MD5 checksum and compare, but it does give you an idea of the scale of the problem.

I would give someone a pass if they called a problem on that scale "impossible to compute". But that's just me.
Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
User avatar
Yakk
 
Posts: 6303
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: md5(x) = x [and other properties of md5]

Postby Notch » Wed Nov 12, 2008 8:44 am UTC

Why would you need that many components?

You can calculate it with a C64, given enough time, as you don't have to store past results.
Notch
 
Posts: 315
Joined: Tue Dec 12, 2006 5:52 pm UTC
Location: Stockholm, Sweden

Re: md5(x) = x [and other properties of md5]

Postby Why Two Kay » Wed Nov 12, 2008 2:14 pm UTC

Notch wrote:given enough time


That's kinda the issue.
tl;dr - I said nothing important.
User avatar
Why Two Kay
 
Posts: 266
Joined: Sun Mar 23, 2008 6:25 pm UTC
Location: Plano, TX

Re: md5(x) = x [and other properties of md5]

Postby Yakk » Wed Nov 12, 2008 4:33 pm UTC

So you have a computer that can examine a MD-5 self-equality in 1 / 10^9 seconds -- ie, a billion a second.

There are more than 10^38.4 such self-equalities to solve. That means it will take 10^29.4 seconds.

There are roughly 31,557,600 seconds in a year, or about 10^7.5 seconds in a year. So to solve the problem, it will take 10^21.9 years. That is a bit too long for my taste.

If you built 1 billion boxes, each with 1 million CPUs inside of them, each solving 10^9 of them per second, that reduces the time required down to 10^6.9 years, or roughly 7,943,282 years 126 days 17 hours 50 minutes and 49 seconds. Give or take an order of magnitude.

And that's a lot of computer power.

So yes, you can solve it in geological time using more computing power than the entire planet Earth currently has. You can solve it in days using the entire mass of Jupiter turned into a giant computer. And if you built an entire universe-computer to solve the problem, you could solve it in the blink of an eye.

But given any reasonable time and CPU limits, iterating over every MD5 checksum and testing for self-equality isn't possible.
Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.
User avatar
Yakk
 
Posts: 6303
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: md5(x) = x [and other properties of md5]

Postby Rysto » Thu Nov 13, 2008 12:16 am UTC

Well, personally I think that it's important to be correct in our use of terminology, especially in the CS forum. "Computable" has a well-defined meaning in CS.
Rysto
 
Posts: 1357
Joined: Wed Mar 21, 2007 4:07 am UTC

Re: md5(x) = x [and other properties of md5]

Postby davean » Thu Nov 13, 2008 2:09 am UTC

Rysto wrote:Well, personally I think that it's important to be correct in our use of terminology, especially in the CS forum. "Computable" has a well-defined meaning in CS.


Yes, yes it does, and this is defiantly in the computable set. Also, looking at MD5, it doesn't seem unreasonable that the bits wouldn't be entirely independent; a fact that could be well exploited. Though I haven't read them, the attacks against MD5 would probably also help shrink the search space a lot here.

While it probably would be a pretty large project, I don't see a reason that this should be written off as not solvable yet.
User avatar
davean
Site Ninja
 
Posts: 2214
Joined: Sat Apr 08, 2006 7:50 am UTC

Re: md5(x) = x [and other properties of md5]

Postby mroctogon » Thu Nov 13, 2008 10:56 pm UTC

Also, you don't need to calculated the entire md5 for every string, just enough to verify that it is not itself. And you could also hope that you get really really lucky with the values you choose and get it really fast.
mroctogon
 
Posts: 15
Joined: Fri Feb 29, 2008 1:26 am UTC

Re: md5(x) = x [and other properties of md5]

Postby davean » Fri Nov 14, 2008 2:45 am UTC

mroctogon wrote:hope that you get really really lucky with the values you choose and get it really fast.


Well A) thats only if such a fixed points exists; B) That will only answer "Yes" fast, not no or what they (all) are, just give an example; so yes it partially satisfies but isn't exactly stellar.
User avatar
davean
Site Ninja
 
Posts: 2214
Joined: Sat Apr 08, 2006 7:50 am UTC


Return to Computer Science

Who is online

Users browsing this forum: No registered users and 4 guests