## 32 hex character strings md5 hashing to themselves

A place to discuss the science of computers and programs, from algorithms to computability.

Formal proofs preferred.

Moderators: phlip, Moderators General, Prelates

Blipo
Posts: 89
Joined: Mon Jul 30, 2007 8:15 pm UTC

### 32 hex character strings md5 hashing to themselves

Is it possible for a 32-hex-character-string to md5 hash to... itself?

Had this thought while on #stats. Anyone know if this is possible?
Last edited by Blipo on Thu Nov 22, 2007 3:35 am UTC, edited 1 time in total.

Rysto
Posts: 1458
Joined: Wed Mar 21, 2007 4:07 am UTC

### Re: 32 hex character strings hashing to themselves

Blipo
Posts: 89
Joined: Mon Jul 30, 2007 8:15 pm UTC

### Re: 32 hex character strings hashing to themselves

Whoa. Fix'd. =\

joeframbach
Posts: 1478
Joined: Sun Nov 05, 2006 12:49 am UTC
Location: The City of Steel
Contact:

### Re: 32 hex character strings md5 hashing to themselves

md5 you say?

well I guess there's only one way to find out. There's a finite number of possible strings. Get crackin.
xxv/♂/♫

Blipo
Posts: 89
Joined: Mon Jul 30, 2007 8:15 pm UTC

### Re: 32 hex character strings md5 hashing to themselves

joeframbach wrote:md5 you say?

well I guess there's only one way to find out. There's a finite number of possible strings. Get crackin.

It'd take rather more time than I have to figure this out, and I don't have nearly enough talent to write code to check for me.

Anyways, it's hypothetical. Is there anything preventing it from happening?

Notch
Posts: 318
Joined: Tue Dec 12, 2006 5:52 pm UTC
Location: Stockholm, Sweden
Contact:

### Re: 32 hex character strings md5 hashing to themselves

There are 16^32 possible strings like that.
A 128 bit md5 has has 16^32 (2^128) possible values.

If we pretend md5 is totally random, the odds of the first string ("0000000000000000") hashing to itself is 1/(16^32). The second string ("0000000000000001") also has a 1/(16^32) chance of hashing to itself.

So the odds of no string hashing to itself is (1-1/(16^32))^(16^32), which works out to.. uh.. something like 0.37.
In other words there probably is a md5 hash that hashes to itself. And would someone please double check my math?

Rysto
Posts: 1458
Joined: Wed Mar 21, 2007 4:07 am UTC

### Re: 32 hex character strings md5 hashing to themselves

I got the same. The chances seem to come out to about 1/e.

Yakk
Poster with most posts but no title.
Posts: 11045
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

### Re: 32 hex character strings md5 hashing to themselves

Just to note: there are hash functions that do not hash to themselves. The hash function "+5" doesn't, as an example.

Code: Select all

`#include <vector>#include <assert.h>static bool progress_report = true;static unsigned int progress_freq = 100000;typedef unsigned int (*hash_function)( unsigned char* data, size_t length );bool find_to_itself( hash_function hash,  unsigned int* result, size_t min_length_ = 0 ) {  unsigned int min_length = (min_length_<sizeof(unsigned int))?sizeof(unsigned int):min_length);  std::vector<unsigned char> buff(min_length);  unsigned char* buff_ptr = &buff.front();  unsigned int* data_ptr = reinterpret_cast<unsigned int*>(buff_ptr);  if (!data_ptr) {    assert(false);    return false;  }  unsigned int& data = *data_ptr;  data = 0;  unsigned int hash_result = hash_function( &buff.front(), min_length );  if (hash_result == data) {    if (result) *result = data;    return true;  }  while( (++data) != 0 ) {    hash_result = hash_function( &buff.front(), min_length );    if (hash_result == data) {      if (result) *result = data;      return true;    }    if (progress_report) {      if ((progress_freq % data) == 0) {        printf("Progress: up to %d checked\n", data);      }    }  }  return false;}void main() {  unsigned int result = 0;  bool found = find_to_itself( md5hash, &result );  if (found) {    printf("Found: %d hashes to itself\n", result);  } else {    printf("No self-hash found\n");  }}`

Write or find an md5 function, and that should do it.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

Rysto
Posts: 1458
Joined: Wed Mar 21, 2007 4:07 am UTC

### Re: 32 hex character strings md5 hashing to themselves

There are 2^128 different possible strings. Surely that would take a prohibitively long time to check.

Yakk
Poster with most posts but no title.
Posts: 11045
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

### Re: 32 hex character strings md5 hashing to themselves

sizeof(unsigned int) is usually 4, and char's usually have 8 bits.

So that's 2^32 different unsigned ints to check, which is about 4 billion. I'd be shocked if the above program took more than a week to run, and if you remove the printf's and do some optimization you should be able to do this on the order of hours (if not minutes or seconds) on a modern consumer computer.

The real problem becomes "how expensive is generating an md5 checksum". And the easiest way to check that is to run the above program and time how long it takes to hit 1 million md5s. Multiply that by 4000, and that is your estimated time of computation.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

Yakk
Poster with most posts but no title.
Posts: 11045
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

### Re: 32 hex character strings md5 hashing to themselves

Oh god, I'm sorry. I misread the original problem.

32 character hex string? Ahhhh! PAIN!

Ya, that's far to large to test exhastively.

I thought he said 32 bit number.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

phlip
Restorer of Worlds
Posts: 7535
Joined: Sat Sep 23, 2006 3:56 am UTC
Location: Australia
Contact:

### Re: 32 hex character strings md5 hashing to themselves

Now the real question is whether you're trying to hash the 16 bytes of binary data, or 32 bytes of that data as hexadecimal in ASCII... and in that latter case, whether it's uppercase or lowercase... or whether you'll accept a hex string with mixed cases that still hashes to itself... or maybe EBCDIC is OK...

Better test them all, just to be sure.

Code: Select all

`enum ಠ_ಠ {°□°╰=1, °Д°╰, ಠ益ಠ╰};void ┻━┻︵​╰(ಠ_ಠ ⚠) {exit((int)⚠);}`
[he/him/his]

Yakk
Poster with most posts but no title.
Posts: 11045
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

### Re: 32 hex character strings md5 hashing to themselves

Oh ya: regardless of the medium, if the 32 bit result equals the input, the input must fit in 32 bits of freedom.

So it is only a 2^32 search, which is quite doable. My code should work, you just have to have the hash function take the integer input, turn it into the format you want to CRC from, then do the CRC32, and return the result as a 32 bit int.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

Rysto
Posts: 1458
Joined: Wed Mar 21, 2007 4:07 am UTC

### Re: 32 hex character strings md5 hashing to themselves

SHA-5 gives a 128-bit hash.

rrenaud
Posts: 47
Joined: Fri Jul 28, 2006 12:34 am UTC

### Re: 32 hex character strings md5 hashing to themselves

In case you are curious, the md5 hash function introduces some permutation on 32 hex character strings, as you realized. You are asking the question if the md5 induced permutation is (not) a derangement. You can read about them on wikipedia here.

http://en.wikipedia.org/wiki/Derangement

Marvin
Posts: 153
Joined: Thu Nov 22, 2007 8:03 am UTC
Location: Croatia
Contact:

### Re: 32 hex character strings md5 hashing to themselves

i find taht very improbable, when you look how md5 works. First it makes a block of 512bits, then makes 16blocks of 32bits, then makes some nuts functions, rotations and XORing on that. You'd have to have mighty luck to find such a hash... and i still think it doesn't exist and i'm just not able to se how to prove it by looking at md5...
42
--
If God intended us to program we would be born with serial I/O ports.

xyzzy
Meta-Titled
Posts: 263
Joined: Sun Mar 18, 2007 10:02 pm UTC
Location: Colossal Cave
Contact:

### Re: 32 hex character strings md5 hashing to themselves

Simple.

First, generate all possible MD5 hashes

Second, hash them all.

Third, compare the two for any items that match themselves

Done.

You'll be wanting a supercomputer to do this quickly though.
"Wile E. Coyote was a theoretical mathematician." - Leliel
"Modern life can be so boring without elements of the bizarre or the fantastical. Hence, we have steampunk." - Me

Marvin
Posts: 153
Joined: Thu Nov 22, 2007 8:03 am UTC
Location: Croatia
Contact:

### Re: 32 hex character strings md5 hashing to themselves

xyzzy wrote:Simple.

First, generate all possible MD5 hashes

Second, hash them all.

Third, compare the two for any items that match themselves

Done.

You'll be wanting a supercomputer to do this quickly though.

many of those i think...
42
--
If God intended us to program we would be born with serial I/O ports.

Yakk
Poster with most posts but no title.
Posts: 11045
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

### Re: 32 hex character strings md5 hashing to themselves

Rysto wrote:SHA-5 gives a 128-bit hash.

That is what I get for having CRC-32 on the brain.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.