in regards to signed executable exec checks? and why are you just focusing on MD5 when SHA-1 fingerprints sometimes are, and should always be coupled with it as a verification mechanism(in my opinion) ? Many different and unique perspectives and approaches could be used, pending on the goal you wish to accomplish. What are you doing with it specifically?
I am specifically looking for information relating to the subject when a MD5 checksum is done of a file (any file), that the value that is returned can be manipulated or the contents of the file altered in some way that the MD5 checksum can still return the same value of the altered file. I seem to remember of an article I read somewhere about a md5 not being trustworthy as it could be manipulated.
Nothing is hackproof, if you believe something is there's a surprise to be had. The hash is only as good as the software that authenticates it. If the software authenticating has been altered, it would produce a hash that would appear valid but actually is not. I have not ever tried to alter something and have the MD5 checksum be the same without modifying the authenticating tool itself so I cannot say for certain that it is, or is not possible.
To Quote Michalson:
"MD5 is easily crackable for any "small" piece of data. While many message boards and such use MD5 to "encrypt" passwords, the only real use of MD5 is as a hash for large pieces of data (at least several KB). Standard passwords encoded with MD5 can be broken easily; on my machine I can break "normal" (i.e. 6-10 characters using [A-Z,a-z,0-9]) passwords in less than 2 minutes given the MD5 hash (that's using brute force). Now fortunately MD5 is designed in such a way that you can't nudge data to get a particular MD5 hash (except via brute force of approx. 2^128), and so it is very hard to change data and still have the same MD5 hash."
He's a cryptoanalysis expert, moderator on a few boards and his answer is valid in my eyes.
Not a problem, he tends to explain things much more indepth without writing a story and confusing the hell out of people as I tend to do from time to time.
There's a project where they're trying to get duplicate MD5's, I think. It's a large project on the order of the distributed.net stuff though I think.
Obviously for any hash cipher there are an infinite number of data sets which will hash to the same digest. Does this negate the usefulness of hash ciphers? No.
If you have some sort of structured data which you are hashing, it's statistically impossible to generate a similarly strucutred data set which will hash to the same digest value, or if it's not impossible, the security properties of the given hash cipher have been somehow coopted, and use of that particular cipher should be abandoned.
Needless to say, I don't think we'll see anyone abandoning MD5 any time soon.
MD5 is used by a number of operating systems for secure storage of passwords in the shadow file. HMAC-MD5/CRAM-MD5 is a standardized authentication system used by countless servers throughout the world to provide secure authentication on a number of services.
I am specifically looking for information relating to the subject when a MD5 checksum is done of a file (any file), that the value that is returned can be manipulated or the contents of the file altered in some way that the MD5 checksum can still return the same value of the altered file. I seem to remember of an article I read somewhere about a md5 not being trustworthy as it could be manipulated.
Thanx
Simply stated, you cannot alter a single bit of the file and return the same checksum. You would have to alter every bit to a correct different value to achieve the same checksum. The better idea is to discover the process the file is digitally signed, and duplicate that process in calculating a new signature. This is commonly done on the xbox for save games, and you can examine how this process works using a xbox game's XBE and a program like XBE tool and Xboxsavsig. XBE Tool will examine the executable and discover the key used for signing files, and xbesavesig will use that key to resign altered files. MD5 is certainmy not perfect. But not trustworthy? I don't think you have to worry about your MD5 checksums failing to report an altered file... it's much more likely a hacker would update your md5 checksum than try to rewrite a file to match the old one.
Was that page written by a bunch of teenagers? Their entire argument is horribly flawed. For any hash cipher there exists an infinite number of data sets which produce the same digest... that's basic information theory. Obviously the time required to brute force duplicate digests is proportional to the size of the input data set... and if you read their methodology, they are employing brute force... but on ridiculously small data sets.
They intend to find two 32 byte arbitrary data sets which produce the same MD5 digests. Excuse me, but how often are MD5 checsums performed on 32 byte data sets when there isn't some intervening mechanism like HMAC to provide an extra degree of security (and the HMAC method has found to be quite sound). By using such a small amount of input data, and with the only requirement being that they find two arbitrary data sets which produce the MD5 hash, they drastically reduce the time and computational power needed for their effort.
If they really wanted to demonstrate the cryptographic insecurity of MD5, they should generate a random data set of a reasonably long length (16kB?), calculate its MD5 digest, and then find a data set of equal length which produces the same MD5 digest.
If not this, they should apply the same methodology to SHA1 to make it a fair comparison.
Given the sheer volume of off-the-cuff numbers found throughout their writeup with absolutely no mathematical backing, I'd simply say they're full of shit.
While I admit the "safety margin" of MD5 is shrinking, and operating systems currently employing MD5 for passwd hashing should switch to SHA1 in the near future, there are several applications where security is irrelevant and verification of data integrity is the real concern. If security were the only factor, why does CRC32 still see widespread use? Perhaps because an algorithm as complicated as MD5 or SHA1 is too much for your typical microcontroller to see acceptable throughput? Using SHA1 over MD5 will reduce hashing throughput by about 20%. Conversely, for their MD4 example, the difference is more like 5%. The MD4/MD5 comparison also ignores the massive proliferation of MD5 compared to MD4.
For one of my projects, I chose to use MD5 to verify file integrity, and HMAC-SHA1 for authorization. When selecting a hash cipher, there will always be a performance/security tradeoff, so selection criteria should be based on how problematic usage of a weaker cipher could potentially be in the future. I don't believe we'll see anyone creating 16kB datasets which produce a duplicate MD5 digest as a different 16kB set any time in the next 20 years.
Comment