The Fc.exe command does not work when files differ every 128th byte

DanWaterworth · on May 7, 2011

I'd hate to have been the guy who discovered this. That sounds like a heap of frustration.

yuvadam · on May 7, 2011

I would love to see the code that produces this bug.

gst · on May 7, 2011

I guess it's a simply off-by-one error. Most likely they are reading the file data into buffers of size 128, but then only compare the first 127 characters of each buffer (e.g., because they use < instead of <= in their loop).

rudiger · on May 7, 2011

There are two hard problems in computer science: cache invalidation, naming things, and off-by-one errors.

ajross · on May 8, 2011

You forgot sign bugs. It's very, very important that you have an even number of them. Many developers forget.

mseebach · on May 7, 2011

Probably something like loading the 128 bytes into a byte array, then doing something to it that requires it to be \0 terminated and then the last byte is overwritten.

EDIT: The suggested workaround is to run it in binary mode. As binary doesn't have a concept of \0-terminated strings, this seems to back up my theory.

w-ll · on May 7, 2011

My assumption as well.

It's funny how we can even produce pseudo bugs in our head when we look at how some other software misbehaves.

pbhjpbhj · on May 7, 2011

If it's something that simple then how on earth could MS ship it? What tests would they have run that could miss this - I'd think that a test on a file comparison program would include cycling through a single bit alteration and checking that it was captured up to some arbitrary number of bytes (and I'd think you'd let it run > 128 bytes too).

I'm not a coder, does this sort of assumption seem reasonable. If you can't get the basics right ...?

jerf · on May 7, 2011

C is a fantastic language for failing to the get the basics right. It takes only a moment's inattention from even a C master and you've got an error like this. In almost every more modern language I can think of, the language itself would afford code that would make this particular error much harder to write, unless you write C-in-(Python/Ruby/Haskell/etc) in the first place. And of course those other languages all have their own problems, but they tend to be fewer, which is why we can write in them faster.

Locke1689 · on May 7, 2011

Your test suite seems broken (which is why test suites are hard to write in the first place). What if there is a bug when you have two bits changed -- one in the 128th byte and one in the 256th byte? Here's a possible test suite that could catch those bugs.

Take two files. Generate all possible combinations of byte-differences between the two files up to a length of 256 bytes and flag invalid comparisons.

Now, what is the running time of this test case? I'll give you a hint: this test case is unlikely to finish before the heat-death of the universe.

In general it's very hard to right comprehensive test suites and the only thing this proves is that Microsoft developers are humans, not gods. In fact, this is exactly the type of weird corner-case bug that I would expect to find in code written by good developers.

pbhjpbhj · on May 9, 2011

>Your test suite seems broken (which is why test suites are hard to write in the first place). What if there is a bug when you have two bits changed

I said "a test ... would include ...".

This just seems the first most obvious pattern to check against. You need to be sure the program compares every bit, that just seems the simplest way to be sure that it's looping through every bit of both files and making a proper comparison. Knowing the internals of the program and functions used would give you the data length to check against.

You seem to be saying that because a comprehensive test is impossible no test should have been performed.

In any case wouldn't you see it in the ASCII read out of a watch routine or some such - hey look [made up variable] currentChunk has string terminators at the end which aren't in the chunks of the test file.

You say good coders would miss this sort of bug, I'd think a good coder would realise the function they're using puts a string terminator in.

Without seeing the code I guess we wouldn't know but direct comparison is surely one of the more basic operations to code (although I grant that doing it quickly maybe isn't). Surely the fundamental part is to XOR registers and look for 1s???

pzxc · on May 7, 2011

He didn't suggest that the test suite should account for all possible combinations of byte differences. He suggested that the test suite should account for all possible combinations of SINGLE byte differences. I.E., the test suite should have checked two files with only the first byte different, with only the second byte different, etc. Such a test suite would be linearly complex, not exponentially, and could easily be run before the heat death of anything.

(However, to play my own devil's advocate, I'd have to say it's easy in retrospect to say "yes! there is an easy test for this that should have been written!" when in fact often the number of possible tests is astronomically large and it can be hard to pick the right ones. What if the bug was that FC.EXE didn't correctly register a difference when both the 127th and 128th bytes were the only differences? The proposed test suite would not have caught it.)

Locke1689 · on May 7, 2011

Your devil's advocate argument seems to be just my post. I was trying to show that, while the parent's test suite was linear running time, the number of tests in a comprehensive test suite is exponential, therefore the runtime of any completely comprehensive test suite is exponential. Choosing the correct tests to use resources on is a very difficult problem.

_tggb · on May 7, 2011

This is an easy mistake to make, and miss when looking over your code. With hindsight - and having it explained to you - it is also a very easy mistake to understand.

That doesn't mean it was an easy mistake to find. I bet it went without being detected for a long time.

baddspellar · on May 7, 2011

I don't know the root cause, but something this esoteric would most likely be discoverable only in a code review, or possibly a unit test. It's hard to imagine someone thinking of testing the case of 2 files that differ in every 128 bytes without reading the code.

pronoiac · on May 7, 2011

It sounds like it compares blocks of 128 bytes, & there's an off-by-one error in the code.

quinndupont · on May 7, 2011

Nothing as satisfying as really really occasional bugs. (When they aren't yours)

Todd · on May 7, 2011

At least it's reproducable...

joe_the_user · on May 7, 2011

"This article applies to a different operating system than the one you are using. Article content that may not be relevant to you is disabled."

If I don't use Windows, they hide some content.

rtaycher · on May 8, 2011

I think thats just to warn vista/7 users that it doesn't effect them.

jodrellblank · on May 7, 2011

In Windows XP, which is two generations outdated.

karamazov · on May 7, 2011

In what kind of real-world situations would this bug impact results?

jtheory · on May 7, 2011

The most common case: any situation in which only one character in a file is different. For one in every 128 pairs of these files, more or less, fc.exe will say "no differences found".

Are you asking why anyone would ever change only one character in a file? Typo corrections come to mind, or mild file corruption. I recovered a ton of files from a failing drive a few years ago, and quite a few of the text files had just a character or two corrupted.

I'm not sure what people generally use fc for -- I don't -- so it's hard to say how serious the impact might be when it fails.

CamperBob · on May 7, 2011

You must restart the computer after you apply this hotfix.

Really Microsoft?

Locke1689 · on May 7, 2011

This was from XP before they re-architected the DLL management. I'm not sure blaming Microsoft for a mistake made more than 9 years ago in OS design is relevant or helpful.

CamperBob · on May 7, 2011

fc.exe doesn't use any DLLs. It's a trivial console application, deployed as a single .exe that can be overwritten as long as it's not currently running.

Locke1689 · on May 7, 2011

The hotfix seems to modify ulib.dll.

CamperBob · on May 7, 2011

If fc.exe depends on DLLs for its core functionality, then the whole scenario is even goofier.

Locke1689 · on May 7, 2011

/usr/bin/diff depends on the following shared libraries for operation:

linux-gate.so.1 => (0xb7f31000) librt.so.1 => /lib/tls/i686/cmov/librt.so.1 (0xb7f18000) libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7dc9000) libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb7db0000) /lib/ld-linux.so.2 (0xb7f32000)

Did you look any of this information up before making your statements? It looks like ulib.dll is a DLL for file utilities, which would perform an analogous purpose as librt and libc in any UNIX system.

quadhome · on May 7, 2011

But, you don't need to reboot a Linux machine after updating a shared library.

Locke1689 · on May 8, 2011

This was from XP before they re-architected the DLL management. I'm not sure blaming Microsoft for a mistake made more than 9 years ago in OS design is relevant or helpful.

-- Me, about 4 comments ago

CamperBob · on May 8, 2011

And this is relevant because....?

You can rest assured that the actual bug was not in a CRT library DLL. They may have had a good reason for updating ulib.dll which wasn't mentioned in the KB article, but fixing an off-by-one bug in an app-specific memory-compare loop wasn't it.

pwg · on May 7, 2011

Sign.... 30+ years on, and MS still can not figure out how to change a basic setting or update something without also requiring a reboot in the process.

mdda · on May 7, 2011

Microsoft's single purpose 'File-Compare' utility actually fails to do the one thing that it was designed for? It boggles the mind. (Sorry for the lack of content, but OMG and WTF!)