Over the last year, there has been some discussion in the newsgroups about a
time conversion bug in COREDLL for debug builds. Just in the last week, we
diagnosed the two bugs -- one in the Windows CE code and one in the ARMV4I
compiler -- that were causing this (and other related problems).
Coincidentally, Microsoft just released QFE 030915_KB828098 for 4.2 that
attempts to address the problem, but it does so unsatisfactorily, for
several reasons:
- it doesn't fix the problem in CE .NET 4.1, only in 4.2
- it doesn't fix the offending CoreMain code properly; it hacks around the
problem
- the same offending code exists in other places in Windows CE, which this
QFE doesn't fix
- it doesn't fix the underlying ARMV4I compiler bug, which still exists
We'd like to see new QFEs released in the next few months that address this
properly, for all affected parts of Windows CE as well as the compiler, and
for 4.1 and 4.2 (this may also appear in 4.0, but we didn't check). What
follows below is the definition of the two bugs that caused the behavior,
but which still have not been fixed fully even with the new QFE applied.
BUG 1: mul64_32_64 uses (__int64) instead of (unsigned __int64)
The function mul64_32_64 in time.c, which is part of COREDLL in the private
area, multiplies a 64-bit FILETIME value by a 32-bit DWORD, and does by
multiplying the dwLowDateTime and dwHighDateTime fields of FILETIME with the
DWORD separately, then adding them together appropriately. In order to do
this, it must first cast the 32-bit values used in the multiplication to a
64-bit value, so that the compiler will not throw out the overflow above
32-bits, as it usually does with 32-bit multiplicaiton.
The bug is that mul64_32_64 casts these values to (__int64), which is
signed; it should be casting them to (unsigned __int64) and using an
unsigned __int64 temporary variable for accumulating the results. As a
result, the multiplicaiton is done using the smull instruction, which sign
extends the result. Therefore, if bit 31 of the dwLowDateTime is set, the
result of this first step will be a negative number. When this result is
added to the result of the multiplication between the DWORD and
dwHighDateTime, it will subtract rather than add to this value, which is
incorrect.
As an example, if the FILETIME arg contains the 64-bit number 0xb8e8b296740
and this is multiplied by 0x2710, the result should be 0x1c3701c01513400,
but mul64_32_64 returns 0x1c3490c01513400 in this case. The reason is that,
as a first step, mul64_32_64 takes the low 32-bits of FILETIME and
0x8b296740 and multiplies it with 0x2710. But before it does this, it casts
both operands to __int64, which causes the processor to treat the first
operand as a negative number. The result of this multiplication should be
0x0000153c01513400 to achieve the proper result, but will actually be
0xffffee2c01513400, yielding the incorrect result above.
So, if that's the fix, what's the problem with Microsoft's QFE patch?
Besides only addressing 4.2, there are two more fundamental issues:
A) Microsoft's QFE does not fix this by changing the __int64 values to
unsigned __int64, as the purpose of the function clearly requires. Instead,
it hacks around this by changing separating the casting of the values to
__int64 from the multiplication. Of course, that shouldn't make any
difference, since they are still casting them as signed __int64 values, but
it turns out it does (see the Bug #2 described below).
B) Microsoft's QFE only patches this in the version of mul64_32_64 that from
time.c in CoreMain. It doesn't fix it in the version of mul64_32_64 from
dhcp.c in Windows CE's DHCP code.
C) Microsoft also uses __int64 elsewhere in the OS, so it's possible that
this same issue -- assuming that int64's are unsigned -- could occur
elsewhere. It might make sense for Microsoft to explicitly define all 64-bit
variables in CE as "unsigned" unless they know that these variables must
support negative numbers.
BUG2: The ARMV4I Compiler Handles 64-bit Integer Multiplication Incorrectly
In Retail Builds
Before we start with how Microsoft's QFE is able to work despite incorrect
code, let's look at why the original code worked in retail builds, but not
debug builds, despite the problem described above, since that illustrates a
problem with the ARMV4I handling of 64-bit multiplication. In the bug above,
the first part of mul64_32_64, as originally implemented, resolves to the
simplified code below:
__int64 result;
DWORD op1, op2, op3;
...
result = (__int64) op1 * (__int64) op2;
result += ((__int64) op3 * (__int64) op2) << 32;
In a debug build, the above multiplication operations are generated using
ARM smull instructions, which perform a multiplication of two signed 32-bit
integers and yield a signed 64-bit result. Since __int64 is a signed data
type, these instructions match the source code. But when you compile the
above code in a retail build, the above operations are generated as follows:
umull r2, r0, r3, r1
mla r1, r4, r1, r0
In other words, umull is being used for the first operation, which is
incorrect given that the operands and the result of the first operation are
all signed integers. So mul64_32_64 should have been using unsigned __int64
values all along, but this second bug saved the day for retail builds on the
ARMV4I version of CE by forcing an unsigned multiplication despite what the
source code indicated.
And, of course, it turns out that Microsoft's QFE is saved by the same
compiler bug. In the patch, Microsoft replaces the above code with,
essentially, this:
__int64 result, m1, m2;
DWORD op1, op2, op3;
...
m1 = op1;
m2 = op2;
result = m1 * m2;
result += ((__int64) op3 * (__int64) op2) << 32;
The reason this works for both debug and retail builds is that, in this
case, the unsigned values of op1 and op2 are assigned to signed 64-bit
variables, something the ARMV4I compiler knows how to do correctly. In this
case, even if bit 31 of either op1 or op2 is set to 1, the compiler does not
treat these values as negative. So m1 and m2 both end up with positive
integers, and m1 * m2 works fine for both debug and retail builds.
But the ARMV4I compiler error for 64-bit multiplication still rears its head
even with Microsoft's QFE in place. If you look at the instructions
generated for a debug build of the above code, it's clearly performing the
multiplicaiton of m1 and m2 as signed integers (it requires about 16
instructions, including three multiplication instructions). In the case of a
retail build of the same code, however, the multiplication operation
generates:
umull r2, r0, r1, r4
mla r1, r3, r4, r0
In other words, Microsoft's QFE is a workaround (and not the best one) for
the fact that the ARMV4I compiler treats __int64 as a signed integer in
debug builds, and as an unsigned integer in retail builds, when performing
integer mutliplication. Whether the ARMV4I compiler handles this correctly
for other operations on __int64 integers is something we haven't explored.
Clearly, this is a serious issue considering the extent to which Microsoft
uses __int64 values in Windows CE code, since it causes the operating system
to function differently when debugging than for retail builds. It also
creates an issue for developers working with 64-bit integers in our own
code: we originally tried to use them in an iButton driver earlier this
year, but wound up storing the iButton's 64-bit ID as two separate 32-bit
integers because of flakiness with 64-bit arithmetic.
Nathan Tennies
Director, Software Engineering
InHand Electronics
(240) 558-2014 x.215
(240) 558-2019 (fax)