Crash from HeapLock

One of my routine work is investigating crash dumps collected by Windows Error Reporting service (also known as Winqual). This morning I found cabs are available for some of high volume buckets, I downloaded the first one, opened it in WinDbg, corrected symbols, and here is the stack after .ecxr and kb command:

0:019> kb
  *** Stack trace for last set context - .thread/.cxr resets it
ChildEBP RetAddr  Args to Child              
01a1ac80 74b76dee 01c70000 01a1ed04 62d54ab3 ntdll!RtlLockHeap+0x16
01a1ac8c 62d54ab3 01c70000 bfb3fcb0 01a1f050 KERNELBASE!HeapLock+0xe
01a1ed04 62d55239 01b42b71 00000200 01a1f064 MyModule!IsAddressOnHeap+0x83
...

Here is the pseudo code for MyModule!IsAddressOnHeap, it takes a parameter of void*, then walk through heaps in current process and returns whether the address might be coming from a heap.

BOOL IsAddressOnHeap(void* p) {
	HANDLE handles[4096];
	int heaps = GetProcessHeaps(sizeof(handles)/sizeof(HANDLE), handles);	
	if(heaps == 0)
		return FALSE;
	for(int i=0 ; i < heaps ; i++) {
		if (HeapLock(handles[i])) {
			__try {
				// use HeapWalk to determine whether the address is on heap
			}
			__finally {
				HeapUnlock(handles[i]);
			}
		}
	}
	return FALSE;
}

I already had a good guess about what went wrong, but let’s confirm it. First, let’s see how the execution was transferred from MyModule!IsAddressOnHeap to KERNELBASE!HeapLock:

0:019> ub 62d54ab3 
MyModule!IsAddressOnHeap+0x62
62d54a92 0f8447010000    je      MyModule!IsAddressOnHeap+0x1af (62d54bdf)
62d54a98 33db            xor     ebx,ebx
62d54a9a 895de4          mov     dword ptr [ebp-1Ch],ebx
62d54a9d 3bdf            cmp     ebx,edi
62d54a9f 0f8d34010000    jge     MyModule!IsAddressOnHeap+0x1a9 (62d54bd9)
62d54aa5 8b8c9da4bfffff  mov     ecx,dword ptr [ebp+ebx*4-405Ch]
62d54aac 51              push    ecx
62d54aad ff155ca0d662    call    dword ptr [MyModule!_imp__HeapLock (62d6a05c)]

The parameter passed to KERNELBASE!HeapLock was pushed to stack, which came from ecx, which in turn came from address at ebp+ebx*4-405Ch. Also we know that ebx is used as the index (variable i) and edi is used as the counter (variable heaps). We can get the value of registers from context record saved at the time of crash, but they might not be the same as they were in MyModule!IsAddressOnHeap considering there are two more functions (KERNELBASE!HeapLock+0xe and ntdll!RtlLockHeap+0x16) on the stack above MyModule!IsAddressOnHeap. Things gets simple in this case as these two functions didn’t run much (offsets are 0xe and 0x16), which means we can disassemble them from start to the offset to make sure ebx and edi were not changed (I did it and they were not changed). In case they were changed and since they are non-volatile registers, caller is responsible in preserving the old values (by pushing/pop them to/from stack) so they will still be retrievable. Since they are not changed in this case, let’s use the registers in crash context record to see what were in handles array:

0:019> .ecxr
eax=3b94b284 ebx=00000004 ecx=01c70000 edx=00000000 esi=01c70000 edi=00000005
eip=77481641 esp=01a1ac54 ebp=01a1ac80 iopl=0         nv up ei pl nz na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010206
ntdll!RtlLockHeap+0x16:
77481641 f7464400000001  test    dword ptr [esi+44h],1000000h ds:0023:01c70044=????????

* 01a1ed04 is ChildEBP of MyModule!IsAddressOnHeap from kb command, the output is in sync with first arg from kb
0:019> dd 01a1ed04+4*4-405c l1
01a1acb8  01c70000

* edi has value of 5, dump first 5 elements in array heaps
0:019> dd 01a1ed04-405c l5
01a1aca8  006d0000 004b0000 00690000 01a80000
01a1acb8  01c70000

Now we know what were in array heaps. Let’s take a look at the heaps in PEB:

0:019> dt _PEB @$peb
MyModule!_PEB
   ...
   +0x088 NumberOfHeaps    : 4
   +0x08c MaximumNumberOfHeaps : 0x10
   +0x090 ProcessHeaps     : 0x7754d3a0  -> 0x006d0000 Void
   ...
0:019> dd 0x7754d3a0 l4
7754d3a0  006d0000 004b0000 00690000 01a80000

So there were 4 heaps reported from PEB, while our array had 5, the first 4 were same as in PEB, the fifth one was not in PEB heaps and wasn’t in a committed page:

0:019> !address 01c70000
Usage:                  Free
Base Address:           01b48000
End Address:            5ffd0000
Region Size:            5e488000
State:                  00010000	MEM_FREE
Protect:                00000001	PAGE_NOACCESS
Type:                   <info not present at the target>

So it looks like by the time function MyModule!IsAddressOnHeap walked to the last heap, it had been destroyed and memory had been freed. We need to put HeapLock into another SEH block to catch the access violation, existing block cannot be reused as it sole purpose is to use the __finally block to unlock the heap (if it was locked successfully).

I wrote a simple test application and validated the theory: after calling HeapDestroy on a heap, calling HeapLock with same heap handle will cause access violation. HeapValidate can be used to check if the heap was valid (even with a handle that has been used with HeapDestroy), I though of calling it before HeapLock, but ditched the idea quickly due to the potential racing condition that HeapDestroy could called in-between.

Advertisements

Posted on May 27, 2013, in debug. Bookmark the permalink. Leave a comment.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: