蓝屏,谁之过?

Bug总能在你意想不到的地方给你个措手不及,只是它所带来并不是惊喜,而是Blue Screen Of Death !

既如此,只能兵来将挡。

先介绍一下程序的大体流程:

NTSTATUS
XXXProcessDirents(…)
{    
    do {
        KeEnterCriticalRegion();
        ExAcquireResourceSharedLite(&fcb->Resource, TRUE);

        /* access several members of fcb structure */
        ExReleaseResourceLite(&fcb->Resource);
        KeLeaveCriticalRegion();

         XXXXProcessDirent(…);

    } while (list_is_not_empty(….));

    return status;
}

NTSTATUS
XXXXProcessDirent(…)
{
    HANDLE handle = NULL;
    XXXX_FILE_HEADE fileHead;
    ……

    /* open file */
    status = ZwCreateFile(&handle, GENERIC_READ, &oa, &iosb, NULL, 0,
                          FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
                          FILE_OPEN, 0, NULL, 0);

    /* read file header*/
    status = ZwReadFile(handle, ioevent, NULL, NULL, &iosb, (PVOID)&fileHead,
                        sizeof(XXXX_FILE_HEADE), &offset, NULL);

    /* check whether file is interesting to us */
    if (status == STATUS_SUCCESS && iosb.Information == sizeof(……)) {
        /* it’s my taste, haha */
    }

    /* close file, not interested in it any more */

    if (handle){
        ZwClose(handle);
    }

    return status;
}

过程比较简单,XXXProcessDirents()会循环调用XXXProcessDirent(),直至列表中所有项全检查完毕。

下面再来看windbg分析吧:

1: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high.  This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: 0abc9867, memory referenced
Arg2: 00000002, IRQL
Arg3: 00000001, bitfield :
bit 0 : value 0 = read operation, 1 = write operation
bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)
Arg4: 806e7a2a, address which referenced memory

Debugging Details:
------------------

WRITE_ADDRESS:  0abc9867

CURRENT_IRQL:  2

FAULTING_IP:
hal!KeAcquireInStackQueuedSpinLock+3a
806e7a2a 8902            mov     dword ptr [edx],eax

DEFAULT_BUCKET_ID:  DRIVER_FAULT

BUGCHECK_STR:  0xA

PROCESS_NAME:  System

TRAP_FRAME:  b9019bbc -- (.trap 0xffffffffb9019bbc)
ErrCode = 00000002
eax=b9019c40 ebx=00000000 ecx=c0000211 edx=0abc9867 esi=c0000128 edi=8842d268
eip=806e7a2a esp=b9019c30 ebp=b9019c68 iopl=0         nv up ei ng nz na pe nc
cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00010286
hal!KeAcquireInStackQueuedSpinLock+0x3a:
806e7a2a 8902            mov     dword ptr [edx],eax  ds:0023:0abc9867=????????
Resetting default scope

LAST_CONTROL_TRANSFER:  from 806e7a2a to 80544768

STACK_TEXT:
b9019bbc 806e7a2a badb0d00 0abc9867 804f4e77 nt!KiTrap0E+0x238
b9019c68 806e7ef2 00000000 00000000 b9019c80 hal!KeAcquireInStackQueuedSpinLock+0x3a
b9019c68 b9019d24 00000000 00000000 b9019c80 hal!HalpApcInterrupt+0xc6
WARNING: Frame IP not in any known module. Following frames may be wrong.
b9019cf0 80535873 00000000 8896fb20 00000000 0xb9019d24
b9019d10 b79d87ff ba668a30 8859b7e8 00000440 nt!ExReleaseResourceLite+0x8d
b9019d2c b79d8a5c 8a3ff2f0 00000003 ba6685f0 XXXXX!XXXProcessDirents+0xef
b9019d88 b79e163a e2f6b170 00000001 00000001 XXXXX!XXXKernelQueryDirectory+0x20c
b9019ddc 8054616e b79e1530 88a8ae00 00000000 nt!PspSystemThreadStartup+0x34
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16

问题出在系统函数ExReleaseResourceLite()及KeAcquireInStackQueuedSpinLock()上,且程序要写的地址为0abc9867 ,明显不对,所以此处可做栈损坏推断。

第一嫌疑要考虑的是,XXXProcessDirents()中有锁保护的部分,此部分是果真是最容易造成栈损坏buffer复制操作。但经过仔细检查及测试,便排除了此部分出错的可能。

在排除第一嫌疑后,就没有明显目标了。只好再接着看windbg log:

貌似KeAcquireInStackQueuedSpinLock()要写的地址是LockHandle的LockQueue->Next,而LockHandle一般都在从当前堆栈分配的,由此可肯定之前对于栈损坏的推断。可问题是,是谁导致的栈损坏。

Stack中有hal!HalpApcInterrupt()调用记录,它是处理APC的软中断。hal!HalpApcInterrupt()会一般会调用nt!KiDeliverApc()来处理线程的APC队列。但当ExReleaseResourceLite()调用的时候,线程还处于临界区内(Critical Section),此时User mode APC及Kernel mode normal APC都会被禁止的,但Kernel mode special APC不会。

Kernel Special APC最常见的情况便是由IoCompleteRequest()添加的:在APC Level中调用IopCompleteRequest()以处理Irp的Stage 2的清理工作。

至此,问题终于有些眉目了。分析代码中唯一有可能导致APC添加的地方就在函数XXXXProcessDirent()中的ZwReadFile()调用,而且fileHead正是于堆栈中分配的。

想到此处,此bug的根据原因便付出水面:

XXXXProcessDirent()没有处理ZwReadFile()返回STATUS_PENDING的情况,此情形下,XXXXProcessDirent()退出并继续执行,而之前的ZwReadFile()的IRP完成操作也在同时进行(还没有完成),并且此完成操作所要写的fileHead地址,正是早已被回收并加以重用的当前栈。

搞清楚之后,便在调用ZwReadFile()后,特别针对STATUS_PENING的情况来调用ZwWaitForSingleObject()以确保读操作全部完成后,再进行下一步操作。

到此,问题解决!

一个蓝屏的问题,竟然如此之绕,不禁让我想起刘震云的《一句顶一万句》,只是这能顶一万句的一句到底是哪句呢?

<下一步打算写写APC相关的东西,操作系统将APC隐藏得太深,总让人捉摸不定!>

14 条评论

  1. Hi, i fesl that i noticed you visited my blog thus i came to
    ggo back the choose?.I’m teying to to find things to enhance
    my site!I assume its adequate to uuse a ffew off your ideas!!

    my homepage … Nashvillle TN Chiropractor (Sherry)

  2. Before joining Tata Groups – Ratan Tata received a Bachelor
    of Science degree in Architect from Cornell University.

    Housekeeping kept the room looking great and even left chocolates on the bed for us each night.
    There is also a benefit to business owners who install PTACs.

  3. And, never dog training forget that animals are usually
    planning. Most wounds from dog teeth brushing. Once you register your cuddly
    friend in the society. Rescue to Hand majority of
    which do not get him down to operator errors. Let you pup approach them rather than terror?

    Here is my website :: wikipedia.org (Rosalina)

  4. Cool blog! Іs your thme cistom made or did yyou download itt from somewheге?
    А theme liike үours ԝith a few simple adjustements wߋuld гeally
    mɑke my blog jump out. Ρlease let mе kmow ѡhегe you got your theme.
    Bless yߋu

    Feel frfee to visit mʏ weblog … pusat Web design bali

  5. Hey! I could have sworn I’ve been to this website before but after reading through some of the post I
    realized it’s new to me. Anyhow, I’m definitely
    happy I found it and I’ll be book-marking and checking back
    frequently!

  6. There are some of the games, puzzle, along with doing moviestarplanet hack work side by side.

    The open source 2D framework being widely used games for free.
    Its attractive feature can surely influence any person to at
    least 64 million people played mobile games industry, in spite of the
    easy and diverse niches and make your profile. These are some of the
    games as well as stat buffing.

    Visit my web page; Moviestarplanet hack android download

  7. Everything you need to hire than 2D versions. How To Hold Your LiquorSo, you’re
    presented a virtual soccer game. There are jetpack joyride hack several advantages and disadvantages of mobile phone games.
    However, this game till you drop dead! Description About Tiny VillageTiny Village is an online PC
    game downloads in less than stellar sales of an iPhone, Android, iPhone 4, iPhone, Android game
    besides Where’s My Water?

    Have a look at my web site :: Jetpack joyride cheats ipad

  8. We realize lots of candy crush saga cheats enjoyment. That is why service providers, like and once get bored enjoying.
    Dress up games, ticket purchases, GPS etc in candy
    crush saga cheats addition to convenience, mobile games are
    amalgamation of entertainment as well. If you are going
    to be cumbersome. Mobile Games are the actual war experience.

    Feel free to surf to my website :: Candy crush saga cheats list

  9. Excellent weblog right here! Additionally your website a lot up fast!

    What host are you the usage of? Can I get your affiliate hyperlink to your host?

    I desire my website loaded up as fast as yours lol

  10. A wall plug charger is usually the most common type of charger included in starter sets.
    The answer for this question is that electronic cigarettes are cheaper than the traditional tobacco cigarettes.
    As our consumer society continues to churn out endless upgrades and improvements of electronic equipment we are constantly told that we need to upgrade.

  11. Hello to every body, it’s my first pay a quick visit of this blog; this weblog includes amazing and genuinely excellent
    information in support of visitors.

  12. I see huge monetizing potential on your website. I browse your site often on my smartphone and
    don’t see any ads. You can monetize all your mobile traffic very easily just by
    installing simple wp plugin, just search in google for – Dremosny’s mobile ads plugin

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注