我是靠谱客的博主 美满小兔子,最近开发中收集的这篇文章主要介绍Kernel Painc 调试,觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

有时候遇到系统崩溃又不知道怎么下手,这时候定位panic位置就显得极为重要.

我们需要用到objdump  这个工具.简单介绍一下,具体说明看附录.

objdump就是把二进制文件转换成汇编,arm系统下有arm-linux-objdump,使用方法一样.

例子

我遇到这样的panic,出现GPU驱动错误.

[ 1628.353295] Unable to handle kernel NULL pointer dereference at virtual address 00000003
[ 1628.361376] pgd = c0004000
[ 1628.363966] msm_isp_process_error_info: Stream[0]: dropped 1 frames
[ 1628.370222] [00000003] *pgd=00000000[ 1628.373598] msm_isp_process_error_info: Stream[0]: dropped 1 frames
[ 1628.379911] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[ 1628.385129] Modules linked in:
[ 1628.388170] CPU: 0 PID: 13178 Comm: kworker/u8:2 Not tainted 3.10.28 #63
[ 1628.394860] Workqueue: kgsl-3d0 kgsl_process_events
[ 1628.399714] task: e3cf1a40 ti: df6fc000 task.ti: df6fc000
[ 1628.405096] PC is at kgsl_process_events+0x30/0x214
[ 1628.409957] LR is at kgsl_process_events+0x13c/0x214
[ 1628.414905] pc : [<c042df84>]    lr : [<c042e090>]    psr: 200f0013
[ 1628.414905] sp : df6fded8  ip : 00000003  fp : c1057bb0
怎么定位代码呢?调用kgsl_process_events时候出错.PC指针遇到空指针引起的错误.从源码里把函数原型揪出来.

函数原型:

void kgsl_process_events(struct work_struct *work)
{
    struct kgsl_event_group *group;
    struct kgsl_device *device = container_of(work, struct kgsl_device,
        event_work);

    read_lock(&group_lock);
    list_for_each_entry(group, &group_list, group)
        retire_events(device, group);
    read_unlock(&group_lock);
}

这里出现几个指针还有宏函数,宏是直接带入到函数内,所以难度再次增大.

我们需要知道PC is at kgsl_process_events+0x30/0x214指的是哪一行代码,函数偏移0x30的位置,位置是相对汇编指令而言.

也就在pc : [<c042df84>]  指的这个位置.


Android 内核和Linux一样,编译后再obj/KERNEL_OBJ/目录下,生成各个驱动的.o文件,System.map,vmlinux.

函数地址表:

System.map是各个函数的地址表,这部分可选,因为可以从函数名直接定位

c042ddf8 T kgsl_cancel_events
c042de88 T kgsl_cancel_event
c042df54 T kgsl_process_events
c042e168 T kgsl_add_event
c042e408 t _kgsl_event_worker

从这里知道kgsl_process_events的地址为c042df54,这个是我们索引的开始

转换为汇编

把汇编内容输出到dump.txt,需要点时间,5~10分钟左右,大小200m左右,参数看附录说明.

arm-linux-objdump -dS vmlinux >dump.txt

汇编内容分析

打开dump.txt,搜索定位c042df54或者kgsl_process_events.

void kgsl_process_events(struct work_struct *work)
{
    c042df60:    e24dd014     sub    sp, sp, #20
    struct kgsl_event_group *group;
    struct kgsl_device *device = container_of(work, struct kgsl_device,
        event_work);

    read_lock(&group_lock);
    c042df64:    e59f01ec     ldr    r0, [pc, #492]    ; c042e158 <kgsl_process_events+0x204>
 ....无关省略
    static inline int _kgsl_context_get(struct kgsl_context *context)

....无关省略

    static inline int __atomic_add_unless(atomic_t *v, int a, int u)
    {
        int c, old;

    c = atomic_read(v);
    c042df84:    15943000     ldrne    r3, [r4]
    c042df88:    1a000001     bne    c042df94 <kgsl_process_events+0x40>
    c042df8c:    ea00000d     b    c042dfc8 <kgsl_process_events+0x74>
    while (c != u && (old = atomic_cmpxchg((v), c, c + a)) != c)
        c = old;

我们要定位到c042df84这个地方,其实就是c = atomic_read(v)这句话.

我们也可以很快猜测到atomic_read是一个宏,看一看它的定义吧.

#define atomic_read(v)    (*(volatile int *)&(v)->counter)

这里涉及到指针问题很可能出在这里.

而且ldrne    r3, [r4] 可以理解为一个赋值 的把r4地址的内容赋给r3,刚好符合c =  (*(volatile int *)&(v)->counter);

所以问题就出在这里.简单处理办法是在调用atomic_read的时候判断一下v的合法性.

有人会说这是内核最基本函数,怎么会出错呢?最底层的函数很多是不作判断,保证效率和模块化,上层要确保自己经过判断后,才能调用,很明显上层没看管好.

从根源上修改得跟踪到调用它的地方.这就不属于今天讨论的内容.

附录

objdump说明

Usage: objdump <option(s)> <file(s)>
 Display information from object <file(s)>.
 At least one of the following switches must be given:
  -a, --archive-headers    Display archive header information
  -f, --file-headers       Display the contents of the overall file header
  -p, --private-headers    Display object format specific file header contents
  -P, --private=OPT,OPT... Display object format specific contents
  -h, --[section-]headers  Display the contents of the section headers
  -x, --all-headers        Display the contents of all headers
  -d, --disassemble        Display assembler contents of executable sections
  -D, --disassemble-all    Display assembler contents of all sections
  -S, --source             Intermix source code with disassembly
  -s, --full-contents      Display the full contents of all sections requested
  -g, --debugging          Display debug information in object file
  -e, --debugging-tags     Display debug information using ctags style
  -G, --stabs              Display (in raw form) any STABS info in the file
  -W[lLiaprmfFsoRt] or
  --dwarf[=rawline,=decodedline,=info,=abbrev,=pubnames,=aranges,=macro,=frames,
          =frames-interp,=str,=loc,=Ranges,=pubtypes,
          =gdb_index,=trace_info,=trace_abbrev,=trace_aranges]
                           Display DWARF info in the file
  -t, --syms               Display the contents of the symbol table(s)
  -T, --dynamic-syms       Display the contents of the dynamic symbol table
  -r, --reloc              Display the relocation entries in the file
  -R, --dynamic-reloc      Display the dynamic relocation entries in the file
  @<file>                  Read options from <file>
  -v, --version            Display this program's version number
  -i, --info               List object formats and architectures supported
  -H, --help               Display this information

 The following switches are optional:
  -b, --target=BFDNAME           Specify the target object format as BFDNAME
  -m, --architecture=MACHINE     Specify the target architecture as MACHINE
  -j, --section=NAME             Only display information for section NAME
  -M, --disassembler-options=OPT Pass text OPT on to the disassembler
  -EB --endian=big               Assume big endian format when disassembling
  -EL --endian=little            Assume little endian format when disassembling
      --file-start-context       Include context from start of file (with -S)
  -I, --include=DIR              Add DIR to search list for source files
  -l, --line-numbers             Include line numbers and filenames in output
  -F, --file-offsets             Include file offsets when displaying information
  -C, --demangle[=STYLE]         Decode mangled/processed symbol names
                                  The STYLE, if specified, can be `auto', `gnu',
                                  `lucid', `arm', `hp', `edg', `gnu-v3', `java'
                                  or `gnat'
  -w, --wide                     Format output for more than 80 columns
  -z, --disassemble-zeroes       Do not skip blocks of zeroes when disassembling
      --start-address=ADDR       Only process data whose address is >= ADDR
      --stop-address=ADDR        Only process data whose address is <= ADDR
      --prefix-addresses         Print complete address alongside disassembly
      --[no-]show-raw-insn       Display hex alongside symbolic disassembly
      --insn-width=WIDTH         Display WIDTH bytes on a single line for -d
      --adjust-vma=OFFSET        Add OFFSET to all displayed section addresses
      --special-syms             Include special symbols in symbol dumps
      --prefix=PREFIX            Add PREFIX to absolute paths for -S
      --prefix-strip=LEVEL       Strip initial directory names for -S
      --dwarf-depth=N        Do not display DIEs at depth N or greater
      --dwarf-start=N        Display DIEs starting with N, at the same depth
                             or deeper


最后

以上就是美满小兔子为你收集整理的Kernel Painc 调试的全部内容,希望文章能够帮你解决Kernel Painc 调试所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(53)

评论列表共有 0 条评论

立即
投稿
返回
顶部