概述
有时候遇到系统崩溃又不知道怎么下手,这时候定位panic位置就显得极为重要.
我们需要用到objdump 这个工具.简单介绍一下,具体说明看附录.
objdump就是把二进制文件转换成汇编,arm系统下有arm-linux-objdump,使用方法一样.
例子
我遇到这样的panic,出现GPU驱动错误.
[ 1628.353295] Unable to handle kernel NULL pointer dereference at virtual address 00000003
[ 1628.361376] pgd = c0004000
[ 1628.363966] msm_isp_process_error_info: Stream[0]: dropped 1 frames
[ 1628.370222] [00000003] *pgd=00000000[ 1628.373598] msm_isp_process_error_info: Stream[0]: dropped 1 frames
[ 1628.379911] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[ 1628.385129] Modules linked in:
[ 1628.388170] CPU: 0 PID: 13178 Comm: kworker/u8:2 Not tainted 3.10.28 #63
[ 1628.394860] Workqueue: kgsl-3d0 kgsl_process_events
[ 1628.399714] task: e3cf1a40 ti: df6fc000 task.ti: df6fc000
[ 1628.405096] PC is at kgsl_process_events+0x30/0x214
[ 1628.409957] LR is at kgsl_process_events+0x13c/0x214
[ 1628.414905] pc : [<c042df84>] lr : [<c042e090>] psr: 200f0013
[ 1628.414905] sp : df6fded8 ip : 00000003 fp : c1057bb0
怎么定位代码呢?调用kgsl_process_events时候出错.PC指针遇到空指针引起的错误.从源码里把函数原型揪出来.
函数原型:
void kgsl_process_events(struct work_struct *work)
{
struct kgsl_event_group *group;
struct kgsl_device *device = container_of(work, struct kgsl_device,
event_work);
read_lock(&group_lock);
list_for_each_entry(group, &group_list, group)
retire_events(device, group);
read_unlock(&group_lock);
}
这里出现几个指针还有宏函数,宏是直接带入到函数内,所以难度再次增大.
我们需要知道PC is at kgsl_process_events+0x30/0x214指的是哪一行代码,函数偏移0x30的位置,位置是相对汇编指令而言.
也就在pc : [<c042df84>] 指的这个位置.
Android 内核和Linux一样,编译后再obj/KERNEL_OBJ/目录下,生成各个驱动的.o文件,System.map,vmlinux.
函数地址表:
System.map是各个函数的地址表,这部分可选,因为可以从函数名直接定位
c042ddf8 T kgsl_cancel_events
c042de88 T kgsl_cancel_event
c042df54 T kgsl_process_events
c042e168 T kgsl_add_event
c042e408 t _kgsl_event_worker
从这里知道kgsl_process_events的地址为c042df54,这个是我们索引的开始
转换为汇编
把汇编内容输出到dump.txt,需要点时间,5~10分钟左右,大小200m左右,参数看附录说明.
arm-linux-objdump -dS vmlinux >dump.txt
汇编内容分析
打开dump.txt,搜索定位c042df54或者kgsl_process_events.
void kgsl_process_events(struct work_struct *work)
{
c042df60: e24dd014 sub sp, sp, #20
struct kgsl_event_group *group;
struct kgsl_device *device = container_of(work, struct kgsl_device,
event_work);
read_lock(&group_lock);
c042df64: e59f01ec ldr r0, [pc, #492] ; c042e158 <kgsl_process_events+0x204>
....无关省略
static inline int _kgsl_context_get(struct kgsl_context *context)
....无关省略
static inline int __atomic_add_unless(atomic_t *v, int a, int u)
{
int c, old;
c = atomic_read(v);
c042df84: 15943000 ldrne r3, [r4]
c042df88: 1a000001 bne c042df94 <kgsl_process_events+0x40>
c042df8c: ea00000d b c042dfc8 <kgsl_process_events+0x74>
while (c != u && (old = atomic_cmpxchg((v), c, c + a)) != c)
c = old;
我们要定位到c042df84这个地方,其实就是c = atomic_read(v)这句话.
我们也可以很快猜测到atomic_read是一个宏,看一看它的定义吧.
#define atomic_read(v) (*(volatile int *)&(v)->counter)
这里涉及到指针问题很可能出在这里.
而且ldrne r3, [r4] 可以理解为一个赋值 的把r4地址的内容赋给r3,刚好符合c = (*(volatile int *)&(v)->counter);
所以问题就出在这里.简单处理办法是在调用atomic_read的时候判断一下v的合法性.
有人会说这是内核最基本函数,怎么会出错呢?最底层的函数很多是不作判断,保证效率和模块化,上层要确保自己经过判断后,才能调用,很明显上层没看管好.
从根源上修改得跟踪到调用它的地方.这就不属于今天讨论的内容.
附录
objdump说明
Usage: objdump <option(s)> <file(s)>
Display information from object <file(s)>.
At least one of the following switches must be given:
-a, --archive-headers Display archive header information
-f, --file-headers Display the contents of the overall file header
-p, --private-headers Display object format specific file header contents
-P, --private=OPT,OPT... Display object format specific contents
-h, --[section-]headers Display the contents of the section headers
-x, --all-headers Display the contents of all headers
-d, --disassemble Display assembler contents of executable sections
-D, --disassemble-all Display assembler contents of all sections
-S, --source Intermix source code with disassembly
-s, --full-contents Display the full contents of all sections requested
-g, --debugging Display debug information in object file
-e, --debugging-tags Display debug information using ctags style
-G, --stabs Display (in raw form) any STABS info in the file
-W[lLiaprmfFsoRt] or
--dwarf[=rawline,=decodedline,=info,=abbrev,=pubnames,=aranges,=macro,=frames,
=frames-interp,=str,=loc,=Ranges,=pubtypes,
=gdb_index,=trace_info,=trace_abbrev,=trace_aranges]
Display DWARF info in the file
-t, --syms Display the contents of the symbol table(s)
-T, --dynamic-syms Display the contents of the dynamic symbol table
-r, --reloc Display the relocation entries in the file
-R, --dynamic-reloc Display the dynamic relocation entries in the file
@<file> Read options from <file>
-v, --version Display this program's version number
-i, --info List object formats and architectures supported
-H, --help Display this information
The following switches are optional:
-b, --target=BFDNAME Specify the target object format as BFDNAME
-m, --architecture=MACHINE Specify the target architecture as MACHINE
-j, --section=NAME Only display information for section NAME
-M, --disassembler-options=OPT Pass text OPT on to the disassembler
-EB --endian=big Assume big endian format when disassembling
-EL --endian=little Assume little endian format when disassembling
--file-start-context Include context from start of file (with -S)
-I, --include=DIR Add DIR to search list for source files
-l, --line-numbers Include line numbers and filenames in output
-F, --file-offsets Include file offsets when displaying information
-C, --demangle[=STYLE] Decode mangled/processed symbol names
The STYLE, if specified, can be `auto', `gnu',
`lucid', `arm', `hp', `edg', `gnu-v3', `java'
or `gnat'
-w, --wide Format output for more than 80 columns
-z, --disassemble-zeroes Do not skip blocks of zeroes when disassembling
--start-address=ADDR Only process data whose address is >= ADDR
--stop-address=ADDR Only process data whose address is <= ADDR
--prefix-addresses Print complete address alongside disassembly
--[no-]show-raw-insn Display hex alongside symbolic disassembly
--insn-width=WIDTH Display WIDTH bytes on a single line for -d
--adjust-vma=OFFSET Add OFFSET to all displayed section addresses
--special-syms Include special symbols in symbol dumps
--prefix=PREFIX Add PREFIX to absolute paths for -S
--prefix-strip=LEVEL Strip initial directory names for -S
--dwarf-depth=N Do not display DIEs at depth N or greater
--dwarf-start=N Display DIEs starting with N, at the same depth
or deeper
最后
以上就是美满小兔子为你收集整理的Kernel Painc 调试的全部内容,希望文章能够帮你解决Kernel Painc 调试所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复