Java应用性能分析工具：async-profiler

48 阅读 0 评论 32 点赞

我是靠谱客的博主俭朴唇膏，最近开发中收集的这篇文章主要介绍Java应用性能分析工具：async-profiler，觉得挺不错的，现在分享给大家，希望可以做个参考。

概述

https://www.jianshu.com/p/9364028cca4e

厉害的内容

及时对项目进行性能检测，并且分析检测结果数据，发现热点代码是一项充满意义的工作，因为可能因为某一段热点代码会拖慢整个系统的运行，这是不可忍受的，发现热点代码之后需要及时进行代码优化，并且重复检测，多多角度检测，来360无死角的发现项目的性能瓶颈，让运行着的项目是最优化的。这也是每一位开发者的义务。

发现热点代码的前提是可以获取java应用运行时的profile数据，而采集这些数据需要较为底层的技术，还好目前有大量的开源根据可以进行这项非常有挑战性的工作，但是似乎每种工具采集的数据都有所差异（待考证），本文将介绍一种强大而轻量级的java应用运行时profile数据采集工具，在此之前，可以参考使用火焰图进行Java应用性能分析来大概了解java应用性能分析的一些基本情况，该文章中介绍的工具lightweight-java-profiler和本文要介绍的async-profiler都是类似的，但是前者我在自己的电脑上（macOS Sierra 10.12.4）上没有正常启动起来，错误大概是说段错误，但是可以在linux上正常启动并且可以采集到数据，结合火焰图生成工具FlameGraph可以进行java应用性能分析，喔对，你应该去学习一下如何从火焰图中找到热点代码，也就是你要学会看火焰图，这是一种非常重要的技能。也是进行应用性能分析的基础。下面开始详细介绍如何使用async-profiler来进行java应用性能分析。

环境准备

首先，你需要从github将代码下载下来：

https://github.com/jvm-profiling-tools/async-profiler#async-profiler


git clone https://github.com/jvm-profiling-tools/async-profiler

然后，进入到下载好的项目中，然后进行编译：


cd async-profiler
make

等待编译完成，可以在看到项目中多了一个build文件夹，这就是我们需要的东西，值得注意的是，async-profiler是少有的我在编译的时候没有遇到任何问题的工具，这也说明这个工具的易用性。当然，下面这些内容是必须的：

JAVA_HOME
GCC

关于async-profiler到底能做些什么事情，可以参考下面的描述：

async-profiler can trace the following kinds of events:

CPU cycles
Hardware and Software performance counters like cache misses, branch misses, page faults, context switches etc.
Allocations in Java Heap
Contented lock attempts of Java monitors

我主要关心的是CPU profiling这一个功能点，所以本文的重点也在CPU profiling这一个功能点上，其他的功能点可以自行去探索。关于async-profiler实现CPU profiling的原理以及为什么这么做，直接参考github上的readme就可以了，就不再这里赘述了，下面来看一下到底如何使用这个工具进行java应用的性能分析。

可以发现在async-profiler项目中有一个脚本叫做“profile.sh”，运行这个脚本，会输出如下提示内容：


Usage: ./profiler.sh [action] [options] <pid>
Actions:
  start             start profiling and return immediately
  stop              stop profiling
  status            print profiling status
  list              list profiling events supported by the target JVM
  collect           collect profile for the specified period of time
                    and then stop (default action)
Options:
  -e event          profiling event: cpu|alloc|lock|cache-misses etc.
  -d duration       run profiling for <duration> seconds
  -f filename       dump output to <filename>
  -i interval       sampling interval in nanoseconds
  -b bufsize        frame buffer size
  -t                profile different threads separately
  -o fmt[,fmt...]   output format: summary|traces|flat|collapsed

<pid> is a numeric process ID of the target JVM
      or 'jps' keyword to find running JVM automatically using jps tool

Example: ./profiler.sh -d 30 -f profile.fg -o collapsed 3456
         ./profiler.sh start -i 999000 jps
         ./profiler.sh stop -o summary,flat jps

其中几个重要的命令解释如下：

start ：开始进行应用的profile数据采集，如果没有设定采集时间的话会一直运行下去直到遇到stop命令
stop：和start配合使用，用来停止应用的profile数据采集
status：检测工具的运行状态，比如可以看到是否已经不可用，或者已经运行多少时间了等信息
list：将可以采集的profile数据类型打印出来
-d N：设定采集应用profile数据的时间，单位为秒
-e event：指定采集数据类型，比如cpu

其他的命令可以参考说明，并且可以结合自己实际操作来查看效果。下面来开始使用async-profiler工具来采集cpu profile数据，并且配合火焰图生成工具工具FlameGraph来生成cpu火焰图，并且从火焰图中找到热点代码。FlameGraph工具可以直接下载下来就可以使用：


git clone https://github.com/brendangregg/FlameGraph

首先将java应用运行起来，你可以试着运行下面的代码来进行测试：


import java.io.File;

class Target {
    private static volatile int value;

    private static void method1() {
        for (int i = 0; i < 1000000; ++i)
            ++value;
    }

    private static void method2() {
        for (int i = 0; i < 1000000; ++i)
            ++value;
    }

    private static void method3() throws Exception {
        for (int i = 0; i < 1000; ++i) {
            for (String s : new File("/tmp").list()) {
                value += s.hashCode();
            }
        }
    }

    public static void main(String[] args) throws Exception {
        while (true) {
            method1();
            method2();
            method3();
        }
    }
}

运行起来之后，可以使用jps命令来查看运行起来的java应用的pid，然后使用下面的命令开始使用工具进行cpu profile数据采集：


./profiler.sh start $pid

一段时间之后，比如30秒后，就可以使用下面的命令来停止数据采集了：


./profiler.sh stop $pid

然后，会打印处下面的信息：

可以很直观的看出，占用cpu时间最多的是method3，占用了93.06%的cpu时间，然后是method2和method1，分别占用2.93%和2.77%的cpu时间，所以很明显method3就是性能瓶颈，也就是所谓的热点代码，需要着手进行优化。当然，上面是有的命令式是比较简单的，下面来介绍一个比较厉害的命令，可以设定采集数据的时间，并且可以将采集到的数据dump起来，然后使用FlameGraph工具来生成火焰图进行直观的分析。当然，首先需要运行起来代码，并且使用jps找到应用的pid，然后可以使用下面的命令来进行数据采集任务：


./profiler.sh -d 10 -o collapsed -f /tmp/collapsed.txt pid

这个命令的意思是说，采集数据的时间为10秒，并且将数据按照collapsed规范进行dump，并且dump到/tmp/collapsed.txt这个文件，过了10秒之后，工具会自动停止，并且将cpu的profile数据dump到指定的路径（按照指定的规范），可以到/tmp/collapsed.txt查看具体的文件内容，但是很大程度上是看不懂的，所以需要使用FlameGraph工具来进行加工一下，可以使用下面的命令来生成火焰图：


~/github/FlameGraph/flamegraph.pl --colors=java /tmp/collapsed.txt > flamegraph.svg

当然，你需要指定你自己的FlameGraph的路径，上面命令中的是我的路径，很快，你就可以在当前目录下发现多了一个flamegraph.svg文件，使用chorm打开，就可以看到下面的图片内容（可以点击放大的）：

可以看到，method3是最宽的，也就代表method3占用的cpu时间是最多的，这样看起来就直观很多了。

下面来看一下alloc类型的数据式怎么生成的，可以从这些数据中看出什么，运行下面的代码：


import java.util.concurrent.ThreadLocalRandom;

public class AllocatingTarget implements Runnable {
    public static volatile Object sink;

    public static void main(String[] args) {
        new Thread(new AllocatingTarget(), "AllocThread-1").start();
        new Thread(new AllocatingTarget(), "AllocThread-2").start();
    }

    @Override
    public void run() {
        while (true) {
            allocate();
        }
    }

    private static void allocate() {
        if (ThreadLocalRandom.current().nextBoolean()) {
            sink = new int[128 * 1000];
        } else {
            sink = new Integer[128 * 1000];
        }
    }
}

然后使用jps命令取到该应用的pid，然后执行下面的命令：