背景

java相比于c++的一个比较大的特性就是垃圾回收,也就是我们常说的garbage collector(gc)。在使用c++的时候,很多人都有这样的困扰: 在 手动申请了一块内存后,必须记得释放这些内存,否则就会导致内存泄漏。而gc的引入从本质商解决了这个问题,我们只需要申请对象(内存)即可, 对象的释放(内存的释放)由jvm来控制,它会根据实际情况决定合适什么时候回收对象。gc带来巨大便利的同时,也引入了一些问题。因为对象的释放不是 及时的,而是集中进行的,所以当有大量的对象需要释放的时候,会导致一定的延迟,在某些情况下延迟甚至达到秒级别,这对于低延迟应用是不可接受的。 所以我们需要做针对的优化,这包括jvm堆的配置调整,gc算法或者gc参数的调整,或者单机流量的限制等,而gc日志则是我们优化的最直接依据,所以学会 看懂gc日志就显得尤为重要。下面通过一个小例子来介绍下如何解读gc日志,以及通过gc日志来优化程序配置。

例子

import com.google.common.cache.Cache;
import com.google.common.cache.CacheBuilder;

import java.util.Arrays;
import java.util.concurrent.Executor;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicLong;

public class GC {

    public static class LargeObject {
        private int[] data;

        private LargeObject() {
            data = new int[1024 * 1024 / 4];
            Arrays.fill(data, 10);
        }
    }

    public static Cache<Long, LargeObject> cache = CacheBuilder.newBuilder()
            .expireAfterWrite(500, TimeUnit.MILLISECONDS)
            .build();

    public static void main(String[] args) {
        Executor executor = Executors.newFixedThreadPool(4);
        final AtomicLong j = new AtomicLong();
        final long start = System.currentTimeMillis();
        for (int i = 0; i < 4; i++) {
            executor.execute(new Runnable() {
                public void run() {
                    for (int i = 0; i < Integer.MAX_VALUE; i++) {
                        LargeObject largeObject = new LargeObject();
                        long t1 = System.currentTimeMillis();
                        if (Math.random() < 0.2) {
                            cache.put(System.nanoTime(), largeObject);
                        }

                        long t2 = System.currentTimeMillis();
                        long t = t2 - t1;
                        if (t > 50) {
                            System.out.println(t + " 第" + (t2 - start) / 1000 + "秒"); // 1
                        } else if ( t > 10){
                            System.out.println(t + " 第" + (t2 - start) / 1000 + "秒"); // 2
                        }
                    }
                }
            });
        }
    }
}


上面的代码启动了四个线程,然后不停的创建对象LargeObject=1M。这些对象有百分之20的概率进入到guava的缓存中,缓存的有效期是500ms。当我们以下面的参数运行程序的时候,可以看到很多的输出:

-Xms4g
-Xmx4g
-XX:NewSize=512M
-XX:SurvivorRatio=2
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=70

输出的内容多为,

29 第0秒
19 第0秒
16 第0秒
18 第0秒
20 第0秒
27 第0秒
24 第0秒
24 第0秒
23 第0秒
23 第0秒
23 第1秒
25 第1秒
23 第1秒
23 第1秒

也就是说很多时候t2、t1时刻相隔了20+ms,理论上这两个时刻应该相隔很近,不会有什么输出才对。很多同学可能会想到是gc的问题,那么如何验证呢?

分析、验证

为了分析是否gc导致的问题,我们需要一些线索,这时候需要在启动参数里面加上一些打印gc的参数。修改后的启动参数如下所示,

-Xms1g
-Xmx1g
-XX:NewSize=512M
-XX:SurvivorRatio=2
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintTenuringDistribution
-XX:PrintCMSStatistics=1
-XX:+PrintHeapAtGC
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCApplicationConcurrentTime
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=70

这里介绍下几个参数

  1. -XX:+PrintGCDetails
    打印gc信息
  2. -XX:+PrintGCTimeStamps
    打印时间戳信息
  3. -XX:+PrintTenuringDistribution
    打印新生代中对象年代分布情况,比如经过2轮gc后仍然存活的对象的总量。
  4. -XX:PrintCMSStatistics=1
    打印CMS的统计信息
  5. -XX:+PrintHeapAtGC
    打印堆信息
  6. -XX:+PrintGCApplicationStoppedTime
    打印gc暂停时间
  7. -XX:+PrintGCApplicationConcurrentTime 打印程序运行时间

用上面的参数再次运行程序,可以看到如下的信息,由于输出较多这里只截取部分内容

0.338: Application time: 0.1426909 seconds
{Heap before GC invocations=0 (full 0):
 par new generation   total 511168K, used 340228K [0x00000006c0000000, 0x00000006e9990000, 0x00000006e9990000)
  eden space 340800K,  99% used [0x00000006c0000000, 0x00000006d4c41388, 0x00000006d4cd0000)
  from space 170368K,   0% used [0x00000006d4cd0000, 0x00000006d4cd0000, 0x00000006df330000)
  to   space 170368K,   0% used [0x00000006df330000, 0x00000006df330000, 0x00000006e9990000)
 concurrent mark-sweep generation total 3512768K, used 0K [0x00000006e9990000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 4284K, capacity 4896K, committed 5248K, reserved 1056768K
  class space    used 484K, capacity 528K, committed 640K, reserved 1048576K
0.338: [GC (Allocation Failure) 0.338: [ParNew
Desired survivor size 87228416 bytes, new threshold 15 (max 15)
- age   1:   65370168 bytes,   65370168 total
: 340228K->63888K(511168K), 0.0149605 secs] 340228K->63888K(4023936K), 0.0150078 secs] [Times: user=0.00 sys=0.13, real=0.02 secs] 
Heap after GC invocations=1 (full 0):
 par new generation   total 511168K, used 63888K [0x00000006c0000000, 0x00000006e9990000, 0x00000006e9990000)
  eden space 340800K,   0% used [0x00000006c0000000, 0x00000006c0000000, 0x00000006d4cd0000)
  from space 170368K,  37% used [0x00000006df330000, 0x00000006e3194178, 0x00000006e9990000)
  to   space 170368K,   0% used [0x00000006d4cd0000, 0x00000006d4cd0000, 0x00000006df330000)
 concurrent mark-sweep generation total 3512768K, used 0K [0x00000006e9990000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 4284K, capacity 4896K, committed 5248K, reserved 1056768K
  class space    used 484K, capacity 528K, committed 640K, reserved 1048576K
}
0.353: Total time for which application threads were stopped: 0.0156427 seconds, Stopping threads took: 0.0004197 seconds



0.404: Application time: 0.0509319 seconds
{Heap before GC invocations=1 (full 0):
 par new generation   total 511168K, used 404315K [0x00000006c0000000, 0x00000006e9990000, 0x00000006e9990000)
  eden space 340800K,  99% used [0x00000006c0000000, 0x00000006d4c72d00, 0x00000006d4cd0000)
  from space 170368K,  37% used [0x00000006df330000, 0x00000006e3194178, 0x00000006e9990000)
  to   space 170368K,   0% used [0x00000006d4cd0000, 0x00000006d4cd0000, 0x00000006df330000)
 concurrent mark-sweep generation total 3512768K, used 0K [0x00000006e9990000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 4290K, capacity 4896K, committed 5248K, reserved 1056768K
  class space    used 484K, capacity 528K, committed 640K, reserved 1048576K
0.405: [GC (Allocation Failure) 0.405: [ParNew
Desired survivor size 87228416 bytes, new threshold 2 (max 15)
- age   1:   73410072 bytes,   73410072 total
- age   2:   62890336 bytes,  136300408 total
: 404315K->133492K(511168K), 0.0273827 secs] 404315K->133492K(4023936K), 0.0274192 secs] [Times: user=0.06 sys=0.13, real=0.03 secs] 
Heap after GC invocations=2 (full 0):
 par new generation   total 511168K, used 133492K [0x00000006c0000000, 0x00000006e9990000, 0x00000006e9990000)
  eden space 340800K,   0% used [0x00000006c0000000, 0x00000006c0000000, 0x00000006d4cd0000)
  from space 170368K,  78% used [0x00000006d4cd0000, 0x00000006dcf2d1f8, 0x00000006df330000)
  to   space 170368K,   0% used [0x00000006df330000, 0x00000006df330000, 0x00000006e9990000)
 concurrent mark-sweep generation total 3512768K, used 0K [0x00000006e9990000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 4290K, capacity 4896K, committed 5248K, reserved 1056768K
  class space    used 484K, capacity 528K, committed 640K, reserved 1048576K
}
0.432: Total time for which application threads were stopped: 0.0278286 seconds, Stopping threads took: 0.0001922 seconds



0.486: Application time: 0.0534380 seconds
{Heap before GC invocations=2 (full 0):
 par new generation   total 511168K, used 473359K [0x00000006c0000000, 0x00000006e9990000, 0x00000006e9990000)
  eden space 340800K,  99% used [0x00000006c0000000, 0x00000006d4be6a18, 0x00000006d4cd0000)
  from space 170368K,  78% used [0x00000006d4cd0000, 0x00000006dcf2d1f8, 0x00000006df330000)
  to   space 170368K,   0% used [0x00000006df330000, 0x00000006df330000, 0x00000006e9990000)
 concurrent mark-sweep generation total 3512768K, used 0K [0x00000006e9990000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 4293K, capacity 4896K, committed 5248K, reserved 1056768K
  class space    used 484K, capacity 528K, committed 640K, reserved 1048576K
0.486: [GC (Allocation Failure) 0.486: [ParNew
Desired survivor size 87228416 bytes, new threshold 2 (max 15)
- age   1:   65019712 bytes,   65019712 total
- age   2:   71312504 bytes,  136332216 total
: 473359K->134027K(511168K), 0.0346654 secs] 473359K->195468K(4023936K), 0.0347067 secs] [Times: user=0.11 sys=0.14, real=0.04 secs] 
Heap after GC invocations=3 (full 0):
 par new generation   total 511168K, used 134027K [0x00000006c0000000, 0x00000006e9990000, 0x00000006e9990000)
  eden space 340800K,   0% used [0x00000006c0000000, 0x00000006c0000000, 0x00000006d4cd0000)
  from space 170368K,  78% used [0x00000006df330000, 0x00000006e7612d60, 0x00000006e9990000)
  to   space 170368K,   0% used [0x00000006d4cd0000, 0x00000006d4cd0000, 0x00000006df330000)
 concurrent mark-sweep generation total 3512768K, used 61440K [0x00000006e9990000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 4293K, capacity 4896K, committed 5248K, reserved 1056768K
  class space    used 484K, capacity 528K, committed 640K, reserved 1048576K
}
0.521: Total time for which application threads were stopped: 0.0353197 seconds, Stopping threads took: 0.0003413 second


2.556: Application time: 0.0001004 seconds
2.556: [GC (CMS Initial Mark)  (Survivor:9chunks) Finished young gen initial mark scan work in 4th thread: 0.000 sec
Finished young gen initial mark scan work in 6th thread: 0.000 sec
Finished young gen initial mark scan work in 5th thread: 0.000 sec
Finished young gen initial mark scan work in 1th thread: 0.000 sec
Finished young gen initial mark scan work in 3th thread: 0.000 sec
Finished young gen initial mark scan work in 0th thread: 0.000 sec
Finished remaining root initial mark scan work in 6th thread: 0.000 sec
Finished young gen initial mark scan work in 2th thread: 0.000 sec
Finished remaining root initial mark scan work in 3th thread: 0.000 sec
Finished remaining root initial mark scan work in 1th thread: 0.000 sec
Finished remaining root initial mark scan work in 5th thread: 0.000 sec
Finished remaining root initial mark scan work in 4th thread: 0.000 sec
Finished young gen initial mark scan work in 7th thread: 0.000 sec
Finished remaining root initial mark scan work in 2th thread: 0.000 sec
Finished remaining root initial mark scan work in 0th thread: 0.000 sec
Finished remaining root initial mark scan work in 7th thread: 0.000 sec
[1 CMS-initial-mark: 1851656K(3512768K)] 1939801K(4023936K), 0.0003299 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
2.556: Total time for which application threads were stopped: 0.0006027 seconds, Stopping threads took: 0.0002155 seconds
2.556: [CMS-concurrent-mark-start]
Finished cms space scanning in 1th thread: 0.018 sec
Finished cms space scanning in 0th thread: 0.018 sec
Finished work stealing in 0th thread: 0.000 sec
Finished work stealing in 1th thread: 0.000 sec
2.575: [CMS-concurrent-mark: 0.019/0.019 secs] (CMS-concurrent-mark yielded 0 times)
 [Times: user=0.06 sys=0.02, real=0.02 secs] 
2.575: [CMS-concurrent-preclean-start]
 (cardTable: 56 cards, re-scanned 56 cards, 1 iterations)
2.585: [CMS-concurrent-preclean: 0.010/0.010 secs] (CMS-concurrent-preclean yielded 0 times)
 [Times: user=0.05 sys=0.00, real=0.01 secs] 
2.586: [CMS-concurrent-abortable-preclean-start]
 (cardTable: 40 cards, re-scanned 40 cards, 1 iterations)
 (cardTable: 21 cards, re-scanned 99 cards, 1 iterations)
 (cardTable: 27 cards, re-scanned 205 cards, 1 iterations)
 (cardTable: 25 cards, re-scanned 25 cards, 1 iterations)
 (cardTable: 34 cards, re-scanned 212 cards, 1 iterations)
 (cardTable: 15 cards, re-scanned 15 cards, 1 iterations)
 (cardTable: 32 cards, re-scanned 199 cards, 1 iterations)
 (cardTable: 35 cards, re-scanned 35 cards, 1 iterations)
 (cardTable: 11 cards, re-scanned 180 cards, 1 iterations)
 (cardTable: 26 cards, re-scanned 26 cards, 1 iterations)
 (cardTable: 6 cards, re-scanned 165 cards, 1 iterations)
 (cardTable: 21 cards, re-scanned 21 cards, 1 iterations)
 (cardTable: 13 cards, re-scanned 182 cards, 1 iterations)
 (cardTable: 35 cards, re-scanned 35 cards, 1 iterations)
 (cardTable: 25 cards, re-scanned 194 cards, 1 iterations)
 (cardTable: 16 cards, re-scanned 16 cards, 1 iterations)
 (cardTable: 26 cards, re-scanned 171 cards, 1 iterations)
 (cardTable: 21 cards, re-scanned 21 cards, 1 iterations)
 (cardTable: 35 cards, re-scanned 167 cards, 1 iterations)
 (cardTable: 22 cards, re-scanned 22 cards, 1 iterations)
 (cardTable: 44 cards, re-scanned 202 cards, 1 iterations)
 (cardTable: 22 cards, re-scanned 22 cards, 1 iterations)
 [22 iterations, 12 waits, 2254 cards)] 
3.511: [CMS-concurrent-abortable-preclean: 0.229/0.925 secs] (CMS-concurrent-abortable-preclean yielded 0 times)
 [Times: user=3.86 sys=0.72, real=0.92 secs] 
 


3.511: Application time: 0.0002506 seconds
3.511: [GC (CMS Final Remark) [YG occupancy: 142976 K (511168 K)]3.511: [Rescan (parallel)  (Survivor:12chunks) Finished young gen rescan work in 1th thread: 0.000 sec
Finished young gen rescan work in 4th thread: 0.000 sec
Finished young gen rescan work in 3th thread: 0.000 sec
Finished young gen rescan work in 2th thread: 0.000 sec
Finished remaining root rescan work in 1th thread: 0.000 sec
Finished young gen rescan work in 7th thread: 0.000 sec
Finished young gen rescan work in 6th thread: 0.000 sec
Finished young gen rescan work in 0th thread: 0.000 sec
Finished young gen rescan work in 5th thread: 0.000 sec
Finished remaining root rescan work in 0th thread: 0.000 sec
Finished remaining root rescan work in 6th thread: 0.000 sec
Finished remaining root rescan work in 3th thread: 0.000 sec
Finished remaining root rescan work in 7th thread: 0.000 sec
Finished remaining root rescan work in 4th thread: 0.000 sec
Finished remaining root rescan work in 2th thread: 0.000 sec
Finished unhandled CLD scanning work in 0th thread: 0.000 sec
Finished remaining root rescan work in 5th thread: 0.000 sec
Finished dirty klass scanning work in 0th thread: 0.000 sec
Finished dirty card rescan work in 0th thread: 0.001 sec
Finished dirty card rescan work in 6th thread: 0.001 sec
Finished dirty card rescan work in 2th thread: 0.001 sec
Finished dirty card rescan work in 3th thread: 0.001 sec
Finished dirty card rescan work in 1th thread: 0.001 sec
Finished dirty card rescan work in 5th thread: 0.001 sec
Finished dirty card rescan work in 4th thread: 0.001 sec
Finished dirty card rescan work in 7th thread: 0.001 sec
Finished work stealing in 7th thread: 0.000 sec
Finished work stealing in 5th thread: 0.000 sec
Finished work stealing in 1th thread: 0.000 sec
Finished work stealing in 3th thread: 0.000 sec
Finished work stealing in 4th thread: 0.000 sec
Finished work stealing in 0th thread: 0.000 sec
Finished work stealing in 6th thread: 0.000 sec
Finished work stealing in 2th thread: 0.000 sec
, 0.0014281 secs]3.513: [weak refs processing, 0.0000165 secs]3.513: [class unloading, 0.0007096 secs]3.514: [scrub symbol table, 0.0010190 secs]3.515: [scrub string table, 0.0001269 secs][1 CMS-remark: 2581867K(3512768K)] 2724843K(4023936K), 0.0034235 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
3.515: Total time for which application threads were stopped: 0.0037337 seconds, Stopping threads took: 0.0002355 seconds
3.516: [CMS-concurrent-sweep-start]
3.523: [CMS-concurrent-sweep: 0.007/0.007 secs] (CMS-concurrent-sweep yielded 0 times)
 [Times: user=0.08 sys=0.00, real=0.01 secs] 
3.524: [CMS-concurrent-reset-start]
3.549: [CMS-concurrent-reset: 0.025/0.025 secs] (CMS-concurrent-reset yielded 0 times)
 [Times: user=0.06 sys=0.02, real=0.02 secs] 

为了更清晰的展示,上面四段日志使用换行符隔开,前三段代表young gc,后一段代表old gc。下面分别来解读下这些日志,同时看看能否根据这些日志做出一些优化。

新生代日志

首先看一下young gc的日志,这里以第三段作为例子。

应用时间

0.338: Application time: 0.1426909 seconds

上面的日志代表了应用在gc前运行了0.142s,然后开始gc。

gc前堆的使用情况

Heap before GC invocations=2 (full 0):
 par new generation   total 511168K, used 473359K [0x00000006c0000000, 0x00000006e9990000, 0x00000006e9990000)
  eden space 340800K,  99% used [0x00000006c0000000, 0x00000006d4be6a18, 0x00000006d4cd0000)
  from space 170368K,  78% used [0x00000006d4cd0000, 0x00000006dcf2d1f8, 0x00000006df330000)
  to   space 170368K,   0% used [0x00000006df330000, 0x00000006df330000, 0x00000006e9990000)
 concurrent mark-sweep generation total 3512768, used 0K [0x00000006e9990000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 4293K, capacity 4896K, committed 5248K, reserved 1056768K
  class space    used 484K, capacity 528K, committed 640K, reserved 1048576K

日志打印了gc开始前新生代、老年代和metaspace的使用情况,我们可以看出如下信息,

  1. eden区总共有340800KB/1024=332M,使用率99%
  2. from区总共有170368KB/1024=166M,使用率78%
  3. to区总共有170368KB/1024=166M,使用率0%,
  4. 新生代总共用了473359K=462M

由于eden区已经满了,所以触发了young gc。

### gc情况

 0.486: [GC (Allocation Failure) 0.486: [ParNew
Desired survivor size 87228416 bytes, new threshold 2 (max 15)
- age   1:   65019712 bytes,   65019712 total
- age   2:   71312504 bytes,  136332216 total
: 473359K->134027K(511168K), 0.0346654secs] 473359K->195468K(4023936K), 0.0347067 secs] [Times: user=0.11 sys=0.14, real=0.04 secs] 

日志记录了这次gc的详细情况。

  1. gc后最新的new generation threshold是2,也就是下一轮gc的时候,age > 2的对象会进入到老年代。
  2. gc后,age为1的对象有65019712B/1024/1024=62M,age为1的对象有71312504B/1024/1024=68M,加起来总共130MB。
  3. 期望的survivor大小为83M,大小为原始survivor的一半。这个值是由TargetSurvivorRatio控制的
  4. 新生代使用的大小从473359K降到了134027K=130M,新生代和老年代使用的总和从473359K到195468K=190M,也就是说老年代新增了60M。新生代和老年代总大小为4023936K=3G。
  5. 这一次GC的真实时间real是0.04s,user是花费在用户态的时间,sys是花费在内核态的时间(可能包括系统调用等,写日志等)。通常情况下real都远小于user和sys,这是因为进程是多线程运行的,所以user和sys对应的时间是各个线程的user和sys时间的加和。理论上real约等于user/4,约等于sys/4。

gc后堆使用情况

Heap after GC invocations=3 (full 0):
 par new generation   total 511168K, used 134027K [0x00000006c0000000, 0x00000006e9990000, 0x00000006e9990000)
  eden space 340800K,   0% used [0x00000006c0000000, 0x00000006c0000000, 0x00000006d4cd0000)
  from space 170368K,  78% used [0x00000006df330000, 0x00000006e7612d60, 0x00000006e9990000)
  to   space 170368K,   0% used [0x00000006d4cd0000, 0x00000006d4cd0000, 0x00000006df330000)
 concurrent mark-sweep generation total 3512768K, used 61440K [0x00000006e9990000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 4293K, capacity 4896K, committed 5248K, reserved 1056768K
  class space    used 484K, capacity 528K, committed 640K, reserved 1048576K
}

新生代或者的对象从eden,和一个survivor移动到了另一个survivor。gc前后from区的使用都很高,需要注意的是gc后from对应的是之前的to,只可以从其地址0x00000006df330000看出。老年代的使用大小为61440K=60M,这与上一小节(gc情况)的推测一致。

应用暂停时间

0.521: Total time for which application threads were stopped: 0.0353197 seconds, Stopping threads took: 0.0003413 second

这里可以看出应用暂定了大概35ms,与real时间大致相等

young gc的日志的含义已经清楚了,但是想从中分析出系统的问题还是不够的,需要全局的来看。把上面三段young gc日志综合起来看可以发现如下的规律:

  1. 三段日志开始的时间分别是
    0.338
    0.404
    0.486
    也就是说每隔80ms左右就会产生一次gc
  2. 三段日志结束的时间分别是 0.353
    0.432
    0.521 和上面的开始时间做减法后,可以看出每次gc暂停的时间大概为20ms

根据这些信息,我们大概能知道了运行程序的时候频繁的打印出日志的原因,因为日志里面的时间间隔(t2-t1)多在20ms之间,和gc暂停的时间大致吻合。频繁的young gc会大大的影响系统的吞吐量。

那么有哪些因素会影响young gc的频率及暂停时间呢?

回顾下之前设置的参数,新生代只有512M,同时SurvivorRatio=2,也就是说eden区只有大约300M,很容易就会被新生成的对象占满,eden区越小越容易产生young gc。

这里我们调整下新生代的大小,

-XX:NewSize=2048M

然后运行程序可以看到,触发young gc的间隔为了原来的两倍,也就是每160ms左右才会产生一次gc,但是gc暂停的时间也相应的翻倍了。

0.433: Application time: 0.2661536 seconds
{Heap before GC invocations=0 (full 0):
 par new generation   total 1572864K, used 1047893K [0x00000006c0000000, 0x0000000740000000, 0x0000000740000000)
  eden space 1048576K,  99% used [0x00000006c0000000, 0x00000006fff55518, 0x0000000700000000)
  from space 524288K,   0% used [0x0000000700000000, 0x0000000700000000, 0x0000000720000000)
  to   space 524288K,   0% used [0x0000000720000000, 0x0000000720000000, 0x0000000740000000)
 concurrent mark-sweep generation total 2097152K, used 0K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 4291K, capacity 4896K, committed 5248K, reserved 1056768K
  class space    used 484K, capacity 528K, committed 640K, reserved 1048576K
0.434: [GC (Allocation Failure) 0.434: [ParNew
Desired survivor size 268435456 bytes, new threshold 15 (max 15)
- age   1:  199604016 bytes,  199604016 total
: 1047893K->194969K(1572864K), 0.0379891 secs] 1047893K->194969K(3670016K), 0.0380353 secs] [Times: user=0.06 sys=0.24, real=0.04 secs] 
Heap after GC invocations=1 (full 0):
 par new generation   total 1572864K, used 194969K [0x00000006c0000000, 0x0000000740000000, 0x0000000740000000)
  eden space 1048576K,   0% used [0x00000006c0000000, 0x00000006c0000000, 0x0000000700000000)
  from space 524288K,  37% used [0x0000000720000000, 0x000000072be667a0, 0x0000000740000000)
  to   space 524288K,   0% used [0x0000000700000000, 0x0000000700000000, 0x0000000720000000)
 concurrent mark-sweep generation total 2097152K, used 0K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 4291K, capacity 4896K, committed 5248K, reserved 1056768K
  class space    used 484K, capacity 528K, committed 640K, reserved 1048576K
}
0.472: Total time for which application threads were stopped: 0.0386414 seconds, Stopping threads took: 0.0003695 seconds
0.624: Application time: 0.1526262 seconds
{Heap before GC invocations=1 (full 0):
 par new generation   total 1572864K, used 1243343K [0x00000006c0000000, 0x0000000740000000, 0x0000000740000000)
  eden space 1048576K,  99% used [0x00000006c0000000, 0x00000006fffcd678, 0x0000000700000000)
  from space 524288K,  37% used [0x0000000720000000, 0x000000072be667a0, 0x0000000740000000)
  to   space 524288K,   0% used [0x0000000700000000, 0x0000000700000000, 0x0000000720000000)
 concurrent mark-sweep generation total 2097152K, used 0K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 4322K, capacity 4960K, committed 5248K, reserved 1056768K
  class space    used 484K, capacity 528K, committed 640K, reserved 1048576K
0.625: [GC (Allocation Failure) 0.625: [ParNew
Desired survivor size 268435456 bytes, new threshold 2 (max 15)
- age   1:  209742920 bytes,  209742920 total
- age   2:  197122600 bytes,  406865520 total
: 1243343K->397829K(1572864K), 0.0773231 secs] 1243343K->397829K(3670016K), 0.0773578 secs] [Times: user=0.19 sys=0.39, real=0.08 secs] 
Heap after GC invocations=2 (full 0):
 par new generation   total 1572864K, used 397829K [0x00000006c0000000, 0x0000000740000000, 0x0000000740000000)
  eden space 1048576K,   0% used [0x00000006c0000000, 0x00000006c0000000, 0x0000000700000000)
  from space 524288K,  75% used [0x0000000700000000, 0x0000000718481450, 0x0000000720000000)
  to   space 524288K,   0% used [0x0000000720000000, 0x0000000720000000, 0x0000000740000000)
 concurrent mark-sweep generation total 2097152K, used 0K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 4322K, capacity 4960K, committed 5248K, reserved 1056768K
  class space    used 484K, capacity 528K, committed 640K, reserved 1048576K
}
0.702: Total time for which application threads were stopped: 0.0779363 seconds, Stopping threads took: 0.0003222 seconds

如果继续修改-XX:SurvivorRatio=2为-XX:SurvivorRatio=4, 可以看到gc的间隔时间再次变长,同时gc暂停的时间也再次增长。

0.545: Application time: 0.3504132 seconds
{Heap before GC invocations=0 (full 0):
 par new generation   total 1747648K, used 1398144K [0x00000006c0000000, 0x0000000740000000, 0x0000000740000000)
  eden space 1398144K, 100% used [0x00000006c0000000, 0x0000000715560000, 0x0000000715560000)
  from space 349504K,   0% used [0x0000000715560000, 0x0000000715560000, 0x000000072aab0000)
  to   space 349504K,   0% used [0x000000072aab0000, 0x000000072aab0000, 0x0000000740000000)
 concurrent mark-sweep generation total 2097152K, used 0K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 4302K, capacity 4896K, committed 5248K, reserved 1056768K
  class space    used 485K, capacity 528K, committed 640K, reserved 1048576K
0.546: [GC (Allocation Failure) 0.546: [ParNew
Desired survivor size 178946048 bytes, new threshold 1 (max 15)
- age   1:  261483200 bytes,  261483200 total
: 1398144K->255389K(1747648K), 0.0488136 secs] 1398144K->255389K(3844800K), 0.0488638 secs] [Times: user=0.09 sys=0.20, real=0.05 secs] 
Heap after GC invocations=1 (full 0):
 par new generation   total 1747648K, used 255389K [0x00000006c0000000, 0x0000000740000000, 0x0000000740000000)
  eden space 1398144K,   0% used [0x00000006c0000000, 0x00000006c0000000, 0x0000000715560000)
  from space 349504K,  73% used [0x000000072aab0000, 0x000000073a417750, 0x0000000740000000)
  to   space 349504K,   0% used [0x0000000715560000, 0x0000000715560000, 0x000000072aab0000)
 concurrent mark-sweep generation total 2097152K, used 0K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 4302K, capacity 4896K, committed 5248K, reserved 1056768K
  class space    used 485K, capacity 528K, committed 640K, reserved 1048576K
}
0.595: Total time for which application threads were stopped: 0.0496173 seconds, Stopping threads took: 0.0005163 seconds
0.776: Application time: 0.1812291 seconds
{Heap before GC invocations=1 (full 0):
 par new generation   total 1747648K, used 1653014K [0x00000006c0000000, 0x0000000740000000, 0x0000000740000000)
  eden space 1398144K,  99% used [0x00000006c0000000, 0x00000007154de180, 0x0000000715560000)
  from space 349504K,  73% used [0x000000072aab0000, 0x000000073a417750, 0x0000000740000000)
  to   space 349504K,   0% used [0x0000000715560000, 0x0000000715560000, 0x000000072aab0000)
 concurrent mark-sweep generation total 2097152K, used 0K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 4333K, capacity 4960K, committed 5248K, reserved 1056768K
  class space    used 485K, capacity 528K, committed 640K, reserved 1048576K
0.778: [GC (Allocation Failure) 0.778: [ParNew
Desired survivor size 178946048 bytes, new threshold 1 (max 15)
- age   1:  275812568 bytes,  275812568 total
: 1653014K->270161K(1747648K), 0.1033287 secs] 1653014K->522095K(3844800K), 0.1033612 secs] [Times: user=0.14 sys=0.73, real=0.10 secs] 
Heap after GC invocations=2 (full 0):
 par new generation   total 1747648K, used 270161K [0x00000006c0000000, 0x0000000740000000, 0x0000000740000000)
  eden space 1398144K,   0% used [0x00000006c0000000, 0x00000006c0000000, 0x0000000715560000)
  from space 349504K,  77% used [0x0000000715560000, 0x0000000725d34430, 0x000000072aab0000)
  to   space 349504K,   0% used [0x000000072aab0000, 0x000000072aab0000, 0x0000000740000000)
 concurrent mark-sweep generation total 2097152K, used 251933K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 4333K, capacity 4960K, committed 5248K, reserved 1056768K
  class space    used 485K, capacity 528K, committed 640K, reserved 1048576K
}
0.882: Total time for which application threads were stopped: 0.1057652 seconds, Stopping threads took: 0.0020995 seconds

新生代gc触发的频率和暂停时间与新生代大小,以及eden区和survivor区的比例有直接关系,eden区越大,触发gc的概率越小,而相应的暂停时间就越长。

在相同的配置的情况下,如果对象生存的周期越长,单次gc的时间也就越长。回到之前的代码,guava缓存的对象生命周期是500ms,如果我们将其改为200ms,会发现在young gc间隔不变的情况下,gc的暂停时间降低了。

0.537: Application time: 0.3536742 seconds
{Heap before GC invocations=0 (full 0):
 par new generation   total 1747648K, used 1397200K [0x00000006c0000000, 0x0000000740000000, 0x0000000740000000)
  eden space 1398144K,  99% used [0x00000006c0000000, 0x00000007154743a8, 0x0000000715560000)
  from space 349504K,   0% used [0x0000000715560000, 0x0000000715560000, 0x000000072aab0000)
  to   space 349504K,   0% used [0x000000072aab0000, 0x000000072aab0000, 0x0000000740000000)
 concurrent mark-sweep generation total 2097152K, used 0K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 4380K, capacity 4960K, committed 5248K, reserved 1056768K
  class space    used 499K, capacity 528K, committed 640K, reserved 1048576K
0.537: [GC (Allocation Failure) 0.537: [ParNew
Desired survivor size 178946048 bytes, new threshold 1 (max 15)
- age   1:  195491160 bytes,  195491160 total
: 1397200K->190946K(1747648K), 0.0376362 secs] 1397200K->190946K(3844800K), 0.0376798 secs] [Times: user=0.03 sys=0.17, real=0.04 secs] 
Heap after GC invocations=1 (full 0):
 par new generation   total 1747648K, used 190946K [0x00000006c0000000, 0x0000000740000000, 0x0000000740000000)
  eden space 1398144K,   0% used [0x00000006c0000000, 0x00000006c0000000, 0x0000000715560000)
  from space 349504K,  54% used [0x000000072aab0000, 0x0000000736528898, 0x0000000740000000)
  to   space 349504K,   0% used [0x0000000715560000, 0x0000000715560000, 0x000000072aab0000)
 concurrent mark-sweep generation total 2097152K, used 0K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 4380K, capacity 4960K, committed 5248K, reserved 1056768K
  class space    used 499K, capacity 528K, committed 640K, reserved 1048576K
}
0.575: Total time for which application threads were stopped: 0.0381014 seconds, Stopping threads took: 0.0001808 seconds
0.764: Application time: 0.1888610 seconds
{Heap before GC invocations=1 (full 0):
 par new generation   total 1747648K, used 1588337K [0x00000006c0000000, 0x0000000740000000, 0x0000000740000000)
  eden space 1398144K,  99% used [0x00000006c0000000, 0x00000007154a3ee8, 0x0000000715560000)
  from space 349504K,  54% used [0x000000072aab0000, 0x0000000736528898, 0x0000000740000000)
  to   space 349504K,   0% used [0x0000000715560000, 0x0000000715560000, 0x000000072aab0000)
 concurrent mark-sweep generation total 2097152K, used 0K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 4407K, capacity 5024K, committed 5248K, reserved 1056768K
  class space    used 499K, capacity 528K, committed 640K, reserved 1048576K
0.764: [GC (Allocation Failure) 0.764: [ParNew
Desired survivor size 178946048 bytes, new threshold 1 (max 15)
- age   1:  286295720 bytes,  286295720 total
: 1588337K->280407K(1747648K), 0.0568310 secs] 1588337K->281442K(3844800K), 0.0568624 secs] [Times: user=0.14 sys=0.31, real=0.06 secs] 
Heap after GC invocations=2 (full 0):
 par new generation   total 1747648K, used 280407K [0x00000006c0000000, 0x0000000740000000, 0x0000000740000000)
  eden space 1398144K,   0% used [0x00000006c0000000, 0x00000006c0000000, 0x0000000715560000)
  from space 349504K,  80% used [0x0000000715560000, 0x0000000726735dd0, 0x000000072aab0000)
  to   space 349504K,   0% used [0x000000072aab0000, 0x000000072aab0000, 0x0000000740000000)
 concurrent mark-sweep generation total 2097152K, used 1035K [0x0000000740000000, 0x00000007c0000000, 0x00000007c0000000)
 Metaspace       used 4407K, capacity 5024K, committed 5248K, reserved 1056768K
  class space    used 499K, capacity 528K, committed 640K, reserved 1048576K
}
0.821: Total time for which application threads were stopped: 0.0572199 seconds, Stopping threads took: 0.0001468 seconds

在配置相同,以及内存增长速度相同的情况下,对象的生命周期越长,单次gc所造成的暂停就越长。

当然除了上面提到的eden区大小,对象的生命周期会影响young gc的频率和时间外,还会有其他的因素影响

  1. 对象创建的频率以及速度
  2. survivor区的大小,survivor区太小可能会导致一部分对象需要被复制到老年带,增加耗时
  3. 老年代的大小,老年代过小会导致新生代提升到老年代的时候失败,触发old gc
  4. 新生代对象的大小,对象过大进入老年代的时候,找不到合适的空间导致cms触发old gc甚至compact。

等等

老年代日志

回到最上面提到的四段日志,其中第四段属于old gc的日志,我们可以看到CMS的整个过程可以划分为下面的几个阶段

  1. CMS Initial Mark
  2. CMS-concurrent-mark-start
  3. CMS-concurrent-mark
  4. CMS-concurrent-preclean-start
  5. CMS-concurrent-preclean
  6. CMS-concurrent-abortable-preclean-start
  7. CMS-concurrent-abortable-preclean
  8. CMS Final Remark
  9. CMS-concurrent-sweep-start
  10. CMS-concurrent-sweep
  11. CMS-concurrent-reset-start
  12. CMS-concurrent-reset

CMS Initial Mark

初始化标记阶段。

2.556: [GC (CMS Initial Mark)  (Survivor:9chunks) Finished young gen initial mark scan work in 4th thread: 0.000 sec
Finished young gen initial mark scan work in 6th thread: 0.000 sec
Finished young gen initial mark scan work in 5th thread: 0.000 sec
Finished young gen initial mark scan work in 1th thread: 0.000 sec
Finished young gen initial mark scan work in 3th thread: 0.000 sec
Finished young gen initial mark scan work in 0th thread: 0.000 sec
Finished remaining root initial mark scan work in 6th thread: 0.000 sec
Finished young gen initial mark scan work in 2th thread: 0.000 sec
Finished remaining root initial mark scan work in 3th thread: 0.000 sec
Finished remaining root initial mark scan work in 1th thread: 0.000 sec
Finished remaining root initial mark scan work in 5th thread: 0.000 sec
Finished remaining root initial mark scan work in 4th thread: 0.000 sec
Finished young gen initial mark scan work in 7th thread: 0.000 sec
Finished remaining root initial mark scan work in 2th thread: 0.000 sec
Finished remaining root initial mark scan work in 0th thread: 0.000 sec
Finished remaining root initial mark scan work in 7th thread: 0.000 sec
[1 CMS-initial-mark: 1851656K(3512768K)] 1939801K(4023936K), 0.0003299 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
2.556: Total time for which application threads were stopped: 0.0006027 seconds, Stopping threads took: 0.0002155 seconds

该阶段是STOP WORLD的,这可以通过关键词Stopping threads took: xxx得知。有8个线程在进行initial mark操作,线程数受参数ParallelGCThreads控制,默认应该运行程序所在的服务器的核数一致。线程的initial mark操作包含两类,一个是young gen initial mark,另一个是remaining root initial mark。

CMS-concurrent-mark

并发标记阶段,非STOP WORLD

CMS Final Remark

最终标记阶段,这个是STOP WORLD的。

CMS-concurrent-sweep

并发清理