Java垃圾回收详解(5)

CMS 垃圾回收器详解之二 CMS 深入分析

Posted by Jason Lee on 2020-08-10

回顾

继上一节,讨论了CMS的gc 流程,本篇讨论几个遗留的问题。

CMS 的缺点

CMS GC 和Full GC 和 Minor GC 的区别

  • CMS GC

    通过一个后台线程触发,触发机制是默认每隔2秒判断一下当前老年代的内存使用率是否达到阈值,当然具体的触发条件没有这么简单,如果是则触发一次cms gc,在该过程中只会标记出存活对象,然后清除死亡对象,期间会产生碎片空间。

  • Full GC

    是通过 vm thread 执行的,整个过程是 stop-the-world,在该过程中会判断当前 gc 是否需要进行compact,即把存活对象移动到内存的一端,可以有效的消除cms gc产生的碎片空间。

  • Minor GC

    从年轻代空间(包括 Eden 和 Survivor 区域)回收内存被称为 Minor GC。这一定义既清晰又易于理解。但是,当发生Minor GC事件的时候,有一些有趣的地方需要注意到:

    当 JVM 无法为一个新的对象分配空间时会触发 Minor GC,比如当 Eden 区满了。所以分配率越高,越频繁执行 Minor GC。
    对年轻代的 Eden 和 Survivor 区进行了标记和复制操作,无碎片。

CMS GC 如何触发

对于 cms gc 来说,触发条件很简单,实现位于 ConcurrentMarkSweepThread 类中,相当于Java 中的Thread,该线程随着堆一起初始化,在该类的 run 方法中有这么一段逻辑:

1
2
3
4
5
6
7
while (!_should_terminate) {
sleepBeforeNextCycle();
if (_should_terminate) break;
GCCause::Cause cause = _collector->_full_gc_requested ?
_collector->_full_gc_cause : GCCause::_cms_concurrent_mark;
_collector->collect_in_background(false, cause);
}

sleepBeforeNextCycle()保证了最晚每 2 秒(-XX:CMSWaitDuration)进行一次判断,实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
void ConcurrentMarkSweepThread::sleepBeforeNextCycle() {
while (!_should_terminate) {
if (CMSIncrementalMode) {
icms_wait();
return;
} else {
// Wait until the next synchronous GC, a concurrent full gc
// request or a timeout, whichever is earlier.
wait_on_cms_lock(CMSWaitDuration);
}
// Check if we should start a CMS collection cycle
if (_collector->shouldConcurrentCollect()) {
return;
}
// .. collection criterion not yet met, let's go back
// and wait some more
}
}

其中shouldConcurrentCollect()方法决定了是否可以触发本次 cms gc,分为以下几种情况:

  1. 如果_full_gc_requested为真,说明有明确的需求要进行gc,比如调用System.gc();

  2. CMS 默认采用 jvm 运行时的统计数据判断是否需要触发 cms gc,如果需要根据 CMSInitiatingOccupancyFraction 的值进行判断,需要设置参数-XX:+UseCMSInitiatingOccupancyOnly

  3. 如果开启了UseCMSInitiatingOccupancyOnly参数,判断当前老年代使用率是否大于阈值,则触发 cms gc,该阈值可以通过参数-XX:CMSInitiatingOccupancyFraction进行设置,如果没有设置,默认为92%

  4. 如果之前的 ygc 失败过,或则下次新生代执行 ygc 可能失败,这两种情况下都需要触发 cms gc;

  5. CMS 默认不会对永久代进行垃圾收集,如果希望对永久代进行垃圾收集,需要设置参数-XX:+CMSClassUnloadingEnabled如果开启了CMSClassUnloadingEnabled,根据永久带的内存使用率判断是否触发 cms gc;

  6. …还有一些其它情况
    如果有上述几种情况,说明需要执行一次 cms gc,通过调用_collector->collect_in_background(false, cause) 进行触发,注意这个方法名中的in_background

FULL GC 如何触发
触发 full gc 的主要原因是在eden区为对象或TLAB分配内存失败,导致一次 ygc,在 GenCollectorPolicy 类的satisfy_failed_allocation()方法中有这么一段逻辑:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
if (!gch->incremental_collection_will_fail(false /* don't consult_young */)) {
// Do an incremental collection.
gch->do_collection(false /* full */,
false /* clear_all_soft_refs */,
size /* size */,
is_tlab /* is_tlab */,
number_of_generations() - 1 /* max_level */);
} else {
if (Verbose && PrintGCDetails) {
gclog_or_tty->print(" :: Trying full because partial may fail :: ");
}
// Try a full collection; see delta for bug id 6266275
// for the original code and why this has been simplified
// with from-space allocation criteria modified and
// such allocation moved out of the safepoint path.
gch->do_collection(true /* full */,
false /* clear_all_soft_refs */,
size /* size */,
is_tlab /* is_tlab */,
number_of_generations() - 1 /* max_level */);
}

该方法是由 vm thread 执行的,整个过程都是 stop-the-world,如果当前incremental_collection_will_fail方法返回 false,则会放弃本次的 ygc,直接触发一次 full gc,incremental_collection_will_fail实现如下:

1
2
3
4
5
6
7
8
9
bool incremental_collection_will_fail(bool consult_young) {
// Assumes a 2-generation system; the first disjunct remembers if an
// incremental collection failed, even when we thought (second disjunct)
// that it would not.
assert(heap()->collector_policy()->is_two_generation_policy(),
"the following definition may not be suitable for an n(>2)-generation system");
return incremental_collection_failed() ||
(consult_young && !get_gen(0)->collection_attempt_is_safe());
}

其中参数 consult_youngfalse,如果incremental_collection_failed()返回 true,会导致执行很慢很慢很慢的full gc,如果上一次 ygc 过程中发生 promotion failure时,会设置 _incremental_collection_failedtrue,即方法incremental_collection_failed()返回 true,相当于触发了 full gc。

还有一种情况是,当发生ygc之后,还是没有足够的内存进行分配,这时会继续触发 full gc,实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
// If we reach this point, we're really out of memory. Try every trick
// we can to reclaim memory. Force collection of soft references. Force
// a complete compaction of the heap. Any additional methods for finding
// free memory should be here, especially if they are expensive. If this
// attempt fails, an OOM exception will be thrown.
{
IntFlagSetting flag_change(MarkSweepAlwaysCompactCount, 1); // Make sure the heap is fully compacted

gch->do_collection(true /* full */,
true /* clear_all_soft_refs */,
size /* size */,
is_tlab /* is_tlab */,
number_of_generations() - 1 /* max_level */);
}

FULL GC 中的compact

每次触发 full gc,会根据should_compact 标识进行判断是否需要执行 compact ,判断实现如下:

1
2
3
4
5
*should_compact =
UseCMSCompactAtFullCollection &&
((_full_gcs_since_conc_gc >= CMSFullGCsBeforeCompaction) ||
GCCause::is_user_requested_gc(gch->gc_cause()) ||
gch->incremental_collection_will_fail(true /* consult_young */));

UseCMSCompactAtFullCollection默认开启,但是否要进行 compact,还得看后面的条件:

  1. 最近一次cms gc 以来发生 full gc 的次数_full_gcs_since_conc_gc(这个值每次执行完 cms gc 的sweeping 阶段就会设置为0)达到阈值CMSFullGCsBeforeCompaction
  2. 用户强制执行了gc,如System.gc()
  3. 上一次 ygc 已经失败(发生了promotion failure),或预测下一次 ygc 不会成功。

如果上述条件都不满足,是否就一直不进行 compact,这样碎片问题就得不到缓解了,幸好还有补救的机会,实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

if (clear_all_soft_refs && !*should_compact) {
// We are about to do a last ditch collection attempt
// so it would normally make sense to do a compaction
// to reclaim as much space as possible.
if (CMSCompactWhenClearAllSoftRefs) {
// Default: The rationale is that in this case either
// we are past the final marking phase, in which case
// we'd have to start over, or so little has been done
// that there's little point in saving that work. Compaction
// appears to be the sensible choice in either case.
*should_compact = true;
} else {
// We have been asked to clear all soft refs, but not to
// compact. Make sure that we aren't past the final checkpoint
// phase, for that is where we process soft refs. If we are already
// past that phase, we'll need to redo the refs discovery phase and
// if necessary clear soft refs that weren't previously
// cleared. We do so by remembering the phase in which
// we came in, and if we are past the refs processing
// phase, we'll choose to just redo the mark-sweep
// collection from scratch.
if (_collectorState > FinalMarking) {
// We are past the refs processing phase;
// start over and do a fresh synchronous CMS cycle
_collectorState = Resetting; // skip to reset to start new cycle
reset(false /* == !asynch */);
*should_start_over = true;
} // else we can continue a possibly ongoing current cycle
}
}

普通的 full gc,参数clear_all_soft_refs为 false,不会清理软引用,如果在执行完 full gc,空间还是不足的话,会执行一次彻底的 full gc,尝试清理所有的软引用,想方设法的收集可用内存,这种情况clear_all_soft_refs为 true,而且CMSCompactWhenClearAllSoftRefs默认为 true,在垃圾收集完可以执行一次compact,如果真的走到了这一步,该好好的查查代码了,因为这次 gc 的暂停时间已经很长很长很长了。
根据对should_compact参数的判断,执行不同的算法进行 full gc,实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
if (should_compact) {
// If the collection is being acquired from the background
// collector, there may be references on the discovered
// references lists that have NULL referents (being those
// that were concurrently cleared by a mutator) or
// that are no longer active (having been enqueued concurrently
// by the mutator).
// Scrub the list of those references because Mark-Sweep-Compact
// code assumes referents are not NULL and that all discovered
// Reference objects are active.
ref_processor()->clean_up_discovered_references();

if (first_state > Idling) {
save_heap_summary();
}

do_compaction_work(clear_all_soft_refs);

// Has the GC time limit been exceeded?
DefNewGeneration* young_gen = _young_gen->as_DefNewGeneration();
size_t max_eden_size = young_gen->max_capacity() -
young_gen->to()->capacity() -
young_gen->from()->capacity();
GenCollectedHeap* gch = GenCollectedHeap::heap();
GCCause::Cause gc_cause = gch->gc_cause();
size_policy()->check_gc_overhead_limit(_young_gen->used(),
young_gen->eden()->used(),
_cmsGen->max_capacity(),
max_eden_size,
full,
gc_cause,
gch->collector_policy());
} else {
do_mark_sweep_work(clear_all_soft_refs, first_state, should_start_over);
}

关于引用和和gc 处理 将会在下节讨论

参考:

关于CMS垃圾收集算法的一些疑惑
Java垃圾回收浅析(2)-GC方式介绍



支付宝打赏 微信打赏

赞赏一下