在命令行中使用Eclipse MAT工具
最近應用在測試中出現Out Of Memory的問題, 通過jmap查看,發現JVM heap全用滿了。
有很多工具可以查看JVM堆的信息, 收費的比如JProfiler, YourKit,免費的如Oracle JDK自帶的visualvm, jhat和Eclipse MAT。
這個應用安裝在一臺AWS上,沒有圖形界面, 內存也比較小,想通過VNC遠程桌面啟動visualvm或者MAT不可能,通過jhat分析dump出來的snapshot(大約4.3G)也很慢,半天沒有分析完畢,這種辦法也放棄。
最后通過MAT的命令行工具分析了dump出來的snapshot,查找到OOM的元兇。
使用MAT命令行工具
首先通過jstat或者jmap查看heap信息,比如通過jmap查看:
JVM version is 25.31 -b07
using thread-local object allocation.
Parallel GC with 4 thread(s)
Heap Configuration:
MinHeapFreeRatio = 0
MaxHeapFreeRatio = 100
MaxHeapSize = 4294967296 ( 4096.0 MB)
NewSize = 1431306240 ( 1365.0 MB)
MaxNewSize = 1431306240 ( 1365.0 MB)
OldSize = 2863661056 ( 2731.0 MB)
NewRatio = 2
SurvivorRatio = 8
MetaspaceSize = 21807104 ( 20.796875 MB)
CompressedClassSpaceSize = 1073741824 ( 1024.0 MB)
MaxMetaspaceSize = 17592186044415 MB
G1HeapRegionSize = 0 ( 0.0 MB)
Heap Usage:
PS Young Generation
Eden Space:
capacity = 482344960 ( 460.0 MB)
used = 468288384 ( 446.5946044921875 MB)
free = 14056576 ( 13.4053955078125 MB)
97.08578358525816 % used
From Space:
capacity = 278921216 ( 266.0 MB)
used = 0 ( 0.0 MB)
free = 278921216 ( 266.0 MB)
0.0 % used
To Space:
capacity = 477102080 ( 455.0 MB)
used = 0 ( 0.0 MB)
free = 477102080 ( 455.0 MB)
0.0 % used
PS Old Generation
capacity = 2863661056 ( 2731.0 MB)
used = 2863365080 ( 2730.7177352905273 MB)
free = 295976 ( 0.28226470947265625 MB)
99.98966441927965 % used
12340 interned Strings occupying 1051736 bytes.
最多的類的實例:
num #instances #bytes class name
----------------------------------------------
1: 21606534 1530253752 [C
2: 21606239 518549736 java.lang.String
3: 19198980 460775520 scala.collection.immutable.ListSet$Node
4: 4568546 109645104 scala.collection.immutable.HashSet$HashSetCollision1
5: 103739 63212992 [B
6: 1487034 53464560 [Lscala.collection.immutable.HashSet;
7: 1487034 35688816 scala.collection.immutable.HashSet$HashTrieSet
8: 1350368 32408832 scala.collection.immutable.$colon$colon
9: 1090897 26181528 scala.collection.immutable.HashSet$HashSet1
10: 200035 17603080 akka.actor.ActorCell
11: 100536 8042880 java.lang.reflect.Constructor
12: 500026 8000416 scala.runtime.ObjectRef
從分析來看猜測是akka actor mailbox里面的字符串消息太多了。
既然沒有辦法圖形化啟動visualvm和MAT,那么就使用MAT文件夾下的ParseHeapDump.sh, 特別適合分析大堆的信息。
首先你需要修改MemoryAnalyzer.ini中的Xmx值,確保有充足的硬盤空間(至少dump文件的兩倍)。
然后運行
./ParseHeapDump.sh heap.bin org.eclipse.mat.api:suspects org.eclipse.mat.api:overview org.eclipse.mat.api:top_components
會得到suspects, overview和top_components三個視圖的信息。

可以看到akka.dispatch.Dispatcher$$anon$1一個實例占用了2.4GB的內存,這就是罪魁禍首。這其實是akka dispatcher的mailbox中的java.util.concurrent.ConcurrentLinkedQueue,每個Node占用了81M的內存,
消息體太大了。
編寫程序得到所需信息
你也可以引用MAT的類,得到heap dump中的信息, 因為MAT使用Eclipse RCP框架, 基于osgi架構,使用起來不太方便,所以你可以別人抽取出來的MAT庫,如https://bitbucket.org/joebowbeer/andromat,
然后實現一個命令行程序,比如下面的例子就是輸出所有的字符串的值:
package com.colobu.mat; import org.eclipse.mat.SnapshotException; import org.eclipse.mat.parser.model.PrimitiveArrayImpl; import org.eclipse.mat.snapshot.ISnapshot; import org.eclipse.mat.parser.internal.SnapshotFactory; import org.eclipse.mat.snapshot.model.IClass; import org.eclipse.mat.snapshot.model.IObject; import org.eclipse.mat.util.ConsoleProgressListener; import org.eclipse.mat.util.IProgressListener; import java.io.File; import java.io.IOException; import java.util.Collection; import java.util.HashMap; public class Main { public static void main (String[] args) throws SnapshotException, IOException { String arg = args[args.length - 1 ]; String fileName = arg; IProgressListener listener = new ConsoleProgressListener(System.out); SnapshotFactory sf = new SnapshotFactory(); ISnapshot snapshot = sf.openSnapshot( new File(fileName), new HashMap<String, String>(), listener); System.out.println(snapshot.getSnapshotInfo()); System.out.println(); String[] classNames = { "java.lang.String" }; for (String name : classNames) { Collection<IClass> classes = snapshot.getClassesByName(name, false ); if (classes == null || classes.isEmpty()) { System.out.println(String.format( "Cannot find class %s in heap dump" , name)); continue ; } assert classes.size() == 1 ; IClass clazz = classes.iterator().next(); int [] objIds = clazz.getObjectIds(); long minRetainedSize = snapshot.getMinRetainedSize(objIds, listener); System.out.println(String.format( "%s instances = %d, retained size >= %d" , clazz.getName(), objIds.length, minRetainedSize)); for ( int i = 0 ; i < objIds.length; i++) { IObject str = snapshot.getObject(objIds[i]); String address = Long.toHexString(snapshot.mapIdToAddress(objIds[i])); PrimitiveArrayImpl chars = (PrimitiveArrayImpl) str.resolveValue( "value" ); String value = new String(( char []) chars.getValueArray()); System.out.println(String.format( "id=%d, address=%s, value=%s" , objIds[i], address, value)); } } } }
基本上使用ParseHeapDump.sh已經得到了我所需要的結果,優化akka actor消息的內容解決了我的問題。