type
status
date
slug
summary
tags
category
icon
password
标签
背景
最近发现线上某应用从8月初开始重启越来越频繁,折腾了近一周,过程有点曲折,这里记录一下大概过程。
线上情况
线上服务器平时运行良好,只是在某个瞬间服务器负载突然飙升,检测脚本检测应用请求无响应自动对应用进行重启操作。
重启瞬间服务器负载突然升高,得并不太高,流量也比平时高一倍左右,后台日志并无报错信息。
线上虚拟机配置:4核CPU,5G内存;
JDK版本:1.6
排查
DUMP日志分析
根据dump下来的文件分析95%的线程都在等待同一个资源。
"http-app-a-8080-145$893127005" daemon prio=10 tid=0x00002aab4c0f3400 nid=0x4b36 in Object.wait() [0x0000000044719000..0x000000004471ac10] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00002aaacfd05698> (a java.util.ArrayList) at com.caucho.jca.ConnectionPool.create(ConnectionPool.java:913) - locked <0x00002aaacfd05698> (a java.util.ArrayList) at com.caucho.jca.ConnectionPool.allocatePool(ConnectionPool.java:795) at com.caucho.jca.ConnectionPool.allocate(ConnectionPool.java:756) at com.caucho.jca.ConnectionPool.allocateConnection(ConnectionPool.java:567) at com.caucho.sql.DataSourceImpl.getConnection(DataSourceImpl.java:65) at com.caucho.sql.DBPool.getConnection(DBPool.java:701) at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:113) at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:79) at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:577) at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:641) at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:670) at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:678) at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:710) at org.springframework.jdbc.core.simple.SimpleJdbcTemplate.query(SimpleJdbcTemplate.java:187) Locked ownable synchronizers: - None
GC分析
从GC统计看,Eden区和Old代都是100%。
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 1 0 92 377528 186180 1639548 0 0 0 22 0 0 8 2 90 0 0 1 0 92 377536 186180 1639588 0 0 0 15 1714 794 26 2 72 0 0 1 0 92 377884 186180 1639612 0 0 0 87 1759 7236 44 5 51 0 0 S0 S1 E O P YGC YGCT FGC FGCT GCT 100.00 0.00 100.00 100.00 53.23 805 206.792 85 315.321 522.113 100.00 0.00 100.00 100.00 53.23 805 206.792 86 326.422 533.214 100.00 0.00 100.00 100.00 53.23 805 206.792 86 326.422 533.214
打开JVM的GC详情日志后发现,GC日志偶尔会出现并发模式失败(
concurrent mode failure
)的错误,在应用重启前Old代使用比例迅速升高,然后频繁进行Full GC操作,Full GC伴随着 concurrent mode failure
错误,Old代使用空间一直没法缩小,最终造成应用无响应而重启。165481.820: [CMS-concurrent-mark-start] 165481.847: [GC 165481.847: [ParNew: 419456K->52416K(471872K), 0.1388080 secs] 1678939K->1315679K(2044736K), 0.1389840 secs] [Times: user=0.30 sys=0.00, real=0.14 secs] 165482.711: [GC 165482.711: [ParNew: 471872K->52416K(471872K), 0.2897670 secs] 1735135K->1415603K(2044736K), 0.2901120 secs] [Times: user=0.68 sys=0.04, real=0.30 secs] 165484.160: [GC 165484.160: [ParNew: 471872K->52416K(471872K), 0.1511710 secs] 1835059K->1468617K(2044736K), 0.1514160 secs] [Times: user=0.35 sys=0.01, real=0.15 secs] 165485.025: [GC 165485.025: [ParNew: 471872K->52416K(471872K), 0.1989200 secs] 1888073K->1513178K(2044736K), 0.1991520 secs] [Times: user=0.48 sys=0.02, real=0.20 secs] 165485.499: [CMS-concurrent-mark: 2.867/3.678 secs] [Times: user=10.39 sys=0.60, real=3.68 secs] 165485.499: [CMS-concurrent-preclean-start] 165485.597: [CMS-concurrent-preclean: 0.095/0.098 secs] [Times: user=0.29 sys=0.05, real=0.10 secs] 165485.597: [CMS-concurrent-abortable-preclean-start] 165486.287: [GC 165486.287: [ParNew: 471872K->52416K(471872K), 0.4610860 secs] 1932634K->1567513K(2044736K), 0.4614320 secs] [Times: user=1.01 sys=0.05, real=0.46 secs] 165486.938: [CMS-concurrent-abortable-preclean: 0.580/1.341 secs] [Times: user=4.14 sys=0.28, real=1.34 secs] 165486.941: [GC[YG occupancy: 128659 K (471872 K)]165486.941: [Rescan (parallel) , 0.2358900 secs]165487.177: [weak refs processing, 0.0020640 secs] [1 CMS-remark: 1515097K(1572864K)] 1643757K(2044736K), 0.2381370 secs] [Times: user=0.47 sys=0.01, real=0.24 secs] 165487.181: [CMS-concurrent-sweep-start] 165488.042: [Full GC 165488.043: [CMS165488.518: [CMS-concurrent-sweep: 1.213/1.337 secs] [Times: user=3.37 sys=0.23, real=1.34 secs] (concurrent mode failure) (concurrent mode failure): 1514835K->1485873K(1572864K), 0.4853610 secs] 1986707K->1957745K(2044736K), [CMS Perm : 81695K->81695K(136412K)], 0.4856970 secs] [Times: user=0.37 sys=0.01, real=0.48 secs] 165488.532: [GC 165488.532: [ParNew: 471872K->44512K(471872K), 0.3130160 secs] 1957745K->1573798K(2044736K), 0.3133460 secs] [Times: user=0.75 sys=0.02, real=0.31 secs] 165489.568: [Full GC 165489.568: [CMS[Unloading class sun.reflect.GeneratedMethodAccessor1602] [Unloading class sun.reflect.GeneratedMethodAccessor1597] [Unloading class sun.reflect.GeneratedMethodAccessor1592] [Unloading class sun.reflect.GeneratedMethodAccessor1594] [Unloading class sun.reflect.GeneratedMethodAccessor1598] [Unloading class sun.reflect.GeneratedMethodAccessor1555] [Unloading class sun.reflect.GeneratedMethodAccessor1599] [Unloading class sun.reflect.GeneratedMethodAccessor1600] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor77] [Unloading class sun.reflect.GeneratedMethodAccessor1604] [Unloading class sun.reflect.GeneratedMethodAccessor1593] [Unloading class sun.reflect.GeneratedMethodAccessor1603] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor75] [Unloading class sun.reflect.GeneratedMethodAccessor1596] [Unloading class sun.reflect.GeneratedMethodAccessor1595] [Unloading class sun.reflect.GeneratedMethodAccessor1567] [Unloading class sun.reflect.GeneratedMethodAccessor395] [Unloading class sun.reflect.GeneratedMethodAccessor1601] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor26] : 1529286K->1392790K(1572864K), 4.5172760 secs] 1993254K->1392790K(2044736K), [CMS Perm : 81695K->81629K(136412K)], 4.5176710 secs] [Times: user=4.44 sys=0.02, real=4.52 secs] 165494.145: [GC [1 CMS-initial-mark: 1392790K(1572864K)] 1414063K(2044736K), 0.2010890 secs] [Times: user=0.20 sys=0.00, real=0.20 secs] 165494.347: [CMS-concurrent-mark-start] 165495.237: [GC 165495.237: [ParNew: 419456K->52416K(471872K), 0.2333590 secs] 1812246K->1460179K(2044736K), 0.2337080 secs] [Times: user=0.50 sys=0.01, real=0.23 secs] 165496.118: [GC 165496.118: [ParNew: 471872K->52416K(471872K), 0.1903580 secs] 1879635K->1515864K(2044736K), 0.1905860 secs] [Times: user=0.44 sys=0.03, real=0.19 secs] 165496.715: [GC 165496.715: [ParNew: 471872K->52416K(471872K), 0.2422860 secs] 1935320K->1563468K(2044736K), 0.2425610 secs] [Times: user=0.58 sys=0.02, real=0.24 secs] 165497.487: [Full GC 165497.487: [CMS165498.311: [CMS-concurrent-mark: 3.262/3.965 secs] [Times: user=11.02 sys=0.73, real=3.96 secs] (concurrent mode failure): 1511052K->1366128K(1572864K), 5.3419340 secs] 1982924K->1366128K(2044736K), [CMS Perm : 81638K->81638K(136412K)], 5.3422540 secs] [Times: user=5.28 sys=0.00, real=5.34 secs] 165503.637: [GC 165503.637: [ParNew: 419456K->52416K(471872K), 0.1635850 secs] 1785584K->1421481K(2044736K), 0.1638860 secs] [Times: user=0.39 sys=0.01, real=0.16 secs] 165503.804: [GC [1 CMS-initial-mark: 1369065K(1572864K)] 1424544K(2044736K), 0.1566690 secs] [Times: user=0.15 sys=0.00, real=0.15 secs] 165503.961: [CMS-concurrent-mark-start] 165505.133: [GC 165505.133: [ParNew: 471866K->52416K(471872K), 0.4184870 secs] 1840931K->1480980K(2044736K), 0.4188850 secs] [Times: user=0.91 sys=0.04, real=0.42 secs] 165506.175: [GC 165506.176: [ParNew: 471872K->52416K(471872K), 0.2192270 secs] 1900436K->1525506K(2044736K), 0.2194620 secs] [Times: user=0.49 sys=0.01, real=0.22 secs] 165507.094: [GC 165507.094: [ParNew: 471872K->52416K(471872K), 0.2391910 secs] 1944962K->1569157K(2044736K), 0.2394130 secs] [Times: user=0.49 sys=0.02, real=0.24 secs] 165508.178: [Full GC 165508.178: [CMS165509.030: [CMS-concurrent-mark: 4.169/5.068 secs] [Times: user=14.16 sys=1.11, real=5.07 secs] (concurrent mode failure): 1516741K->1484032K(1572864K), 8.9510320 secs] 1988613K->1484032K(2044736K), [CMS Perm : 81663K->81634K(136412K)], 8.9515060 secs] [Times: user=7.56 sys=0.15, real=8.95 secs] 165517.659: [GC 165517.659: [ParNew: 419456K->52416K(471872K), 0.1610900 secs] 1903488K->1541664K(2044736K), 0.1613260 secs] [Times: user=0.31 sys=0.00, real=0.16 secs] 165518.036: [GC [1 CMS-initial-mark: 1489248K(1572864K)] 1717780K(2044736K), 0.5143650 secs] [Times: user=0.51 sys=0.00, real=0.51 secs] 165518.552: [CMS-concurrent-mark-start] 165519.175: [GC 165519.175: [ParNew: 471872K->52416K(471872K), 0.5417870 secs] 1961120K->1606922K(2044736K), 0.5421660 secs] [Times: user=1.07 sys=0.04, real=0.54 secs] 165520.553: [Full GC 165520.553: [CMS165521.788: [CMS-concurrent-mark: 2.683/3.236 secs] [Times: user=7.05 sys=0.65, real=3.24 secs] (concurrent mode failure): 1554506K->1522781K(1572864K), 6.0295590 secs] 2026378K->1522781K(2044736K), [CMS Perm : 81634K->81634K(136412K)], 6.0298480 secs] [Times: user=5.97 sys=0.01, real=6.03 secs] 165526.774: [GC 165526.774: [ParNew: 227496K->227496K(471872K), 0.0000320 secs] 1750278K->1750278K(2044736K), 0.0001730 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] GC locker: Trying a full collection because scavenge failed 165526.774: [Full GC 165526.774: [CMS: 1522781K->1532156K(1572864K), 5.7417660 secs] 1750278K->1532156K(2044736K), [CMS Perm : 81634K->81634K(136412K)], 5.7420650 secs] [Times: user=5.67 sys=0.01, real=5.74 secs] 165532.560: [GC [1 CMS-initial-mark: 1532156K(1572864K)] 1559772K(2044736K), 0.3628570 secs] [Times: user=0.33 sys=0.02, real=0.37 secs] 165532.923: [CMS-concurrent-mark-start] 165533.668: [GC 165533.668: [ParNew: 419456K->419456K(471872K), 0.0000250 secs]165533.668: [CMS165535.512: [CMS-concurrent-mark: 2.581/2.589 secs] [Times: user=4.38 sys=0.27, real=2.59 secs] (concurrent mode failure): 1532156K->1544103K(1572864K), 7.4693280 secs] 1951612K->1544103K(2044736K), 7.4697110 secs] [Times: user=7.30 sys=0.04, real=7.47 secs] 165542.182: [GC 165542.183: [ParNew: 419456K->419456K(471872K), 0.0000450 secs]165542.183: [CMS: 1544103K->1546768K(1572864K), 4.2923010 secs] 1963559K->1546768K(2044736K), 4.2927530 secs] [Times: user=4.25 sys=0.00, real=4.29 secs] 165546.939: [GC [1 CMS-initial-mark: 1546768K(1572864K)] 1966224K(2044736K), 0.3814880 secs] [Times: user=0.37 sys=0.00, real=0.38 secs] 165547.321: [CMS-concurrent-mark-start] 165547.322: [GC 165547.323: [ParNew: 419456K->419456K(471872K), 0.0000240 secs]165547.323: [CMS165549.212: [CMS-concurrent-mark: 1.885/1.891 secs] [Times: user=1.88 sys=0.00, real=1.90 secs] (concurrent mode failure): 1546768K->1558536K(1572864K), 7.2698510 secs] 1966224K->1558536K(2044736K), 7.2701150 secs] [Times: user=7.20 sys=0.00, real=7.27 secs] 165555.162: [GC 165555.162: [ParNew: 419456K->419456K(471872K), 0.0000440 secs]165555.162: [CMS: 1558536K->1572419K(1572864K), 4.6586080 secs] 1977992K->1572419K(2044736K), 4.6590020 secs] [Times: user=4.59 sys=0.00, real=4.65 secs] 165559.919: [GC [1 CMS-initial-mark: 1572419K(1572864K)] 1693916K(2044736K), 0.3145610 secs] [Times: user=0.31 sys=0.00, real=0.32 secs] 165560.236: [CMS-concurrent-mark-start] 165560.628: [GC 165560.628: [ParNew: 419456K->419456K(471872K), 0.0000280 secs]165560.628: [CMS165562.196: [CMS-concurrent-mark: 1.954/1.961 secs] [Times: user=2.84 sys=0.19, real=1.96 secs] (concurrent mode failure): 1572419K->1572863K(1572864K), 8.8879030 secs] 1991875K->1607194K(2044736K), 8.8882080 secs] [Times: user=8.80 sys=0.00, real=8.89 secs] 165569.881: [Full GC 165569.881: [CMS: 1572863K->1572863K(1572864K), 4.8703640 secs] 1992319K->1634385K(2044736K), [CMS Perm : 81634K->81634K(136412K)], 4.8706150 secs] [Times: user=4.82 sys=0.00, real=4.88 secs] 165574.775: [GC [1 CMS-initial-mark: 1572863K(1572864K)] 1656915K(2044736K), 0.1364280 secs] [Times: user=0.14 sys=0.00, real=0.14 secs] 165574.911: [CMS-concurrent-mark-start] 165575.648: [Full GC 165575.649: [CMS165576.932: [CMS-concurrent-mark: 2.013/2.020 secs] [Times: user=3.41 sys=0.21, real=2.01 secs] (concurrent mode failure): 1572863K->1572863K(1572864K), 5.9651530 secs] 2044735K->1682474K(2044736K), [CMS Perm : 81634K->81634K(136412K)], 5.9654160 secs] [Times: user=5.91 sys=0.00, real=5.97 secs] 165582.423: [Full GC 165582.423: [CMS: 1572863K->1572864K(1572864K), 5.0236220 secs] 2044735K->1695427K(2044736K), [CMS Perm : 81634K->81634K(136412K)], 5.0239250 secs] [Times: user=4.98 sys=0.00, real=5.02 secs] 165587.448: [GC [1 CMS-initial-mark: 1572864K(1572864K)] 1697093K(2044736K), 0.1697460 secs] [Times: user=0.17 sys=0.00, real=0.17 secs] 165587.618: [CMS-concurrent-mark-start] 165588.609: [Full GC 165588.609: [CMS165589.719: [CMS-concurrent-mark: 2.093/2.100 secs] [Times: user=4.07 sys=0.63, real=2.10 secs] (concurrent mode failure): 1572864K->1572864K(1572864K), 6.7291670 secs] 2044735K->1724617K(2044736K), [CMS Perm : 81634K->81634K(136412K)], 6.7294860 secs] [Times: user=6.62 sys=0.03, real=6.73 secs] 165596.592: [Full GC 165596.592: [CMS: 1572864K->1572864K(1572864K), 4.4670560 secs] 2044735K->1735633K(2044736K), [CMS Perm : 81634K->81634K(136412K)], 4.4673550 secs] [Times: user=4.43 sys=0.00, real=4.47 secs] 165601.346: [GC [1 CMS-initial-mark: 1572864K(1572864K)] 1992320K(2044736K), 0.5660880 secs] [Times: user=0.56 sys=0.00, real=0.57 secs] 165601.913: [CMS-concurrent-mark-start] 165602.192: [Full GC 165602.192: [CMS165603.600: [CMS-concurrent-mark: 1.681/1.687 secs] [Times: user=2.27 sys=0.20, real=1.68 secs] (concurrent mode failure): 1572864K->1572863K(1572864K), 7.8641150 secs] 2044734K->1739724K(2044736K), [CMS Perm : 81634K->81634K(136412K)], 7.8643680 secs] [Times: user=7.78 sys=0.00, real=7.86 secs] 165610.597: [Full GC 165610.597: [CMS: 1572863K->1572864K(1572864K), 4.9612280 secs] 2044735K->1738303K(2044736K), [CMS Perm : 81634K->81634K(136412K)], 4.9614800 secs] [Times: user=4.92 sys=0.00, real=4.96 secs] 165615.797: [GC [1 CMS-initial-mark: 1572864K(1572864K)] 1992913K(2044736K), 0.3312260 secs] [Times: user=0.32 sys=0.00, real=0.33 secs] 165616.129: [CMS-concurrent-mark-start] 165616.507: [Full GC 165616.507: [CMS165618.227: [CMS-concurrent-mark: 2.085/2.098 secs] [Times: user=2.89 sys=0.26, real=2.10 secs] (concurrent mode failure): 1572864K->1572863K(1572864K), 8.0011150 secs] 2044735K->1733747K(2044736K), [CMS Perm : 81634K->81634K(136412K)], 8.0014230 secs] [Times: user=7.91 sys=0.00, real=8.00 secs] 165624.791: [GC [1 CMS-initial-mark: 1572863K(1572864K)] 1997838K(2044736K), 0.3768850 secs] [Times: user=0.38 sys=0.00, real=0.38 secs] 165625.168: [CMS-concurrent-mark-start] 165625.514: [Full GC 165625.514: [CMS165627.373: [CMS-concurrent-mark: 2.191/2.204 secs] [Times: user=2.95 sys=0.23, real=2.21 secs] (concurrent mode failure): 1572863K->1572863K(1572864K), 8.6134440 secs] 2044735K->1745403K(2044736K), [CMS Perm : 81634K->81634K(136412K)], 8.6138280 secs] [Times: user=8.50 sys=0.02, real=8.61 secs] 165634.940: [Full GC 165634.940: [CMS: 1572863K->1572863K(1572864K), 5.7606810 secs] 2044735K->1763133K(2044736K), [CMS Perm : 81634K->81634K(136412K)], 5.7610340 secs] [Times: user=5.70 sys=0.00, real=5.76 secs] 165640.853: [GC [1 CMS-initial-mark: 1572863K(1572864K)] 1992760K(2044736K), 0.5402960 secs] [Times: user=0.54 sys=0.00, real=0.54 secs] 165641.394: [CMS-concurrent-mark-start] 165641.711: [Full GC 165641.711: [CMS165643.146: [CMS-concurrent-mark: 1.745/1.752 secs] [Times: user=2.40 sys=0.24, real=1.75 secs] (concurrent mode failure): 1572863K->1572863K(1572864K), 7.6559090 secs] 2044735K->1806228K(2044736K), [CMS Perm : 81634K->81634K(136412K)], 7.6561430 secs] [Times: user=7.58 sys=0.00, real=7.66 secs] 165649.836: [Full GC 165649.836: [CMS: 1572863K->1572863K(1572864K), 6.5024530 secs] 2044735K->1808957K(2044736K), [CMS Perm : 81634K->81634K(136412K)], 6.5028710 secs] [Times: user=6.42 sys=0.00, real=6.50 secs] 165656.702: [GC [1 CMS-initial-mark: 1572863K(1572864K)] 1992334K(2044736K), 0.6510610 secs] [Times: user=0.65 sys=0.00, real=0.65 secs] 165657.354: [CMS-concurrent-mark-start] 165657.768: [Full GC 165657.768: [CMS165659.289: [CMS-concurrent-mark: 1.926/1.935 secs] [Times: user=2.79 sys=0.33, real=1.93 secs] (concurrent mode failure): 1572864K->1572863K(1572864K), 9.4778960 secs] 2044735K->1817502K(2044736K), [CMS Perm : 81634K->81634K(136412K)], 9.4782160 secs] [Times: user=9.15 sys=0.00, real=9.48 secs] 165668.339: [Full GC 165668.339: [CMS[Unloading class sun.reflect.GeneratedSerializationConstructorAccessor82] : 1572864K->1572863K(1572864K), 9.3393990 secs] 2044736K->1848935K(2044736K), [CMS Perm : 81634K->81631K(136412K)], 9.3398270 secs] [Times: user=9.25 sys=0.00, real=9.34 secs] 165677.949: [GC [1 CMS-initial-mark: 1572863K(1572864K)] 1993871K(2044736K), 0.6954050 secs] [Times: user=0.69 sys=0.00, real=0.70 secs] 165678.644: [CMS-concurrent-mark-start] 165679.539: [Full GC 165679.540: [CMS165681.669: [CMS-concurrent-mark: 3.012/3.025 secs] [Times: user=4.80 sys=0.64, real=3.02 secs] (concurrent mode failure): 1572863K->1572863K(1572864K), 11.9368790 secs] 2044735K->1859006K(2044736K), [CMS Perm : 81631K->81631K(136412K)], 11.9372800 secs] [Times: user=11.61 sys=0.00, real=11.94 secs] 165692.759: [Full GC 165692.759: [CMS[Unloading class sun.reflect.GeneratedConstructorAccessor240] : 1572863K->1572863K(1572864K), 11.2508310 secs] 2044735K->1874250K(2044736K), [CMS Perm : 81631K->81628K(136412K)], 11.2511750 secs] [Times: user=9.91 sys=0.00, real=11.25 secs] 165704.100: [GC [1 CMS-initial-mark: 1572863K(1572864K)] 1992415K(2044736K), 0.3157090 secs] [Times: user=0.31 sys=0.00, real=0.32 secs] 165704.416: [CMS-concurrent-mark-start] 165704.825: [Full GC 165704.825: [CMS165706.129: [CMS-concurrent-mark: 1.703/1.713 secs] [Times: user=2.54 sys=0.34, real=1.71 secs] (concurrent mode failure): 1572863K->1572863K(1572864K), 6.3984280 secs] 2044732K->1866289K(2044736K), [CMS Perm : 81628K->81628K(136412K)], 6.3986960 secs] [Times: user=6.34 sys=0.01, real=6.40 secs] 165711.728: [Full GC 165711.728: [CMS: 1572863K->1572863K(1572864K), 5.4379380 secs] 2044735K->1880349K(2044736K), [CMS Perm : 81628K->81628K(136412K)], 5.4382730 secs] [Times: user=5.39 sys=0.00, real=5.44 secs] 165717.351: [GC [1 CMS-initial-mark: 1572863K(1572864K)] 1992539K(2044736K), 0.6038490 secs] [Times: user=0.60 sys=0.00, real=0.61 secs] 165717.955: [CMS-concurrent-mark-start] 165718.302: [Full GC 165718.302: [CMS165719.904: [CMS-concurrent-mark: 1.943/1.949 secs] [Times: user=2.69 sys=0.24, real=1.95 secs] (concurrent mode failure)[Unloading class sun.reflect.GeneratedSerializationConstructorAccessor25] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor18] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor17] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor22] [Unloading class sun.reflect.GeneratedMethodAccessor1445] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor24] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor10] [Unloading class sun.reflect.GeneratedMethodAccessor441] [Unloading class sun.reflect.GeneratedMethodAccessor854] : 1572863K->1572863K(1572864K), 7.1694470 secs] 2044735K->1862162K(2044736K), [CMS Perm : 81628K->81598K(136412K)], 7.1696770 secs] [Times: user=7.09 sys=0.00, real=7.17 secs] 165726.106: [Full GC 165726.106: [CMS: 1572863K->1572862K(1572864K), 4.8448050 secs] 2044735K->1877599K(2044736K), [CMS Perm : 81598K->81598K(136412K)], 4.8451090 secs] [Times: user=4.80 sys=0.00, real=4.84 secs] 165731.084: [GC [1 CMS-initial-mark: 1572862K(1572864K)] 1992928K(2044736K), 0.3369610 secs] [Times: user=0.34 sys=0.00, real=0.33 secs] 165731.421: [CMS-concurrent-mark-start] 165731.716: [Full GC 165731.716: [CMS165733.347: [CMS-concurrent-mark: 1.919/1.925 secs] [Times: user=2.56 sys=0.20, real=1.92 secs] (concurrent mode failure): 1572862K->1572252K(1572864K), 10.0779900 secs] 2044734K->1873718K(2044736K), [CMS Perm : 81598K->81598K(136412K)], 10.0782250 secs] [Times: user=9.64 sys=0.00, real=10.08 secs] 165742.925: [Full GC 165742.925: [CMS: 1572776K->1572408K(1572864K), 6.1126860 secs] 2044648K->1887999K(2044736K), [CMS Perm : 81598K->81598K(136412K)], 6.1130120 secs] [Times: user=6.05 sys=0.00, real=6.11 secs] 165749.202: [GC [1 CMS-initial-mark: 1572408K(1572864K)] 1991990K(2044736K), 0.7009320 secs] [Times: user=0.69 sys=0.00, real=0.71 secs] 165749.903: [CMS-concurrent-mark-start] 165750.271: [Full GC 165750.271: [CMS165751.834: [CMS-concurrent-mark: 1.920/1.931 secs] [Times: user=2.65 sys=0.30, real=1.93 secs] (concurrent mode failure): 1572647K->1572863K(1572864K), 8.3700350 secs] 2044519K->1900106K(2044736K), [CMS Perm : 81598K->81598K(136412K)], 8.3703090 secs] [Times: user=8.25 sys=0.00, real=8.38 secs] 165759.574: [Full GC 165759.574: [CMS[Unloading class sun.reflect.GeneratedSerializationConstructorAccessor19] : 1572863K->1572863K(1572864K), 5.2368120 secs] 2044735K->1906019K(2044736K), [CMS Perm : 81598K->81594K(136412K)], 5.2371710 secs] [Times: user=5.17 sys=0.01, real=5.24 secs] 165764.957: [GC [1 CMS-initial-mark: 1572863K(1572864K)] 1992752K(2044736K), 0.5464550 secs] [Times: user=0.54 sys=0.00, real=0.55 secs] 165765.504: [CMS-concurrent-mark-start] 165765.849: [Full GC 165765.849: [CMS165767.557: [CMS-concurrent-mark: 2.043/2.053 secs] [Times: user=2.59 sys=0.28, real=2.05 secs] (concurrent mode failure)
分析原因
根据JVM调优—GC 长时间停顿问题及解决方法分析,出现这种情况基本可以定位两种原因:
- 在 CMS 启动过程中,新生代提升速度过快,老年代收集速度赶不上新生代提升速度
- 在 CMS 启动过程中,老年代碎片化严重,无法容纳新生代提升上来的大对象
发生这种情况,应用线程将会全部停止(相当于网站这段时间无法响应用户请求),进行压缩式垃圾收集(回退到 Serial Old 算法)。
根据建议先进行JVM参数调整,然后再分析内存存放的数据状况。
JVM参数调整
增加三个参数:
<jvm-arg>-XX:+UseCMSInitiatingOccupancyOnly</jvm-arg> <jvm-arg>-XX:CMSInitiatingOccupancyFraction=60</jvm-arg> <jvm-arg>-XX:CMSFullGCsBeforeCompaction=1</jvm-arg>
调整前:
<jvm-arg>-Xms2048m</jvm-arg> <jvm-arg>-Xmx2048m</jvm-arg> <jvm-arg>-Xmn512m</jvm-arg> <jvm-arg>-Xss256k</jvm-arg> <jvm-arg>-XX:+UseConcMarkSweepGC</jvm-arg> <jvm-arg>-XX:+UseParNewGC</jvm-arg> <jvm-arg>-XX:+CMSParallelRemarkEnabled</jvm-arg> <jvm-arg>-XX:+UseCMSCompactAtFullCollection</jvm-arg> <jvm-arg>-XX:PermSize=128m</jvm-arg> <jvm-arg>-XX:MaxPermSize=256m</jvm-arg> <jvm-arg>-XX:+PrintGCTimeStamps</jvm-arg> <jvm-arg>-XX:+PrintGCDetails</jvm-arg>
调整后:
<jvm-arg>-Xms2048m</jvm-arg> <jvm-arg>-Xmx2048m</jvm-arg> <jvm-arg>-Xmn512m</jvm-arg> <jvm-arg>-Xss256k</jvm-arg> <jvm-arg>-XX:+UseConcMarkSweepGC</jvm-arg> <jvm-arg>-XX:+UseParNewGC</jvm-arg> <jvm-arg>-XX:+CMSParallelRemarkEnabled</jvm-arg> <jvm-arg>-XX:+UseCMSCompactAtFullCollection</jvm-arg> <jvm-arg>-XX:+UseCMSInitiatingOccupancyOnly</jvm-arg> <jvm-arg>-XX:CMSInitiatingOccupancyFraction=60</jvm-arg> <jvm-arg>-XX:CMSFullGCsBeforeCompaction=1</jvm-arg> <jvm-arg>-XX:PermSize=128m</jvm-arg> <jvm-arg>-XX:MaxPermSize=256m</jvm-arg> <jvm-arg>-XX:+PrintGCTimeStamps</jvm-arg> <jvm-arg>-XX:+PrintGCDetails</jvm-arg>
JVM参数调整后观察
调整后观察服务器比调整前稳定多了,重启情况减少,但偶尔还是会并发模式失败(
concurrent mode failure
)的错误并重启。分析JVM内存对象
直接上图:
从上图可以很容易发现红框内的对象占比非常大,分析代码发现此处产品筛选页面中的索引文件生成和调用设计有问题,每个分类都会生成单独的一份索引文件,几千个分类就得生成几千个分类,当爬虫过来频繁爬取未有页面缓存内容时,此处频繁加载索引文件到内存中,瞬间把JVM内存中Old代占满而无法释放。
优化解决
重新调整索引方案,分析整理各分类下不同属性之间的关系,将所有分类的索引文件整合成一个。
优化后效果
优化后观察系统稳定,JVM日志也没发现再有并发模式失败(
concurrent mode failure
)的错误。