-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
由1.4.2升级到1.4.6后,在启动内存配置参数不变的情况下,Full GC时间增加了大约十几倍。 #10787
Comments
这两个版本没有太大变化, 只有修复一些bug, 可以自行确认changelog。 可能是升级过后长轮询不均匀导致的。 可以看下config-memory, 每台节点的量和之前的对比。 |
在哪查?nacos grafna里? |
config-memory.log, naming-performance.log 如果有grafana,可以看GC的节点的用量监控指标和升级前的区别。 |
那就是请求数变多了,导致GC提升的,并不是版本导致的,这个对象新旧版本都是存在的,是tomcat http serverlt request的byte[]。 应该就是我猜测的,升级的时候导致了流量不均,负载高的(抗连接更多的)节点出现了频繁FullGC。可以考虑扩容或者升配了,否则集群有一个节点dump了的话,可能会导致所有节点都出现严重GC甚至影响稳定性。 |
在哪里看出请求增多的,单机QPS一致800左右,一共五台,实际上升级配置,增大堆内存,问题会更严重,堆内存增大,导致Full GC时停顿时间更长,长达3s - 4s |
而且我看gc root,都是各种nacos线程持有,类似timer之类的,需要内存dump文件我可以提供 |
request的byte,对应的就是请求数或长轮询数,如果占用内存的都是byte[],且来自于tomcat,因此只能说明是流量不均导致的。或者是业务整体增长,导致更多的byte[]被生成。 |
我们准备回退到1.4.2了,看看效果 |
回退报错,只能扩增节点,增加了三台,还是老样子,byte[]这些对象都是被nacos相关的线程持有不释放,1.4.2版本就没有这些问题。 |
你可以看下1.4.2和1.4.6版本的change log,看下是哪些改动可能导致这个问题,欢迎提交PR修复。 |
好的 |
…e creation of a large number of sessions. Signed-off-by: wangyongtao <wangyongtao@yiche.com>
经过和同事压测排查,发现此问题是由1.4.3版本引入的bug: |
@KomachiSion 麻烦review下,没问题发个版,我们好升级修复 |
#10937) * [ISSUE #10787] Fixed the memory performance issue caused by the creation of a large number of sessions. Signed-off-by: wangyongtao <wangyongtao@yiche.com> * Remove unused import. Signed-off-by: wangyongtao <wangyongtao@yiche.com> --------- Signed-off-by: wangyongtao <wangyongtao@yiche.com>
升级前:
2023-05-13T11:11:03.871+0800: 23199291.478: [Full GC (Ergonomics) [PSYoungGen: 19168K->0K(4174848K)] [ParOldGen: 6290293K->1005684K(6291456K)] 6309461K->1005684K(10466304K), [Metaspace: 87461K->87461K(1130496K)], 0.1700654 secs] [Times: user=0.62 sys=0.12, real=0.17 secs]
2023-05-13T13:51:43.024+0800: 23208930.631: [Full GC (Ergonomics) [PSYoungGen: 18912K->0K(4174336K)] [ParOldGen: 6290870K->1005100K(6291456K)] 6309782K->1005100K(10465792K), [Metaspace: 87461K->87461K(1130496K)], 0.1707284 secs] [Times: user=0.61 sys=0.12, real=0.17 secs]
2023-05-13T16:32:24.626+0800: 23218572.233: [Full GC (Ergonomics) [PSYoungGen: 18720K->0K(4174336K)] [ParOldGen: 6289967K->1012340K(6291456K)] 6308687K->1012340K(10465792K), [Metaspace: 87461K->87461K(1130496K)], 0.1704050 secs] [Times: user=0.74 sys=0.00, real=0.17 secs]
2023-05-13T19:12:52.195+0800: 23228199.802: [Full GC (Ergonomics) [PSYoungGen: 19072K->0K(4174848K)] [ParOldGen: 6290958K->1008397K(6291456K)] 6310030K->1008397K(10466304K), [Metaspace: 87461K->87461K(1130496K)], 0.1670840 secs] [Times: user=0.73 sys=0.00, real=0.17 secs]
2023-05-13T21:53:26.760+0800: 23237834.367: [Full GC (Ergonomics) [PSYoungGen: 18912K->0K(4174848K)] [ParOldGen: 6290160K->1013505K(6291456K)] 6309072K->1013505K(10466304K), [Metaspace: 87461K->87461K(1130496K)], 0.1705765 secs] [Times: user=0.75 sys=0.00, real=0.17 secs]
2023-05-14T00:33:36.515+0800: 23247444.122: [Full GC (Ergonomics) [PSYoungGen: 19136K->0K(4174848K)] [ParOldGen: 6290931K->1016372K(6291456K)] 6310067K->1016372K(10466304K), [Metaspace: 87461K->87461K(1130496K)], 0.1711714 secs] [Times: user=0.74 sys=0.00, real=0.17 secs]
2023-05-14T03:14:16.425+0800: 23257084.032: [Full GC (Ergonomics) [PSYoungGen: 18848K->0K(4174848K)] [ParOldGen: 6290592K->1021930K(6291456K)] 6309440K->1021930K(10466304K), [Metaspace: 87461K->87461K(1130496K)], 0.1642861 secs] [Times: user=0.61 sys=0.13, real=0.16 secs]
2023-05-14T05:54:22.967+0800: 23266690.574: [Full GC (Ergonomics) [PSYoungGen: 19072K->0K(4174336K)] [ParOldGen: 6289543K->1024258K(6291456K)] 6308615K->1024258K(10465792K), [Metaspace: 87461K->87461K(1130496K)], 0.1678808 secs] [Times: user=0.60 sys=0.12, real=0.17 secs]
2023-05-14T08:34:14.287+0800: 23276281.894: [Full GC (Ergonomics) [PSYoungGen: 18816K->0K(4174848K)] [ParOldGen: 6290728K->1035862K(6291456K)] 6309544K->1035862K(10466304K), [Metaspace: 87461K->87461K(1130496K)], 0.1723334 secs] [Times: user=0.62 sys=0.13, real=0.17 secs]
2023-05-14T11:13:54.941+0800: 23285862.548: [Full GC (Ergonomics) [PSYoungGen: 18880K->0K(4174336K)] [ParOldGen: 6291083K->1036786K(6291456K)] 6309963K->1036786K(10465792K), [Metaspace: 87461K->87461K(1130496K)], 0.1718653 secs] [Times: user=0.74 sys=0.00, real=0.17 secs]
2023-05-14T13:53:36.794+0800: 23295444.401: [Full GC (Ergonomics) [PSYoungGen: 18816K->0K(4174336K)] [ParOldGen: 6289983K->1037363K(6291456K)] 6308799K->1037363K(10465792K), [Metaspace: 87461K->87461K(1130496K)], 0.1684250 secs] [Times: user=0.73 sys=0.00, real=0.17 secs]
2023-05-14T16:31:49.117+0800: 23304936.723: [Full GC (Ergonomics) [PSYoungGen: 18880K->0K(4174336K)] [ParOldGen: 6289020K->1073979K(6291456K)] 6307900K->1073979K(10465792K), [Metaspace: 87461K->87461K(1130496K)], 0.1959986 secs] [Times: user=0.82 sys=0.00, real=0.20 secs]
2023-05-14T19:10:24.719+0800: 23314452.325: [Full GC (Ergonomics) [PSYoungGen: 19008K->0K(4174336K)] [ParOldGen: 6289498K->1087454K(6291456K)] 6308506K->1087454K(10465792K), [Metaspace: 87461K->87461K(1130496K)], 0.1704064 secs] [Times: user=0.62 sys=0.13, real=0.17 secs]
2023-05-14T21:48:49.638+0800: 23323957.245: [Full GC (Ergonomics) [PSYoungGen: 18720K->0K(4174848K)] [ParOldGen: 6290375K->1091221K(6291456K)] 6309095K->1091221K(10466304K), [Metaspace: 87461K->87461K(1130496K)], 0.1704408 secs] [Times: user=0.63 sys=0.13, real=0.17 secs]
2023-05-15T00:26:59.459+0800: 23333447.066: [Full GC (Ergonomics) [PSYoungGen: 18752K->0K(4174848K)] [ParOldGen: 6289737K->1087750K(6291456K)] 6308489K->1087750K(10466304K), [Metaspace: 87461K->87461K(1130496K)], 0.1644770 secs] [Times: user=0.61 sys=0.12, real=0.17 secs]
2023-05-15T03:04:49.616+0800: 23342917.223: [Full GC (Ergonomics) [PSYoungGen: 18496K->0K(4174848K)] [ParOldGen: 6290349K->1009804K(6291456K)] 6308845K->1009804K(10466304K), [Metaspace: 87461K->87461K(1130496K)], 0.2883000 secs] [Times: user=1.61 sys=0.00, real=0.29 secs]
2023-05-15T05:45:19.259+0800: 23352546.866: [Full GC (Ergonomics) [PSYoungGen: 18592K->0K(4174848K)] [ParOldGen: 6289310K->1011289K(6291456K)] 6307902K->1011289K(10466304K), [Metaspace: 87461K->87461K(1130496K)], 0.1642978 secs] [Times: user=0.58 sys=0.12, real=0.17 secs]
2023-05-15T08:25:48.615+0800: 23362176.222: [Full GC (Ergonomics) [PSYoungGen: 19168K->0K(4174848K)] [ParOldGen: 6289734K->1015392K(6291456K)] 6308902K->1015392K(10466304K), [Metaspace: 87461K->87461K(1130496K)], 0.1688850 secs] [Times: user=0.72 sys=0.00, real=0.17 secs]
升级后:
2023-07-14T00:36:43.955+0800: 1560858.593: [Full GC (Ergonomics) [PSYoungGen: 9600K->0K(4183040K)] [ParOldGen: 6288948K->2244369K(6291456K)] 6298548K->2244369K(10474496K), [Metaspace: 84355K->84355K(1128448K)], 1.3631063 secs] [Times: user=10.17 sys=0.00, real=1.36 secs]
2023-07-14T01:31:37.485+0800: 1564152.123: [Full GC (Ergonomics) [PSYoungGen: 9952K->0K(4183040K)] [ParOldGen: 6289357K->2286632K(6291456K)] 6299309K->2286632K(10474496K), [Metaspace: 84355K->84355K(1128448K)], 1.3682990 secs] [Times: user=9.58 sys=0.61, real=1.37 secs]
2023-07-14T02:25:32.474+0800: 1567387.112: [Full GC (Ergonomics) [PSYoungGen: 9824K->0K(4181504K)] [ParOldGen: 6286228K->2302335K(6291456K)] 6296052K->2302335K(10472960K), [Metaspace: 84355K->84355K(1128448K)], 1.3964071 secs] [Times: user=8.39 sys=2.02, real=1.40 secs]
2023-07-14T03:19:55.375+0800: 1570650.013: [Full GC (Ergonomics) [PSYoungGen: 9728K->0K(4183040K)] [ParOldGen: 6286803K->2298780K(6291456K)] 6296531K->2298780K(10474496K), [Metaspace: 84355K->84355K(1128448K)], 1.3909473 secs] [Times: user=10.22 sys=0.00, real=1.39 secs]
2023-07-14T04:14:24.521+0800: 1573919.159: [Full GC (Ergonomics) [PSYoungGen: 10016K->0K(4181504K)] [ParOldGen: 6286520K->2295721K(6291456K)] 6296536K->2295721K(10472960K), [Metaspace: 84355K->84355K(1128448K)], 1.4162378 secs] [Times: user=9.35 sys=1.20, real=1.41 secs]
2023-07-14T05:09:00.785+0800: 1577195.423: [Full GC (Ergonomics) [PSYoungGen: 9600K->0K(4183040K)] [ParOldGen: 6288843K->2304364K(6291456K)] 6298443K->2304364K(10474496K), [Metaspace: 84355K->84355K(1128448K)], 1.3999174 secs] [Times: user=8.48 sys=2.01, real=1.40 secs]
2023-07-14T06:02:47.382+0800: 1580422.021: [Full GC (Ergonomics) [PSYoungGen: 9920K->0K(4183552K)] [ParOldGen: 6287748K->2090612K(6291456K)] 6297668K->2090612K(10475008K), [Metaspace: 84355K->84355K(1128448K)], 1.4470640 secs] [Times: user=9.60 sys=0.00, real=1.44 secs]
2023-07-14T07:00:04.113+0800: 1583858.751: [Full GC (Ergonomics) [PSYoungGen: 10112K->0K(4182528K)] [ParOldGen: 6286867K->2123146K(6291456K)] 6296979K->2123146K(10473984K), [Metaspace: 84355K->84355K(1128448K)], 1.4455848 secs] [Times: user=8.65 sys=2.06, real=1.44 secs]
2023-07-14T07:56:28.038+0800: 1587242.676: [Full GC (Ergonomics) [PSYoungGen: 9696K->0K(4183040K)] [ParOldGen: 6287276K->2137543K(6291456K)] 6296972K->2137543K(10474496K), [Metaspace: 84355K->84355K(1128448K)], 1.4331709 secs] [Times: user=10.69 sys=0.00, real=1.44 secs]
2023-07-14T08:52:45.242+0800: 1590619.880: [Full GC (Ergonomics) [PSYoungGen: 9504K->0K(4183552K)] [ParOldGen: 6287306K->2141503K(6291456K)] 6296811K->2141503K(10475008K), [Metaspace: 84355K->84355K(1128448K)], 1.4394211 secs] [Times: user=10.76 sys=0.00, real=1.44 secs]
2023-07-14T09:48:49.616+0800: 1593984.254: [Full GC (Ergonomics) [PSYoungGen: 9952K->0K(4183040K)] [ParOldGen: 6289126K->2150238K(6291456K)] 6299078K->2150238K(10474496K), [Metaspace: 84356K->84356K(1128448K)], 1.4179921 secs] [Times: user=10.61 sys=0.00, real=1.42 secs]
2023-07-14T10:41:06.558+0800: 1597121.196: [Full GC (Ergonomics) [PSYoungGen: 9568K->0K(4182016K)] [ParOldGen: 6289797K->2142933K(6291456K)] 6299365K->2142933K(10473472K), [Metaspace: 84363K->84363K(1128448K)], 1.4261841 secs] [Times: user=10.58 sys=0.00, real=1.43 secs]
启动内存参数:
-Xms10g -Xmx10g -Xmn4g -XX:MetaspaceSize=128m -XX:MaxMetaspaceSize=320m -XX:-OmitStackTraceInFastThrow -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/data/apps/nacos/logs/java_heapdump.hprof -XX:-UseLargePages -Dnacos.member.list= -Djava.ext.dirs=/usr/java/jdk1.8.0_191-amd64//jre/lib/ext:/usr/java/jdk1.8.0_191-amd64//lib/ext -Xloggc:/data/apps nacos/logs/nacos_gc.log -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M -Dloader.path=/data/apps/nacos/plugins/health,/data/apps/nacos/plugins/cmdb -Dnacos.home=/data/apps/nacos -jar /data/apps/nacos/target/nacos-server.jar --spring.config.additional-location=file:/data/apps/nacos/conf/ --logging.config=/data/apps/nacos/conf/nacos-logback.xml --server.max-http-header-size=524288 nacos.nacos
由于Full GC时间过长,造成客户端链接超时:
com.alibaba.nacos.api.exception.NacosException: org.apache.http.conn.ConnectTimeoutException: Connect to x.x.x.x:8848 [/x.x.x.x] failed: connect timed out at com.alibaba.nacos.client.naming.net.NamingProxy.callServer(NamingProxy.java:615) at com.alibaba.nacos.client.naming.net.NamingProxy.reqApi(NamingProxy.java:526) at com.alibaba.nacos.client.naming.net.NamingProxy.reqApi(NamingProxy.java:498) at com.alibaba.nacos.client.naming.net.NamingProxy.sendBeat(NamingProxy.java:433) at com.alibaba.nacos.client.naming.beat.BeatReactor$BeatTask.run(BeatReactor.java:167) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
怀疑新版本代码改动造成此问题
The text was updated successfully, but these errors were encountered: