Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connectionListener of AbstractHTTP2ServerConnectionFactory cause the low performance of concurrent connect of http2 #2828

Closed
ligzy opened this issue Aug 18, 2018 · 10 comments

Comments

@ligzy
Copy link

ligzy commented Aug 18, 2018

Hi,
Thanks for your project. and i like jetty very much.I am writing one http2 server based on Jetty 9.4.x
Now I got one question,i found that:
One server can accept and hold connections up to 460,000 connections in one connection server if clients make connections: 50000 by 50000 through several client pc.
But the server can only accept up to about 160000 connections if the clients connect server 'at the same time', for example: 5 client pc make 300000 connection at the same time.(through multi ip), The other connect client get an connect timeout exception . And i found the recv-q multi in the server jvm.
I dump the jvm stack and found this:

"qtp1620112330-73" #73 prio=5 os_prio=0 tid=0x00007f1038001000 nid=0xcca40 waiting on condition [0x00007f10c24e5000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00007f122869a050> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at java.util.concurrent.CopyOnWriteArrayList.add(CopyOnWriteArrayList.java:436)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.addBean(ContainerLifeCycle.java:283)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.addBean(ContainerLifeCycle.java:267)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.addManaged(ContainerLifeCycle.java:362)
	at org.eclipse.jetty.http2.server.AbstractHTTP2ServerConnectionFactory$ConnectionListener.onOpened(AbstractHTTP2ServerConnectionFactory.java:226)
	at org.eclipse.jetty.io.AbstractConnection.onOpen(AbstractConnection.java:202)
	at org.eclipse.jetty.http2.HTTP2Connection.onOpen(HTTP2Connection.java:116)
	at org.eclipse.jetty.http2.server.HTTP2ServerConnection.onOpen(HTTP2ServerConnection.java:147)
	at org.eclipse.jetty.io.AbstractEndPoint.upgrade(AbstractEndPoint.java:440)
	at org.eclipse.jetty.server.NegotiatingServerConnection.onFillable(NegotiatingServerConnection.java:131)
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:281)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
	at org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:291)
	at org.eclipse.jetty.io.ssl.SslConnection$3.succeeded(SslConnection.java:151)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
	at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:762)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:680)
	at java.lang.Thread.run(Thread.java:748)

I read the source and found that:

private final Connection.Listener connectionListener = new ConnectionListener();

the connectionListener object cause the lock.
and an comment it to test.

I got the better performance accept the connection more faster and more connection and have no timeout.

so ,how about remove the member connectionListener of AbstractHTTP2ServerConnectionFactory for more high performance? or other solution?

@ligzy ligzy changed the title AbstractHTTP2ServerConnectionFactory cause the low performance of concurrent connect of http2 connectionListener of AbstractHTTP2ServerConnectionFactory cause the low performance of concurrent connect of http2 Aug 18, 2018
@sbordet
Copy link
Contributor

sbordet commented Aug 18, 2018

ConnectionListener is used to add/remove HTTP2Session instances to/from the component tree, so that they are visible in JMX.
This is an important feature that we cannot remove.

Having said that, perhaps it is now the case that the component tree is more likely to be a mostly write data structure than a mostly read data structure?

ContainerLifeCycle._beans has a Set semantic but it's a List.

Would it be worth replacing it with ConcurrentHashMap.newKeySet()?
This would give the Set semantic and be concurrent and more efficient for writes.
We will pay an additional indirection reading though, as we go from an Object[] holding the beans (in CopyOnWriteArrayList) to a Node[] (in ConcurrentHashMap).

Perhaps a benchmark?

@gregw thoughts?

@gregw
Copy link
Contributor

gregw commented Aug 19, 2018

@sbordet I question the need of HTTP2Sessions to be managed objects. They are transient instances that may end their lifecycle at anytime with a message from the remote end. Thus they are highly unreliable targets for any JMX operation, as the object may cease to exist between the discovery of the managed attribute and any attempt to call it. Almost all the managed attributes on the object are never going to change from the values set by the factory that creates the session, so it's managed object should be sufficient to convey that information.

So I would suggest that we stop adding the sessions directly to the component tree. Instead we can create a new object, perhaps called a Http2SessionContainer, to the component tree. This object would be Dumpable and would dump all its sessions when dumped (either directly or as part of a component tree dump). This object would be managed, so it could provide managed attributes to describe the collection in general, and if need be it could have methods that would allow the attributes of individual sessions to be queried/set. The object would of course have the most appropriate data structure to best support the most frequent operations: add & remove of sessions.

This would mimic other similar setups in jetty - eg the SelectorManager and SessionManager objects are both in the component tree, but the many transient objects that they manager are not. However their collections are visible via dumps and JMX aggregate operations.

@gregw
Copy link
Contributor

gregw commented Aug 19, 2018

Note also that a Http2SessionContainer could have a method to add a specific session as a bean, if it really needed to expose that particular sessions MBean.... but I am still dubious that is necessary.

@ligzy
Copy link
Author

ligzy commented Aug 19, 2018

@sbordet

I got.
Follow your thought, i make a code update and test, but the result is not satisfied.
i updated ContainerLifeCycleBase code such as:

	private final ConcurrentHashMap<Bean, Bean> _beans = new ConcurrentHashMap<Bean, Bean>();
	private final ConcurrentHashMap<Container.Listener, Container.Listener> _listeners = new ConcurrentHashMap<Container.Listener, Container.Listener>();

	@Override
	protected void doStart() throws Exception {
		if (_destroyed)
			throw new IllegalStateException("Destroyed container cannot be restarted");

		// indicate that we are started, so that addBean will start other beans added.
		_doStarted = true;

		// start our managed and auto beans
		for (Bean b : _beans.values()) {
			if (b._bean instanceof LifeCycle) {
				LifeCycle l = (LifeCycle) b._bean;
				switch (b._managed) {
				case MANAGED:
					if (!l.isRunning())
						start(l);
					break;

				case AUTO:
					if (l.isRunning())
						unmanage(b);
					else {
						manage(b);
						start(l);
					}
					break;

				default:
					break;
				}
			}
		}

		super.doStart();
	}

It can't get better performance. only up to about 130,000 connections.

if i comment all about

private final Connection.Listener connectionListener = new ConnectionListener();

it can accept up to 500,000 and more.(memory left about 14g of total 64g)

so In our case, server will prefer better performance, and maybe not managed through JMX, so there have some option to set the manage properties,
or other method?

@ligzy
Copy link
Author

ligzy commented Aug 19, 2018

@gregw yes,we prefer the high performance for the jetty http2 server is our gateway,
we hope it can accept and hold at least 500,000(up to 1M is the better)connections in a 64G memory server. now if i comment the connectionListener of AbstractHTTP2ServerConnectionFactoryBase,
the test show it can reach it easy.
here it the test report data:

Before comment:
date connections
Sun Aug 19 21:17:54 CST 2018 1
Sun Aug 19 21:18:04 CST 2018 1801
Sun Aug 19 21:18:14 CST 2018 12596
Sun Aug 19 21:18:25 CST 2018 31692
Sun Aug 19 21:18:47 CST 2018 60150
Sun Aug 19 21:19:01 CST 2018 70893
Sun Aug 19 21:19:16 CST 2018 81061
Sun Aug 19 21:19:29 CST 2018 89588
Sun Aug 19 21:19:57 CST 2018 105406
Sun Aug 19 21:20:15 CST 2018 108778
Sun Aug 19 21:20:36 CST 2018 111866
Sun Aug 19 21:20:56 CST 2018 116502
Sun Aug 19 21:21:16 CST 2018 119821
Sun Aug 19 21:21:34 CST 2018 122523
Sun Aug 19 21:22:01 CST 2018 123362
Sun Aug 19 21:22:20 CST 2018 124534
Sun Aug 19 21:22:43 CST 2018 124989
Sun Aug 19 21:23:04 CST 2018 125166
Sun Aug 19 21:23:23 CST 2018 127008
Sun Aug 19 21:23:45 CST 2018 126792
Sun Aug 19 21:24:02 CST 2018 128445
Sun Aug 19 21:24:25 CST 2018 128714
Sun Aug 19 21:24:43 CST 2018 129608
Sun Aug 19 21:25:04 CST 2018 131757
Sun Aug 19 21:25:28 CST 2018 132026
Sun Aug 19 21:25:45 CST 2018 133842
Sun Aug 19 21:26:05 CST 2018 134388
Sun Aug 19 21:26:25 CST 2018 136471
Sun Aug 19 21:26:52 CST 2018 137856
Sun Aug 19 21:27:11 CST 2018 139082
Sun Aug 19 21:27:37 CST 2018 144547
Sun Aug 19 21:27:55 CST 2018 140170
Sun Aug 19 21:28:30 CST 2018 143542
Sun Aug 19 21:28:47 CST 2018 142151
Sun Aug 19 21:29:14 CST 2018 143388
Sun Aug 19 21:29:39 CST 2018 144991
Sun Aug 19 21:29:57 CST 2018 140534
Sun Aug 19 21:30:33 CST 2018 144533
Sun Aug 19 21:31:01 CST 2018 145362
Sun Aug 19 21:31:17 CST 2018 142906
Sun Aug 19 21:31:54 CST 2018 146322
Sun Aug 19 21:32:20 CST 2018 148008

After comment(the speed is satisfied:) )
Sun Aug 19 21:42:34 CST 2018 1
Sun Aug 19 21:42:44 CST 2018 3270
Sun Aug 19 21:42:55 CST 2018 38947
Sun Aug 19 21:43:09 CST 2018 87946
Sun Aug 19 21:43:22 CST 2018 121767
Sun Aug 19 21:43:39 CST 2018 179252
Sun Aug 19 21:44:01 CST 2018 227720
Sun Aug 19 21:44:28 CST 2018 299753
Sun Aug 19 21:45:02 CST 2018 368717
Sun Aug 19 21:45:42 CST 2018 393140
Sun Aug 19 21:46:29 CST 2018 473516
Sun Aug 19 21:47:25 CST 2018 482001

@sbordet
Copy link
Contributor

sbordet commented Aug 19, 2018

@ligzy have you tried #2831?

@ligzy
Copy link
Author

ligzy commented Aug 19, 2018

@sbordet i tried #2831 , it's fast again.
Thanks! Great speed of fix.

I will update version from 9.4.11.v20180605 after your new release.

(but I really want it be more fast and more connections,for this target session not been managed can be accepted, some option? )

gregw added a commit that referenced this issue Aug 20, 2018
Signed-off-by: Greg Wilkins <gregw@webtide.com>
@sbordet
Copy link
Contributor

sbordet commented Aug 20, 2018

@ligzy I don't know exactly what you are trying to test here. Seems like a synthetic benchmark of "let's bomb the server with connect requests and see when it dies" that never happens in reality and so it will give you back a number that may not be of any use to you: from this number you will not be able to derive any capacity for your real system.

Is there any difference between #2831 and the case where you commented out the field?

In any case, here a few suggestions:

  • Make sure you have the OS tuned properly. You want the TCP stack to be able to support these many connections. Google for this specific problem.
  • Increase heap size - takes memory to hold on all those connections and associated data structures. The more memory you have, the less time you will spend in GC, also considering that all these connections will end up in old generation.
  • Make sure you have a lot of selectors on the ServerConnector and possibly a lot of cores. For 500K connections, no less than 10 cores, possibly way more than that (48/64 or more).
  • Make sure you have more than 1 acceptor on the ServerConnector. Acceptors will compete to accept connections, so while one acceptor is registering a new accepted connection, other acceptors can accept new connections.
  • Make sure it's not the client that is slowing you down. I would not be surprised if it's the client that cannot issue connect attempts as fast as you would like.
  • Be on Linux, not on Windows.

sbordet added a commit that referenced this issue Aug 21, 2018
… low performance.

Improved JMX for the HTTP2SessionContainer.

Signed-off-by: Simone Bordet <simone.bordet@gmail.com>
sbordet added a commit that referenced this issue Aug 22, 2018
…on_listener_contention

Fixes #2828 - AbstractHTTP2ServerConnectionFactory concurrent connect low performance.
sbordet added a commit that referenced this issue Aug 22, 2018
… low performance.

Fixed test failure.

Signed-off-by: Simone Bordet <simone.bordet@gmail.com>
@ligzy
Copy link
Author

ligzy commented Dec 29, 2018

Hi
Here is my test and question, i found it pull down the accept connection speed.
May you have a chane to the HashSet ?

"qtp1620112330-1064" #1064 prio=5 os_prio=0 tid=0x00007f104c01d000 nid=0xcce52 waiting on condition [0x00007f0f478bd000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00007f122869a050> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at java.util.concurrent.CopyOnWriteArrayList.add(CopyOnWriteArrayList.java:436)
at org.eclipse.jetty.util.component.ContainerLifeCycle.addBean(ContainerLifeCycle.java:283)
at org.eclipse.jetty.util.component.ContainerLifeCycle.addBean(ContainerLifeCycle.java:267)
at org.eclipse.jetty.util.component.ContainerLifeCycle.addManaged(ContainerLifeCycle.java:362)
at org.eclipse.jetty.http2.server.AbstractHTTP2ServerConnectionFactory$ConnectionListener.onOpened(AbstractHTTP2ServerConnectionFactory.java:226)
at org.eclipse.jetty.io.AbstractConnection.onOpen(AbstractConnection.java:202)
at org.eclipse.jetty.http2.HTTP2Connection.onOpen(HTTP2Connection.java:116)
at org.eclipse.jetty.http2.server.HTTP2ServerConnection.onOpen(HTTP2ServerConnection.java:147)
at org.eclipse.jetty.io.AbstractEndPoint.upgrade(AbstractEndPoint.java:440)
at org.eclipse.jetty.server.NegotiatingServerConnection.onFillable(NegotiatingServerConnection.java:131)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:281)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
at org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:291)
at org.eclipse.jetty.io.ssl.SslConnection$3.succeeded(SslConnection.java:151)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:762)
at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:680)
at java.lang.Thread.run(Thread.java:748)

@sbordet
Copy link
Contributor

sbordet commented Dec 29, 2018

@ligzy I don't understand your last comment. Make sure you are using the latest version of Jetty - the stack trace shows you are on an old version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants