Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement for graceful shutdown #1021

Closed
wants to merge 1 commit into from
Closed

Enhancement for graceful shutdown #1021

wants to merge 1 commit into from

Conversation

manzhizhen
Copy link

What is the purpose of the change

目前不管是用的最多的2.5.3版本还是最新的2.5.7版本,亲自测试在不设置重试机制下是无法做到优雅停机的,这次改动主要是修改一点点代码,加上可配置的等待时间,就能简单的做到“不开启重试也能优雅停机”。

其主要实现机制就是在【provider断连注册中心之后,关闭应答之前】和【consumer移除掉invoker后,关闭client之前】这两个阶段加入可配置的等待时间,目前亲测可以做到不配置重试也能优雅停机。

Brief changelog

modified: dubbo-common/src/main/java/com/alibaba/dubbo/common/Constants.java
modified: dubbo-config/dubbo-config-api/src/main/java/com/alibaba/dubbo/config/ProtocolConfig.java
modified: dubbo-rpc/dubbo-rpc-api/src/main/java/com/alibaba/dubbo/rpc/protocol/AbstractInvoker.java
modified: dubbo-rpc/dubbo-rpc-default/src/main/java/com/alibaba/dubbo/rpc/protocol/dubbo/DubboInvoker.java
modified: dubbo-rpc/dubbo-rpc-thrift/src/main/java/com/alibaba/dubbo/rpc/protocol/thrift/ThriftInvoker.java

Verifying this change

XXXX

Follow this checklist to help us incorporate your contribution quickly and easily:

  • Make sure there is a GITHUB_issue filed for the change (usually before you start working on it). Trivial changes like typos do not require a GITHUB issue. Your pull request should address just this issue, without pulling in other changes - one PR resolves one issue.
  • Format the pull request title like [Dubbo-XXX] Fix UnknownException when host config not exist. Each commit in the pull request should have a meaningful subject line and body.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Write necessary unit-test to verify your logic correction, more mock a little better when cross module dependency exist. If the new feature or significant change is committed, please remember to add integration-test in test module.
  • Run mvn clean install -DskipITs to make sure unit-test pass. Run mvn clean test-compile failsafe:integration-test to make sure integration-test pass.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

@manzhizhen
Copy link
Author

因为现在大多数用dubbo的公司,为了避免极端情况下的雪崩和流量风暴,大部分接口都会关闭重试机制,这样,对于当前dubbo优雅停机的设定,就无法做到优雅停机了,所以这里通过比较简单的方式,加大了在不重试情况下优雅停机的成功率。

http://blog.csdn.net/manzhizhen/article/details/78756370

@manzhizhen
Copy link
Author

manzhizhen commented Dec 9, 2017

测试方案:
A模块(consumer,部署单个)高并发调用B模块(provider,部署多个)某个接口,然后对B模块进行kill,看A模块是否报错。

B模块是provider,会启动多个,端口号不同,配置和代码:

<dubbo:application name="mzz-dubbo2"/>
<dubbo:protocol name="dubbo" port="8887" threads="200" dispatcher="message" />
<dubbo:registry id="dubboRegistry" protocol="zookeeper" address="127.0.0.1:2181"/>
<dubbo:provider token="true" connections="3"/>
<dubbo:service interface="com.manzhizhen.dubbo.service.Dubbo2Service"
			ref="dubbo2Service" version="1.0.0" timeout="100" />
<dubbo:service interface="com.manzhizhen.dubbo.service.Dubbo2Service1"
				   ref="dubbo2Service1" version="1.0.0" timeout="100" />
public interface Dubbo2Service {
    void service2(Integer num);
}
public interface Dubbo2Service1 {
    void service3(Integer num);
}
@Service("dubbo2Service")
public class Dubbo2ServiceImpl implements Dubbo2Service {
    @Override
    public void service2(Integer num) {
        try {
            Thread.sleep(10);
        } catch (InterruptedException e) {
        }
    }
}
@Service("dubbo2Service1")
public class Dubbo2Service1Impl implements Dubbo2Service1 {
    @Override
    public void service3(Integer num) {
        try {
            Thread.sleep(10);
        } catch (InterruptedException e) {
        }
    }
}

A模块是consumer,重试设置成0,配置和代码:

<dubbo:application name="mzz-dubbo1" />
<dubbo:registry id="dubboRegistry" protocol="zookeeper" address="127.0.0.1:2181"/>
<dubbo:provider token="true" />
<dubbo:reference interface="com.manzhizhen.dubbo.service.Dubbo2Service"
			id="dubbo2Service" version="1.0.0" timeout="2000" retries="0"/>
<dubbo:reference interface="com.manzhizhen.dubbo.service.Dubbo2Service1"
					 id="dubbo2Service1" version="1.0.0" timeout="2000" retries="0"/>
@Service
public class Dubbo1ServiceImpl implements Dubbo1Service {
    @Autowired
    private Dubbo2Service dubbo2Service;
    @Autowired
    private Dubbo2Service1 dubbo2Service1;

    @Override
    public void service1() {
        dubbo2Service.service2(1);
        dubbo2Service1.service3(2);
    }
}	
  public static void main(String[] args) {
       ApplicationContext applicationContext = new ClassPathXmlApplicationContext("classpath:dubbo-spring.xml");
       Dubbo1Service dubbo1ServiceImpl = (Dubbo1Service) applicationContext.getBean("dubbo1ServiceImpl");
       dubbo1ServiceImpl.service1();

       // 200个线程不断调用,创建高并发场景
       ThreadPoolExecutor poolExecutor = new ThreadPoolExecutor(200, 200, 100,
               TimeUnit.SECONDS, new ArrayBlockingQueue<Runnable>(10000));

       int times = 200;
       while(times-- > 0) {
           poolExecutor.submit(() -> {
               try {
                   // 每个线程都会死循环调用,直到有异常
                   while(true) {
                       dubbo1ServiceImpl.service1();
                   }
               } catch (Exception e) {
                   // 如果真正做到优雅停机,此处是不会有异常的
                   e.printStackTrace();
               }
           });
       }

       try {
           TimeUnit.HOURS.sleep(1000);
       } catch (InterruptedException e) {
       }
   }

@manzhizhen
Copy link
Author

manzhizhen commented Dec 9, 2017

改造前,如果创建1个A模块实例和5个B模块实例,在启动后,使用kill杀掉其中一个B模块,A模块将会产生如下异常:

com.alibaba.dubbo.rpc.RpcException: Failed to invoke the method service2 in the service com.manzhizhen.dubbo.service.Dubbo2Service. Tried 1 times of the providers [172.30.97.211:8887] (1/3) from the registry 10.0.50.150:2181 on the consumer 172.30.97.211 using the dubbo version 2.5.7. Last error is: Failed to invoke remote method: service2, provider: dubbo://172.30.97.211:8887/com.manzhizhen.dubbo.service.Dubbo2Service?anyhost=true&application=mzz-dubbo1&check=false&default.connections=3&default.token=true&dispatcher=message&dubbo=2.5.7&generic=false&interface=com.manzhizhen.dubbo.service.Dubbo2Service&loadbalance=roundrobin&methods=service2&pid=1508&register.ip=172.30.97.211&remote.timestamp=1512800036146&retries=0&revision=1.0.0&side=consumer&timeout=2000&timestamp=1512800235497&version=1.0.0, cause: message can not send, because channel is closed . url:dubbo://172.30.97.211:8887/com.manzhizhen.dubbo.service.Dubbo2Service?anyhost=true&application=mzz-dubbo1&check=false&codec=dubbo&default.connections=3&default.token=true&dispatcher=message&dubbo=2.5.7&generic=false&heartbeat=60000&interface=com.manzhizhen.dubbo.service.Dubbo2Service&loadbalance=roundrobin&methods=service2&pid=1508&register.ip=172.30.97.211&remote.timestamp=1512800036146&retries=0&revision=1.0.0&side=consumer&timeout=2000&timestamp=1512800235497&version=1.0.0
	at com.alibaba.dubbo.rpc.cluster.support.FailoverClusterInvoker.doInvoke(FailoverClusterInvoker.java:101)
	at com.alibaba.dubbo.rpc.cluster.support.AbstractClusterInvoker.invoke(AbstractClusterInvoker.java:229)
	at com.alibaba.dubbo.rpc.cluster.support.wrapper.MockClusterInvoker.invoke(MockClusterInvoker.java:72)
	at com.alibaba.dubbo.rpc.proxy.InvokerInvocationHandler.invoke(InvokerInvocationHandler.java:52)
	at com.alibaba.dubbo.common.bytecode.proxy0.service2(proxy0.java)
	at com.manzhizhen.dubbo.service.impl.Dubbo1ServiceImpl.service1(Dubbo1ServiceImpl.java:22)
	at com.manzhizhen.dubbo.service.MzzDubbo1Main.lambda$main$0(MzzDubbo1Main.java:32)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: com.alibaba.dubbo.remoting.RemotingException: message can not send, because channel is closed . url:dubbo://172.30.97.211:8887/com.manzhizhen.dubbo.service.Dubbo2Service?anyhost=true&application=mzz-dubbo1&check=false&codec=dubbo&default.connections=3&default.token=true&dispatcher=message&dubbo=2.5.7&generic=false&heartbeat=60000&interface=com.manzhizhen.dubbo.service.Dubbo2Service&loadbalance=roundrobin&methods=service2&pid=1508&register.ip=172.30.97.211&remote.timestamp=1512800036146&retries=0&revision=1.0.0&side=consumer&timeout=2000&timestamp=1512800235497&version=1.0.0
	at com.alibaba.dubbo.remoting.transport.AbstractClient.send(AbstractClient.java:256)

使用改造后的dubbo,所有配置都不变,kill其中一个B模块,A模块没有抛任何异常。

@manzhizhen
Copy link
Author

如果想调整provider和consumer的等待时间,只需要在dubbo.properties中设置如下key就可以:
provider.shutdown.min.wait=5000
consumer.shutdown.min.wait=2000

@manzhizhen manzhizhen closed this Dec 9, 2017
@manzhizhen
Copy link
Author

@chickenlj 先给我点时间了解下49525b0,呵呵

@manzhizhen manzhizhen reopened this Dec 9, 2017
@1cming
Copy link

1cming commented Dec 11, 2017

That's what I need !!!

@chickenlj chickenlj self-requested a review January 18, 2018 03:00
@chickenlj chickenlj changed the base branch from master to 2.5.x January 23, 2018 07:26
@chickenlj chickenlj changed the title 不用重试也能优雅停机 Enhancement for graceful shutdown Jan 23, 2018
chickenlj added a commit that referenced this pull request Jan 23, 2018
@chickenlj chickenlj added this to the 2.5.9 milestone Jan 23, 2018
@chickenlj chickenlj added the status/don’t-merge No plan to merge label Jan 23, 2018
@chickenlj chickenlj closed this Jan 23, 2018
@chickenlj chickenlj modified the milestones: 2.5.9, 2.6.1 Mar 16, 2018
rolandhe pushed a commit to rolandhe/dubbo that referenced this pull request Sep 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/don’t-merge No plan to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants