Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delta CDS: add on-demand cds support #9626

Closed
wants to merge 7 commits into from

Conversation

aaron-ai
Copy link
Member

@aaron-ai aaron-ai commented Jan 9, 2020

Signed-off-by: aaron-ai yangkun.ayk@alibaba-inc.com

Description: Apache RocketMQ needs the ability of on-demand CDS #9503 , so we split this part as an independent PR, this PR allows user can start a spontaneous DeltaDiscoveryRequests for CDS (AKA on-demand cds) by using addToClusterInterest in cluster_manager.
Risk Level: Low
Testing: Unit Tests
Docs Changes: N/A
Release Notes: N/A
[Optional Fixes #Issue]: #9431

@lizhanhui
Copy link

@zyfjeff

@mattklein123
Copy link
Member

@wgallagher do you mind taking a first pass on this?

@wgallagher
Copy link

sure

@htuch
Copy link
Member

htuch commented Jan 9, 2020

Can we take a step back and document what the wire protocol looks like? On-demand protocols are pretty tricky, we only have one proof-of-existence today, which is VHDS, see https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_conn_man/vhds.html?highlight=vhds. We should have similar level of documentation for what delta CDS looks like.

@htuch htuch self-assigned this Jan 9, 2020
@aaron-ai
Copy link
Member Author

Can we take a step back and document what the wire protocol looks like? On-demand protocols are pretty tricky, we only have one proof-of-existence today, which is VHDS, see https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_conn_man/vhds.html?highlight=vhds. We should have similar level of documentation for what delta CDS looks like.

Thank you, on-demand CDS 's document is available at the link below:

https://docs.google.com/document/d/1WW4rgUTAeeFO4XLLMDZhH71BYwr6u9r5XvCiawmjWVQ/edit?usp=sharing

@htuch
Copy link
Member

htuch commented Jan 10, 2020

@aaron-ai great, can you turn that into RST and include it in this PR? It should live somewhere similar to the VHDS docs I linked to.

@aaron-ai
Copy link
Member Author

@aaron-ai great, can you turn that into RST and include it in this PR? It should live somewhere similar to the VHDS docs I linked to.

OK, include the RST in the PR already.

@mattklein123
Copy link
Member

@aaron-ai for this to be useful in a filter don't you need some capability to wait until the initial watch interest is loaded, with relevant timeouts, etc.? I don't see what in the code here. I'm trying to understand how this would be used in practice.

@aaron-ai
Copy link
Member Author

aaron-ai commented Jan 14, 2020

@aaron-ai for this to be useful in a filter don't you need some capability to wait until the initial watch interest is loaded, with relevant timeouts, etc.? I don't see what in the code here. I'm trying to understand how this would be used in practice.

Thanks for your reply! actually we will use ClusterUpdateCallbacks to notify the events to be processed, the code in rocketmq_proxy (#9503 ) shows the related part.

void RocketmqClusterUpdateCallback::onClusterAddOrUpdate(Upstream::ThreadLocalCluster& cluster) {
  ENVOY_LOG(trace, "RocketmqClusterUpdateCallback#onClusterAddOrUpdate: {}",
            cluster.info()->name());
  connection_manager_.onClusterReady(cluster);
}

@mattklein123
Copy link
Member

Thanks OK makes sense regarding the update callbacks. @wgallagher if you could take a first pass that would be great. Thank you!

that Envoy will call to dynamically fetch upstream clusters which Envoy interested in spontaneously.

By default in CDS, all cluster configurations are sent to every Envoy instance in the mesh. The
delta CDS provides the ability that the control panel can send incremental CDS to the Envoy instance,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/control panel/management server/

std::set<std::string> additions = add_these_names;

return AddedRemoved(findAdditions(newly_added_to_watch, watch),
findRemovals(newly_removed_from_watch, watch));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

won't this always be the empty set?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

findRemovals(newly_removed_from_watch, watch) will be always empty, maybe giving a empty set here is more appropriate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I've already remove additions.

@@ -227,6 +227,10 @@ class ClusterManager {
* @return Config::SubscriptionFactory& the subscription factory.
*/
virtual Config::SubscriptionFactory& subscriptionFactory() PURE;

virtual void updateClusterInterest(const std::set<std::string>& update_to_these_names) PURE;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this used?


virtual void updateClusterInterest(const std::set<std::string>& update_to_these_names) PURE;

virtual void addToClusterInterest(const std::set<std::string>& add_these_names) PURE;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this used?

Copy link
Member Author

@aaron-ai aaron-ai Jan 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, updateClusterInterest here only provides the possibility to update the entire clusterInterest, maybe I should remove this method. As for addToClusterInterest, the code in rocketmq_proxy (#9503 ) call it in practice.

code in RouterImpl#sendRequestToUpstream

  if (!cluster) {
    active_message_->awaitCluster(true);
    active_message_->incPendingStats();
    if (active_message_->connectionManager().hasInFlightCdsRequest(cluster_name)) {
      ENVOY_LOG(trace, "Cluster {} is absent, but there has been an in-flight CDS request",
                cluster_name);
      return;
    }

    std::set<std::string> resources = {cluster_name};
    cluster_manager_.addToClusterInterest(resources);
    active_message_->connectionManager().markInFlightCdsRequest(cluster_name);
    ENVOY_LOG(info, "Cluster {} is not available, trigger to set up through delta CDS. Opaque: {}",
              cluster_name, opaque);
    return;
  }

code in ActiveMessage#onQueryTopicRoute

    await_cluster_ = true;
    incPendingStats();
    std::set<std::string> resources{cluster_name};
    ENVOY_LOG(trace, "Initiate on-demand cluster discovery service for {}", cluster_name);
    clusterManager.addToClusterInterest(resources);
    connection_manager_.markInFlightCdsRequest(cluster_name);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wgallagher the on-demand part in #9503 has been removed temporarily, but you can still view the related part in this commit : aaron-ai@40ed836

@@ -125,6 +125,16 @@ class GrpcMux {
const std::set<std::string>& resources,
SubscriptionCallbacks& callbacks,
std::chrono::milliseconds init_fetch_timeout) PURE;

/**
* The only difference between addToWatch() and addOrUpdateWatch() is that the 'resources' here

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would phrase this comment a bit differently, something like: 'addToWatch adds additional resources to the watch. Unlike addOrUpdateWatch, it never removes resources'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would phrase this comment a bit differently, something like: 'addToWatch adds additional resources to the watch. Unlike addOrUpdateWatch, it never removes resources'

OK, I'll fix it.

@@ -18,3 +18,40 @@ Statistics
----------

CDS has a :ref:`statistics <subscription_statistics>` tree rooted at *cluster_manager.cds.*

On-demand CDS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markdroth FYI, in case you have some review bandwidth cycles to give this a pass similar to the VHDS one.

@aaron-ai aaron-ai requested review from wgallagher and htuch January 16, 2020 01:42
in, using on-demand CDS will allow an Envoy instance to subscribe and unsubscribe from a list of
cluster configurations stored internally in the xDS management server. The xDS management server
will monitor the list and use it to filter the configuration sent to an individual Envoy instance
to only contain the subscribed cluster configurations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true? Let's say we have an Envoy and it has a CDS initially with {X, Y}. An extension notices that Z is missing and asks for Z. Is the management server now only returning {Z} or is it {X, Y, Z} (via delta protocol). I think what we're doing with on-demand is augmenting the base set of clusters, not going pure delta subscription.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the management server will only return {Z}, but actually it depends the implementation of management server, management server can also return {X, Y, Z}, but in the scene we expected before, the management server will only return {Z}.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to describe two different things here:

  1. What the management server thinks the Envoy is subscribed to. Presumably {X, Y, Z}.
  2. What the management server puts on the wire as part of the response delta update, this is just {Z}, but it could validly be any superset.

with the :ref:`type_url <envoy_api_field_DeltaDiscoveryRequest.type_url>` set to
`type.googleapis.com/envoy.api.v2.Cluster` and
:ref:`resource_names_subscribe <envoy_api_field_DeltaDiscoveryRequest.resource_names_subscribe>`
set to a list of cluster resource names for which it would like configuration.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the cluster name have any special form? How does the extension (RocketMQ in your case) know what cluster to ask for? I.e. how is it translating something in the request to something it can ask for from the management server?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, no special form here. the resource_names_subscribe would notify management server cluster name the envoy instance need, the type_url would notify management server the type of resource the envoy instance need. The management server will translate the clusters into resources in DeltaDiscoveryResponse envoy can accept.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still trying to understand how we go from some RocketMQ concept to a cluster resource name. Can you elaborate on this?

Copy link
Member Author

@aaron-ai aaron-ai Jan 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, there are a series of concepts in RocketMQ: topic, broker, and nameserver. nameserver is the server who provides service-discovery to find broker through topic. RocketMQ client will choose a topic to send/consume message. topic is just a name or an abstract concept, it is a set of brokers actually.

You can view more details in the official website's description about the architecture, which could be more precise.

In a standard procedure, RocketMQ client will fetch the broker's address through nameserver, the nameserver will send the route information to the client. Client would send/consume message to broker directly.

In service mesh, one topic's route information is abstracted into one cluster in xDS, the topic name is the cluster's name or the resource's name in delta xDS. Sometimes the envoy instance do not retain the cluster ( can not find the cluster in clusterManager ) but RocketMQ client needs it, the addToClusterInterest method will be trigger to fetch the route information from management server, and the ClusterUpdateCallbacks will notify the the events to be processed.

Copy link
Member

@htuch htuch Jan 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I think the interesting question here is how do we map topic to cluster name? Is this always going to be 1:1? Is the RocketMQ filter config going to allow for some kind of transformation or mapping to happen?

It seems unlikely that in all deployments we would want a 1:1 mapping from topic to cluster, this seems very restrictive.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@htuch Yes, it's 1:1 now, I think 1:1 is the most suited way for our architecture.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering when other folks pick up RocketMQ whether they are going to feel the same way. Many Envoy operators have stylized ways of naming clusters that reflect things like their provenance, tenant, workload, version, etc. Would something like cluster-foo-v145-acme-tau-beta-lambda be a reasonable abstract topic to route to? Or are topics more like the-weather-today-in-france and you are expecting the clusters to match this logical naming?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@htuch As a developer of Apache RocketMQ, I would say 1:1 mapping RocketMQ topic to envoy cluster is the most natural way. One topic is normally served by multiple broker servers to maintain availability and each broker serves multiple topics. The topic name itself can be any arbitrary valid name. Prod deployment usually follows some naming schemes to implement multi-tenant etc. From the perspective of application developer and operator, it is OK to treat a topic as a cluster of virtual shared servers.

" Would something like cluster-foo-v145-acme-tau-beta-lambda be a reasonable abstract topic to route to? Or are topics more like the-weather-today-in-france and you are expecting the clusters to match this logical naming?"

The name of the topic is up to the application developer/operators. In most cases, topic name follows TenantId-productName-moduleX-featureY paradigm. Application developers may version their topics.

@rshriram
Copy link
Member

theoretically, shouldnt we be able to add the addToClusterInterest to existing http router impl (when loading rds), tcp proxy, and all the other filters needed? If the interested clusters already exist, great. If they dont, then these could be batched to fire off a cds request right?

This does mean that route warming or listener warming may be delayed until cds fetch and cluster load, but we would now be able to stream resources in lds/rds/cds/eds order rather than sending all cds+eds, and then lds/rds.

Signed-off-by: aaron-ai <yangkun.ayk@alibaba-inc.com>
Signed-off-by: aaron-ai <yangkun.ayk@alibaba-inc.com>
Signed-off-by: aaron-ai <yangkun.ayk@alibaba-inc.com>
Signed-off-by: aaron-ai <yangkun.ayk@alibaba-inc.com>
Signed-off-by: aaron-ai <yangkun.ayk@alibaba-inc.com>
Signed-off-by: aaron-ai <yangkun.ayk@alibaba-inc.com>
Signed-off-by: aaron-ai <yangkun.ayk@alibaba-inc.com>
@stale
Copy link

stale bot commented Jan 30, 2020

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Jan 30, 2020
@aaron-ai
Copy link
Member Author

This PR would be suspended because of the vacation.

@stale stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Jan 30, 2020
@stale
Copy link

stale bot commented Feb 10, 2020

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Feb 10, 2020
@stale
Copy link

stale bot commented Feb 17, 2020

This pull request has been automatically closed because it has not had activity in the last 14 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot closed this Feb 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale stalebot believes this issue/PR has not been touched recently
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants