admin: move '/routes' handler implementation from rds_impl to admin_impl #2669

jsedgwick · 2018-02-24T01:39:48Z

Description:
This PR is setup for a solution to #2421. For that issue, the admin will have to aggregate config information from multiple xDS consumers. There, the model of adding admin interfaces via addHandler breaks down - there is no single source from which to add the handler. Since the route configs are necessary for the aggregated config, I decided to relocate the /routes implementation as a guinea pig for decoupling admin functionality from xDS functionality.*

This commit stack contains a few refactors, none of which I'm tied to but were the path of least resistance to getting the config objects available within AdminImpl.** LMK if you feel there are better ways.

I noticed that FactoryContext contained the bulk of Server::Instance's interface, so I split the context more cleanly between "server context" and "additional context for listeners" so that AdminImpl can pass its Server::Instance reference as a context to the connection manager. As a result, what is now called FactoryContextshould probably be called ListenerContext but that's an orthogonal refactor.

Note the difference between the old RdsImplTest and the new AdminTest. The relocated tests now test the admin functionality they purport to (i.e. how the configs are filtered and spat out) and don't rely on rds impl details.

Anywho, I will start building the current config endpoint on top of this.

Below, 'FWIW' means 'it's not worth very much;' I just like to start discussions as a way to learn this project. The below opinions are not meant to be assertive or authoritative but rather food for [my] thought.

* FWIW, this means there are no longer any external users of Admin::addHandler/removeHandler. I'd be in favor of making those private without a new compelling case for dynamism, but I don't care. That the /routes handler has to do sanitation is a bit of a warning sign against anyone adding an admin route anywhere. IMO: admin handlers, especially those that modify, should interact with server components through established interfaces, not vice versa.

** FWIW, I think the backflips required to get the RouteConfigProviderManager into AdminImpl showcase the fragility of a proliferation of singletons, but that's a separate conversation. IMO, the HttpConnectionManager and RouteConfigProviderManager and other such singletons should be owned by the Server, unless they need to be shared outside the single Server object..

Risk Level: Low - no functional production changes

Testing: Unit, local testing of /routes endpoint

…::Instance implements Signed-off-by: James Sedgwick <jsedgwick@lyft.com>

Signed-off-by: James Sedgwick <jsedgwick@lyft.com>

…figFactory Signed-off-by: James Sedgwick <jsedgwick@lyft.com>

Signed-off-by: James Sedgwick <jsedgwick@lyft.com>

mattklein123 · 2018-02-24T18:10:07Z

@jsedgwick thanks for this. From a quick skim the code quality looks awesome. Well done! I have some high level comments on approach though on how we solve #2421 which might change what we want to do here so we should converge on that first. I'm realizing now I should have left more notes on #2421 which I didn't do. Sorry. :(

First, the reasoning behind the somewhat odd use of singletons as it relates to RDS is that Envoy is a modular pluggable server. Filters/plugins can be added via static (and eventually dynamic) linking w/o the main server knowing anything about them. HTTP support is implemented effectively as a very complex plugin using the same interfaces as all the other plugins. Thus, since RDS is really implemented on behalf of a plugin, it can't be known to the main server ahead of time (this is also the reason why we can't make admin add/remove handler private. There are very likely private filters/plugins people are using that add admin handlers).

Now, one could clearly argue that HTTP RDS is important enough that it should be a full fledged thing hanging off of the server interface much like CDS/LDS/EDS. However, this is a slippery slope and not academic. We will have other APIs that are developed for use only in filter dynamic updates. Things that immediately come to mind include an RDS like API for tcp_proxy, a streaming API for the forthcoming tap/dump filter, IP tagging updates, SSL client auth updates, etc. It becomes problematic to add each of these APIs to the top level server object when ultimately we want the server to be fully modular and to allow compiling out functionality that is not desired for small form factor environments. (Again not academic, this has already been requested).

So in terms of how we implement #2421 I think we have 3 options roughly:

Option 1

Continue with the approach you have started on aimed at builing a single /dump_config API. I agree this would mean having access to all API config sources, along with the bootstrap config, in the admin server. However, even if we do this, we have the unfortunate situation in which we are going to have to effectively hand build a final "static" config from the current dynamic config. This is going to mean going through and inserting dynamic resources in places and in the case of RDS removing the RDS config and inserting an effective static config. This can clearly be done, but as you have already found, it's quite ugly, and will get even more ugly as we add other APIs in the future. It will basically mean making the admin API aware of all APIs that we eventually add on behalf of filters which I think is not great. But definitely doable.

Option 2

Abort this change and leave /routes where they were before as a dynamic plugin.
Move forward with cleaning up and adding config output for /bootstrap, /clusters, /listeners. (Cleanup is needed in the case of the current /listeners fake endpoint used for testing and the current /clusters maybe needs to get moved to /cluster_stats or something).

I believe that doing (2) only will basically satisfy the real user intent of #2421. It's true that it won't produce a full functional static config, but it will provide a trivial way to dump the major parts of the server in a way that can be easily consumed and audited.

If (2) is not sufficient, I believe that we could pretty easily write an external tool in Python which would inspect the admin endpoint for the various config dump endpoints, and then actually construct the final static inferred config. This is clearly not as great as having /dump_config, but it would work and is pretty easy, and would avoid all the ugliness of trying to deal with the modular nature of the server inside the admin handler.

Option 3

This is a loosely formed option in mind, but I think it's probably possible to do something fancy in which we allow plugins/filters to register as "config providers" in some way with the admin server. Then, when the /dump_config endpoint is called, the admin handler would actually take the bootstrap config, and start dumping it, while allowing plugins to modify the dumped config as the DOM is being created. This would be very nice, as it would work for any API/plugin we add in the future. So this is probably the optimal solution, but the most difficult, and would require the most thinking.

Summary

I would recommend that that we do option (2) right now as it's relatively quick and will immediately add a tremendous amount of value. We can also think about option (3) for the future. I'm not really in favor of option (1) mainly for modularity reasons. But in any case this is kind of complicated and again sorry I didn't provide more guidance. Would also like to hear of anyone else has any opinions on this. Thanks for working on this!

jsedgwick · 2018-02-24T21:11:29Z

@mattklein123 Thanks for the detailed feedback. I will noodle this further on the flight I'm about to board, but initial thoughts:

First, I think we were aiming to first provide the actual loaded configuration simply in xDS proto form. I had broached the topic (with @htuch I think?) of doing the work of mangling it all into a usable static bootstrap config for reproducibility reasons, but also brought of your idea of that being a separate piece of tooling. I agree.

Re: why admin handlers are dynamic, I figured that was for plugins and that totally makes sense, that's why I didn't make that change.
Thanks for explaining that HTTP RDS is meant to be implemented as a plugin; I was confused precisely because of that discrepancy from CDS/LDS/EDS.

Pluggability is awesome but I don't think that precludes an ownership model that relies less on singletons. Can the server own a polymorphic collection of plugin implementations, where the plugin interface includes configuration dumps via proto and/or opaque callbacks?* i.e., not dissimilar from how Admin owns dynamic handlers. This brings me to your option 3:

If these dynamic plugins are owned by the server, they can be accessed in the way I was hoping by other components like Admin through a common interface, which could include providing structured configuration. This is basically your option 3, and is actually something I sketched out as a first draft because it melds better with the vision of having Admin at least have the option of provide structured output for all component present and future, including dynamic, instead of just at '/clusters', '/listeners', '/routes', etc. At one endpoint, without having to know what plugins are enabled.

To be clear, 100% agree with keeping addHandler() and '/routes' dynamic per your explanations, but also pursuing a mix of option 2 and option 3 for this change, without the "DOM modification" you mentioned, leaving that part to separate tooling or at least a separate PR. As a separate stretch goal, there's a path here towards reducing reliability on singletons.

I agree that this simplified option 3 is the most work but if you're onboard once we nail down a plan I'm happy to do it.

This will all obviously require a new PR but I will leave this one open to continue discussion, unless you want to move discussion to the issue.

Please don't apologize for not providing guidance, it's ok to learn through iteration :) I like to converse through code and am not afraid of throwing code away. Ok, maybe a little afraid. But it's good for me.

minimal hand-wavy pseudocode:

interface ConfigurationDump {
  // edit: i see there's no polymorphism in proto, but you might use type urls a la DiscoveryResponse to provide config objects in a common way among all components such that admin can consume and dump them
  SomeProto dumpConfig(); 
}
interface PluggableComponent : ConfigurationDump, AnyOtherInterfaces {};
class ClusterManager: public ConfigurationDump {};
class HttpConnectionManager: public PluggableComponent {};
class Server {
  vector<shared_ptr<PluggableComponent>> plugins; // Primary ownership here, not with singletons
  shared_ptr<ClusterManager> cluster_manager; // and listener_manager, etc
  vector<shared_ptr<ConfigurationDump>> dumpers; // could also stick ClusterManager etc in here for admin's convenience
}

class Admin {
handlerDumpConfig {
  for config_info in server.dumpers() {
    theseNamesAreTerribleButHopefullyYouGetMyDrift()
    myPlaneIsTakingOffBye()
}
}

mattklein123 · 2018-02-25T20:33:17Z

Yes we can register a list of interfaces (ConfigDumper, etc.) to the server if need be, but it occurs to me that there is a really good and clean solution to this that is easy to implement and will not require much churn at all. It's a variant of your proposal that I think would turn out pretty sweet. What if we did this:

Add a new handler registration function to admin such as addConfigDumpHandler(ConfigDumpProvider& provider);
This new handler is a special handler which effectively registers a config snippet for dumping with the /dump_config endpoint.
The /dump_config endpoint is special and will use all of the registered dump providers to produce output. This would allow us to do pretty neat things like:

/dump_config?list
/dump_config?all
/dump_config?name=bootstrap

etc. and in the future even have different output types that could be consumed by external scripts for transforming a dynamic config into a fully static config (swapping out RDS, replacing all dynamic clusters w/ static ones, etc.).

If we do above, I think we get rid of /routes. Then, the config dump providers that are guaranteed to be registered would be bootstrap, clusters, and listeners. Then if RDS is in use you get rds. Then in the future we can trivially add other config dumpers as APIs are added.

There are details to work out re: the interfaces, what the printing looks like, etc. but I think this would work out pretty well. WDYT?

jsedgwick · 2018-02-26T23:05:29Z

@mattklein123 Ya, something like that is what I had in mind, I'll put it together.

My one concern with your exact approach: what about the next time someone wants a handler that needs access to data/state from multiple components? Do you add another addXYZHandler() interface? I would rather set up some sort of ConfigurationTracker object associated with the admin that encapsulates all this functionality and calls addHandler() itself. It's not that different but JFYI what to expect.

Re: my other thoughts on ownership I'll set it up in a way that's compatible with any future changes there.

Question before I go for it though: what's the synchronization situation with global objects like Server and Admin? e.g. addHandler() docs don't say anything about thread safety, and while the implementation doesn't look thread safe, there doesn't look to be anything stopping/advising someone not to call it from some rando thread.

P.S. One issue w.r.t. getting rid of current endpoint(s) is the difference between the proposed dumping of just the xDS protoc configs and the current more specific printouts that include runtime state and so on. So maybe we can just bolt this on and decide if the others are redundant after? Or give ConfigDumpProviders the ability to provide structured config dumps and component-specific dumps as exist now. I like the latter but it doesn't have to be in the first PR.

mattklein123 · 2018-02-27T01:16:55Z

My one concern with your exact approach: what about the next time someone wants a handler that needs access to data/state from multiple components? Do you add another addXYZHandler() interface? I would rather set up some sort of ConfigurationTracker object associated with the admin that encapsulates all this functionality and calls addHandler() itself. It's not that different but JFYI what to expect.

I don't think this extra complexity is worth it. Let's design for what we need right now. Not for what we might need in the future. I would do roughly what I outlined. If we need something else in the future we can refactor. I guess what I'm really saying is that I don't mind the specific details, but doing this change should not require modifying the server interface. If you want to do something different rooted off of admin feel free.

Question before I go for it though: what's the synchronization situation with global objects like Server and Admin? e.g. addHandler() docs don't say anything about thread safety, and while the implementation doesn't look thread safe, there doesn't look to be anything stopping/advising someone not to call it from some rando thread.

Everything happens on the main thread in this path. You don't need any synchronization.

P.S. One issue w.r.t. getting rid of current endpoint(s) is the difference between the proposed dumping of just the xDS protoc configs and the current more specific printouts that include runtime state and so on. So maybe we can just bolt this on and decide if the others are redundant after? Or give ConfigDumpProviders the ability to provide structured config dumps and component-specific dumps as exist now. I like the latter but it doesn't have to be in the first PR.

The only one I'm suggesting getting rid of is /routes which I think is made fully redundant by this new endpoint. The rest should stay for now.

jsedgwick · 2018-02-27T02:27:35Z

Sgtm. To be clear yeah this tracker thing would hang off admin, not server. Which is where it belongs. Imo simpler not more complex but I'll show you what I mean with code not my crappy words. Hopefully will carve some time tomorrow. Re: your other points : ack.

On Mon, Feb 26, 2018 at 8:16 PM Matt Klein ***@***.***> wrote: My one concern with your exact approach: what about the next time someone wants a handler that needs access to data/state from multiple components? Do you add another addXYZHandler() interface? I would rather set up some sort of ConfigurationTracker object associated with the admin that encapsulates all this functionality and calls addHandler() itself. It's not that different but JFYI what to expect. I don't think this extra complexity is worth it. Let's design for what we need right now. Not for what we might need in the future. I would do roughly what I outlined. If we need something else in the future we can refactor. I guess what I'm really saying is that I don't mind the specific details, but doing this change should not require modifying the server interface. If you want to do something different rooted off of admin feel free. Question before I go for it though: what's the synchronization situation with global objects like Server and Admin? e.g. addHandler() docs don't say anything about thread safety, and while the implementation doesn't look thread safe, there doesn't look to be anything stopping/advising someone not to call it from some rando thread. Everything happens on the main thread in this path. You don't need any synchronization. P.S. One issue w.r.t. getting rid of current endpoint(s) is the difference between the proposed dumping of just the xDS protoc configs and the current more specific printouts that include runtime state and so on. So maybe we can just bolt this on and decide if the others are redundant after? Or give ConfigDumpProviders the ability to provide structured config dumps and component-specific dumps as exist now. I like the latter but it doesn't have to be in the first PR. The only one I'm suggesting getting rid of is /routes which I think is made fully redundant by this new endpoint. The rest should stay for now. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2669 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAc4Ir2g3s1uYZcLB9z5hIJjH8RMMEd6ks5tY1eKgaJpZM4SRsT1> .

-- *James Sedgwick* Software Engineer, Networking Infrastructure 973.738.0737 <+19737380737> [image: Lyft] <http://www.lyft.com/>

mattklein123 · 2018-02-27T02:31:53Z

Sgtm. To be clear yeah this tracker thing would hang off admin, not server.
Which is where it belongs. Imo simpler not more complex but I'll show you
what I mean with code not my crappy words. Hopefully will carve some time
tomorrow.

Sure sounds good. As long as it's a registration system off of admin SGTM.

…oyproxy#2669) * feat(edges): add support for reporting two batches (new vs full) Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com> * cleanup Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com> * fix comment in test Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com> * address review feedback Signed-off-by: Douglas Reid <douglas-reid@users.noreply.github.com>

* Update Envoy * Remove uses of c-ares * Register static cluster extension * Fix KV store APIs Co-authored-by: Alyssa Wilk <alyssar@chromium.org> Signed-off-by: JP Simard <jp@jpsim.com>

James Sedgwick added 5 commits February 23, 2018 16:23

refactor most of FactoryContext into ServerContext parent that Server…

5726def

…::Instance implements Signed-off-by: James Sedgwick <jsedgwick@lyft.com>

provide a tryGet method in singleton manager

05575e0

Signed-off-by: James Sedgwick <jsedgwick@lyft.com>

expose RouteConfigProviderManager from HttpConnectionManagerFilterCon…

bbdc823

…figFactory Signed-off-by: James Sedgwick <jsedgwick@lyft.com>

add ability to downcast in FactoryRegistry::getFactory

8d0c610

Signed-off-by: James Sedgwick <jsedgwick@lyft.com>

move /routes admin handler out of RdsImpl and into Admin

05aeb2b

Signed-off-by: James Sedgwick <jsedgwick@lyft.com>

danielhochman assigned mattklein123 Feb 26, 2018

jsedgwick closed this Feb 27, 2018

jpsim pushed a commit that referenced this pull request Nov 28, 2022

Update Envoy (#2669)

bf327c0

* Update Envoy * Remove uses of c-ares * Register static cluster extension * Fix KV store APIs Co-authored-by: Alyssa Wilk <alyssar@chromium.org> Signed-off-by: JP Simard <jp@jpsim.com>

jpsim pushed a commit that referenced this pull request Nov 29, 2022

Update Envoy (#2669)

e57c5a5

* Update Envoy * Remove uses of c-ares * Register static cluster extension * Fix KV store APIs Co-authored-by: Alyssa Wilk <alyssar@chromium.org> Signed-off-by: JP Simard <jp@jpsim.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

admin: move '/routes' handler implementation from rds_impl to admin_impl #2669

admin: move '/routes' handler implementation from rds_impl to admin_impl #2669

jsedgwick commented Feb 24, 2018 •

edited

Loading

mattklein123 commented Feb 24, 2018

jsedgwick commented Feb 24, 2018 •

edited

Loading

mattklein123 commented Feb 25, 2018

jsedgwick commented Feb 26, 2018

mattklein123 commented Feb 27, 2018

jsedgwick commented Feb 27, 2018 via email

mattklein123 commented Feb 27, 2018

admin: move '/routes' handler implementation from rds_impl to admin_impl #2669

admin: move '/routes' handler implementation from rds_impl to admin_impl #2669

Conversation

jsedgwick commented Feb 24, 2018 • edited Loading

mattklein123 commented Feb 24, 2018

Option 1

Option 2

Option 3

Summary

jsedgwick commented Feb 24, 2018 • edited Loading

mattklein123 commented Feb 25, 2018

jsedgwick commented Feb 26, 2018

mattklein123 commented Feb 27, 2018

jsedgwick commented Feb 27, 2018 via email

mattklein123 commented Feb 27, 2018

jsedgwick commented Feb 24, 2018 •

edited

Loading

jsedgwick commented Feb 24, 2018 •

edited

Loading