Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Display resources and available resources for nodes #2382

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

denverdino
Copy link
Contributor

Signed-off-by: Li Yi denverdino@gmail.com

In the daily operation works, user need to monitor the resource allocation in the swarmkit cluster, but it is not easy to get such infomation directly from API.

This PR provides the enhancement on the API and CLI for such requirement.

  • In the API for listing nodes and inspecting node, add the bool flag for available_resources in request body and return the available resources in response
  • Add -r, --available-resources parameters to swarmctl node ls and swarmctl node inspect commands to display the resources and available resources for nodes.
$ swarmctl node ls
ID                         Name    Membership  Status  Availability  Manager Status  CPUs  Memory
--                         ----    ----------  ------  ------------  --------------  ----  ------
2xxq1a7pl71v2icmyafbgi4h8  node-1  ACCEPTED    READY   ACTIVE        REACHABLE *     4     3.9 GiB
jz7hn441ktdqv73hmseo1cx5h  node-3  ACCEPTED    READY   ACTIVE                        4     3.9 GiB
r1c4azue5lxiefgvp56hmk1hz  node-2  ACCEPTED    READY   ACTIVE                        4     3.9 GiB
$ swarmctl node ls -r
ID                         Name    Membership  Status  Availability  Manager Status  CPUs  Memory   Available CPUs  Available Memory
--                         ----    ----------  ------  ------------  --------------  ----  ------   --------------  ----------------
2xxq1a7pl71v2icmyafbgi4h8  node-1  ACCEPTED    READY   ACTIVE        REACHABLE *     4     3.9 GiB  1.5             1.9 GiB
jz7hn441ktdqv73hmseo1cx5h  node-3  ACCEPTED    READY   ACTIVE                        4     3.9 GiB  4               3.9 GiB
r1c4azue5lxiefgvp56hmk1hz  node-2  ACCEPTED    READY   ACTIVE                        4     3.9 GiB  3.5             2.9 GiB

Related issue #1344

image

@codecov
Copy link

codecov bot commented Sep 21, 2017

Codecov Report

Merging #2382 into master will decrease coverage by 6.15%.
The diff coverage is 6.38%.

@@            Coverage Diff             @@
##           master    #2382      +/-   ##
==========================================
- Coverage   66.58%   60.43%   -6.16%     
==========================================
  Files          93      128      +35     
  Lines       17908    26453    +8545     
==========================================
+ Hits        11924    15986    +4062     
- Misses       5016     9065    +4049     
- Partials      968     1402     +434

s.store.View(func(tx store.ReadTx) {
node = store.GetNode(tx, request.NodeID)
if request.AvailableResources && node != nil && node.Description != nil && node.Description.Resources != nil {
tasks, err = store.FindTasks(tx, store.ByNodeID(request.NodeID))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be filtered by state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean the node status?

@@ -183,6 +184,7 @@ message ListNodesRequest {
}

Filters filters = 1;
bool available_resources = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to plumb this flag. Just include the resources with the standard request.

If that isn't going to work, we need to come up with a more generic way to handle selective inclusion of data, such as field paths. This approach gets nasty if you just keep adding booleans.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevvooe the challenge is this operation need calculate the available resources by tasks status. It is not a simple filtering. I add the flag just to avoid the performance impact for simple listing/getting node status

Copy link
Contributor

@stevvooe stevvooe Sep 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@denverdino Yes, there is expense. Right now, we don't have a selective inclusion model when fetching resources. Adding booleans for each addition will become cumbersome. The node endpoint is supposed to return the node's data, as is, so unless the bookkeeping is kept directly on the node, this isn't the right model.

In the past, we don't really have these endpoints do calculations of this sort. This really something that we have pushed to the client in the past. If that won't work, this might be better as a separate endpoint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevvooe Such calculation in client side will introduce many cost for network, it is better in server-side. And It is a pretty common requirement, most of user don't know how to do that properly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like a boolean to control inclusion/exclusion of an expensive calculated field is cleaner than introducing a new set of endpoints. We could possibly formalize this with some kind of Features message that contains feature flags for the request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@denverdino Can you quantify it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, but if we have hundreds of tasks on each node it will take pretty much time to fetch tasks, and filtering. If we do that in client side it will take moretime on serialization and transportation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, if client side is not an option, and I'm not convinced that its not, because you haven't benchmarked it, the options are the following:

  1. Build out a model for selective inclusion of fields that will scale to other types.
  2. Have an endpoint that focuses on providing aggregate views.

Let's not sacrifice the design constraints for expediency. The above are reasonable approaches to this problems. I'm in favor of #2, as I think it complements the the current API model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevvooe any suggestion for the endpoint? thx

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetNodeResourceUsage or GetNodeStatus. I think the first step here is to look at what this endpoint might encompass. Basically, we are trying to see some of the aggregates that the orchestrator sees but are not obvious from the single node record. What are some other facts about orchestration state that we may want to expose?

@allencloud
Copy link
Contributor

Conflict happens. A rebase needed. @denverdino

Signed-off-by: Li Yi <denverdino@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants