Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement openflow metrics query interface #1140

Merged
merged 1 commit into from
Sep 11, 2020

Conversation

weiqiangt
Copy link
Contributor

@weiqiangt weiqiangt commented Aug 24, 2020

  • Add flows to calculate packet and session numbers of each NetworkPolicy
    rule.
  • Provide functions to query metrics of each NetworkPolicy rule.

@antrea-bot
Copy link
Collaborator

Thanks for your PR.
Unit tests and code linters are run automatically every time the PR is updated.
E2e, conformance and network policy tests can only be triggered by a member of the vmware-tanzu organization. Regular contributors to the project should join the org.

The following commands are available:

  • /test-e2e: to trigger e2e tests.
  • /skip-e2e: to skip e2e tests.
  • /test-conformance: to trigger conformance tests.
  • /skip-conformance: to skip conformance tests.
  • /test-whole-conformance: to trigger all conformance tests on linux.
  • /skip-whole-conformance: to skip all conformance tests on linux.
  • /test-networkpolicy: to trigger networkpolicy tests.
  • /skip-networkpolicy: to skip networkpolicy tests.
  • /test-windows-conformance: to trigger windows conformance tests.
  • /skip-windows-conformance: to skip windows conformance tests.
  • /test-windows-networkpolicy: to trigger windows networkpolicy tests.
  • /skip-windows-networkpolicy: to skip windows networkpolicy tests.
  • /test-hw-offload: to trigger ovs hardware offload test.
  • /skip-hw-offload: to skip ovs hardware offload test.
  • /test-all: to trigger all tests (except whole conformance).
  • /skip-all: to skip all tests (except whole conformance).

@weiqiangt weiqiangt force-pushed the policy-metrics branch 3 times, most recently from 28f25c2 to 2ae1b29 Compare August 25, 2020 04:40
@weiqiangt weiqiangt changed the title [WIP]Implement policy metrics Implement policy metrics Aug 27, 2020
@weiqiangt
Copy link
Contributor Author

/test-all

@codecov-commenter
Copy link

codecov-commenter commented Aug 28, 2020

Codecov Report

Merging #1140 into master will increase coverage by 0.08%.
The diff coverage is 65.62%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1140      +/-   ##
==========================================
+ Coverage   56.20%   56.29%   +0.08%     
==========================================
  Files         105      108       +3     
  Lines       11550    12027     +477     
==========================================
+ Hits         6492     6770     +278     
- Misses       4490     4669     +179     
- Partials      568      588      +20     
Flag Coverage Δ
#integration-tests 46.12% <31.87%> (-1.22%) ⬇️
#unit-tests 42.53% <61.53%> (+0.87%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/agent/openflow/client.go 35.60% <ø> (ø)
pkg/agent/types/networkpolicy.go 16.66% <0.00%> (-8.34%) ⬇️
pkg/ovs/openflow/ofctrl_action.go 83.79% <ø> (+1.58%) ⬆️
pkg/agent/openflow/pipeline.go 57.11% <54.09%> (+0.31%) ⬆️
pkg/agent/openflow/network_policy.go 73.61% <66.17%> (-0.84%) ⬇️
pkg/ovs/openflow/ofctrl_builder.go 74.91% <100.00%> (+2.60%) ⬆️
...g/controller/networkpolicy/store/appliedtogroup.go 84.84% <0.00%> (-3.91%) ⬇️
pkg/agent/controller/networkpolicy/reconciler.go 69.07% <0.00%> (-3.16%) ⬇️
pkg/agent/controller/networkpolicy/cache.go 77.31% <0.00%> (-2.26%) ⬇️
pkg/apiserver/storage/ram/store.go 80.39% <0.00%> (-1.31%) ⬇️
... and 9 more

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you update the title of the PR and commit to something more specific? "Implement policy metrics" is too high level and may be confusing since this is only a part of "policy metrics", maybe "Implement openflow metrics query interface"

pkg/agent/openflow/network_policy.go Show resolved Hide resolved
@tnqn tnqn requested a review from wenyingd September 2, 2020 14:57
@tnqn tnqn mentioned this pull request Sep 2, 2020
3 tasks
@tnqn tnqn added this to the Antrea v0.10.0 release milestone Sep 2, 2020
@antoninbas
Copy link
Contributor

@weiqiangt @tnqn Apologies. I didn't realize that this PR was part of #1172. I left some relevant comments in that other PR.

@weiqiangt weiqiangt changed the title Implement policy metrics Implement openflow metrics query interface Sep 3, 2020
@weiqiangt
Copy link
Contributor Author

@weiqiangt @tnqn Apologies. I didn't realize that this PR was part of #1172. I left some relevant comments in that other PR.

Thanks for reviewing, I have updated the code according to your comment.

Copy link
Member

@srikartati srikartati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change. I have a couple of comments.

m := types.RuleMetric{}
pkts, _ := strconv.ParseUint(segs[1][strings.Index(segs[1], "=")+1:], 10, 32)
m.Packets = pkts
if strings.Contains(segs[4], "+") {
Copy link
Member

@srikartati srikartati Sep 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example format to cover this scenario would be good I guess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added the example format.

}

func parseMetricFlow(flow string) (uint32, types.RuleMetric) {
if strings.Contains(flow, "reg0") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: drop identifier would make this easy to read?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added.

@@ -184,6 +184,33 @@ func (b *ofFlowBuilder) MatchCTMarkMask(mask uint32) FlowBuilder {
return b
}

func (b *ofFlowBuilder) MatchCTLabel(high, low uint64) FlowBuilder {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, removed.

b.ofFlow.Match.CtLabelHiMask <<= rng[0] - 64
b.ofFlow.Match.CtLabelHiMask <<= 127 - rng[1]
b.ofFlow.Match.CtLabelHiMask <<= 127 - rng[1]
b.ofFlow.Match.CtLabelHiMask = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be CtLabelHiMask as the range doesn't involve LabelLo value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing it out. I have updated this part and extract the bit operations to a separate function with a unit test covers.

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor follow-up comments


func (c *client) NetworkPolicyMetrics() map[uint32]*types.RuleMetric {
result := map[uint32]*types.RuleMetric{}
egressFlows, _ := c.ovsctlClient.DumpTableFlows(uint8(EgressMetricTable))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I commented in #1172 (comment). I wasn't suggesting making it a field in the client, just having a local variable for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 1386 to 1401
for _, flows := range [][]string{egressFlows, ingressFlows} {
for _, flow := range flows {
if strings.Contains(flow, "priority=0") {
continue
}
ruleID, metric := parseMetricFlow(flow)
// We have two flows for each allow rule. One with ct_state=+new matching calculates session numbers,
// and first packets, another ct_state=-new flow is used to calculate following packets. We need to merge
// metrics of these two flows to get the right number.
if accMetric, ok := result[ruleID]; ok {
accMetric.Merge(metric)
} else {
result[ruleID] = &metric
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually thought you were going to have something like this:

collectMetricsFromFlows := func(flows []string) {
    for _, flow := range flows {
        // ...
    }
}

collectMetricsFromFlows(egressFlows)
collectMetricsFromFlows(ingressFlows)

(not a big fan of the [][]string and double for loop)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, updated.

pkg/agent/openflow/network_policy.go Outdated Show resolved Hide resolved
pkg/agent/openflow/network_policy_test.go Outdated Show resolved Hide resolved
pkg/agent/openflow/pipeline.go Outdated Show resolved Hide resolved
pkg/agent/openflow/pipeline.go Outdated Show resolved Hide resolved
pkg/agent/openflow/pipeline.go Outdated Show resolved Hide resolved
pkg/agent/openflow/pipeline.go Outdated Show resolved Hide resolved
pkg/agent/openflow/pipeline.go Outdated Show resolved Hide resolved
pkg/ovs/openflow/interfaces.go Outdated Show resolved Hide resolved
@weiqiangt
Copy link
Contributor Author

/test-all

pkts, _ := strconv.ParseUint(segs[1][strings.Index(segs[1], "=")+1:], 10, 32)
m.Packets = pkts
m.Sessions = pkts
bytes, _ := strconv.ParseUint(segs[2][strings.Index(segs[2], "=")+1:], 10, 32)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will overflow once it exceeds 4G bytes? I think all the 3 metrics should be 64 bit size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing it out, updated.

pkg/agent/openflow/pipeline.go Show resolved Hide resolved
EgressReg regType = 5
IngressReg regType = 6
TraceflowReg regType = 9 // Use reg9[28..31] to store traceflow dataplaneTag.
cnpDropConjunctionIDReg = endpointIPReg // marksRegServiceNeedLB indicates a packet need to do service selection.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment doesn't match, and I think using number directly is more straightforward, people may be confused by endpointIPReg, I thought they have some association before, but they just share the reg, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, updated.

if !ingress {
metricTableID = EgressMetricTable
offset = 32
labelRange = binding.Range{32, 63}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

declaring labelRange as two global variables to avoid repeated allocation, for example low32Range, high32Range?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

conjReg = EgressReg
labelRange = binding.Range{32, 63}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

pkg/agent/openflow/pipeline.go Show resolved Hide resolved
Bytes, Packets, Sessions uint64
}

func (m *RuleMetric) Merge(m1 RuleMetric) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use pointer to avoid struct copy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, updated.

@weiqiangt weiqiangt force-pushed the policy-metrics branch 2 times, most recently from 9211e3c to 7f55ced Compare September 10, 2020 13:53
// cnpDropConjunctionIDReg reuses reg3 which will also be used for storing endpoint IP to store the rule ID. Since
// the service selection will finish when a packet hitting NetworkPolicy related rules, there is no conflict.
cnpDropConjunctionIDReg regType = 3
marksRegServiceNeedLB uint32 = 0b001 // marksRegServiceNeedLB indicates a packet need to do service selection.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why changing the comment position? I think the previous one is more common and neat

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted.

pkg/agent/openflow/pipeline.go Show resolved Hide resolved
- Add flows to calculate packet and session numbers of each NetworkPolicy
  rule.
- Provide functions to query metrics of each NetworkPolicy rule.

Signed-off-by: Weiqiang Tang <weiqiangt@vmware.com>
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @weiqiangt, LGTM

@weiqiangt
Copy link
Contributor Author

/test-all

@weiqiangt weiqiangt merged commit be79453 into antrea-io:master Sep 11, 2020
Copy link
Member

@srikartati srikartati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that this is merged. Trying to use this patch for adding netpol info in flow records. Have a couple of questions.

What happens to ct_label if both cluster network policy rule and k8s rule are applied on to one flow? Do we support this scenario?

bytes, _ := strconv.ParseUint(segs[2][strings.Index(segs[2], "=")+1:], 10, 64)
m.Bytes = bytes
idRaw := segs[5][strings.Index(segs[5], "0x")+2 : strings.Index(segs[5], "/")]
if len(idRaw) > 8 { // only 32 bits are valid.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens to egress rule if only 32 bits are valid? I see that in ct_label we use 64 bits like below:

// We use the 0..31 bits of the ct_label to store the ingress rule ID and use the 32..63 bits to store the
// egress rule ID.

Copy link
Contributor Author

@weiqiangt weiqiangt Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing it out, this should be a bug. I will create a PR to fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When matching egress rule, there should always be 8 trailing zero while the length of the matching pattern of ingress rule will never exceed 8.
In other words, if the length of idRaw exceeds 8, it must for egress, otherwise it's for ingress.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.. makes sense. The comment was a bit confusing and misleading.

@weiqiangt
Copy link
Contributor Author

I understand that this is merged. Trying to use this patch for adding netpol info in flow records. Have a couple of questions.

What happens to ct_label if both cluster network policy rule and k8s rule are applied on to one flow? Do we support this scenario?

For now, only the ID of the last rule which takes effect will be calculated.

@weiqiangt weiqiangt deleted the policy-metrics branch September 16, 2020 08:13
@srikartati
Copy link
Member

I understand that this is merged. Trying to use this patch for adding netpol info in flow records. Have a couple of questions.
What happens to ct_label if both cluster network policy rule and k8s rule are applied on to one flow? Do we support this scenario?

For now, only the ID of the last rule which takes effect will be calculated.

Okay. Maybe using the rest of 64bits is probably the right way I guess, right?

@srikartati
Copy link
Member

I understand that this is merged. Trying to use this patch for adding netpol info in flow records. Have a couple of questions.
What happens to ct_label if both cluster network policy rule and k8s rule are applied on to one flow? Do we support this scenario?

For now, only the ID of the last rule which takes effect will be calculated.

Okay. Maybe using the rest of 64bits is probably the right way I guess, right?

Please ignore the above comment. I confirmed with @abhiraut that only one rule either from cluster network policy or K8s network policy is being applied.

GraysonWu pushed a commit to GraysonWu/antrea that referenced this pull request Sep 22, 2020
- Add flows to calculate packet and session numbers of each NetworkPolicy
  rule.
- Provide functions to query metrics of each NetworkPolicy rule.

Signed-off-by: Weiqiang Tang <weiqiangt@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants