Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add l4netlb metrics #1678

Merged
merged 1 commit into from
Mar 21, 2022
Merged

Add l4netlb metrics #1678

merged 1 commit into from
Mar 21, 2022

Conversation

kl52752
Copy link
Contributor

@kl52752 kl52752 commented Feb 24, 2022

No description provided.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Feb 24, 2022
@k8s-ci-robot
Copy link
Contributor

Hi @kl52752. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 24, 2022
@kl52752
Copy link
Contributor Author

kl52752 commented Feb 24, 2022

/assign @bowei

@freehan
Copy link
Contributor

freehan commented Mar 2, 2022

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 2, 2022
case loadbalancers.SyncTypeDelete:
// if service is successfully deleted, remove it from cache
if result.Error == nil {
klog.V(6).Infof("External L4 Loadbalancer for Service %s deleted, removing its state from metrics cache", namespacedName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V(4)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

prometheus.HistogramOpts{
Name: L4netlbLatencyMetricName,
Help: "Latency of an L4 NetLB Sync",
// custom buckets - [30s, 60s, 120s, 240s(4min), 480s(8min), 960s(16m), +Inf]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are sure the smallest bucket contains <30s is good enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right, we need to run more tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my experiment, 33 seconds was the shortest time. But most times was around 40-45 sec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to start with 15s

}
switch result.SyncType {
case loadbalancers.SyncTypeCreate, loadbalancers.SyncTypeUpdate:
klog.V(6).Infof("External L4 Loadbalancer for Service %s ensured, updating its state %v in metrics cache", namespacedName, result.MetricsState)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V(4)? why V(6)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

},
l4LBSyncLatencyMetricsLabels,
)
l4NetLBSyncErrorCount = prometheus.NewCounterVec(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you guys found out how the counter got handled in the case of master container restart?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have jus checked the following metrics:

number_of_ingresses
number_of_service_ports

They were NOT exported with value "0", just after registration in prometheus
They were exported once I created load balancers.

I am not sure if this is a prometheus feature or prom-to-sd

It seems values are exported only after we set some value

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought errors would be logged as a delta?

@freehan freehan self-assigned this Mar 3, 2022
func GetL4NetLBErrorMetric(t *testing.T) *L4LBErrorMetricInfo {
return getL4LBErrorMetric(t, metrics.L4netlbErrorMetricName)
}

func getL4LBErrorMetric(t *testing.T, metricName string) *L4LBErrorMetricInfo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo L4LBE
L4ELB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:) It is L4-LB-Error i can change to getL4lbErrorMetric?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's ok

@@ -342,6 +342,11 @@ func GetL4ILBLatencyMetric(t *testing.T) *L4LBLatencyMetricInfo {
return getL4LatencyMetric(t, metrics.L4ilbLatencyMetricName)
}

// GetL4NetLBLatencyMetric gets the current state of the l4_netlb_sync_duration_seconds metric.
func GetL4NetLBLatencyMetric(t *testing.T) *L4LBLatencyMetricInfo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should return error rather than pass 't'
or t.Helper() a least
and preferably renamed to MustGetL4NetLBLatencyMetric
go/go-style/decisions#must-functions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, and no change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, forget to push, now it is done

@@ -81,6 +96,16 @@ func checkAnnotations(result *L4NetLBSyncResult, l4netlb *L4NetLB) error {
return nil
}

func checkMetrics(m metrics.L4NetLBServiceState, isManaged, isPremium bool) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's put this at the end of file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
}

func createUsersStatusIP(fakeGCE *gce.Cloud, region string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createUsersStatusIP I don't get a name

what UsersStatus means?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was supposed to be Static :D

const (
managed = true
unmanaged = false
premium = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

premiumTier?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

managed = true
unmanaged = false
premium = true
standard = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

standardTier?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"k8s.io/ingress-gce/pkg/test"
"k8s.io/ingress-gce/pkg/utils"
namer_util "k8s.io/ingress-gce/pkg/utils/namer"
"k8s.io/legacy-cloud-providers/gce"
)

const (
managed = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

managedIP or managedAddress

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"k8s.io/ingress-gce/pkg/test"
"k8s.io/ingress-gce/pkg/utils"
namer_util "k8s.io/ingress-gce/pkg/utils/namer"
"k8s.io/legacy-cloud-providers/gce"
)

const (
managed = true
unmanaged = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unmanagedIP or unmanagedAddress

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -124,6 +134,8 @@ type ControllerMetrics struct {
negMap map[string]NegServiceState
// l4ILBServiceMap is a map between service key and L4 ILB service state.
l4ILBServiceMap map[string]L4ILBServiceState
// l4NetLBServiceMap is a map between service key and L4 NetLB service state.
l4NetLBServiceMap map[string]L4NetLBServiceState
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it would be better to use sync.Map
for every map in ControllerMetrics
maybe leave a
// todo ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still not todo...

Copy link
Contributor Author

@kl52752 kl52752 Mar 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added TODO in line 143

//TODO(kl52752) remove mutex and change map to sync.map
	sync.Mutex
	

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you leave a TODO, file an issue. Don't attach to a person..

case loadbalancers.SyncTypeCreate, loadbalancers.SyncTypeUpdate:
klog.V(6).Infof("External L4 Loadbalancer for Service %s ensured, updating its state %v in metrics cache", namespacedName, result.MetricsState)
lc.ctx.ControllerMetrics.SetL4NetLBService(namespacedName, result.MetricsState)
l4metrics.PublishNetLBSyncMetrics(result.Error == nil, result.SyncType, result.GCEResourceInError, utils.GetErrorType(result.Error), result.StartTime)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not pass result struct as argument
l4metrics.PublishNetLBSyncMetrics(result)

it this couses circular dependency then maybe L4NetLBSyncResult is declared in wrong package?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utils.GetErrorType(result.Error)
can also be called inside l4metrics.PublishNetLBSyncMetrics(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do it in new review because I think that we can separate fields that are needed only for metrics from sync result and I think that it will be bigger change. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

@kl52752 kl52752 requested a review from freehan March 10, 2022 14:13
Copy link
Contributor

@cezarygerard cezarygerard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please address or apply all my comments

@kl52752 kl52752 force-pushed the metrics-l4-rbs branch 2 times, most recently from ec7f4d8 to 4b47b04 Compare March 15, 2022 08:56
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 15, 2022
@kl52752
Copy link
Contributor Author

kl52752 commented Mar 15, 2022

/retest

@cezarygerard
Copy link
Contributor

/lgtm

@k8s-ci-robot
Copy link
Contributor

@cezarygerard: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@@ -338,19 +338,23 @@ type L4LBLatencyMetricInfo struct {
}

// GetL4ILBLatencyMetric gets the current state of the l4_ilb_sync_duration_seconds metric.
func GetL4ILBLatencyMetric(t *testing.T) *L4LBLatencyMetricInfo {
return getL4LatencyMetric(t, metrics.L4ilbLatencyMetricName)
func GetL4ILBLatencyMetric() (error, *L4LBLatencyMetricInfo) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

golang convention is to put the error as the last returning value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

func getL4LatencyMetric(t *testing.T, metricName string) *L4LBLatencyMetricInfo {
// GetL4NetLBLatencyMetric gets the current state of the l4_netlb_sync_duration_seconds metric.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return getL4LatencyMetric(metrics.L4netlbLatencyMetricName)
}

func getL4LatencyMetric(metricName string) (error, *L4LBLatencyMetricInfo) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -416,16 +420,21 @@ type L4LBErrorMetricInfo struct {
}

// GetL4ILBErrorMetric gets the current state of the l4_ilb_sync_error_count.
func GetL4ILBErrorMetric(t *testing.T) *L4LBErrorMetricInfo {
return getL4LBErrorMetric(t, metrics.L4ilbErrorMetricName)
func GetL4ILBErrorMetric() (error, *L4LBErrorMetricInfo) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@freehan freehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall
just nits

@@ -58,12 +60,30 @@ var (
},
l4LBSyncErrorMetricLabels,
)
l4NetLBSyncLatency = prometheus.NewHistogramVec(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mind just add a brief comment on the metrics?

Easier to understand

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes done, and I fixed the buckets count because previously I changes it to 15s but for ILB metrics 😄

@kl52752 kl52752 requested a review from freehan March 21, 2022 08:09
Copy link
Contributor

@freehan freehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 21, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cezarygerard, freehan, kl52752

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 21, 2022
@k8s-ci-robot k8s-ci-robot merged commit cc29101 into kubernetes:master Mar 21, 2022
listers "k8s.io/client-go/listers/core/v1"
"k8s.io/client-go/tools/cache"
"k8s.io/ingress-gce/pkg/annotations"
"k8s.io/ingress-gce/pkg/backends"
"k8s.io/ingress-gce/pkg/context"
"k8s.io/ingress-gce/pkg/controller/translator"
"k8s.io/ingress-gce/pkg/instances"
l4metrics "k8s.io/ingress-gce/pkg/l4lb/metrics"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't alias packages unless it is required due to a conflict

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -107,6 +109,7 @@ func NewL4NetLBController(
}
},
})
//TODO change to component name "l4netlb-controller"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you leave a todo, you need to file an issue and link it

TODO(): short description.

Otherwise finish the work and don't leave a TODO in the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment will be removed in #1683

if lc.needsDeletion(svc) {
klog.V(3).Infof("Deleting L4 External LoadBalancer resources for service %s", key)
result := lc.garbageCollectRBSNetLB(key, svc)
if result == nil {
return nil
}
lc.publishMetrics(result, namespacedName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not change this method to take a types.NamespacesName instead of a string? Then you don't have to have this outside the method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// publishMetrics sets controller metrics for NetLB services and pushes NetLB metrics based on sync type.
func (lc *L4NetLBController) publishMetrics(result *loadbalancers.L4NetLBSyncResult, namespacedName string) {
if result == nil {
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something that we expect? Should we log an error? Why will this happen?
There is no explanation of why this is an expected value for the parameter.

If we do not expect it to be nil, do not silently ignore it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed, this check is done before we call publishMetrics

case loadbalancers.SyncTypeCreate, loadbalancers.SyncTypeUpdate:
klog.V(4).Infof("External L4 Loadbalancer for Service %s ensured, updating its state %v in metrics cache", namespacedName, result.MetricsState)
lc.ctx.ControllerMetrics.SetL4NetLBService(namespacedName, result.MetricsState)
l4metrics.PublishNetLBSyncMetrics(result.Error == nil, result.SyncType, result.GCEResourceInError, utils.GetErrorType(result.Error), result.StartTime)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be easier to reason about if we have

  • PublishSyncSuccess()
  • PublishSyncError()

instead of exposing so many parameters to the user. then PublishNetLBSyncMetrics() can be an internal method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

statusError = "error"
L4ilbLatencyMetricName = "l4_ilb_sync_duration_seconds"
L4ilbErrorMetricName = "l4_ilb_sync_error_count"
statusSuccess = "success"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if these are reference in only one place, we don't need to create a named constant symbol for the value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is used by ILB and NetLB

Name: L4netlbLatencyMetricName,
Help: "Latency of an L4 NetLB Sync",
// custom buckets - [15s, 30s, 60s, 120s, 240s(4min), 480s(8min), 960s(16m), +Inf]
Buckets: prometheus.ExponentialBuckets(15, 2, 6),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have more buckets? They are low cost and better than missing out of the data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -87,3 +111,25 @@ func publishL4ILBSyncLatency(success bool, syncType string, startTime time.Time)
func publishL4ILBSyncErrorCount(syncType, gceResource, errorType string) {
l4ILBSyncErrorCount.WithLabelValues(syncType, gceResource, errorType).Inc()
}

// PublishL4NetLBSyncMetrics exports metrics related to the L4 NetLB sync.
func PublishNetLBSyncMetrics(success bool, syncType, gceResource, errType string, startTime time.Time) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -153,6 +152,12 @@ func (l4netlb *L4NetLB) EnsureFrontend(nodeNames []string, svc *corev1.Service)
result.Annotations[annotations.UDPForwardingRuleKey] = fr.Name
}
result.Status = &corev1.LoadBalancerStatus{Ingress: []corev1.LoadBalancerIngress{{IP: fr.IPAddress}}}
if fr.NetworkTier == cloud.NetworkTierPremium.ToGCEValue() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result.MetricsState.IsPremiumTier = fr.NetworkTier == cloud.NetworkTierPremium.ToGCEValue()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if fr.NetworkTier == cloud.NetworkTierPremium.ToGCEValue() {
result.MetricsState.IsPremiumTier = true
}
if ipAddrType == IPAddrManaged {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result.MetricsState.IsManagedIP = ipAddrType == IPAddrManaged

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"k8s.io/ingress-gce/pkg/test"
"k8s.io/ingress-gce/pkg/utils"
namer_util "k8s.io/ingress-gce/pkg/utils/namer"
"k8s.io/legacy-cloud-providers/gce"
)

const (
managedIP = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] personally prefer checkMetrics(..., /managedIP/true...) for documentation than adding symbols to the package scope.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is a minor comment then I preferred not to change it, it is used in few places and iMO it is more readable than comment

@@ -23,6 +23,10 @@ import (
)

const (
premium = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above

@@ -124,21 +134,25 @@ type ControllerMetrics struct {
negMap map[string]NegServiceState
// l4ILBServiceMap is a map between service key and L4 ILB service state.
l4ILBServiceMap map[string]L4ILBServiceState
// l4NetLBServiceMap is a map between service key and L4 NetLB service state.
l4NetLBServiceMap map[string]L4NetLBServiceState
// pscMap is a map between the service attachment key and PSC state
pscMap map[string]pscmetrics.PSCState
// ServiceMap track the number of services in this cluster
serviceMap map[string]struct{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we document why this is a set? Can we just use the set data structure directly?

defer im.Unlock()

if im.l4NetLBServiceMap == nil {
klog.Fatalf("L4 Net LB Metrics failed to initialize correctly.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect this to be common vs just having a panic() when accessing the nil map?

Otherwise, to be consistent, we should be checking nil in way more places...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is very unlikely that the table will be empty because it is initialized in context and the l4netlb controller is created from the context so the table should already be initialized. But since metric controller and l4netlb controller are independent it's good to have this check if sth will change in the future.
I didn't put this check into delete func because the idea is to call set service first.

im.Lock()
defer im.Unlock()
klog.V(4).Infof("Computing L4 NetLB usage metrics from service state map: %#v", im.l4NetLBServiceMap)
counts := map[feature]int{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why use a map vs just a struct?

type ... struct {
serviceCount int
statisIPCount int
...
}

etc vs allocating a dynamic map

Also, if we are using a dynamic map in this way, it will be better to wrap it so it's more consistent to deal with

type l4MetricsMap map[feature]int

func (m* l4MetricsMap) init() {
m[l4Net...] = 0 ...
}

Copy link
Contributor Author

@kl52752 kl52752 Mar 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use this map to set labels and its values to metric code
l4NetLBCount.With(prometheus.Labels{label: feature.String()}).Set(float64(count))

I added type for a map

@kl52752 kl52752 deleted the metrics-l4-rbs branch March 24, 2022 10:17
kl52752 added a commit to kl52752/ingress-gce that referenced this pull request Mar 24, 2022
This change included fixes for comments for closed PRs.
kubernetes#1675
kubernetes#1678
kl52752 added a commit to kl52752/ingress-gce that referenced this pull request Mar 24, 2022
This change included fixes for comments for closed PRs.
kubernetes#1675
kubernetes#1678
kl52752 added a commit to kl52752/ingress-gce that referenced this pull request Mar 24, 2022
This change included fixes for comments for closed PRs.
kubernetes#1675
kubernetes#1678
@kl52752 kl52752 mentioned this pull request Mar 24, 2022
kl52752 added a commit to kl52752/ingress-gce that referenced this pull request Mar 25, 2022
This change included fixes for comments for closed PRs.
kubernetes#1675
kubernetes#1678
kl52752 added a commit to kl52752/ingress-gce that referenced this pull request Mar 25, 2022
This change included fixes for comments for closed PRs.
kubernetes#1675
kubernetes#1678
kl52752 added a commit to kl52752/ingress-gce that referenced this pull request Apr 14, 2022
This change included fixes for comments for closed PRs.
kubernetes#1675
kubernetes#1678
kl52752 added a commit to kl52752/ingress-gce that referenced this pull request Apr 21, 2022
This change included fixes for comments for closed PRs.
kubernetes#1675
kubernetes#1678
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants