Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a package to perform Mann-Whitney U test #2501

Merged
merged 1 commit into from
Sep 22, 2021
Merged

Conversation

nakabonne
Copy link
Member

@nakabonne nakabonne commented Sep 21, 2021

What this PR does / why we need it:
This PR adds the mannwhitney package that provides a public function to perform the Mann-Whitney U test to be used by the analysis executor.

NOTE: All files within the PR are just borrowed from golang.org/x/perf/internal/stats which cannot be imported from external packages, so well tested.

Which issue(s) this PR fixes:

Fixes #

Does this PR introduce a user-facing change?:

NONE

@pipecd-bot
Copy link
Collaborator

TODO

The following ISSUES will be created once got merged. If you want me to skip creating the issue, you can use /todo skip command.

Details

1. Add a Support method for finite support distributions? Or

https://github.com/pipe-cd/pipe/blob/66a90c1a2937afeecfeae2f5ea5ee34a04f563de/pkg/app/piped/executor/analysis/mannwhitney/dist.go#L93-L96

This was created by todo plugin since "TODO:" was found in 66a90c1 when #2501 was merged. cc: @nakabonne.

2. Plot method to return a pre-configured Plot object with

https://github.com/pipe-cd/pipe/blob/66a90c1a2937afeecfeae2f5ea5ee34a04f563de/pkg/app/piped/executor/analysis/mannwhitney/dist.go#L97-L100

This was created by todo plugin since "TODO:" was found in 66a90c1 when #2501 was merged. cc: @nakabonne.

3. For discrete distributions, use the step size to

https://github.com/pipe-cd/pipe/blob/66a90c1a2937afeecfeae2f5ea5ee34a04f563de/pkg/app/piped/executor/analysis/mannwhitney/dist.go#L133-L136

This was created by todo plugin since "TODO:" was found in 66a90c1 when #2501 was merged. cc: @nakabonne.

4. Check these against some other implementation.

https://github.com/pipe-cd/pipe/blob/66a90c1a2937afeecfeae2f5ea5ee34a04f563de/pkg/app/piped/executor/analysis/mannwhitney/mannwhitney_test.go#L63-L66

This was created by todo plugin since "TODO:" was found in 66a90c1 when #2501 was merged. cc: @nakabonne.

5. Actually, it's at most ⌈n*m/2⌉, but

https://github.com/pipe-cd/pipe/blob/66a90c1a2937afeecfeae2f5ea5ee34a04f563de/pkg/app/piped/executor/analysis/mannwhitney/udist.go#L138-L141

This was created by todo plugin since "TODO:" was found in 66a90c1 when #2501 was merged. cc: @nakabonne.

6. ~40% of this function's time is spent in mapassign on

https://github.com/pipe-cd/pipe/blob/66a90c1a2937afeecfeae2f5ea5ee34a04f563de/pkg/app/piped/executor/analysis/mannwhitney/udist.go#L188-L191

This was created by todo plugin since "TODO:" was found in 66a90c1 when #2501 was merged. cc: @nakabonne.

7. The worst case for this function is when there are

https://github.com/pipe-cd/pipe/blob/66a90c1a2937afeecfeae2f5ea5ee34a04f563de/pkg/app/piped/executor/analysis/mannwhitney/udist.go#L194-L197

This was created by todo plugin since "TODO:" was found in 66a90c1 when #2501 was merged. cc: @nakabonne.

8. The n1 and twoU values in the ukeys follow strict

https://github.com/pipe-cd/pipe/blob/66a90c1a2937afeecfeae2f5ea5ee34a04f563de/pkg/app/piped/executor/analysis/mannwhitney/udist.go#L224-L227

This was created by todo plugin since "TODO:" was found in 66a90c1 when #2501 was merged. cc: @nakabonne.

9. Is it possible to generate this table bottom-up? If

https://github.com/pipe-cd/pipe/blob/66a90c1a2937afeecfeae2f5ea5ee34a04f563de/pkg/app/piped/executor/analysis/mannwhitney/udist.go#L237-L240

This was created by todo plugin since "TODO:" was found in 66a90c1 when #2501 was merged. cc: @nakabonne.

10. Later computations depend on these, but these don't

https://github.com/pipe-cd/pipe/blob/66a90c1a2937afeecfeae2f5ea5ee34a04f563de/pkg/app/piped/executor/analysis/mannwhitney/udist.go#L269-L272

This was created by todo plugin since "TODO:" was found in 66a90c1 when #2501 was merged. cc: @nakabonne.

11. Use symmetry to minimize U

https://github.com/pipe-cd/pipe/blob/66a90c1a2937afeecfeae2f5ea5ee34a04f563de/pkg/app/piped/executor/analysis/mannwhitney/udist.go#L355-L358

This was created by todo plugin since "TODO:" was found in 66a90c1 when #2501 was merged. cc: @nakabonne.

12. Minimize U?

https://github.com/pipe-cd/pipe/blob/66a90c1a2937afeecfeae2f5ea5ee34a04f563de/pkg/app/piped/executor/analysis/mannwhitney/udist.go#L367-L370

This was created by todo plugin since "TODO:" was found in 66a90c1 when #2501 was merged. cc: @nakabonne.

13. More precise bounds when there are ties.

https://github.com/pipe-cd/pipe/blob/66a90c1a2937afeecfeae2f5ea5ee34a04f563de/pkg/app/piped/executor/analysis/mannwhitney/udist.go#L399-L401

This was created by todo plugin since "TODO:" was found in 66a90c1 when #2501 was merged. cc: @nakabonne.

@nakabonne
Copy link
Member Author

/todo skip

@pipecd-bot
Copy link
Collaborator

GO_LINTER

The following files are not gofmt-ed. By commenting /golinter fmt, the formatted one will be appended to this pull request automatically.

pkg/app/piped/executor/analysis/mannwhitney/udist.go
--- pkg/app/piped/executor/analysis/mannwhitney/udist.go.orig
+++ pkg/app/piped/executor/analysis/mannwhitney/udist.go
@@ -228,7 +228,7 @@
 	// bounds. It might be worth turning these into directly
 	// indexible slices.
 	A := make([]map[ukey]float64, K+1)
-	A[K] = map[ukey]float64{ukey{n1: n1, twoU: twoU}: 0}
+	A[K] = map[ukey]float64{{n1: n1, twoU: twoU}: 0}
 
 	// Compute memo table (k, n1, twoU) triples from high K values
 	// to low K values. This drives the recurrence relation

Copy link
Collaborator

@pipecd-bot pipecd-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GO_LINTER

Some issues were detected while linting go source files in your changes.

@pipecd-bot
Copy link
Collaborator

COVERAGE

Code coverage for golang is 32.84%. This pull request increases coverage by 0.81%.

File Function Base Head Diff
pkg/app/piped/executor/analysis/mannwhitney/alg.go maxint -- 100.00% +100.00%
pkg/app/piped/executor/analysis/mannwhitney/alg.go minint -- 100.00% +100.00%
pkg/app/piped/executor/analysis/mannwhitney/alg.go sumint -- 100.00% +100.00%
pkg/app/piped/executor/analysis/mannwhitney/alg.go bisect -- 0.00% +0.00%
pkg/app/piped/executor/analysis/mannwhitney/alg.go bisectBool -- 0.00% +0.00%
pkg/app/piped/executor/analysis/mannwhitney/alg.go series -- 0.00% +0.00%
pkg/app/piped/executor/analysis/mannwhitney/dist.go InvCDF -- 0.00% +0.00%
pkg/app/piped/executor/analysis/mannwhitney/dist.go Rand -- 0.00% +0.00%
pkg/app/piped/executor/analysis/mannwhitney/mannwhitney.go MannWhitneyUTest -- 96.15% +96.15%
pkg/app/piped/executor/analysis/mannwhitney/mannwhitney.go labeledMerge -- 100.00% +100.00%
pkg/app/piped/executor/analysis/mannwhitney/mannwhitney.go tieCorrection -- 100.00% +100.00%
pkg/app/piped/executor/analysis/mannwhitney/mathx.go mathSign -- 57.14% +57.14%
pkg/app/piped/executor/analysis/mannwhitney/mathx.go init -- 100.00% +100.00%
pkg/app/piped/executor/analysis/mannwhitney/mathx.go mathChoose -- 90.91% +90.91%
pkg/app/piped/executor/analysis/mannwhitney/mathx.go mathLchoose -- 0.00% +0.00%
pkg/app/piped/executor/analysis/mannwhitney/mathx.go lchoose -- 100.00% +100.00%
pkg/app/piped/executor/analysis/mannwhitney/normaldist.go NormalDist.PDF -- 0.00% +0.00%
pkg/app/piped/executor/analysis/mannwhitney/normaldist.go NormalDist.pdfEach -- 0.00% +0.00%
pkg/app/piped/executor/analysis/mannwhitney/normaldist.go NormalDist.CDF -- 100.00% +100.00%
pkg/app/piped/executor/analysis/mannwhitney/normaldist.go NormalDist.cdfEach -- 0.00% +0.00%
pkg/app/piped/executor/analysis/mannwhitney/normaldist.go NormalDist.InvCDF -- 0.00% +0.00%
pkg/app/piped/executor/analysis/mannwhitney/normaldist.go NormalDist.Rand -- 0.00% +0.00%
pkg/app/piped/executor/analysis/mannwhitney/normaldist.go NormalDist.Bounds -- 0.00% +0.00%
pkg/app/piped/executor/analysis/mannwhitney/udist.go UDist.hasTies -- 100.00% +100.00%
pkg/app/piped/executor/analysis/mannwhitney/udist.go UDist.p -- 96.55% +96.55%
pkg/app/piped/executor/analysis/mannwhitney/udist.go makeUmemo -- 97.83% +97.83%
pkg/app/piped/executor/analysis/mannwhitney/udist.go twoUmin -- 100.00% +100.00%
pkg/app/piped/executor/analysis/mannwhitney/udist.go twoUmax -- 100.00% +100.00%
pkg/app/piped/executor/analysis/mannwhitney/udist.go UDist.PMF -- 60.00% +60.00%
pkg/app/piped/executor/analysis/mannwhitney/udist.go UDist.CDF -- 95.00% +95.00%
pkg/app/piped/executor/analysis/mannwhitney/udist.go UDist.Step -- 0.00% +0.00%
pkg/app/piped/executor/analysis/mannwhitney/udist.go UDist.Bounds -- 0.00% +0.00%

@nghialv
Copy link
Member

nghialv commented Sep 21, 2021

I think putting all of stats in the same package makes it to be hard to know where is our code for mann-whitney implementation.
So how about moving all stats files into a sub-package to be used as a utility.

@nakabonne
Copy link
Member Author

@nghialv There is no our own code in the mannwhitney package at all. So I feel like it will not make it difficult to read.

@nghialv
Copy link
Member

nghialv commented Sep 21, 2021

I got it. I was thinking that mannwhitney.go is our added code.
Nice.
/lgtm

@nakabonne
Copy link
Member Author

Ah, sorry for confusing you. I just wanted to emphasize that is the main file where is the only file including a public function.

@nakabonne
Copy link
Member Author

Thank you!

@khanhtc1202
Copy link
Member

Way to go
/approve

@pipecd-bot
Copy link
Collaborator

APPROVE

This pull request is APPROVED by khanhtc1202.

Approvers can cancel the approval by writing /approve cancel in a comment. Any additional commits also will change this pull request to be not-approved.

@pipecd-bot pipecd-bot merged commit 0ea5ca3 into master Sep 22, 2021
@pipecd-bot pipecd-bot deleted the mannwhitney-package branch September 22, 2021 01:46
@nghialv
Copy link
Member

nghialv commented Sep 22, 2021

/changelog

@pipecd-bot
Copy link
Collaborator

CHANGELOG

@nghialv: Changelog has been generated in response to this comment.

Details

Changelog since v0.16.3

Notable Changes

  • Remove MutatingWebhook/ValidatingWebhook from default watching K8s resources (They can be added explicitly in Piped config). (#2502)

Internal Changes

  • Add a package to perform Mann-Whitney U test (#2501)
  • Refine feature status page (#2500)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants