Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[query] Adds histogram_quantile function #1373

Merged
merged 7 commits into from
Feb 14, 2019
Merged

Conversation

arnikola
Copy link
Collaborator

No description provided.

"invalid number of args for histogram_quantile: %d", len(args))
}

if opType != HistogramQuantileType {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check this first

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was copying the pattern from clamp.go

Wouldn't mind dealing with these correctly within the parser in a later pr, like the binary and aggregation functions are done at the moment.


for index := 0; stepIter.Next(); index++ {
cloned := make([]float64, len(outValues))
copy(cloned, outValues)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use outValues instead of cloning it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, copied from the regular path, forgot to fix this one

) error {
sanitizeBuckets(bucketedSeries)

builder, err := setupBuilder(bucketedSeries, meta, stepIter, controller)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you reuse the BlockBuilder in m3db/m3/src/query/executor/transform/controller.go?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to set a bunch of meta, seriesMetas, step counts unfortunately

@@ -299,6 +299,10 @@ type TagOptionsConfiguration struct {
// If not provided, defaults to `__name__`.
MetricName string `yaml:"metricName"`

// BucketName specifies the tag name that corresponds to the metric's bucket.
// If not provided, defaults to `le`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is le from Prom?

Copy link
Collaborator

@benraskin92 benraskin92 Feb 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nvm - saw below

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep 👍

{upperBound: 1, value: 1},
{upperBound: math.Inf(1), value: 22},
})
assert.Equal(t, float64(1), actual)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add test with neg infinity and > 2 buckets?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

ninf = math.Inf(-1)
)

func TestQuantileFunctionFilteringWithoutA(t *testing.T) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WithoutA?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with WithQ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I copied some setup stuff from an agg function that filtered by an A tag, will update this name to make more sense


func testQuantileFunctionWithQ(t *testing.T, q float64) [][]float64 {
args := make([]interface{}, 0, 1)
args = append(args, q)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pass in the args instead of having to construct them here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately it's pretty annoying to build a []interface{..}, and that's what we need to pass into the function; since we only ever need the q value here, it should be fine to build the interface slice here


q := n.op.q
if q < 0 || q > 1 {
return processInvalidQuantile(q, bucketedSeries, meta, stepIter, n.controller)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems kind of weird to split this out like this. Maybe just have one function called processQuantile and put the inf logic in there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do that, will need to do the comparison for each step

// options) that denotes the upper bound of that bucket; series without this
// tag are ignored.
HistogramQuantileType = "histogram_quantile"
initIndexBucketLength = 10
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason to use 10?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be the general convention for initing arrays we don't know the length of

Here it has (by chance) a bit more context in that the general rule of thumb for histograms is no more than 10 buckets

values := step.Values()
bucketValues := make([]bucketValue, 0, initIndexBucketLength)

aggregatedValues := make([]float64, 0, len(bucketedSeries))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to just pre-allocate this with a length and index in below when adding to it. Looks like you always add to it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do; was fine using append here since didn't want to add a tracking idx variable

assert.True(t, found)
assert.Equal(t, value, actual)

value2 := []byte("abc")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add test for empty val? e.g. []byte("")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure; not sure if it's necessary a special case though

if buckets, found := bucketsForID[string(id)]; !found {
// Add a single indexed bucket for this ID with the current index only.
newBuckets := make([]indexedBucket, 0, initIndexBucketLength)
newBucket := indexedBucket{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can pull this out before if buckets, found := bucketsForID[string(id)]; !found { since it gets created in both found and !found

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call; this was originally going to have a list of indices that are valid for that bucket but the logic got really awkward. Changed it to add a bucket/index combo every time instead, but didn't update here to reflect that

tags: tagsWithoutKeys,
}
} else {
newBucket := indexedBucket{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above

controller *transform.Controller,
) (block.Builder, error) {
metas := make([]block.SeriesMeta, len(bucketedSeries))
idx := 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have this var? Just include idx in the for loop below?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a map not an array

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦‍♂️

assert.Equal(t, [][]float64{{inf, inf, inf, inf, inf}}, actual)

actual = testQuantileFunctionWithQ(t, 0.8)
test.EqualsWithNansWithDelta(t, [][]float64{{15.6, 20, 11.6, 13.4, 0.8}}, actual, 0.00001)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a case with some NaNs

Copy link
Collaborator

@benraskin92 benraskin92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@arnikola arnikola merged commit 3b5d849 into master Feb 14, 2019
@arnikola arnikola deleted the arnikola/histogram_quantile branch February 14, 2019 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants