Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracing: introduce a per-span limit for structured events #60678

Merged
merged 1 commit into from
Feb 24, 2021

Conversation

irfansharif
Copy link
Contributor

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we construct a way to also detect when the ring buffer has been full and cycled? We'll want to know when things have been lost.

Can we also use a ring buffer for the unstructured (log) events too?

Copy link
Contributor Author

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes to both of those, we'll want to tightly bound how large a verbose span can get and a ring buffer could be one way to do it. I'm hoping to take a stab at the code that renders trace events, but not yet. This could be captured then. For now, we don't have a great way to view structured events anyway (part of what Angela is working on), so I'll wait until we make head way there.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @angelapwen and @tbg)

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Touches #59188

I think we should merge this because it's 1000x better than master now, but ideally the limiting would be based on a count of bytes, not just a count of items. We are already asking Structured to implement interface{ Size() int } so this is easy. All you'd have to change here to get this is to do

type crdbSpan {
  bytesStructured int64 // atomic
}

func (s *crdbSpan) RecordStructured(item Structured) {
  n := int64(item.Size())
  if atomic.AddInt64(&s.bytesStructured, n) > maxStructuredBytesPerSpan {
    // NB: this would also be a convenient place to mark the span has having
    // filled up once we want that.
    atomic.AddInt64(&s.bytesStructured, -n)
    return
  }
  // do the usual stuff
}

and given how easy it is, I would do it - the count-based mechanism will also translate poorly to verbose messages, which can get huge.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @angelapwen and @irfansharif)


pkg/util/tracing/tracer.go, line 402 at r1 (raw file):

		},
	}
	helper.crdbSpan.mu.structured.Reserve(maxStructuredEventsPerSpan)

This will allocate a slice of 50 elements for every Span, which seems inefficient. Why don't we let the ringbuf grow as needed, up to maxStructuredEventsPerSpan?

Copy link
Contributor Author

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have another PR brewing doing both those things, probably could do with another round of review, so merging this one as is.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @angelapwen and @irfansharif)

@craig
Copy link
Contributor

craig bot commented Feb 24, 2021

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Feb 24, 2021

Build failed:

@irfansharif
Copy link
Contributor Author

bors r+

@craig
Copy link
Contributor

craig bot commented Feb 24, 2021

Build succeeded:

@craig craig bot merged commit 5cb9815 into cockroachdb:master Feb 24, 2021
@irfansharif irfansharif deleted the 210217.span-payload-limits branch February 24, 2021 11:28
@tbg tbg mentioned this pull request Mar 2, 2021
irfansharif added a commit to irfansharif/cockroach that referenced this pull request Mar 3, 2021
Touches cockroachdb#59188. Follow-on work from cockroachdb#60678.

Release justification: low risk, high benefit changes to existing
functionality

Release note: None
irfansharif added a commit to irfansharif/cockroach that referenced this pull request Mar 5, 2021
Touches cockroachdb#59188. Follow-on work from cockroachdb#60678.

Release justification: low risk, high benefit changes to existing
functionality

Release note: None
craig bot pushed a commit that referenced this pull request Mar 5, 2021
61359: tracing: use byte-limits for logs/structured events per span r=irfansharif a=irfansharif

Touches #59188. Follow-on work from #60678. We can introduce byte-limits for
verbose logging and structured events, instead of limiting things based on
count.

This PR also:
- adds a _dropped tag to recordings with dropped logs/structured events.
- squashes a bug where reset spans (as used in SessionTracing) still
  held onto earlier structured events
- moves away from the internal usage of the opentracing.LogRecord type,
  it's unnecessary

Release justification: low risk, high benefit changes to existing
functionality

Release note: None

---

+cc @knz / @erikgrinaker / @angelapwen for pod-visibility.

61482: jobs: add job metrics per-type to track success, failure, and cancel r=fqazi a=fqazi

Fixes: #59711

Previously, there were only over all counters tracking how many
jobs were completed, cancelled, or failed. This was inadequate
because it didn't make it easy to tell in aggregate what job
types they were. To address this, this patch will add counters
for different job types for tracking success, failure, and
cancellation.

Release justification: Low risk change only adding a metric inside
the crdb_internal.feature_usage table
Release note: None

61491: sqlsmith: add support for computed columns r=RaduBerinde a=RaduBerinde

This changes the random table generator to also create computed
columns (either STORED or VIRTUAL). Some example of definitions:
 - `col1_14 STRING NOT NULL AS (lower(CAST(col1_8 AS STRING))) VIRTUAL`
 - `col2_6 DECIMAL NOT NULL AS (col2_2 + 1:::DECIMAL) STORED`
 - `col1_13 INT4 AS (col1_0 + col1_10) STORED`

Release justification: non-production code change.

Release note: None

Informs #57608.

61509: sql: add a regression test r=RaduBerinde a=RaduBerinde

This commit adds a regression test for #58104 (the problem was already
fixed).

Resolves #58104.

Release justification: non-production code change.

Release note: None

61522: opt: fix fetch scope in UPDATE..FROM statements r=mgartner a=mgartner

Previously, the fetch scope incorrectly included columns in the FROM
clause of an UPDATE..FROM statement. As a result, column names shared by
the FROM clause and the mutating table lead to ambiguity when resolving
partial index DEL column expressions. This commit ensures that the fetch
scope does not include columns in the FROM clause.

Fixes #61284

Release justification: This is a low-risk bug fix to existing
functionality.

Release note (bug fix): An UPDATE..FROM statement where the FROM clause
contained column names that match table column names erred if the table
had a partial index predicate referencing those columns. This bug,
present since partial indexes were released in version 20.2.0, has been
fixed.

61553: ccl,server: error.Wrap on previously handled errors r=[dt,miretskiy] a=stevendanna

These errors.Wrap calls are wrapping errors that are nil and thus will
always return a nil error.

Release justification: Minor error handling fixes
Release note: None

Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com>
Co-authored-by: Faizan Qazi <faizan@cockroachlabs.com>
Co-authored-by: Radu Berinde <radu@cockroachlabs.com>
Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
Co-authored-by: Steven Danna <danna@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants