Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic in buildTreeFromParentPointerTrees #3566

Open
kolesnikovae opened this issue Sep 18, 2024 · 0 comments
Open

Panic in buildTreeFromParentPointerTrees #3566

kolesnikovae opened this issue Sep 18, 2024 · 0 comments
Assignees
Labels
type/bug Something isn't working

Comments

@kolesnikovae
Copy link
Collaborator

kolesnikovae commented Sep 18, 2024

I've discovered a panic stack trace that indicates a problem in buildTreeFromParentPointerTrees:

Image

However, the problem is apparently somewhere in the SplitStacktraceIDRanges implementation (PartitionWriter), presumably in SampleAppender.Samples:

resolver_tree.go

	// If the number of samples is large (> 128K) and the StacktraceResolver
	// implements the range iterator, we will be building the tree based on
	// the parent pointer tree of the partition (a copy of). The only exception
	// is when the number of nodes is not limited, or is close to the number of
	// nodes in the original tree: the optimization is still beneficial in terms
	// of CPU, but is very expensive in terms of memory.
	iterator, ok := symbols.Stacktraces.(StacktraceIDRangeIterator)
	if ok && shouldCopyTree(appender, maxNodes) {
		ranges := iterator.SplitStacktraceIDRanges(appender) // <--------------- The problem is apparently here
		return buildTreeFromParentPointerTrees(ctx, ranges, symbols, maxNodes)
	}
	// Otherwise, use the basic approach: resolve each stack trace
	// and insert them into the new tree one by one. The method
	// performs best on small sample sets.
	samples := appender.Samples()
	t := treeSymbolsFromPool()
	defer t.reset()
	t.init(symbols, samples)
	if err := symbols.Stacktraces.ResolveStacktraceLocations(ctx, t, samples.StacktraceIDs); err != nil {
		return nil, err
	}
	return t.tree.Tree(maxNodes, t.symbols.Strings), nil

The query that caused the panic targets large in-memory data set, and I can't reproduce it easily.

One way to solve the problem is to disable the optimization: it was added relatively recently (a couple of month ago) and is only helpful when the stack trace tree includes millions of nodes.

However, I'm going to disable stack trace chunking (into ranges) altogether – this piece is useless but complicates a lot of things – this will not fix the problem for already written data, but will prevent this for happening in the future

@kolesnikovae kolesnikovae added the type/bug Something isn't working label Sep 18, 2024
@kolesnikovae kolesnikovae self-assigned this Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant