Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data flow: Remove unused column from flowThroughOutOfCall #18263

Merged
merged 1 commit into from
Dec 12, 2024

Conversation

hvitved
Copy link
Contributor

@hvitved hvitved commented Dec 10, 2024

I noticed in a run that flowThroughOutOfCall may have a large fan-out when joining call with ccc through matchesCall:

[2024-12-10 20:37:27] Evaluated non-recursive predicate DataFlowImpl::MakeImpl<Location::Location,DataFlowImplSpecific::CsharpDataFlow>::Impl<ExternalLocationSink::LocalFileOutputStreamFlow::C>::Stage2::flowThroughOutOfCall/7#306bfe26@3d1aaa28 in 3916ms (size: 109177854).
Evaluated relational algebra for predicate DataFlowImpl::MakeImpl<Location::Location,DataFlowImplSpecific::CsharpDataFlow>::Impl<ExternalLocationSink::LocalFileOutputStreamFlow::C>::Stage2::flowThroughOutOfCall/7#306bfe26@3d1aaa28 with tuple counts:
           153181   ~8%    {2} r1 = JOIN `DataFlowImplCommon::DataFlowCallEx.projectToCall/0#dispred#c63e421f` WITH `DataFlowImpl::MakeImpl<Location::Location,DataFlowImplSpecific::CsharpDataFlow>::Impl<ExternalLocationSink::LocalFileOutputStreamFlow::C>::Stage1::callMayFlowThroughRev/1#05c410c1` ON FIRST 1 OUTPUT Lhs.0, Lhs.1
           291231   ~5%    {8}    | JOIN WITH `project#DataFlowImpl::MakeImpl<Location::Location,DataFlowImplSpecific::CsharpDataFlow>::Impl<ExternalLocationSink::LocalFileOutputStreamFlow::C>::Stage1::callEdgeReturn/7#10c89f93` ON FIRST 1 OUTPUT Rhs.1, Rhs.5, Rhs.5, Rhs.2, Lhs.0, Lhs.1, Rhs.3, Rhs.4
           289311   ~0%    {7}    | JOIN WITH `DataFlowImpl::MakeImpl<Location::Location,DataFlowImplSpecific::CsharpDataFlow>::Impl<ExternalLocationSink::LocalFileOutputStreamFlow::C>::Stage1::returnMayFlowThrough/4#6a6483e2` ON FIRST 4 OUTPUT Lhs.5, Lhs.4, Lhs.0, Lhs.6, Lhs.7, Lhs.1, Lhs.1
        108888543   ~1%    {7}    | JOIN WITH `project#DataFlowImplCommon::Cached::CachedCallContextSensitivity::getSpecificCallContextCall/2#2f40f4a2` ON FIRST 1 OUTPUT Lhs.1, Rhs.1, Lhs.2, Lhs.3, Lhs.4, Lhs.5, Lhs.6
                       
          1362219   ~3%    {1} r2 = SCAN `DataFlowImplCommon::DataFlowCallEx.projectToCall/0#dispred#c63e421f` OUTPUT In.0
          1362219   ~3%    {1}    | STREAM DEDUP
           153181   ~1%    {1}    | JOIN WITH `DataFlowImpl::MakeImpl<Location::Location,DataFlowImplSpecific::CsharpDataFlow>::Impl<ExternalLocationSink::LocalFileOutputStreamFlow::C>::Stage1::callMayFlowThroughRev/1#05c410c1` ON FIRST 1 OUTPUT Lhs.0
           291231   ~2%    {7}    | JOIN WITH `project#DataFlowImpl::MakeImpl<Location::Location,DataFlowImplSpecific::CsharpDataFlow>::Impl<ExternalLocationSink::LocalFileOutputStreamFlow::C>::Stage1::callEdgeReturn/7#10c89f93` ON FIRST 1 OUTPUT Rhs.1, Rhs.5, Rhs.5, Rhs.2, Lhs.0, Rhs.3, Rhs.4
           289311   ~1%    {6}    | JOIN WITH `DataFlowImpl::MakeImpl<Location::Location,DataFlowImplSpecific::CsharpDataFlow>::Impl<ExternalLocationSink::LocalFileOutputStreamFlow::C>::Stage1::returnMayFlowThrough/4#6a6483e2` ON FIRST 4 OUTPUT Lhs.4, Lhs.0, Lhs.5, Lhs.6, Lhs.1, Lhs.1
           289311   ~4%    {7}    | JOIN WITH `num#DataFlowImplCommon::CallContextSensitivity<Cached::CachedCallContextSensitivity::CallContextSensitivityInput>::TSomeCall#b6cffc7d` CARTESIAN PRODUCT OUTPUT Lhs.0, Rhs.0, Lhs.1, Lhs.2, Lhs.3, Lhs.4, Lhs.5
                       
        109177854   ~1%    {7} r3 = r1 UNION r2
                           return r3

I turns out that the column is actually not needed, since all predicates that join against flowThroughOutOfCall already have both call and ccc, so those should already match.

DCA shows a nice speedup for the C# project mono.

@hvitved hvitved added the no-change-note-required This PR does not need a change note label Dec 11, 2024
@hvitved hvitved marked this pull request as ready for review December 11, 2024 08:23
@Copilot Copilot bot review requested due to automatic review settings December 11, 2024 08:23

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.

Files not reviewed (1)
  • shared/dataflow/codeql/dataflow/internal/DataFlowImpl.qll: Language not supported

Tip: Turn on automatic Copilot reviews for this repository to get quick feedback on every pull request. Learn more

@hvitved hvitved requested a review from aschackmull December 11, 2024 08:24
@hvitved hvitved force-pushed the dataflow/remove-column branch from 1148b9e to 40d9460 Compare December 11, 2024 17:45
Copy link
Contributor

@aibaars aibaars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me.

Copy link
Contributor

@aschackmull aschackmull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hvitved hvitved merged commit 20db548 into github:main Dec 12, 2024
32 checks passed
@hvitved hvitved deleted the dataflow/remove-column branch December 12, 2024 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DataFlow Library no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants