-
Notifications
You must be signed in to change notification settings - Fork 12.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SampleFDO] Read call-graph matching recovered top-level function profile #101053
Conversation
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-pgo Author: Lei Wang (wlei-llvm) ChangesIn the extended binary format, the top-level function profile is only read when the functions shows in the current module. However, if a top-level function is renamed, it's never read in the profile, call graph matching will miss this case. Therefore, this patch adds the support to read top-level functions if they are potential candidates for call-graph matching. If a match is found later, the function will be further preserved for use by the sample loader. To do this, we need to tweak the current sample profile reader to support read the profile on-demand by given functions(instead of module), so I added a new function Patch is 36.94 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/101053.diff 6 Files Affected:
diff --git a/llvm/include/llvm/ProfileData/SampleProfReader.h b/llvm/include/llvm/ProfileData/SampleProfReader.h
index f4bdc6525308d..b124233a02d11 100644
--- a/llvm/include/llvm/ProfileData/SampleProfReader.h
+++ b/llvm/include/llvm/ProfileData/SampleProfReader.h
@@ -392,6 +392,11 @@ class SampleProfileReader {
/// which doesn't support loading function profiles on demand.
virtual bool collectFuncsFromModule() { return false; }
+ virtual std::error_code readOnDemand(const DenseSet<StringRef> &FuncsToUse,
+ SampleProfileMap &Profiles) {
+ return sampleprof_error::not_implemented;
+ };
+
/// Print all the profiles on stream \p OS.
void dump(raw_ostream &OS = dbgs());
@@ -413,6 +418,16 @@ class SampleProfileReader {
if (It != Profiles.end())
return &It->second;
+ if (FuncNameToProfNameMap && !FuncNameToProfNameMap->empty()) {
+ auto R = FuncNameToProfNameMap->find(FunctionId(Fname));
+ if (R != FuncNameToProfNameMap->end()) {
+ Fname = R->second.stringRef();
+ auto It = Profiles.find(FunctionId(Fname));
+ if (It != Profiles.end())
+ return &It->second;
+ }
+ }
+
if (Remapper) {
if (auto NameInProfile = Remapper->lookUpNameInProfile(Fname)) {
auto It = Profiles.find(FunctionId(*NameInProfile));
@@ -494,6 +509,11 @@ class SampleProfileReader {
void setModule(const Module *Mod) { M = Mod; }
+ void setFuncNameToProfNameMap(
+ HashKeyMap<std::unordered_map, FunctionId, FunctionId> *FPMap) {
+ FuncNameToProfNameMap = FPMap;
+ }
+
protected:
/// Map every function to its associated profile.
///
@@ -522,6 +542,21 @@ class SampleProfileReader {
std::unique_ptr<SampleProfileReaderItaniumRemapper> Remapper;
+ // A map pointer to the FuncNameToProfNameMap in SampleProfileLoader,
+ // which maps the function name to the matched profile name. This is used
+ // for sample loader to look up profile using the new name.
+ HashKeyMap<std::unordered_map, FunctionId, FunctionId>
+ *FuncNameToProfNameMap = nullptr;
+
+ // A map from a function's context hash to its meta data section range, used
+ // for on-demand read function profile metadata.
+ std::unordered_map<uint64_t, std::pair<const uint8_t *, const uint8_t *>>
+ FContextToMetaDataSecRange;
+
+ std::pair<const uint8_t *, const uint8_t *> LBRProfileSecRange;
+
+ bool ProfileHasAttribute = false;
+
/// \brief Whether samples are collected based on pseudo probes.
bool ProfileIsProbeBased = false;
@@ -621,6 +656,8 @@ class SampleProfileReaderBinary : public SampleProfileReader {
/// Read the next function profile instance.
std::error_code readFuncProfile(const uint8_t *Start);
+ std::error_code readFuncProfile(const uint8_t *Start,
+ SampleProfileMap &Profiles);
/// Read the contents of the given profile instance.
std::error_code readProfile(FunctionSamples &FProfile);
@@ -720,11 +757,15 @@ class SampleProfileReaderExtBinaryBase : public SampleProfileReaderBinary {
std::error_code readSecHdrTableEntry(uint64_t Idx);
std::error_code readSecHdrTable();
+ std::error_code readFuncMetadataOnDemand(bool ProfileHasAttribute,
+ SampleProfileMap &Profiles);
std::error_code readFuncMetadata(bool ProfileHasAttribute);
std::error_code readFuncMetadata(bool ProfileHasAttribute,
FunctionSamples *FProfile);
std::error_code readFuncOffsetTable();
std::error_code readFuncProfiles();
+ std::error_code readFuncProfiles(const DenseSet<StringRef> &FuncsToUse,
+ SampleProfileMap &Profiles);
std::error_code readNameTableSec(bool IsMD5, bool FixedLengthMD5);
std::error_code readCSNameTableSec();
std::error_code readProfileSymbolList();
@@ -776,6 +817,12 @@ class SampleProfileReaderExtBinaryBase : public SampleProfileReaderBinary {
/// the reader has been given a module.
bool collectFuncsFromModule() override;
+ /// Read the profiles on-demand for the given functions. This is used after
+ /// stale call graph matching finds new functions whose profiles aren't read
+ /// at the beginning and we need to re-read the profiles.
+ std::error_code readOnDemand(const DenseSet<StringRef> &FuncsToUse,
+ SampleProfileMap &Profiles) override;
+
std::unique_ptr<ProfileSymbolList> getProfileSymbolList() override {
return std::move(ProfSymList);
};
diff --git a/llvm/include/llvm/Transforms/IPO/SampleProfileMatcher.h b/llvm/include/llvm/Transforms/IPO/SampleProfileMatcher.h
index a67f158433391..67edea42e2fe1 100644
--- a/llvm/include/llvm/Transforms/IPO/SampleProfileMatcher.h
+++ b/llvm/include/llvm/Transforms/IPO/SampleProfileMatcher.h
@@ -198,6 +198,7 @@ class SampleProfileMatcher {
// function and all inlinees.
void countMismatchedCallsiteSamples(const FunctionSamples &FS);
void computeAndReportProfileStaleness();
+ void UpdateSampleLoaderWithRecoveredProfiles();
LocToLocMap &getIRToProfileLocationMap(const Function &F) {
auto Ret = FuncMappings.try_emplace(
diff --git a/llvm/lib/ProfileData/SampleProfReader.cpp b/llvm/lib/ProfileData/SampleProfReader.cpp
index 4752465fc072e..1fe89b9a48832 100644
--- a/llvm/lib/ProfileData/SampleProfReader.cpp
+++ b/llvm/lib/ProfileData/SampleProfReader.cpp
@@ -653,7 +653,8 @@ SampleProfileReaderBinary::readProfile(FunctionSamples &FProfile) {
}
std::error_code
-SampleProfileReaderBinary::readFuncProfile(const uint8_t *Start) {
+SampleProfileReaderBinary::readFuncProfile(const uint8_t *Start,
+ SampleProfileMap &Profiles) {
Data = Start;
auto NumHeadSamples = readNumber<uint64_t>();
if (std::error_code EC = NumHeadSamples.getError())
@@ -678,6 +679,11 @@ SampleProfileReaderBinary::readFuncProfile(const uint8_t *Start) {
return sampleprof_error::success;
}
+std::error_code
+SampleProfileReaderBinary::readFuncProfile(const uint8_t *Start) {
+ return readFuncProfile(Start, Profiles);
+}
+
std::error_code SampleProfileReaderBinary::readImpl() {
ProfileIsFS = ProfileIsFSDisciminator;
FunctionSamples::ProfileIsFS = ProfileIsFS;
@@ -725,6 +731,7 @@ std::error_code SampleProfileReaderExtBinaryBase::readOneSection(
break;
}
case SecLBRProfile:
+ LBRProfileSecRange = std::make_pair(Data, End);
if (std::error_code EC = readFuncProfiles())
return EC;
break;
@@ -745,9 +752,9 @@ std::error_code SampleProfileReaderExtBinaryBase::readOneSection(
ProfileIsProbeBased =
hasSecFlag(Entry, SecFuncMetadataFlags::SecFlagIsProbeBased);
FunctionSamples::ProfileIsProbeBased = ProfileIsProbeBased;
- bool HasAttribute =
+ ProfileHasAttribute =
hasSecFlag(Entry, SecFuncMetadataFlags::SecFlagHasAttribute);
- if (std::error_code EC = readFuncMetadata(HasAttribute))
+ if (std::error_code EC = readFuncMetadata(ProfileHasAttribute))
return EC;
break;
}
@@ -791,6 +798,19 @@ bool SampleProfileReaderExtBinaryBase::useFuncOffsetList() const {
return false;
}
+std::error_code SampleProfileReaderExtBinaryBase::readOnDemand(
+ const DenseSet<StringRef> &FuncsToUse, SampleProfileMap &Profiles) {
+ Data = LBRProfileSecRange.first;
+ End = LBRProfileSecRange.second;
+ if (std::error_code EC = readFuncProfiles(FuncsToUse, Profiles))
+ return EC;
+ End = Data;
+
+ if (std::error_code EC =
+ readFuncMetadataOnDemand(ProfileHasAttribute, Profiles))
+ return EC;
+ return sampleprof_error::success;
+}
bool SampleProfileReaderExtBinaryBase::collectFuncsFromModule() {
if (!M)
@@ -838,110 +858,118 @@ std::error_code SampleProfileReaderExtBinaryBase::readFuncOffsetTable() {
return sampleprof_error::success;
}
-std::error_code SampleProfileReaderExtBinaryBase::readFuncProfiles() {
- // Collect functions used by current module if the Reader has been
- // given a module.
- // collectFuncsFromModule uses FunctionSamples::getCanonicalFnName
- // which will query FunctionSamples::HasUniqSuffix, so it has to be
- // called after FunctionSamples::HasUniqSuffix is set, i.e. after
- // NameTable section is read.
- bool LoadFuncsToBeUsed = collectFuncsFromModule();
-
- // When LoadFuncsToBeUsed is false, we are using LLVM tool, need to read all
- // profiles.
- const uint8_t *Start = Data;
- if (!LoadFuncsToBeUsed) {
- while (Data < End) {
- if (std::error_code EC = readFuncProfile(Data))
- return EC;
+std::error_code SampleProfileReaderExtBinaryBase::readFuncProfiles(
+ const DenseSet<StringRef> &FuncsToUse, SampleProfileMap &Profiles) {
+ const uint8_t *Start = Data;
+
+ if (Remapper) {
+ for (auto Name : FuncsToUse) {
+ Remapper->insert(Name);
}
- assert(Data == End && "More data is read than expected");
- } else {
- // Load function profiles on demand.
- if (Remapper) {
- for (auto Name : FuncsToUse) {
- Remapper->insert(Name);
- }
+ }
+
+ if (ProfileIsCS) {
+ assert(useFuncOffsetList());
+ DenseSet<uint64_t> FuncGuidsToUse;
+ if (useMD5()) {
+ for (auto Name : FuncsToUse)
+ FuncGuidsToUse.insert(Function::getGUID(Name));
}
- if (ProfileIsCS) {
- assert(useFuncOffsetList());
- DenseSet<uint64_t> FuncGuidsToUse;
- if (useMD5()) {
- for (auto Name : FuncsToUse)
- FuncGuidsToUse.insert(Function::getGUID(Name));
+ // For each function in current module, load all context profiles for
+ // the function as well as their callee contexts which can help profile
+ // guided importing for ThinLTO. This can be achieved by walking
+ // through an ordered context container, where contexts are laid out
+ // as if they were walked in preorder of a context trie. While
+ // traversing the trie, a link to the highest common ancestor node is
+ // kept so that all of its decendants will be loaded.
+ const SampleContext *CommonContext = nullptr;
+ for (const auto &NameOffset : FuncOffsetList) {
+ const auto &FContext = NameOffset.first;
+ FunctionId FName = FContext.getFunction();
+ StringRef FNameString;
+ if (!useMD5())
+ FNameString = FName.stringRef();
+
+ // For function in the current module, keep its farthest ancestor
+ // context. This can be used to load itself and its child and
+ // sibling contexts.
+ if ((useMD5() && FuncGuidsToUse.count(FName.getHashCode())) ||
+ (!useMD5() && (FuncsToUse.count(FNameString) ||
+ (Remapper && Remapper->exist(FNameString))))) {
+ if (!CommonContext || !CommonContext->isPrefixOf(FContext))
+ CommonContext = &FContext;
}
- // For each function in current module, load all context profiles for
- // the function as well as their callee contexts which can help profile
- // guided importing for ThinLTO. This can be achieved by walking
- // through an ordered context container, where contexts are laid out
- // as if they were walked in preorder of a context trie. While
- // traversing the trie, a link to the highest common ancestor node is
- // kept so that all of its decendants will be loaded.
- const SampleContext *CommonContext = nullptr;
- for (const auto &NameOffset : FuncOffsetList) {
- const auto &FContext = NameOffset.first;
- FunctionId FName = FContext.getFunction();
- StringRef FNameString;
- if (!useMD5())
- FNameString = FName.stringRef();
-
- // For function in the current module, keep its farthest ancestor
- // context. This can be used to load itself and its child and
- // sibling contexts.
- if ((useMD5() && FuncGuidsToUse.count(FName.getHashCode())) ||
- (!useMD5() && (FuncsToUse.count(FNameString) ||
- (Remapper && Remapper->exist(FNameString))))) {
- if (!CommonContext || !CommonContext->isPrefixOf(FContext))
- CommonContext = &FContext;
- }
-
- if (CommonContext == &FContext ||
- (CommonContext && CommonContext->isPrefixOf(FContext))) {
- // Load profile for the current context which originated from
- // the common ancestor.
- const uint8_t *FuncProfileAddr = Start + NameOffset.second;
- if (std::error_code EC = readFuncProfile(FuncProfileAddr))
- return EC;
- }
- }
- } else if (useMD5()) {
- assert(!useFuncOffsetList());
- for (auto Name : FuncsToUse) {
- auto GUID = MD5Hash(Name);
- auto iter = FuncOffsetTable.find(GUID);
- if (iter == FuncOffsetTable.end())
- continue;
- const uint8_t *FuncProfileAddr = Start + iter->second;
- if (std::error_code EC = readFuncProfile(FuncProfileAddr))
- return EC;
- }
- } else if (Remapper) {
- assert(useFuncOffsetList());
- for (auto NameOffset : FuncOffsetList) {
- SampleContext FContext(NameOffset.first);
- auto FuncName = FContext.getFunction();
- StringRef FuncNameStr = FuncName.stringRef();
- if (!FuncsToUse.count(FuncNameStr) && !Remapper->exist(FuncNameStr))
- continue;
+ if (CommonContext == &FContext ||
+ (CommonContext && CommonContext->isPrefixOf(FContext))) {
+ // Load profile for the current context which originated from
+ // the common ancestor.
const uint8_t *FuncProfileAddr = Start + NameOffset.second;
if (std::error_code EC = readFuncProfile(FuncProfileAddr))
return EC;
}
- } else {
- assert(!useFuncOffsetList());
- for (auto Name : FuncsToUse) {
- auto iter = FuncOffsetTable.find(MD5Hash(Name));
- if (iter == FuncOffsetTable.end())
- continue;
- const uint8_t *FuncProfileAddr = Start + iter->second;
- if (std::error_code EC = readFuncProfile(FuncProfileAddr))
- return EC;
- }
}
+ } else if (useMD5()) {
+ assert(!useFuncOffsetList());
+ for (auto Name : FuncsToUse) {
+ auto GUID = MD5Hash(Name);
+ auto iter = FuncOffsetTable.find(GUID);
+ if (iter == FuncOffsetTable.end())
+ continue;
+ const uint8_t *FuncProfileAddr = Start + iter->second;
+ if (std::error_code EC = readFuncProfile(FuncProfileAddr, Profiles))
+ return EC;
+ }
+ } else if (Remapper) {
+ assert(useFuncOffsetList());
+ for (auto NameOffset : FuncOffsetList) {
+ SampleContext FContext(NameOffset.first);
+ auto FuncName = FContext.getFunction();
+ StringRef FuncNameStr = FuncName.stringRef();
+ if (!FuncsToUse.count(FuncNameStr) && !Remapper->exist(FuncNameStr))
+ continue;
+ const uint8_t *FuncProfileAddr = Start + NameOffset.second;
+ if (std::error_code EC = readFuncProfile(FuncProfileAddr, Profiles))
+ return EC;
+ }
+ } else {
+ assert(!useFuncOffsetList());
+ for (auto Name : FuncsToUse) {
+
+ auto iter = FuncOffsetTable.find(MD5Hash(Name));
+ if (iter == FuncOffsetTable.end())
+ continue;
+ const uint8_t *FuncProfileAddr = Start + iter->second;
+ if (std::error_code EC = readFuncProfile(FuncProfileAddr, Profiles))
+ return EC;
+ }
+ }
+}
+
+std::error_code SampleProfileReaderExtBinaryBase::readFuncProfiles() {
+ // Collect functions used by current module if the Reader has been
+ // given a module.
+ // collectFuncsFromModule uses FunctionSamples::getCanonicalFnName
+ // which will query FunctionSamples::HasUniqSuffix, so it has to be
+ // called after FunctionSamples::HasUniqSuffix is set, i.e. after
+ // NameTable section is read.
+ bool LoadFuncsToBeUsed = collectFuncsFromModule();
+
+ // When LoadFuncsToBeUsed is false, we are using LLVM tool, need to read all
+ // profiles.
+ if (!LoadFuncsToBeUsed) {
+ while (Data < End) {
+ if (std::error_code EC = readFuncProfile(Data))
+ return EC;
+ }
+ assert(Data == End && "More data is read than expected");
+ } else {
+ // Load function profiles on demand.
+ if (std::error_code EC = readFuncProfiles(FuncsToUse, Profiles))
+ return EC;
Data = End;
- }
+ }
assert((CSProfileCount == 0 || CSProfileCount == Profiles.size()) &&
"Cannot have both context-sensitive and regular profile");
assert((!CSProfileCount || ProfileIsCS) &&
@@ -1245,6 +1273,27 @@ SampleProfileReaderExtBinaryBase::readFuncMetadata(bool ProfileHasAttribute,
return sampleprof_error::success;
}
+std::error_code SampleProfileReaderExtBinaryBase::readFuncMetadataOnDemand(
+ bool ProfileHasAttribute, SampleProfileMap &Profiles) {
+ if (FContextToMetaDataSecRange.empty())
+ return sampleprof_error::success;
+
+ for (auto &I : Profiles) {
+ FunctionSamples *FProfile = &I.second;
+ auto R =
+ FContextToMetaDataSecRange.find(FProfile->getContext().getHashCode());
+ if (R == FContextToMetaDataSecRange.end())
+ continue;
+
+ Data = R->second.first;
+ End = R->second.second;
+ if (std::error_code EC = readFuncMetadata(ProfileHasAttribute, FProfile))
+ return EC;
+ assert(Data == End && "More data is read than expected");
+ }
+ return sampleprof_error::success;
+}
+
std::error_code
SampleProfileReaderExtBinaryBase::readFuncMetadata(bool ProfileHasAttribute) {
while (Data < End) {
@@ -1257,8 +1306,11 @@ SampleProfileReaderExtBinaryBase::readFuncMetadata(bool ProfileHasAttribute) {
if (It != Profiles.end())
FProfile = &It->second;
+ const uint8_t *Start = Data;
if (std::error_code EC = readFuncMetadata(ProfileHasAttribute, FProfile))
return EC;
+
+ FContextToMetaDataSecRange[FContext.getHashCode()] = {Start, Data};
}
assert(Data == End && "More data is read than expected");
diff --git a/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp b/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
index 312672e56b017..b9adc6a0631b8 100644
--- a/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp
@@ -782,6 +782,26 @@ bool SampleProfileMatcher::functionMatchesProfileHelper(
float Similarity = 0.0;
const auto *FSFlattened = getFlattenedSamplesFor(ProfFunc);
+ // Check if the function is top-level function. For extended profile format,
+ // if a function profile is unused and it's top-level, even if the profile is
+ // matched, it's not found in the profile. This is because sample reader only
+ // read the used profile at the beginning, we need to read the profile
+ // on-demand. Also save it into the FlattenedProfiles for future look-up.
+ if (!FSFlattened) {
+ DenseSet<StringRef> TopLevelFunc;
+ TopLevelFunc.insert(ProfFunc.stringRef());
+ SampleProfileMap TopLevelProfile;
+ Reader.readOnDemand(TopLevelFunc, TopLevelProfile);
+ assert(TopLevelProfile.size() <= 1 &&
+ "More than one profile is found for top-level function");
+ if (!TopLevelProfile.empty()) {
+ LLVM_DEBUG(dbgs() << "Read top-level function " << ProfFunc
+ << " for call-graph matching\n");
+ auto &FS = TopLevelProfile.begin()->second;
+ FSFlattened =
+ &(FlattenedProfiles.create(FS.getContext()) = std::move(FS));
+ }
+ }
if (!FSFlattened)
return false;
// The check for similarity or checksum may not be reliable if the function is
@@ -863,6 +883,39 @@ bool SampleProfileMatcher::functionMatchesProfile(Function &IRFunc,
return Matched;
}
+void SampleProfileMatcher::UpdateSampleLoaderWithRecoveredProfiles() {
+ DenseSet<StringRef> RecoveredFuncs;
+ // Update FuncNameToProfNameMap and SymbolMap.
+ for (auto &I : FuncToProfileNameMap) {
+ assert(I.first && "New function is null");
+ FunctionId FuncName(I.first->getName());
+ RecoveredFuncs.insert(I.second.stringRef());
+ FuncNameToProfNameMap->emplace(FuncName, I.second);
+
+ // We need to remove the old entry to avoid duplicating the function
+ // processing.
+ SymbolMap->erase(FuncName);
+ SymbolMap->emplace(I.second, I.first);
+ }
+
+ // Read the top-level profiles for the recovered function profiles. This is
+ // because in extended binary format it only loads the top-level profile for
+ // the functions in the new build but not the recovered functions which is
+ // from the old bu...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
@@ -392,6 +392,11 @@ class SampleProfileReader { | |||
/// which doesn't support loading function profiles on demand. | |||
virtual bool collectFuncsFromModule() { return false; } | |||
|
|||
virtual std::error_code readOnDemand(const DenseSet<StringRef> &FuncsToUse, | |||
SampleProfileMap &Profiles) { | |||
return sampleprof_error::not_implemented; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use a pure virtual function here? See readHeader
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only support extended binary format, if we use pure virtual function, we may need more duplicated override functions(only return not_implemented
) for all other derived classes.
@@ -392,6 +392,11 @@ class SampleProfileReader { | |||
/// which doesn't support loading function profiles on demand. | |||
virtual bool collectFuncsFromModule() { return false; } | |||
|
|||
virtual std::error_code readOnDemand(const DenseSet<StringRef> &FuncsToUse, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: On-demand is how this API is used in call graph matching case. But as an API, the key difference comparing to readFuncProfiles
is it reads given set of functions, not on-demand. Suggest to name API based on that, maybe just overload readFuncProfiles
/read
since the parameters imply the difference?
Same for readFuncMetadataOnDemand
.
SymbolMap->emplace(I.second, I.first); | ||
} | ||
|
||
// Read the top-level profiles for the recovered function profiles. This is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"recovered function" can be unclear it means, how about "function with different profile name"?
The comment also reads a bit unclear, how about this
"
With extbinary profile format, initial profile loading only reads profile based on current function names in the module, so we need to load top-level profiles for functions with different profile name explicitly after function-profile name map is established with stale profile matching.
"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion!
SymbolMap->erase(FuncName); | ||
SymbolMap->emplace(I.second, I.first); | ||
} | ||
UpdateSampleLoaderWithRecoveredProfiles(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: rename as UpdateWithSalvagedProfiles
?
llvm/test/Transforms/SampleProfile/pseudo-probe-stale-profile-toplev-func.ll
Show resolved
Hide resolved
LLVM_DEBUG(dbgs() << "Top-level function " << I.second.getFunction() | ||
<< " is recovered and re-read by the sample reader.\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: recovered -> salvaged, top-level function -> profile for top-level function
auto &Ctx = I.second.getContext(); | ||
assert(Profiles.find(Ctx) == Profiles.end() && | ||
"Top level profile is found for the unused profile"); | ||
Profiles.create(Ctx) = std::move(I.second); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we still populate the newly loaded profiles into the same SampleProfileReader::Profiles
map? This would make reader APIs simpler and more consistent, and we won't need the assertion above either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! I was thinking using a different profile map so that we can debug check the reloaded profiles for the test, but I can test check it in a different place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I was suggesting is that the read
no longer need a SampleProfileMap
parameter to hold newly loaded profile. I'm still seeing read
having that parameter in the latest version?
@@ -782,6 +782,26 @@ bool SampleProfileMatcher::functionMatchesProfileHelper( | |||
float Similarity = 0.0; | |||
|
|||
const auto *FSFlattened = getFlattenedSamplesFor(ProfFunc); | |||
// Check if the function is top-level function. For extended profile format, | |||
// if a function profile is unused and it's top-level, even if the profile is | |||
// matched, it's not found in the profile. This is because sample reader only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: rewrite this sentence and maybe the whole comment :)
even if the profile is matched, it's not found in the profile
DenseSet<StringRef> TopLevelFunc; | ||
TopLevelFunc.insert(ProfFunc.stringRef()); | ||
SampleProfileMap TopLevelProfile; | ||
Reader.readOnDemand(TopLevelFunc, TopLevelProfile); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How often does this happen on this path (coming from LCS I think)? And how useful it is to load profile here?
Asking because we probably don't want to load many functions just to do the similarity checks during LCS, as loading profile is kind of expensive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the key part to load the top-level profile. If we don't load the profile, LCS(for cg matching) will never run. see the line ( if (!FSFlattened) return false;
), the function just bails out if it doesn't find any profile.
Asking because we probably don't want to load many functions just to do the similarity checks during LCS, as loading profile is kind of expensive?
Good question! Overall, I think the cost should be fine here.
Note that this is in deep branch, we have a lot of filters(same caller scope, profile not in flattened profile), then the chance to run this code is probably not big.
Additionally, for the on-demand read https://github.com/llvm/llvm-project/blob/main/llvm/lib/ProfileData/SampleProfReader.cpp#L935-L937
See it check the FuncOffsetTable
(not iterating all the FuncOffsetTable
) and read one profile, reading nested profile in one profile is probably fine.
(However, one thing I noticed that 4fe91e0 seems not in llvm-17- branch, we may need to port it to the old version branch)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we add a switch to on/off this feature? Just in case this (loading profile for matched functions) overall becomes expensive.
llvm/test/Transforms/SampleProfile/pseudo-probe-stale-profile-toplev-func.ll
Show resolved
Hide resolved
if (std::error_code EC = readFuncMetadata(ProfileHasAttribute, FProfile)) | ||
return EC; | ||
|
||
FContextToMetaDataSecRange[FContext.getHashCode()] = {Start, Data}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a function doesn't have any attribute, we can skip inserting the index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It Is possible that the a function doesn't have attribute but do have FunctionHash, in this case, we still need to insert the index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change summary needs to be updated, we no longer have readOnDemand
DenseSet<StringRef> TopLevelFunc; | ||
TopLevelFunc.insert(ProfFunc.stringRef()); | ||
SampleProfileMap TopLevelProfile; | ||
Reader.readOnDemand(TopLevelFunc, TopLevelProfile); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we add a switch to on/off this feature? Just in case this (loading profile for matched functions) overall becomes expensive.
if (!FSFlattened) { | ||
DenseSet<StringRef> TopLevelFunc({ProfFunc.stringRef()}); | ||
SampleProfileMap TopLevelProfile; | ||
Reader.read(TopLevelFunc, TopLevelProfile); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will actually return not_implemented for other format, we should probably check success before proceeding the following logic to read profile, to make it explicit.
auto &Ctx = I.second.getContext(); | ||
assert(Profiles.find(Ctx) == Profiles.end() && | ||
"Top level profile is found for the unused profile"); | ||
Profiles.create(Ctx) = std::move(I.second); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I was suggesting is that the read
no longer need a SampleProfileMap
parameter to hold newly loaded profile. I'm still seeing read
having that parameter in the latest version?
SampleProfileMap TopLevelProfile; | ||
Reader.read(TopLevelFunc, TopLevelProfile); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we populate the existing Profiles map in reader, the read
API won't need a SampleProfileMap
parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is different to the one in UpdateWithSalvagedProfiles
. We use a different profile map for the matching, FlattenedProfiles
. so here the output map is the FlattenedProfiles
, not the sample reader's internal Profiles
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can still make the API simpler for reader - we can always load into reader's profile map, and then retriever it by name for downstream use. Reader can serve as a cache too.
OTOH, the profile from reader isn't guaranteed to be flattened, why do we put it directly into flattened profile map anyways?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. It's a little bit tricky, there could be more places to use the FlattenedProfiles
(the profiles for the matching), then For anywhere it's used, we also need to check original samples reader profile map. (I was thinking to rename FlattenedProfiles
and use it so that we only have one container.). Right now there is only one more place, just updated that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you commit the pure refactoring part first?
Created a new PR: #104654 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm with a nit.
…profiles for given functions (#104654) Currently in extended binary format, sample reader only read the profiles when the function are in the current module at initialization time, this extends the support to read the arbitrary profiles for given input functions in later stage. It's used for #101053.
…profiles for given functions (llvm#104654) Currently in extended binary format, sample reader only read the profiles when the function are in the current module at initialization time, this extends the support to read the arbitrary profiles for given input functions in later stage. It's used for llvm#101053.
With extbinary profile format, initial profile loading only reads profile based on current function names in the module. However, if a function is renamed, sample loader skips to load its original profile(which has a different name), we will miss this case. To address this, we load the top-level profile candidate explicitly for the matching. If a match is found later, the function profile will be further preserved for use by the sample loader.