-
Notifications
You must be signed in to change notification settings - Fork 11.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AST] Ensure getRawCommentsForAnyRedecl() does not miss any redecl with a comment #108475
Conversation
@llvm/pr-subscribers-clang Author: Nathan Ridge (HighCommander4) ChangesThe intent of the However, redecls are a circular list, and if iteration starts from the input decl Starting the iteration from the first (canonical) decl makes the cache work as intended. Fixes #108145 Full diff: https://github.com/llvm/llvm-project/pull/108475.diff 3 Files Affected:
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index a4e6d3b108c8a5..3735534ef3d3f1 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -444,7 +444,7 @@ const RawComment *ASTContext::getRawCommentForAnyRedecl(
return CommentlessRedeclChains.lookup(CanonicalD);
}();
- for (const auto Redecl : D->redecls()) {
+ for (const auto Redecl : CanonicalD->redecls()) {
assert(Redecl);
// Skip all redeclarations that have been checked previously.
if (LastCheckedRedecl) {
diff --git a/clang/unittests/AST/CMakeLists.txt b/clang/unittests/AST/CMakeLists.txt
index dcc9bc0f39ac2c..79ad8a28f2b33c 100644
--- a/clang/unittests/AST/CMakeLists.txt
+++ b/clang/unittests/AST/CMakeLists.txt
@@ -33,6 +33,7 @@ add_clang_unittest(ASTTests
NamedDeclPrinterTest.cpp
ProfilingTest.cpp
RandstructTest.cpp
+ RawCommentForDeclTest.cpp
RecursiveASTVisitorTest.cpp
SizelessTypesTest.cpp
SourceLocationTest.cpp
diff --git a/clang/unittests/AST/RawCommentForDeclTest.cpp b/clang/unittests/AST/RawCommentForDeclTest.cpp
new file mode 100644
index 00000000000000..b811df28127d43
--- /dev/null
+++ b/clang/unittests/AST/RawCommentForDeclTest.cpp
@@ -0,0 +1,99 @@
+//===- unittests/AST/RawCommentForDeclTestTest.cpp
+//-------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "clang/AST/ASTConsumer.h"
+#include "clang/AST/DeclGroup.h"
+#include "clang/Frontend/CompilerInstance.h"
+#include "clang/Frontend/FrontendAction.h"
+#include "clang/Tooling/Tooling.h"
+
+#include "gmock/gmock-matchers.h"
+#include "gtest/gtest.h"
+
+namespace clang {
+
+struct FoundComment {
+ std::string DeclName;
+ bool IsDefinition;
+ std::string Comment;
+
+ bool operator==(const FoundComment &RHS) const {
+ return DeclName == RHS.DeclName && IsDefinition == RHS.IsDefinition &&
+ Comment == RHS.Comment;
+ }
+ friend llvm::raw_ostream &operator<<(llvm::raw_ostream &Stream,
+ const FoundComment &C) {
+ return Stream << "{Name: " << C.DeclName << ", Def: " << C.IsDefinition
+ << ", Comment: " << C.Comment << "}";
+ }
+};
+
+class CollectCommentsAction : public ASTFrontendAction {
+public:
+ CollectCommentsAction(std::vector<FoundComment> &Comments)
+ : Comments(Comments) {}
+
+ std::unique_ptr<ASTConsumer> CreateASTConsumer(CompilerInstance &CI,
+ llvm::StringRef) override {
+ CI.getLangOpts().CommentOpts.ParseAllComments = true;
+ return std::make_unique<Consumer>(*this);
+ }
+
+ std::vector<FoundComment> &Comments;
+
+private:
+ class Consumer : public clang::ASTConsumer {
+ private:
+ CollectCommentsAction &Action;
+
+ public:
+ Consumer(CollectCommentsAction &Action) : Action(Action) {}
+ ~Consumer() override {}
+
+ bool HandleTopLevelDecl(DeclGroupRef DG) override {
+ for (Decl *D : DG) {
+ if (NamedDecl *ND = dyn_cast<NamedDecl>(D)) {
+ auto &Ctx = D->getASTContext();
+ const auto *RC = Ctx.getRawCommentForAnyRedecl(D);
+ Action.Comments.push_back(FoundComment{
+ ND->getNameAsString(), IsDefinition(D),
+ RC ? RC->getRawText(Ctx.getSourceManager()).str() : ""});
+ }
+ }
+
+ return true;
+ }
+
+ static bool IsDefinition(const Decl *D) {
+ if (const FunctionDecl *FD = dyn_cast<FunctionDecl>(D)) {
+ return FD->isThisDeclarationADefinition();
+ }
+ if (const TagDecl *TD = dyn_cast<TagDecl>(D)) {
+ return TD->isThisDeclarationADefinition();
+ }
+ return false;
+ }
+ };
+};
+
+TEST(RawCommentForDecl, DefinitionComment) {
+ std::vector<FoundComment> Comments;
+ auto Action = std::make_unique<CollectCommentsAction>(Comments);
+ ASSERT_TRUE(tooling::runToolOnCode(std::move(Action), R"cpp(
+ void f();
+
+ // f is the best
+ void f() {}
+ )cpp"));
+ EXPECT_THAT(Comments, testing::ElementsAre(
+ FoundComment{"f", false, ""},
+ FoundComment{"f", true, "// f is the best"}));
+}
+
+} // namespace clang
|
Buildkite is showing the test |
I've been investigating this failure. It's caused by a slight change of behaviour of /// Aaa.
template<typename T, typename U>
void foo(T aaa, U bbb);
/// Bbb.
template<>
void foo(int aaa, int bbb); While the correct comment ( This is in turn because the code that sets this is conditioned on When I'm not familiar enough with the AST modeling of template specializations to say what is going wrong here... @gribozavr as the author of the mentioned check, any advice would be appreciated. |
I was curious why it is relying on EDIT: It seems that |
@HighCommander4 struct FoundComment {
std::string DeclName;
bool IsDefinition;
std::string Comment;
comments::DeclInfo::TemplateDeclKind TDK;
// ... comparators are snipped ...
}; Action.Comments.push_back(FoundComment{
ND->getNameAsString(), IsDefinition(D),
RC ? RC->getRawText(Ctx.getSourceManager()).str() : "",
RC->parse(Ctx, &Action.getCompilerInstance().getPreprocessor(), D)
->getDeclInfo()
->getTemplateKind()}); So for the following test case, /// Aaa.
template<typename T, typename U>
void foo(T aaa, U bbb);
/// Bbb.
template<>
void foo(int aaa, int bbb); I didn't see the following failing EXPECT_THAT(
Comments,
testing::ElementsAre(
FoundComment{"foo", false, "/// Aaa.",
comments::DeclInfo::TemplateDeclKind::Template},
FoundComment{
"foo", false, "/// Bbb.",
comments::DeclInfo::TemplateDeclKind::TemplateSpecialization})); Did I misread anything from your last comment? NVM, the |
So the problem is that we would have another implicitly-created explicit template specialization
So the status-quo is seemingly violating the contracts of The // Mark the prior declaration as an explicit specialization, so that later
// clients know that this is an explicit specialization. SmallVector<TemplateParameterList *, 4> TPL;
for (unsigned I = 0; I < FD->getNumTemplateParameterLists(); ++I)
TPL.push_back(FD->getTemplateParameterList(I));
Specialization->setTemplateParameterListsInfo(Context, TPL); An alternative might be to teach EDIT: Having run a quick check-clang test locally, and nothing failed. Probably this is feasible to go. |
cd39b4b
to
408259c
Compare
@zyn0217 Thank you for the analysis and suggestion! I updated the patch as suggested, let's see what buildkite says. |
Hmm, quite a few things are failing:
|
Sorry, it might be I forgot to save the changes before I ran the tests yesterday! I looked into it again, and I think I have begun to understand The first intent is to describe out-of-line member functions that live in a templated scope. For example, template <class T>
struct S {
void foo();
};
template <class T>
void S<T>::foo() {} So the member function However, for non-member function explicit specialization in question, Seemingly we might fix the problem in diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index 3c6a0dff798f..bb02910f1dfe 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -10499,6 +10499,13 @@ Sema::ActOnFunctionDeclarator(Scope *S, Declarator &D, DeclContext *DC,
if (CheckFunctionTemplateSpecialization(NewFD, ExplicitTemplateArgs,
Previous))
NewFD->setInvalidDecl();
+ // For source fidelity, store all the template param lists.
+ if (!NewFD->isInvalidDecl() && TemplateParamLists.size() > 0) {
+ auto *Specialization = Previous.getAsSingle<FunctionDecl>();
+ assert(Specialization);
+ Specialization->setTemplateParameterListsInfo(Context,
+ TemplateParamLists);
+ }
}
} else if (isMemberSpecialization && !FunctionTemplate) {
if (CheckMemberSpecialization(NewFD, Previous)) I tried that approach locally and now I have all check-clang tests passed but some of the check-clangd tests failed
I don't know if people already have some assumptions that an "implicit" explicit specialization shouldn't have its |
@zyn0217 I've debugged the header.h template <typename T> void foo();
template <> void foo<int>(); source.cpp: #include "header.h"
template <> void foo<int>() {} It seems the |
clang/lib/AST/ASTContext.cpp
Outdated
for (const auto Redecl : D->redecls()) { | ||
for (const auto Redecl : CanonicalD->redecls()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reread your analysis and I think the only safe usage of getRawCommentForAnyRedecl()
previously was that, in the second call to the function, only those D
s that live in between the CanonicalDecl
and LastCheckedRedecl
are expected to be used as the starting point of the traversal. Otherwise, the presence of LastCheckedRedecl
would result in skipping over all declarations, including those whose associated comments were unparsed yet.
So I wonder if we could make use of LastCheckedRedecl opt-in. To be clear, we first check if D
is in the path Canonical
-> LastCheckedRedecl
. If so, we use the mechanism that skips past every visited declaration; otherwise, we ignore it.
// Any redeclarations of D that we haven't checked for comments yet?
const Decl *LastCheckedRedecl = CommentlessRedeclChains.lookup(CanonicalD);
bool CanUseCommentlessCache = false;
if (LastCheckedRedecl) {
for (auto *Redecl : CanonicalD->redecls()) {
if (Redecl == D) {
CanUseCommentlessCache = true;
break;
}
if (Redecl == LastCheckedRedecl)
break;
}
}
// Skip all redeclarations that have been checked previously.
if (CanUseCommentlessCache && LastCheckedRedecl)
...
The solution could be improved to pick up the LastCheckedRedecl
again after visiting the CanonicalDecl
, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the suggestion! It indeed feels safer to tweak getRawCommentsForAnyRedecl()
like this than to muck around with the modeling of template specializations 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but I still feel the modeling of template specializations should be improved regarding getSourceRange()
relying on TemplateParameterListsInfo
, which is really peculiar. But that's another murky topic unrelated to this patch, anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regarding
getSourceRange()
relying onTemplateParameterListsInfo
, which is really peculiar
FWIW, that part actually makes sense to me. For a function/method declaration that looks like this:
template <...>
ReturnType FunctionName(Parameters...);
the answer to the question "where does the source range begin?" is indeed "if there is a template parameter list, where does it begin?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(The part that makes less sense to me is, why does the explicit specialization have a redeclaration at all in a case like
/// Aaa.
template<typename T, typename U>
void foo(T aaa, U bbb);
/// Bbb.
template<>
void foo(int aaa, int bbb);
There is only one declaration of the explicit specialization in the code.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, in fact, I think we're not too confident about how many FunctionDecls would be generated during the template parsing e.g. they can be the byproducts of a TreeTransform (it does not instantiate declarations directly, but it could delegate calls to Sema that creates extra declarations)/the process of the template argument deduction, such that the contract of TemplateParameterListsInfo
would be less reliable.
408259c
to
1df9057
Compare
Buildkite is green with this approach! Graduated patch from "Draft" state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! I think this probably needs a release note, otherwise it looks good on the whole.
Please give other folks some time before merging it, and I invited @AaronBallman for the second pair of eyes.
if (const FunctionDecl *FD = dyn_cast<FunctionDecl>(D)) { | ||
return FD->isThisDeclarationADefinition(); | ||
} | ||
if (const TagDecl *TD = dyn_cast<TagDecl>(D)) { | ||
return TD->isThisDeclarationADefinition(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to handle *TemplateDecls too, but it is not required as we don't actually need them now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix! The changes should also come with a release note so users know about the improvement. Otherwise, changes basically LG modulo nits.
1df9057
to
2b14e80
Compare
Thanks for the reviews! I'll add the release note shortly (need to update to a newer baseline first). |
Note, I also updated the commit message to reflect the new fix approach. |
0a1f551
to
d224e1b
Compare
(Rebased) |
…th a comment The previous implementation had a bug where, if it was called on a Decl later in the redecl chain than `LastCheckedDecl`, it could incorrectly skip and overlook a Decl with a comment. The patch addresses this by only using `LastCheckedDecl` if the input Decl `D` is on the path from the first (canonical) Decl to `LastCheckedDecl`. An alternative that was considered was to start the iteration from the (canonical) Decl, however this ran into problems with the modelling of explicit template specializations in the AST where the canonical Decl can be unusual. With the current solution, if no Decls were checked yet, we prefer to check the input Decl over the canonical one.
d224e1b
to
1df6853
Compare
Added release note. I put it under "Bug Fixes to AST Handling" which seemed like a good fit. |
The previous implementation had a bug where, if it was called on a Decl later in the redecl chain than
LastCheckedDecl
, it could incorrectly skip and overlook a Decl with a comment.The patch addresses this by only using
LastCheckedDecl
if the input DeclD
is on the path from the first (canonical) Decl toLastCheckedDecl
.An alternative that was considered was to start the iteration from the (canonical) Decl, however this ran into problems with the modelling of explicit template specializations in the AST where the canonical Decl can be unusual. With the current solution, if no Decls were checked yet, we prefer to check the input Decl over the canonical one.
Fixes #108145