Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aal::aal_arm implements tick for the 64 bits flavor. #564

Merged
merged 1 commit into from
Jan 31, 2023

Conversation

devnexen
Copy link
Collaborator

No description provided.

Copy link
Contributor

@nwf-msr nwf-msr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. I'm a bit surprised to see you needed to do something like this.

Our Arm AAL includes NoCpuCycleCounters in its aal_features, so

/**
* Return an architecture-specific cycle counter.
*
* If the compiler provides a portable prefetch builtin, use it directly,
* otherwise delegate to the architecture-specific layer. This allows new
* architectures to avoid needing to implement a custom `tick` method
* if they are used only with a compiler that provides the builtin.
*/
static inline uint64_t tick() noexcept
{
if constexpr (
(Arch::aal_features & NoCpuCycleCounters) == NoCpuCycleCounters)
{
auto tick = std::chrono::high_resolution_clock::now();
return static_cast<uint64_t>(
std::chrono::duration_cast<std::chrono::nanoseconds>(
tick.time_since_epoch())
.count());
}
else
{
#if __has_builtin(__builtin_readcyclecounter) && \
!defined(SNMALLOC_NO_AAL_BUILTINS)
return __builtin_readcyclecounter();
#else
return Arch::tick();
#endif
}
}
};
should be using the std::chrono::high_resolution_clock approach both prior to and with this patch. Am I misreading our #if and if constexpr maze?

Were we to remove NoCpuCycleCounters, on aarch64 it seems like we'd be able to use __builtin_readcyclecounter. Clang lowers that to reading from PMCCNTR_EL0, which Arm describes as "[h]old[ing] the value of the processor Cycle Counter, CCNT, that counts processor clock cycles" and notes that it is mediated by PMCCFILTR_EL0. On the other hand, CNTVCT_EL0 is described as "[h]old[ing] the 64-bit virtual count value. The virtual count value is equal to the physical count value minus the virtual offset visible in CNTVOFF_EL2". Is there a reason the PMCCNTR_EL0 value is not sufficient for your use?

Last, while this is purely cosmetic at the moment, I'd prefer the #if conditional to be on __aarch64__ rather than SNMALLOC_VA_BITS_64.

@devnexen
Copy link
Collaborator Author

Hm. I'm a bit surprised to see you needed to do something like this.

Our Arm AAL includes NoCpuCycleCounters in its aal_features, so

/**
* Return an architecture-specific cycle counter.
*
* If the compiler provides a portable prefetch builtin, use it directly,
* otherwise delegate to the architecture-specific layer. This allows new
* architectures to avoid needing to implement a custom `tick` method
* if they are used only with a compiler that provides the builtin.
*/
static inline uint64_t tick() noexcept
{
if constexpr (
(Arch::aal_features & NoCpuCycleCounters) == NoCpuCycleCounters)
{
auto tick = std::chrono::high_resolution_clock::now();
return static_cast<uint64_t>(
std::chrono::duration_cast<std::chrono::nanoseconds>(
tick.time_since_epoch())
.count());
}
else
{
#if __has_builtin(__builtin_readcyclecounter) && \
!defined(SNMALLOC_NO_AAL_BUILTINS)
return __builtin_readcyclecounter();
#else
return Arch::tick();
#endif
}
}
};

should be using the std::chrono::high_resolution_clock approach both prior to and with this patch. Am I misreading our #if and if constexpr maze?

That's the point I am conceding, I missed this flag indeed.

Were we to remove NoCpuCycleCounters, on aarch64 it seems like we'd be able to use __builtin_readcyclecounter. Clang lowers that to reading from PMCCNTR_EL0, which Arm describes as "[h]old[ing] the value of the processor Cycle Counter, CCNT, that counts processor clock cycles" and notes that it is mediated by PMCCFILTR_EL0. On the other hand, CNTVCT_EL0 is described as "[h]old[ing] the 64-bit virtual count value. The virtual count value is equal to the physical count value minus the virtual offset visible in CNTVOFF_EL2". Is there a reason the PMCCNTR_EL0 value is not sufficient for your use?

Because, as you mentioned, the offset is already removed.

Last, while this is purely cosmetic at the moment, I'd prefer the #if conditional to be on __aarch64__ rather than SNMALLOC_VA_BITS_64.

Hm, I do not really see the point, it will make it just inconsistent with the rest.

@devnexen devnexen force-pushed the tickarm64 branch 2 times, most recently from e9a60d6 to a67ce3d Compare September 25, 2022 22:30
@mjp41 mjp41 dismissed nwf-msr’s stale review January 31, 2023 10:16

Changes have been done, and @nwf-msr is OOF

@mjp41 mjp41 merged commit 6be63b1 into microsoft:main Jan 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants