Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation on current behavior for X86 #212

Merged
merged 5 commits into from
Jan 24, 2023

Conversation

toor1245
Copy link
Contributor

No description provided.

@toor1245
Copy link
Contributor Author

@gchatelet, @Mizux, please review PR

@toor1245
Copy link
Contributor Author

ping

@toor1245 toor1245 force-pushed the add-docs-os-support-x86 branch from d85520f to 37e4db5 Compare January 14, 2022 17:36
Comment on lines 124 to 139
// Before an application attempts to use the SIMD extensions,
// it should check that they are present on the processor and supported by OS:
// 1. Check that operating system saves and restores XMM/YMM/ZMM/TMM
// registers during context switches. Detect CPUID.1:ECX.OSXSAVE[bit 27]
// this feature flag specifies that operating system provides extended state
// management implied HW support for XSAVE, XRSTOR, XGETBV, XCR0.
// 2. Check that the SIMD extensions supports by hardware.
// Detect CPUID.1:ECX.XSAVE[bit 26]
// 3. Check enabled state in XCR0 via XGETBV.
// 4. Check feature flag for instruction set.
//
// Note:
// It is unwise for an application to rely exclusively on feature flag for
// instruction set or at all on CPUID.1:ECX.XSAVE[bit 26]: These indicate
// hardware support but not operating system support. If XMM/YMM/ZMM/TMM state
// management is not enabled by an operating systems, instructions will #UD
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIU this is documenting the current behavior right?
If so I would phrase it differently because it reads as if it was an advice to the user.

Also the section to which you added the documentation only contains the following functions:

static bool HasMask(uint32_t value, uint32_t mask);
static bool HasXmmOsXSave(uint32_t xcr0_eax);
static bool HasYmmOsXSave(uint32_t xcr0_eax);
static bool HasZmmOsXSave(uint32_t xcr0_eax);
static bool HasTmmOsXSave(uint32_t xcr0_eax);

I agree that we could use some documentation to describe what it means to support SIMD instructions but then I would put this section at the top of the file and describe it in terms of process, not in the form of an advice ("it should check", "It is unwise"). Something along these lines:

A note on x86 SIMD instructions availability
--------------------------------------------
A number of conditions need to be met for an application to use SIMD instructions:
1. The CPU itself must support the instruction.
    - we use `CPUID` to check whether the feature is supported.
2. The OS must save and restore the associated SIMD register across context switches, we check that:
    - the CPU reports supporting hardware context switching instructions via CPUID.1:ECX.XSAVE[bit 26]
    - the OS reports supporting hardware context switching instructions via CPUID.1:ECX.OSXSAVE[bit 27]
    - the CPU extended control register 0 (XCR0) is set to save and restore the needed SIMD registers

Note that if `XSAVE`/`OSXSAVE` are missing, we delegate the detection to the OS via the `DetectFeaturesFromOs` function or via microarchitecture heuristics.

@toor1245 toor1245 force-pushed the add-docs-os-support-x86 branch from 8fb1ca5 to 8528847 Compare August 15, 2022 21:24
@toor1245
Copy link
Contributor Author

@gchatelet, I updated PR, but there is a question related to GetX86CacheInfo, I see that the comment describes that it works only on Intel CPUs, but it works on AMD (new) and Hygon processors. May I update this information or did you do it on purpose?

// Returns cache hierarchy informations.
// Can call cpuid multiple times.
// Only works on Intel CPU at the moment.
CacheInfo GetX86CacheInfo(void);

@gchatelet
Copy link
Collaborator

@toor1245 the comment is outdated, yes please remove the part about "Intel only"

@toor1245 toor1245 changed the title Add documentation of OS support for X86 Add documentation on current behavior for X86 Aug 16, 2022
// Encoding
// -----------------------------------------------------------------------------
// `X86Info` contains fields such as `vendor` and `brand_string` they are
// represents characters in ASCII encoding. `vendor` length of characters is 13
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"X86Info contains fields such as vendor and brand_string that are ASCII encoded strings."

Copy link
Collaborator

@gchatelet gchatelet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a bunch of documentation comments. Should be good to go then.

// -----------------------------------------------------------------------------
// `GetX86Microarchitecture` function consists of check on vendor via
// `IsVendorByX86Info`. We use `CPUID(family, model)` macro to define
// microarchitecture of vendor. In cases where the `family` and `model` is the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"to define the vendor's microarchitecture"

// `IsVendorByX86Info`. We use `CPUID(family, model)` macro to define
// microarchitecture of vendor. In cases where the `family` and `model` is the
// same for several microarchitectures we do a stepping check or in the worst
// case parsing `brand_string` as example `HasSecondFMA`. Details of
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"or in the worst case we rely on parsing brand_string (see HasSecondFMA for an example)"

//
// CacheInfo X86
// -----------------------------------------------------------------------------
// We use common struct `CacheInfo` for all CPU architectures and this struct
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"We use the CacheInfo struct to store information about cache levels. The maximum number of levels is hardcoded but can be increased if needed. We have full support .,."

// -----------------------------------------------------------------------------
// We use common struct `CacheInfo` for all CPU architectures and this struct
// contains of `levels` with certain maximum number of cache levels, you can
// feel free increase this value if more cache levels are needed. We have full
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use bullet points to split between vendors, e.g.

  • Intel:
    • modern:
    • old processors:
  • AMD:...

// Internal structures
// -----------------------------------------------------------------------------
// We use internal structures such as `Leaves` and `OsPreserves` to hold cpuid
// info and support of registers, since latency of `CPUID` instruction ~100
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since latency of CPUID instruction is around ~100

//
// Internal structures
// -----------------------------------------------------------------------------
// We use internal structures such as `Leaves` and `OsPreserves` to hold cpuid
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"to hold cpuid" -> "to cache the result of cpuid"

@toor1245 toor1245 force-pushed the add-docs-os-support-x86 branch 2 times, most recently from 534e586 to 52f863f Compare January 22, 2023 19:04
@toor1245 toor1245 force-pushed the add-docs-os-support-x86 branch from 52f863f to bd7161a Compare January 22, 2023 19:13
// -----------------------------------------------------------------------------
// We use internal structures such as `Leaves` and `OsPreserves` to cache the
// result of cpuid info and support of registers, since latency of CPUID
// instruction is around ~100, see
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~100 cycles

@gchatelet
Copy link
Collaborator

I'll squash and merge the PR once the typo is fixed. Thx a lot for the documentation!

@gchatelet gchatelet merged commit c74a85d into google:main Jan 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants