Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding specification to handle the new C2x _BitInt type #191

Merged
merged 4 commits into from
Oct 4, 2023

Conversation

mmalcomson
Copy link
Contributor

Here we add mappings from the new _BitInt type to machine types.

All that I believe needs to be changed in the ABI documents themselves are entries in the Mapping tables for the PCS rules.
I explain in the commit messages below the reasoning that this is all that is needed.

This PR also contains a rationale document.
As discussed I'm putting the initial proposed document up and going to add the comments I've already heard onto this github request (to keep the discussion as in the open as possible).

@mmalcomson mmalcomson requested review from stuij and smithp35 February 20, 2023 12:15
Copy link
Contributor Author

@mmalcomson mmalcomson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments from Kristof Beyls gave me out-of-band.

Also added two comments that I noticed myself around improperly updating the ChangeLogs of the ABI documents.

Comment on lines 304 to 305
| 2023Q1| 5\ :sup:`th` February 2022 | Document mapping from C2x _BitInt(N) and unsigned _BitInt(N) types|
| | | to machine types. |
+-------+-------------------------------------+-------------------------------------------------------------------+
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On reviewing my own patch -- I guess it shouldn't include an update in this table since this is the table.
I'm now starting to think that maybe this table is just for changes in releases, and that should be made upon a release.
Would appreciate a confirmation/refutation on that from someone more in touch with the releases @stuij ?

Similar for the aapcs64.rst document.

Comment on lines 252 to 254
| 2023Q1 | 5\ :sup:`th` | - Document mapping from C2x _BitInt(N) and unsigned _BitInt(N) |
| | February 2022 | types to machine types. |
+------------+--------------------+------------------------------------------------------------------+
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above with the aapcs32.rst changelog.

Comment on lines 23 to 24
As the rationale mentioned, some applications have uses for a specific bit-width
type. In the case of writing C code which can be used to determine FPGA
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: /As the rationale in that proposal mentions/

Comment on lines 33 to 30
general familiarity of programmers with the representation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: maybe it reads more easily if this were a 3-item bullet list?

Comment on lines 35 to 50
For this particular type we are estimating that the use of ``_BitInt`` types
will not be such that operations on these types are performance critical.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Less of a nitpick: I'm not sure we agreed that operations on these types
will never be performance critical, which the above sentence seems to
suggest?
I'd probably start by explaining the difficulty in defining ABI
rules/understanding the trade-offs as there is almost no code currently
using _BitInt, and it is next-to-impossible to predict exactly how it
will be used most often in the future.

Nonetheless, we need to make trade-off decisions.

Therefore, let's analyze some of the characteristics and features of
this type to try and make a best guess of how these types may get used
when target Arm CPUs.

(The specifics on not much code with _BitInt being performance critical
could be repeated after the below analysis of use cases?)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The justifications in N2763 mention that _BitInt to some degree exists to allow code that is used in high-level synthesis to have better performance by not having to participate in the usual arithmetic promotions, like promoting up to int (which is about 32 bits), and only promoting up to the other _BitInt in size (so that adding _BitInt(4) and _BitInt(7) gives you a _BitInt(7), not a 32-bit int, which is easier to translate to an FPGA).

At larger sizes, it's more of a nicety (being able to "natively" specify "bigints" that the compiler handles for you), but an extremely appealing nicety.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just updated the rationale with extra details -- would appreciate your opinion.

Around the high-level synthesis point, my thoughts were that the resulting code would be run in the performance-critical scenario on an FPGA, and would only be run on an Arm architecture for testing or the like.
So I was thinking that the incentives on the Arm ABI coming from that use-case were not quite so focussed on performance.
The possibility that this code would end up being run "in anger" on an Arm architecture later down the line does then add some incentive towards good performance -- though not as much as if the first-order use were to be on Arm CPU's.

Does that seems reasonable from your perspective?

Comment on lines 38 to 40
There seem to be two different regimes for these types. The "small" regime
where bit-precise types could be stored in a single register, and the "large"
regime where bit-precise types must span multiple registers.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: maybe say general purpose register? Is the term "general
purpose register" used in other parts of the ABI docs?

Comment on lines 45 to 291
These types must be at least byte-aligned so they are addressable, and at least
rounded to a byte boundary in size for ``sizeof``. Since these types have an
aesthetic similarity to bit-fields one might expect better packing in an array
of ``_BitInt(24)`` than an array of ``int32_t`` types (i.e. packing as good as a
byte-array). However, this would require a low alignment of such types and that
would mean loading and storing of even "small" sized ``_BitInt``'s crossing
cache boundaries -- leading to an unnecessary performance hit and hindering any
atomic operations on these.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick on style: I'm wondering if it would be better to more explicitly
surface the pros and cons of a trade-off decision in the structure of
this document, rather than using full prose? For the example above, I'm
thinking of for example something like:

Trade-off decision 1: alignment of small-regime _BitInt

There are 2 obvious options:

option 1: 1 byte alignment
pros: list them
cons: list them

option 2: alignment rules same as 8/16/32/64-byte larger datatype (todo:
maybe improve the wording of this title/option)
pros: list them
cons: list them

Discussion of option 1 vs option 2.
Describe why we chose option 2 based on the information and estimates we
currently have available.

(I noticed you have something similar further down for "Representation
in bits". It might be good to have the same structure for analyzing
different trade-off choices to be made in this document?)

Comment on lines +80 to +356
Representation in bits
----------------------
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the above title be "Representation in registers" instead of "...
in bits"?

Comment on lines 209 to 210
As mentioned above, we do not expect operations on ``_BitInt`` types to be
performance critical. Given that providing a productive environment for
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I'm not sure if we can expect _BitInt to never be used in
performance-critical code.
If people will write code targeting FPGAs with _BitInt, presumably that
code is performance-critical (why would anyone otherwise target an FPGA
for that code?). If such code becomes popular, at some point, there may
be good reasons to use the same source code to target CPUs too. In that
case, the _BitInt code would be performance critical...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it's more coherent to say "As mentioned above, we do not expect operations on _BitInt types at large sizes to be performance critical." since the benefits of _BitInt allowing sidestepping the arithmetic promotions disappear for sizes larger than _BitInt(32).

@workingjubilee
Copy link

workingjubilee commented Feb 21, 2023

Regarding _BitInt, I would like to raise a concern about whether C's _BitInt(128) and AArch64's __int128 will be compatible in terms of layout, alignment, and parameter passing. If they are not, it may significantly complicate ABI handling and type definitions, especially for other languages that are not C, and on AArch64, FFI compatibility with __int128 is much more important than on x86-64 because it is used in the "hardware ABI" (it appears a lot in the layout of essential types). It also seems like a programmer footgun in general if they are inconsistent, even for just normal C code, as well as its relevance for STXP/LDXP.

It's been a while since I've read the AAPCS64 last so apologies if this has already been addressed and the spec as-is will guarantee that, it wasn't immediately obvious from my skimming of this PR whether this was the case.

I agree that I don't think these will be performance-critical types at the larger sizes, however.

@mmalcomson
Copy link
Contributor Author

@workingjubilee Thanks for the feedback!

W.r.t. the _BitInt(128) yeah, the current spec does have them compatible with AArch64 __int128, but it's still good to hear that it was on someones mind since it increases our confidence in that choice.

The "performance critical" statement was a bit of a clumsy glossing-over of some analysis on my part.
I'm in the process of adding the details of the analysis to the rationale document, hopefully that would make it clear what the current thought process is -- I'd appreciate hearing your thoughts on whether the adjustment makes sense and is clear when I post it.

Cheers!

@workingjubilee
Copy link

Aha, excellent!

and on AArch64 _BitInt(128) has the same alignment and size as a __int128 which is
the largest type defined on that platform. This falls out of the fact that
double-register size maps to the largest integral Fundamental Data Type defined
on both platforms.

Ah, good, this was the part I missed. Thank you!

I'll be happy to look at things when you follow up!

Copy link
Contributor

@Wilco1 Wilco1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with choosing option B and C for AArch64/AArch32 but I'd reduce the major focus on atomics since I'd be surprised if they would ever be used - it's not even clear whether they are supported by the mid-end since bitints are like bitfields and would require extra work.


- On AArch32 this could cause surprises to developers, given that on this
architecture small Fundamental Data Types are have zero- or sign-extended
extra bits. So a ``char`` would not have the same representation as a
``_BitInt(8)`` on this architecture.

- If used in calls to variadic functions which were written for standard
integral types this can give surprising results.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be reasonable to normalize a bitint when passed as a variadic argument without an explicit cast to int so eg. printf works fine either way. This makes it equivalent to how a char is passed as a variadic argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, honestly I'd not seriously considered this since the way that the PCS is defined is through a two step process of
"source code type" -> "fundamental data type"
then
"fundamental data type" -> "rule for PCS passing".

Now you've pointed it out it does seem reasonable.
Not sure whether the exceptional case could lead to confusion -- will invite other feedback to see what people think.
@smithp35 @stuij @rsandifo-arm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test @rearnsha (wanted to tag you, the comment box doesn't want to auto-complete your name, don't know if typing it fully out will properly link -- this is a test).

``_BitInt``'s will not cross cache boundaries.
- Atomic loads and stores can be made on these objects.
- The representation of bit-precise types of the same size as standard integer
types will have the same alignment and size in memory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That last point is really the most important. The first 2 points are not too convincing since unaligned accesses are practically for free on modern cores, and LSE2 allows atomics on unaligned data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh -- thanks!
I did not realise the cost of unaligned accesses were essentially free (useful to know).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, optimization guides mention things like "The Neoverse N2 handles most unaligned accesses without performance penalties.", so you are unlikely to notice a penalty in typical compiled code.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not just about the atomic RMW operations, you also (I think) want ordinary reads and writes of _BitInt variables to be single-copy-atomic and implemtable with ordinary loads and stores. This implies they should be aligned so as not to cross cache boundaries, and also not packed along with other objects (i.e. no packing a _BitInt(24) together with a _BitInt(8) in a 32-bit word).

@@ -135,16 +390,19 @@ It has the following negatives:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, mentioning atomics as the first, most important thing does not make any sense. For large bitints you can't do atomics at all (no atomics on int arr[100] either!), small bitints up to int128 could support atomics similar to how bitfields could support atomics (typically using CAS due to needing extra normalization step). And it's not like LDSMAX/LDUMAX is a popular atomic given it's not even a supported builtin in GCC...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah -- I had not realised the order of bullet points would imply a priority order.
The atomics were not supposed to be presented as the most important reason -- will adjust.

To make sure I'm not missing something -- what extra normalization step are you referring to?
Is it the normalisation required for the set of operations which are not OK for a given representation (e.g. if choosing option B there would be a normalisation required for /, but not +) or is it something else you've noticed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I mean the standard normalization. So while you can't use LDADD with option C, you can still use a CAS or load exclusive loop with an extra zero/sign-extend. Since you can do this for every atomic operation, being able to use LDADD or LDSMAX directly might be nice but shouldn't count as an advantage. Not being able to use atomics at all would be a disadvantage for a particular layout if atomic support is essential (and I'm not convinced it is for bitint).

values. However, when acting on "large" values (i.e. greater than the size of
one register) it loses some of that benefit. Storing to and from memory would
also come at a cost for this representation. This is also likely to be the most
surprising representation for developers on an Arm platform.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note there is no reason why a compiler couldn't do register operations in format A, while passing arguments/storing as C and assuming B when loading bitints or using parameters/return values. A seems most useful as an optimized internal representation (which is entirely private) rather than as the storage format.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I'll add a paragraph explicitly mentioning that the compiler could always choose to use a different representation at non ABI boundaries.

@stuij
Copy link
Member

stuij commented Mar 1, 2023

@rearnsha

@mmalcomson mmalcomson mentioned this pull request May 30, 2023
`_BitInt(N)` and `unsigned _BitInt(N)` are new integral types added
for C23.  These types have a bit width of `N` and define a different
type for each `N`.

Here we define the language mapping between these language types and
the machine data types used throughout AAPCS64 through which our
calling standard is defined.
Along with the AAPCS32 commit, this closes ARM-software#175.

The rationale for the choices in this patch are presented in a
seperate commit with a rationale document.

Some points of note around the language in this commit:

-- This commit does not update any wording around bit-fields.
The current wording under "Bit-fields subdivision" mentions that "A
member of an aggregate that is a Fundamental Data Type may be
subdivided into bit-fields".  I do not believe this needs updating.
While a _BitInt(N>128) can be subdivided into a bit-field at
the *language level*, once it has been converted to a Machine Type
such a subdivision would look like some number of quad-words which
have not been subdivided (either fully used or fully unused) and
either one or zero quad-words which have been subdivided.

The alignment requirements of the _BitInt are also the same as the
alignment requirements of this fundamental data type of a quad-word,
which this paragraph uses to explain the resulting alignment
requirements on the aggregate containing a bit-field.

In the C/C++ language mapping description of "Bit-fields" we mention
that a bit-field may have any integral type and since a bit-precise
integer is an integral type this still holds.  The explanation of
where a field can be placed in this section relies on *alignment*
requirements of the field type.  For _BitInt this lines up with the
discussion on *fundamental data types* at the machine-level, since the
alignment requirements of a _BitInt are that of the "chunk" it is made
up of, which is the fundamental data type of a quad-word.

Hence I believe the current language does not need updating for
bit-precise integers.

-- _BitInt(N > 128) types and Homogeneous Aggregates.
With this changeset, _BitInt(N>128) types are treated as arrays of
__int128 values.  Hence at the machine level they would be a
Homogeneous Aggregate of quad-words.

The wording in section 5.9.5 currently mentions "uniquely addressable
Members".  I am not sure what this is referring to, but would expect
this is referring to addressable members of the base type.  If that is
the case then I don't believe anything needs to be updated.  If it
were referring to addressable members at the language level (which
would be strange given the context) then this may need updating since
one language-level _BitInt(256) type would not equat to one
Fundamental Data Type.

-- Combination of unspecified bits in _BitInt and C.16 in PCS rules.
The mapping this commits defines from a _BitInt to a Machine Type
specifies that the bits of the relevant Machine Type that are unused
in a _BitInt(N) have unspecified value.

The rules C.12 and C.16 of our parameter passing standard specify that
when there are unused bits of a structure and/or Integral fundamental
data type that are passed in registers, those unused bits are
unspecified.  The combination of these two rules means that e.g. when
passing a _BitInt(2) across a PCS boundary in a register, bits [2-63]
inclusive are unspecified.

-- In-memory and in-register representations match.
This commit only specifies the mapping from language level type to
machine type.  The machine type is then treated as it currently is in
memory and in registers.
`_BitInt(N)` and `unsigned _BitInt(N)` are new integral types added
for C23.  These types have a bit width of `N` and define a different
type for each `N`.

Here we define the language mapping between these language types and
the machine data types used throughout AAPCS32 through which our
calling standard is defined.
Along with the AAPCS64 commit, this closes ARM-software#175.

The rationale for the choices in this patch are presented in a
seperate commit with a rationale document.

Some points of note around the language in this commit:

-- This commit does not update any wording around bit-fields.

The current wording under "Bit-fields" section 5.3.4 mentions that "A
member of an aggregate that is a Fundamental Data Type may be
subdivided into bit-fields".  I do not believe this needs updating.
While a _BitInt(N>64) can be subdivided into a bit-field at the
*language level*, once it has been converted to a Machine Type, such a
subdivision would look like some number of double-words which have not
been subdivided (either fully used or fully unused) and either one or
zero double-words which have been subdivided.

The alignment requirements of the _BitInt are also the same as the
alignment requirements of this fundamental data type of a double-word,
which this paragraph uses to explain the resulting alignment
requirements on the aggregate containing a bit-field.

In the C/C++ language mapping description of "Bit-fields" we mention
that a bit-field may have any integral type and since a bit-precise
integer is an integral type this still holds.  The explanation of
where a field can be placed in this section relies on *alignment*
requirements of the field type.  For _BitInt this lines up with the
discussion on *fundamental data types* at the machine-level, since the
alignment requirements of a _BitInt are that of the "chunk" it is made
up of, which is the fundamental data type of a double-word.

Hence I believe the current language does not need updating for
bit-precise integers.

-- _BitInt(N > 64) types and Homogeneous Aggregates.
With this changeset, _BitInt(N>64) types are treated as arrays of
uint64_t values.  Hence at the machine level they would be a
Homogeneous Aggregate of double-words.

-- Combination of unspecified bits in _BitInt and B.2 in PCS rules.
The mapping this commits defines from a _BitInt to a Machine Type
specifies that the bits of the relevant Machine Type that are unused
in a _BitInt(N) have unspecified value.

Rule B.2 of our parameter passing standard specifies that when there
are unused bits in an integral Fundamental Data Type that is passed in
registers, those unused bits are zero- or sign-extended to a
full-word.  The combination of this rule with the fact that _BitInt
types are zero- or sign-extended to the Fundamental Data Type which
they are passed in means that e.g. when passing a _BitInt(2) across a
PCS boundary in a register, bits [2-63] inclusive are sign-extended.

-- In-memory and in-register representations match.
This commit only specifies the mapping from language level type to
machine type.  The machine type is then treated as it currently is in
memory and in registers.
This type has been added into the C2x specification, alongside changes
to describe how they are represented at a machine level we also add a
design document describing the rationale behind the choices we made.

One of the main contentious issues has been around the alignment of
large _BitInt types.  In this document we have chosen to specify that
large _BitInt's are treated as if they were an array of double-register
sized chunks.

Other psABI's have chosen to represent large _BitInt's as an array of
single-register sized chunks.  This means that differences in alignment
might be surprising to developers attempting to write portable code.
However in contrast we believe that developers familiar with ABI's for
Arm Architectures would be less surprised by this decision.

N.b. I happened to also notice a stack overflow question which was
suggesting the use of _BitInt(128) for u128.  Given the discussion did
not have any mention of the different ABI between this and __uint128 I
would take it as evidence for a benefit to having the two integrals ABI
match.
https://stackoverflow.com/questions/16088282/is-there-a-128-bit-integer-in-gcc
@mmalcomson
Copy link
Contributor Author

Just pinging everyone that has been watching this.
Things seem to have settled on a decision here -- if there are still objections that I've missed could someone raise them again.

Otherwise could someone @rearnsha @smithp35 @stuij have a double-check that everything seems fine to them and approve this?

Copy link

@sallyarmneale sallyarmneale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Use-cases known of so far
-------------------------

There seem to be two different regimes for these types. The "small" regime

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
There seem to be two different regimes for these types. The "small" regime
There are two different regimes for these types. The "small" regime

Copy link
Contributor Author

@mmalcomson mmalcomson Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I adjusted to "We believe there are two different regimes" (point being that we could be mistaken -- there could be more logical regimes to think about -- and want to convey that this is how we view things rather than something which we can be 100% confident on)


Since this is a new type there is large uncertainty on how it will be used by
programmers in the future. Decisions we make here may also influence future
usage. Nonetheless we must make trade-off decisions with this uncertainty. The

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
usage. Nonetheless we must make trade-off decisions with this uncertainty. The
usage. Therefore, we must make trade-off decisions with this uncertainty. The

critical than performance concerns in this use-case.


To help the compiler optimize (e.g. for auto vectorization)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To help the compiler optimize (e.g. for auto vectorization)
To help the compiler optimize (for example, for auto vectorization)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted this to (possibly for auto vectorization) -- hope that isn't falling into another ease-of-parsing pitfall ;-)


- This would be a less familiar representation to programmers. Especially the
fact that a ``_BitInt(8)`` would not have the same representation in a
register as a ``char`` could cause confusion (e.g. when debugging, or writing

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
register as a ``char`` could cause confusion (e.g. when debugging, or writing
register as a ``char`` could cause confusion (For example, when debugging, or writing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason the "for example" reads clunkier than "e.g." to me.
I went with cause confusion (we imagine when debugging or writing assembly code).
Hopefully that avoids the ease-of-parsing difficulty for non-native speakers.

Copy link

@sallyarmneale sallyarmneale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor changes - mostly phrasing. Please do avoid using latin contractions (e.g., i.e, and etc) they're not good for non-native speakers.

@mmalcomson
Copy link
Contributor Author

Please do avoid using latin contractions (e.g., i.e, and etc) they're not good for non-native speakers.

Noted -- thanks for the info!

``unsigned _BitInt(N)``. These are defined for integral ``N`` and each ``N`` is
a different type.

The proposal for these types can be found in following link.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The proposal for these types can be found in following link.
The proposal for these types can be found in following link:

Copy link

@sallyarmneale sallyarmneale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor changes - mostly phrasing. Please do avoid using latin contractions (e.g., i.e, and etc) they're not good for non-native speakers.

Mostly things that should help the document be easily understood for
non-native speakers.
@mmalcomson
Copy link
Contributor Author

@sallyarmneale Thanks for the reviews -- I've updated all the things you pointed at and upon the re-read found some places that I felt could be better worded as well.
N.b. to anyone else reading -- I also re-ordered the reasons in a few places.
A while ago Wilco pointed out that the ordering of reasons gave an impression that we weighted the behaviour around atomics higher than we actually did, and on re-reading it seems I had only re-ordered some of the mentions.

Copy link

@sallyarmneale sallyarmneale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@stuij stuij merged commit 4bd9df5 into ARM-software:main Oct 4, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants