Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-101291: Rearrange the size bits in PyLongObject #102464

Merged
merged 37 commits into from
Mar 22, 2023
Merged
Show file tree
Hide file tree
Changes from 35 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
0ec07e4
Add functions to hide some internals of long object.
markshannon Jan 25, 2023
292b9d0
Add internal functions to longobject.c for setting sign and digit count.
markshannon Jan 25, 2023
5c54894
Replace Py_SIZE(x) < 0 with _PyLong_IsNegative(x) in longobject.c
markshannon Feb 28, 2023
029aaa4
Replace Py_ABS(Py_SIZE(a)) with _PyLong_DigitCount(a) in longobject.c
markshannon Feb 28, 2023
b56e6da
Remove many uses of Py_SIZE in longobject.c
markshannon Feb 28, 2023
91269fc
Remove _PyLong_AssignValue, as it is no longer used.
markshannon Feb 28, 2023
c48e825
Remove some more uses of Py_SIZE in longobject.c.
markshannon Feb 28, 2023
449c0e2
Remove a few more uses of Py_SIZE in longobject.c.
markshannon Mar 1, 2023
c5ba601
Remove some more uses of Py_SIZE, replacing with _PyLong_UnsignedDigi…
markshannon Mar 1, 2023
4b3a3e8
Replace a few Py_SIZE() with _PyLong_SameSign().
markshannon Mar 1, 2023
9ef9d2c
Remove a few more Py_SIZE() from longobject.c
markshannon Mar 1, 2023
9c408c1
Replace uses of IS_MEDIUM_VALUE macro with _PyLong_IsSingleDigit.
markshannon Mar 1, 2023
548d656
Remove most of the remaining uses of Py_SIZE in longobject.c
markshannon Mar 1, 2023
3e3fefd
Replace last remaining uses of Py_SIZE applied to longobject with _Py…
markshannon Mar 1, 2023
391fb51
Don't use _PyObject_InitVar and move a couple of inline functions to …
markshannon Mar 1, 2023
df8c7d3
Correct name of inline function.
markshannon Mar 1, 2023
bc14fa6
Eliminate all remaining uses of Py_SIZE and Py_SET_SIZE on PyLongObject.
markshannon Mar 1, 2023
54c6f1b
Change layout of size/sign bits in longobject to support future addit…
markshannon Mar 2, 2023
ce6bfb2
Test pairs of longs together on fast path of add/mul/sub.
markshannon Mar 2, 2023
4c1956b
Tidy up comment and delete commented out code.
markshannon Mar 6, 2023
301158b
Add news.
markshannon Mar 6, 2023
1aa1891
Remove debugging asserts.
markshannon Mar 6, 2023
bf2a9af
Fix storage classes.
markshannon Mar 6, 2023
169f521
Remove development debug functions.
markshannon Mar 6, 2023
90f9072
Avoid casting to smaller int.
markshannon Mar 8, 2023
f143443
Apply suggestions from code review.
markshannon Mar 8, 2023
a0d661e
Widen types to avoid data loss.
markshannon Mar 8, 2023
145a2e4
Fix syntax error.
markshannon Mar 8, 2023
638a98f
Replace 'SingleDigit' with 'Compact' as the term 'single digit' seems…
markshannon Mar 9, 2023
7f5acc0
Address review comments.
markshannon Mar 16, 2023
b06bb6f
Merge branch 'main' into long-rearrange-size-bits
markshannon Mar 16, 2023
a19b0a7
Merge branch 'main' into long-rearrange-size-bits
markshannon Mar 16, 2023
87f49b2
Fix _PyLong_Sign
markshannon Mar 16, 2023
f764aa8
Replace _PyLong_Sign(x) < 0 with _PyLong_IsNegative(x).
markshannon Mar 16, 2023
9843ac0
fix sign check
markshannon Mar 16, 2023
d6cb917
Address some review comments.
markshannon Mar 22, 2023
469d26f
Change asserts on digit counts to asserts on sign where applicable.
markshannon Mar 22, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion Include/cpython/longintrepr.h
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ typedef long stwodigits; /* signed variant of twodigits */
*/
Copy link
Contributor

@verhovsky verhovsky Apr 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You didn't update this comment that documents _longobject, it's still talking about ob_size and PyVarObject

/* Long integer representation.
   The absolute value of a number is equal to
        SUM(for i=0 through abs(ob_size)-1) ob_digit[i] * 2**(SHIFT*i)


typedef struct _PyLongValue {
Py_ssize_t ob_size; /* Number of items in variable part */
uintptr_t lv_tag; /* Number of digits, sign and flags */
digit ob_digit[1];
} _PyLongValue;

Expand All @@ -94,6 +94,10 @@ PyAPI_FUNC(PyLongObject *) _PyLong_New(Py_ssize_t);
/* Return a copy of src. */
PyAPI_FUNC(PyObject *) _PyLong_Copy(PyLongObject *src);

PyAPI_FUNC(PyLongObject *)
_PyLong_FromDigits(int negative, Py_ssize_t digit_count, digit *digits);


#ifdef __cplusplus
}
#endif
Expand Down
173 changes: 155 additions & 18 deletions Include/internal/pycore_long.h
Original file line number Diff line number Diff line change
Expand Up @@ -82,8 +82,6 @@ PyObject *_PyLong_Add(PyLongObject *left, PyLongObject *right);
PyObject *_PyLong_Multiply(PyLongObject *left, PyLongObject *right);
PyObject *_PyLong_Subtract(PyLongObject *left, PyLongObject *right);

int _PyLong_AssignValue(PyObject **target, Py_ssize_t value);

/* Used by Python/mystrtoul.c, _PyBytes_FromHex(),
_PyBytes_DecodeEscape(), etc. */
PyAPI_DATA(unsigned char) _PyLong_DigitValue[256];
Expand All @@ -110,25 +108,164 @@ PyAPI_FUNC(char*) _PyLong_FormatBytesWriter(
int base,
int alternate);

/* Return 1 if the argument is positive single digit int */
/* Long value tag bits:
* 0-1: Sign bits value = (1-sign), ie. negative=2, positive=0, zero=1.
* 2: Reserved for immortality bit
Comment on lines +112 to +113
Copy link
Contributor

@eduardo-elizondo eduardo-elizondo Apr 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need an immortality flag here, but we do need a static flag (immortality should be marked by the refcount and this marks if the object is static or not. Using this, we can do the static check at dealloc time to prevent the deallocation of the objects

* 3+ Unsigned digit count
*/
#define SIGN_MASK 3
#define SIGN_ZERO 1
#define SIGN_NEGATIVE 2
#define NON_SIZE_BITS 3

/* All *compact" values are guaranteed to fit into
* a Py_ssize_t with at least one bit to spare.
* In other words, for 64 bit machines, compact
* will be signed 63 (or fewer) bit values
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also add that compact values have at most one digit? I've seen some code depending on that (e.g. _PyLong_Multiply).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not with tagged ints. In theory a compact int could have 5 digits. (63 bit compact ints, and 15 bit digits).

For a sensible implementation, a compact int will be one or two digits.

*/

/* Return 1 if the argument is compact int */
static inline int
_PyLong_IsNonNegativeCompact(const PyLongObject* op) {
assert(PyLong_Check(op));
return op->long_value.lv_tag <= (1 << NON_SIZE_BITS);
gvanrossum marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

@eduardo-elizondo eduardo-elizondo Apr 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work if we set the second (immortal/static) bit, i.e: the immortal small int 1 since it will have an lv_tag of 1100 and return an incorrect value here.

I'll create a new PR to restructure this a bit to make it work with the new bit flag.

cc @ericsnowcurrently

}

static inline int
_PyLong_IsCompact(const PyLongObject* op) {
assert(PyLong_Check(op));
return op->long_value.lv_tag < (2 << NON_SIZE_BITS);
}

static inline int
_PyLong_BothAreCompact(const PyLongObject* a, const PyLongObject* b) {
assert(PyLong_Check(a));
assert(PyLong_Check(b));
return (a->long_value.lv_tag | b->long_value.lv_tag) < (2 << NON_SIZE_BITS);
}

/* Returns a *compact* value, iff `_PyLong_IsCompact` is true for `op`.
*
* "Compact" values have at least one bit to spare,
* so that addition and subtraction can be performed on the values
* without risk of overflow.
*/
static inline Py_ssize_t
_PyLong_CompactValue(const PyLongObject *op)
{
assert(PyLong_Check(op));
assert(_PyLong_IsCompact(op));
Py_ssize_t sign = 1 - (op->long_value.lv_tag & SIGN_MASK);
return sign * (Py_ssize_t)op->long_value.ob_digit[0];
}

static inline bool
_PyLong_IsZero(const PyLongObject *op)
{
return (op->long_value.lv_tag & SIGN_MASK) == SIGN_ZERO;
}

static inline bool
_PyLong_IsNegative(const PyLongObject *op)
{
return (op->long_value.lv_tag & SIGN_MASK) == SIGN_NEGATIVE;
}

static inline bool
_PyLong_IsPositive(const PyLongObject *op)
{
return (op->long_value.lv_tag & SIGN_MASK) == 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not have #define SIGN_POSITIVE 0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want these functions to be the only way to determine the sign.
Defining SIGN_POSITIVE will just encourage people to do the test elsewhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, fine. Next question: maybe we also need a _PyLong_IsNonZero? I see !_PyLong_IsZero a lot, and the ! is easily missed. (Or maybe that's just my old eyes.) Possibly also IsNonNegative and IsNonPositive.

}

static inline Py_ssize_t
_PyLong_DigitCount(const PyLongObject *op)
{
assert(PyLong_Check(op));
return op->long_value.lv_tag >> NON_SIZE_BITS;
}

/* Equivalent to _PyLong_DigitCount(op) * _PyLong_NonCompactSign(op) */
static inline Py_ssize_t
_PyLong_SignedDigitCount(const PyLongObject *op)
{
assert(PyLong_Check(op));
Py_ssize_t sign = 1 - (op->long_value.lv_tag & SIGN_MASK);
return sign * (Py_ssize_t)(op->long_value.lv_tag >> NON_SIZE_BITS);
}

/* Like _PyLong_DigitCount but asserts that op is non-negative */
static inline Py_ssize_t
_PyLong_UnsignedDigitCount(const PyLongObject *op)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not excited about this name; I keep having to look up how it differs from _PyLong_DigitCount, and it's not really related to _PyLong_SignedDigitCount. :-( Maybe _PyLong_NonNegativeDigitCount? Or perhaps better _PyLong_DigitCountOfNonNegative?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed this for the extra check during implementation. _PyLong_UnsignedDigitCount is now the same as _PyLong_DigitCount, and should remain so.

I'll remove it.

{
assert(PyLong_Check(op));
assert(!_PyLong_IsNegative(op));
return op->long_value.lv_tag >> NON_SIZE_BITS;
}

static inline int
_PyLong_IsPositiveSingleDigit(PyObject* sub) {
/* For a positive single digit int, the value of Py_SIZE(sub) is 0 or 1.

We perform a fast check using a single comparison by casting from int
to uint which casts negative numbers to large positive numbers.
For details see Section 14.2 "Bounds Checking" in the Agner Fog
optimization manual found at:
https://www.agner.org/optimize/optimizing_cpp.pdf

The function is not affected by -fwrapv, -fno-wrapv and -ftrapv
compiler options of GCC and clang
*/
assert(PyLong_CheckExact(sub));
Py_ssize_t signed_size = Py_SIZE(sub);
return ((size_t)signed_size) <= 1;
_PyLong_CompactSign(const PyLongObject *op)
{
assert(PyLong_Check(op));
assert(_PyLong_IsCompact(op));
return 1 - (op->long_value.lv_tag & SIGN_MASK);
}

static inline int
_PyLong_NonCompactSign(const PyLongObject *op)
{
assert(PyLong_Check(op));
assert(!_PyLong_IsCompact(op));
return 1 - (op->long_value.lv_tag & SIGN_MASK);
}

/* Do a and b have the same sign? */
static inline int
_PyLong_SameSign(const PyLongObject *a, const PyLongObject *b)
{
return (a->long_value.lv_tag & SIGN_MASK) == (b->long_value.lv_tag & SIGN_MASK);
}

#define TAG_FROM_SIGN_AND_SIZE(sign, size) ((1 - (sign)) | ((size) << NON_SIZE_BITS))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size should be cast to size_t before shifting, and the result cast to Py_ssize_t to avoid UB.

I also haven't checked the assembly here, but I don't really know what happens when OR-ing a signed 64-bit int with a signed 32-bit int, and if this is doing work that's not strictly necessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is only in _PyLong_SetSignAndSize that size is a variable. I'll do the conversion there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So maybe add a comment that this macro should only be used with literal or size_t arguments?


static inline void
_PyLong_SetSignAndDigitCount(PyLongObject *op, int sign, Py_ssize_t size)
{
assert(size >= 0);
assert(-1 <= sign && sign <= 1);
assert(sign != 0 || size == 0);
op->long_value.lv_tag = TAG_FROM_SIGN_AND_SIZE(sign, (size_t)size);
}

static inline void
_PyLong_SetDigitCount(PyLongObject *op, Py_ssize_t size)
{
assert(size >= 0);
op->long_value.lv_tag = (((size_t)size) << NON_SIZE_BITS) | (op->long_value.lv_tag & SIGN_MASK);
}

#define NON_SIZE_MASK ~((1 << NON_SIZE_BITS) - 1)

static inline void
_PyLong_FlipSign(PyLongObject *op) {
unsigned int flipped_sign = 2 - (op->long_value.lv_tag & SIGN_MASK);
op->long_value.lv_tag &= NON_SIZE_MASK;
op->long_value.lv_tag |= flipped_sign;
}

#define _PyLong_DIGIT_INIT(val) \
{ \
.ob_base = _PyObject_IMMORTAL_INIT(&PyLong_Type), \
.long_value = { \
.lv_tag = TAG_FROM_SIGN_AND_SIZE( \
(val) == 0 ? 0 : ((val) < 0 ? -1 : 1), \
(val) == 0 ? 0 : 1), \
{ ((val) >= 0 ? (val) : -(val)) }, \
} \
}

#define _PyLong_FALSE_TAG TAG_FROM_SIGN_AND_SIZE(0, 0)
#define _PyLong_TRUE_TAG TAG_FROM_SIGN_AND_SIZE(1, 1)

#ifdef __cplusplus
}
#endif
Expand Down
3 changes: 2 additions & 1 deletion Include/internal/pycore_object.h
Original file line number Diff line number Diff line change
Expand Up @@ -136,8 +136,9 @@ static inline void
_PyObject_InitVar(PyVarObject *op, PyTypeObject *typeobj, Py_ssize_t size)
{
assert(op != NULL);
Py_SET_SIZE(op, size);
assert(typeobj != &PyLong_Type);
_PyObject_Init((PyObject *)op, typeobj);
Py_SET_SIZE(op, size);
}


Expand Down
10 changes: 1 addition & 9 deletions Include/internal/pycore_runtime_init.h
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ extern "C" {
# error "this header requires Py_BUILD_CORE define"
#endif

#include "pycore_long.h"
#include "pycore_object.h"
#include "pycore_parser.h"
#include "pycore_pymem_init.h"
Expand Down Expand Up @@ -130,15 +131,6 @@ extern PyTypeObject _PyExc_MemoryError;

// global objects

#define _PyLong_DIGIT_INIT(val) \
{ \
.ob_base = _PyObject_IMMORTAL_INIT(&PyLong_Type), \
.long_value = { \
((val) == 0 ? 0 : ((val) > 0 ? 1 : -1)), \
{ ((val) >= 0 ? (val) : -(val)) }, \
} \
}

#define _PyBytes_SIMPLE_INIT(CH, LEN) \
{ \
_PyVarObject_IMMORTAL_INIT(&PyBytes_Type, (LEN)), \
Expand Down
8 changes: 7 additions & 1 deletion Include/object.h
Original file line number Diff line number Diff line change
Expand Up @@ -138,8 +138,13 @@ static inline PyTypeObject* Py_TYPE(PyObject *ob) {
# define Py_TYPE(ob) Py_TYPE(_PyObject_CAST(ob))
#endif

PyAPI_DATA(PyTypeObject) PyLong_Type;
PyAPI_DATA(PyTypeObject) PyBool_Type;

// bpo-39573: The Py_SET_SIZE() function must be used to set an object size.
static inline Py_ssize_t Py_SIZE(PyObject *ob) {
assert(ob->ob_type != &PyLong_Type);
assert(ob->ob_type != &PyBool_Type);
PyVarObject *var_ob = _PyVarObject_CAST(ob);
return var_ob->ob_size;
}
Expand Down Expand Up @@ -171,8 +176,9 @@ static inline void Py_SET_TYPE(PyObject *ob, PyTypeObject *type) {
# define Py_SET_TYPE(ob, type) Py_SET_TYPE(_PyObject_CAST(ob), type)
#endif


static inline void Py_SET_SIZE(PyVarObject *ob, Py_ssize_t size) {
assert(ob->ob_base.ob_type != &PyLong_Type);
assert(ob->ob_base.ob_type != &PyBool_Type);
ob->ob_size = size;
}
#if !defined(Py_LIMITED_API) || Py_LIMITED_API+0 < 0x030b0000
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Rearrage bits in first field (after header) of PyLongObject. * Bits 0 and 1:
1 - sign. I.e. 0 for positive numbers, 1 for zero and 2 for negative numbers.
* Bit 2 reserved (probably for the immortal bit) * Bits 3+ the unsigned
size.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Format as bullets?

Suggested change
Rearrage bits in first field (after header) of PyLongObject. * Bits 0 and 1:
1 - sign. I.e. 0 for positive numbers, 1 for zero and 2 for negative numbers.
* Bit 2 reserved (probably for the immortal bit) * Bits 3+ the unsigned
size.
Rearrage bits in first field (after header) of PyLongObject:
* Bits 0 and 1: 1 - sign. I.e. 0 for positive numbers, 1 for zero and 2 for negative numbers.
* Bit 2 reserved (probably for the immortal bit).
* Bits 3+ the unsigned size.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. I suspect it got reformatted by something.


This makes a few operations slightly more efficient, and will enable a more
compact and faster 2s-complement representation of most ints in future.
43 changes: 8 additions & 35 deletions Modules/_decimal/_decimal.c
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
#endif

#include <Python.h>
#include "pycore_long.h" // _PyLong_IsZero()
#include "pycore_pystate.h" // _PyThreadState_GET()
#include "complexobject.h"
#include "mpdecimal.h"
Expand Down Expand Up @@ -2146,35 +2147,25 @@ dec_from_long(PyTypeObject *type, PyObject *v,
{
PyObject *dec;
PyLongObject *l = (PyLongObject *)v;
Py_ssize_t ob_size;
size_t len;
uint8_t sign;

dec = PyDecType_New(type);
if (dec == NULL) {
return NULL;
}

ob_size = Py_SIZE(l);
if (ob_size == 0) {
if (_PyLong_IsZero(l)) {
_dec_settriple(dec, MPD_POS, 0, 0);
return dec;
}

if (ob_size < 0) {
len = -ob_size;
sign = MPD_NEG;
}
else {
len = ob_size;
sign = MPD_POS;
}
uint8_t sign = _PyLong_IsNegative(l) ? MPD_NEG : MPD_POS;

if (len == 1) {
_dec_settriple(dec, sign, *l->long_value.ob_digit, 0);
if (_PyLong_IsCompact(l)) {
_dec_settriple(dec, sign, l->long_value.ob_digit[0], 0);
mpd_qfinalize(MPD(dec), ctx, status);
return dec;
}
size_t len = _PyLong_DigitCount(l);

#if PYLONG_BITS_IN_DIGIT == 30
mpd_qimport_u32(MPD(dec), l->long_value.ob_digit, len, sign, PyLong_BASE,
Expand Down Expand Up @@ -3482,7 +3473,6 @@ dec_as_long(PyObject *dec, PyObject *context, int round)
PyLongObject *pylong;
digit *ob_digit;
size_t n;
Py_ssize_t i;
mpd_t *x;
mpd_context_t workctx;
uint32_t status = 0;
Expand Down Expand Up @@ -3536,26 +3526,9 @@ dec_as_long(PyObject *dec, PyObject *context, int round)
}

assert(n > 0);
pylong = _PyLong_New(n);
if (pylong == NULL) {
mpd_free(ob_digit);
mpd_del(x);
return NULL;
}

memcpy(pylong->long_value.ob_digit, ob_digit, n * sizeof(digit));
assert(!mpd_iszero(x));
pylong = _PyLong_FromDigits(mpd_isnegative(x), n, ob_digit);
mpd_free(ob_digit);

i = n;
while ((i > 0) && (pylong->long_value.ob_digit[i-1] == 0)) {
i--;
}

Py_SET_SIZE(pylong, i);
if (mpd_isnegative(x) && !mpd_iszero(x)) {
Py_SET_SIZE(pylong, -i);
}

mpd_del(x);
return (PyObject *) pylong;
}
Expand Down
2 changes: 1 addition & 1 deletion Modules/_testcapi/mem.c
Original file line number Diff line number Diff line change
Expand Up @@ -347,7 +347,7 @@ test_pyobject_new(PyObject *self, PyObject *Py_UNUSED(ignored))
{
PyObject *obj;
PyTypeObject *type = &PyBaseObject_Type;
PyTypeObject *var_type = &PyLong_Type;
PyTypeObject *var_type = &PyBytes_Type;

// PyObject_New()
obj = PyObject_New(PyObject, type);
Expand Down
Loading