Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-88116: Use a compact format to represent end line and column offsets. #91666

Merged
merged 19 commits into from
Apr 21, 2022

Conversation

markshannon
Copy link
Member

@markshannon markshannon commented Apr 18, 2022

Reduces the memory consumption for extended location information to less than half.
Location information now takes ~3.5 bytes per instruction, instead of 8. (Note this is per instruction, not code unit).
All the information is saved in one table, instead of three.

#88116

@brandtbucher
Copy link
Member

Looks like this results in a 6% memory improvement over main:

Faster (50):
- html5lib: 19.5 MB +- 3818.1 kB -> 14.3 MB +- 1504.1 kB: 1.36x faster
- pickle_dict: 9628.8 kB +- 639.9 kB -> 8825.6 kB +- 21.5 kB: 1.09x faster
- sympy_expand: 59.3 MB +- 15.9 kB -> 54.5 MB +- 40.5 kB: 1.09x faster
- sympy_integrate: 59.2 MB +- 341.5 kB -> 54.4 MB +- 439.2 kB: 1.09x faster
- sympy_str: 60.4 MB +- 16.4 kB -> 55.6 MB +- 34.4 kB: 1.09x faster
- go: 10.5 MB +- 357.3 kB -> 9933.6 kB +- 471.4 kB: 1.08x faster
- deltablue: 10.9 MB +- 153.2 kB -> 10.1 MB +- 183.4 kB: 1.08x faster
- sqlalchemy_declarative: 30.4 MB +- 97.6 kB -> 28.2 MB +- 33.6 kB: 1.08x faster
- meteor_contest: 11.0 MB +- 624.6 kB -> 10.3 MB +- 27.5 kB: 1.08x faster
- chameleon: 20.5 MB +- 101.0 kB -> 19.1 MB +- 164.3 kB: 1.08x faster
- json_dumps: 10.7 MB +- 282.2 kB -> 10172.4 kB +- 353.9 kB: 1.08x faster
- xml_etree_process: 13.2 MB +- 216.1 kB -> 12.2 MB +- 1339.7 kB: 1.08x faster
- unpickle: 9546.4 kB +- 48.9 kB -> 8880.0 kB +- 44.7 kB: 1.08x faster
- json_loads: 8948.8 kB +- 51.2 kB -> 8329.6 kB +- 19.2 kB: 1.07x faster
- fannkuch: 8866.0 kB +- 48.3 kB -> 8264.8 kB +- 43.4 kB: 1.07x faster
- sympy_sum: 67.2 MB +- 3430.4 kB -> 62.6 MB +- 2516.7 kB: 1.07x faster
- django_template: 37.3 MB +- 130.6 kB -> 34.8 MB +- 582.1 kB: 1.07x faster
- logging_silent: 9223.2 kB +- 51.4 kB -> 8618.0 kB +- 51.7 kB: 1.07x faster
- sqlalchemy_imperative: 30.2 MB +- 83.4 kB -> 28.2 MB +- 676.0 kB: 1.07x faster
- pickle_pure_python: 9472.8 kB +- 51.8 kB -> 8865.2 kB +- 34.5 kB: 1.07x faster
- pickle_list: 9458.0 kB +- 48.6 kB -> 8856.4 kB +- 38.9 kB: 1.07x faster
- unpickle_list: 9451.2 kB +- 41.3 kB -> 8850.4 kB +- 50.7 kB: 1.07x faster
- genshi_text: 12.4 MB +- 111.4 kB -> 11.6 MB +- 115.8 kB: 1.07x faster
- raytrace: 9852.4 kB +- 39.3 kB -> 9238.0 kB +- 20.8 kB: 1.07x faster
- pickle: 9450.0 kB +- 21.5 kB -> 8863.2 kB +- 50.6 kB: 1.07x faster
- unpickle_pure_python: 9460.0 kB +- 21.0 kB -> 8875.6 kB +- 45.7 kB: 1.07x faster
- nbody: 9023.6 kB +- 62.0 kB -> 8474.0 kB +- 18.8 kB: 1.06x faster
- regex_compile: 10.0 MB +- 23.3 kB -> 9648.8 kB +- 28.8 kB: 1.06x faster
- chaos: 9804.0 kB +- 46.1 kB -> 9218.8 kB +- 13.1 kB: 1.06x faster
- scimark_monte_carlo: 9552.0 kB +- 55.7 kB -> 8982.4 kB +- 52.3 kB: 1.06x faster
- scimark_sor: 9583.6 kB +- 56.9 kB -> 9017.6 kB +- 69.3 kB: 1.06x faster
- richards: 9411.2 kB +- 16.0 kB -> 8858.8 kB +- 63.3 kB: 1.06x faster
- unpack_sequence: 10.3 MB +- 62.4 kB -> 9914.4 kB +- 41.1 kB: 1.06x faster
- spectral_norm: 8828.8 kB +- 48.4 kB -> 8315.6 kB +- 40.4 kB: 1.06x faster
- scimark_fft: 9686.4 kB +- 55.9 kB -> 9128.8 kB +- 39.8 kB: 1.06x faster
- scimark_sparse_mat_mult: 10115.6 kB +- 25.0 kB -> 9543.6 kB +- 35.0 kB: 1.06x faster
- scimark_lu: 9614.0 kB +- 20.0 kB -> 9076.4 kB +- 31.9 kB: 1.06x faster
- pidigits: 9035.2 kB +- 19.6 kB -> 8532.0 kB +- 31.6 kB: 1.06x faster
- pathlib: 10.1 MB +- 41.3 kB -> 9790.8 kB +- 34.4 kB: 1.06x faster
- tornado_http: 31.4 MB +- 912.5 kB -> 29.7 MB +- 1137.7 kB: 1.06x faster
- crypto_pyaes: 9374.0 kB +- 129.7 kB -> 8869.2 kB +- 114.8 kB: 1.06x faster
- nqueens: 9085.6 kB +- 30.2 kB -> 8605.2 kB +- 53.3 kB: 1.06x faster
- sqlite_synth: 10.9 MB +- 38.3 kB -> 10.3 MB +- 27.4 kB: 1.06x faster
- xml_etree_generate: 13.4 MB +- 319.8 kB -> 12.7 MB +- 366.7 kB: 1.05x faster
- dulwich_log: 15.3 MB +- 97.1 kB -> 14.6 MB +- 82.8 kB: 1.05x faster
- genshi_xml: 12.9 MB +- 120.7 kB -> 12.4 MB +- 12.1 kB: 1.04x faster
- regex_v8: 14.6 MB +- 34.7 kB -> 14.1 MB +- 56.0 kB: 1.04x faster
- python_startup: 11.2 MB +- 28.6 kB -> 10.9 MB +- 41.4 kB: 1.03x faster
- python_startup_no_site: 11.2 MB +- 30.8 kB -> 10.9 MB +- 32.8 kB: 1.03x faster
- 2to3: 22.0 MB +- 51.4 kB -> 21.6 MB +- 64.3 kB: 1.02x faster

Benchmark hidden because not significant (11): float, hexiom, logging_format, logging_simple, mako, pyflate, regex_dna, regex_effbot, telco, xml_etree_parse, xml_etree_iterparse

Geometric mean: 1.06x faster

Objects/codeobject.c Outdated Show resolved Hide resolved
else {
int ldelta = get_line_delta(&data[offset]);
*output++ = 0xe8 | blength;
output += write_signed_varint(output, ldelta);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this reimplementing write_location_info_no_column from compile.c?

Copy link
Member

@iritkatriel iritkatriel Apr 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably get an even tighter representation for the ! code_debug_ranges case (which is presumably the perf critical one). Here we are only using two of the codes, and using a whole byte 4 bits to specify which one. Am I right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although there are only two codes, there is the implicit bit of information about whether code_debug_ranges is set.
So there are 6 bits of information in the byte.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this extra info in each entry?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which extra info?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether code_debug_ranges is set. It's the same across all entries, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. If that information is not stored in the entry, then we effectively have two distinct formats.
Which means two different sets of code for creating and parsing the tables.
I doubt that it is worth it.

@iritkatriel
Copy link
Member

Looks like this results in a 6% memory improvement over main:

I think we need a similar comparison with debug_ranges turned off.

Copy link
Member

@iritkatriel iritkatriel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think https://github.com/python/cpython/pull/91666/files#r854367808 can be addressed now (replace the 15 by the enum), but the other comments we can look at later.

It would be good in the future to have all the logic about the structure of this table in one place. Currently it's split between compile.c, codeobject.c and the tests. We can look at this (and the exceptions table) as part of the compiler work for 3.12.

We can also think later about an even-more optimised representation for line-numbers-only if we want.

Objects/codeobject.c Outdated Show resolved Hide resolved
@markshannon markshannon merged commit 944fffe into python:main Apr 21, 2022
@vstinner
Copy link
Member

Is the documentation up to date? https://docs.python.org/dev/c-api/code.html#c.PyCode_New

Should it be documented in What's New in Python 3.11?

@markshannon
Copy link
Member Author

The documentation is correct again, now I've removed the two extra tables. It wasn't before.

@markshannon
Copy link
Member Author

No, it is wrong as it is missing the exception table.
Was that documentation ever right? 😞

@vstinner
Copy link
Member

Was that documentation ever right? disappointed

The Python 3.10 doc looks correct, no? If not, it should be fixed ;-)

https://docs.python.org/3.10/c-api/code.html#c.PyCode_New

@pablogsal
Copy link
Member

@markshannon There is something wrong with commit 944fffe and the new encoding format. Check out #93662

@markshannon markshannon deleted the compact-debug-info branch September 26, 2023 12:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants