Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-99508: fix TypeError in Lib/importlib/_bootstrap_external.py #99635

Merged
merged 1 commit into from
Nov 23, 2022

Conversation

sobolevn
Copy link
Member

@sobolevn sobolevn commented Nov 21, 2022

This code is clearly not covered and not used (otherwise this bug would be noticed):

  1. Do we need to test it?
  2. How can we do it?

CC @FFY00

@DavidCEllis
Copy link
Contributor

DavidCEllis commented Nov 21, 2022

I obviously can't answer 1. but with regard to 2.

As far as I can tell in order to hit this line the test would need to have a subclass of SourceLoader that has an implementation of get_data that will raise an EOFError or ImportError the first time it is called on a .py file but not on subsequent calls (and always succeed on .pyc files). It has to fail only the first time and only on .py files otherwise the method will fail without reaching the line or source_hash will already exist.

I'm not sure such a test would make sense, but I guess that's a separate discussion?

(I was writing my own SourceFileLoader and spent a lot of time staring at this code trying to figure out why this line wasn't failing).

Copy link
Member

@FFY00 FFY00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks correct to me. I don't think we need a test, even though it'd be nice.

Requesting a review from @benjaminp as the author of #4575.

@benjaminp
Copy link
Contributor

A test would be good it have. (It seems you can get into this state by having a hash-based pyc where the source exists but not the bytecode cache?)

@DavidCEllis
Copy link
Contributor

A test would be good it have. (It seems you can get into this state by having a hash-based pyc where the source exists but not the bytecode cache?)

I don't think so, the only line in the method that changes hash_based from False is L1110 which requires self.get_data(bytecode_path) returns successfully first.

data = self.get_data(bytecode_path)
except OSError:
pass
else:
exc_details = {
'name': fullname,
'path': bytecode_path,
}
try:
flags = _classify_pyc(data, fullname, exc_details)
bytes_data = memoryview(data)[16:]
hash_based = flags & 0b1 != 0

@sobolevn
Copy link
Member Author

I think the test case is way too complex for such an obvious change:

  1. First we need to reach hash_based = flags & 0b1 != 0, it depends on data = self.get_data(bytecode_path) and flags = _classify_pyc(data, fullname, exc_details) been successful. So, we need either to mock it or to write (and then delete) some real bytecode.
  2. sys.dont_write_bytecode should be False and bytecode_path should be not None. It depends on global interpreter state and bytecode_path = cache_from_source(source_path) been successful
  3. source_mtime should be not None, it is set as source_mtime = int(st['mtime']) and depends on st = self.path_stats(source_path)
  4. Finally this should be True: if source_hash is None. It can only be achieved by skipping this condition:
if (_imp.check_hash_based_pycs != 'never' and
  (check_source or
  _imp.check_hash_based_pycs == 'always')):

I don't really think that such brittle unit tests will be worth it 😞

@DavidCEllis
Copy link
Contributor

I think the test case is way too complex for such an obvious change:
...
4. Finally this should be True: if source_hash is None. It can only be achieved by skipping this condition:

if (_imp.check_hash_based_pycs != 'never' and
  (check_source or
  _imp.check_hash_based_pycs == 'always')):

I don't really think that such brittle unit tests will be worth it 😞

I agree the test case is too complicated but it's actually worse. If this condition is skipped the function returns early in the else block of the try/except so you still can't reach this line.

else:
_bootstrap._verbose_message('{} matches {}', bytecode_path,
source_path)
return _compile_bytecode(bytes_data, name=fullname,
bytecode_path=bytecode_path,
source_path=source_path)

The only way for hash_based to be True and source_hash to be None and to reach the line later is if an ImportError or EOFError is raised (and so caught in the except block) after hash_based is set but before source_hash is set. I'm not sure if the 'real' methods have any chance of doing this.

@benjaminp benjaminp merged commit c69cfcd into python:main Nov 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants