Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[str] Add LiteralString overload for __getitem__ #12714

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Commits on Oct 1, 2024

  1. str: Add LiteralString overload for __getitem__

    In PEP 675, @gbleaney and I had specified a list of `LiteralString`-preserving [overloads](https://peps.python.org/pep-0675/#appendix-c-str-methods-that-preserve-literalstring) for `str`. However, we didn't specify an overload for `__getitem__` and didn't give any rationale either. IIRC this was an edge case we didn't want to take a strong decision on unless users wanted it.
    
    @carljm brought this up yesterday, so I think it's worth discussing.
    
    Pro: `my_literal_string[i]` or `my_literal_string[i:j]` should technically be compatible with `LiteralString`, since it is a substring of a literal-derived string.
    
    Con: The main downside is that an attacker might control the indexes and try to access a specific substring from a literal string in the code. For example, they might narrow down the string to `rm foo` or `SELECT *`.
    
    It's true that `join` and other methods could also construct dangerous strings from `LiteralString`s, and we even call that out as an accepted tradeoff in the PEP:
    
    > 4. Trivial functions could be constructed to convert a str to a LiteralString:
    >
    >     def make_literal(s: str) -> LiteralString:
    >         letters: Dict[str, LiteralString] = {
    >             "A": "A",
    >             "B": "B",
    >             ...
    >         }
    >         output: List[LiteralString] = [letters[c] for c in s]
    >         return "".join(output)
    >
    > We could mitigate the above using linting, code review, etc., but ultimately a clever, malicious developer attempting to circumvent the protections offered by LiteralString will always succeed. The important thing to remember is that LiteralString is not intended to protect against malicious developers; it is meant to protect against benign developers accidentally using sensitive APIs in a dangerous way (without getting in their way otherwise).
    >
    > Without LiteralString, the best enforcement tool API authors have is documentation, which is easily ignored and often not seen. With LiteralString, API misuse requires conscious thought and artifacts in the code that reviewers and future developers can notice.
    >
    > -- [PEP 675 - Appendix B: Limitations](https://peps.python.org/pep-0675/#appendix-b-limitations)
    
    `__getitem__`, however, seems a bit different, because it (and `split`, `zfill`, etc.) accept an index or width that could be used to construct a dangerous query or a humongous string. So, we need to clarify the intent a little.
    
    What was the intent of these overloads? We wanted to forbid "arbitrary user-supplied strings" while allowing methods that preserved literal strings. We were not trying to prevent every possible exploit on the string. Since `__getitem__` forbids arbitrary user-supplied strings and preserves literal strings, I think we should add an overload for it.
    
    Thanks to @carljm for bringing this up!
    
    cc @gbleaney @JelleZijlstra in case they have different opinions. Would also love to hear from other users.
    pradeep90 committed Oct 1, 2024
    Configuration menu
    Copy the full SHA
    648fc3c View commit details
    Browse the repository at this point in the history
  2. [str] Add type: ignore to suppress error about LiteralString over…

    …load.
    
    This should fix the error:
    
    ```
    /opt/hostedtoolcache/Python/3.10.15/x64/bin/python -m mypy.stubtest --check-typeshed --custom-typeshed-dir . --allowlist stdlib/@tests/stubtest_allowlists/common.txt --allowlist stdlib/@tests/stubtest_allowlists/linux.txt --allowlist stdlib/@tests/stubtest_allowlists/py310.txt --allowlist stdlib/@tests/stubtest_allowlists/linux-py310.txt
    error: not checking stubs due to mypy build errors:
    stdlib/builtins.pyi:595: error: Overloaded function signature 2 will never be matched: signature 1's parameter type(s) are the same or broader  [misc]
    ```
    
    I'm not sure why there is an error, because the overload seems valid to me. Other `LiteralString` overloads have the same `type: ignore[misc]`, so I'm just going to copy them.
    pradeep90 committed Oct 1, 2024
    Configuration menu
    Copy the full SHA
    cbfcdb6 View commit details
    Browse the repository at this point in the history