Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some supposedly invalid addresses in the documentation point toward malicious websites #102627

Closed
Blind4Basics opened this issue Mar 12, 2023 · 9 comments
Assignees
Labels
docs Documentation in the Doc dir

Comments

@Blind4Basics
Copy link
Contributor

Blind4Basics commented Mar 12, 2023

Describe the problem

I found in the documentation about concurrency some examples that have been "exploited" by malicious people:
in the ThreadPoolExecutor Example

import concurrent.futures
import urllib.request

URLS = ['http://www.foxnews.com/',
        'http://www.cnn.com/',
        'http://europe.wsj.com/',
        'http://www.bbc.co.uk/',
        'http://some-made-up-domain.com/']   # <<<  (DO NOT TRY IT IN A BROWSER)
...

The last domain name is supposed to be non existent.
However, when I tried the snippet, I got a valid response on second try (the first one woke up their server).
It's not problematic with the code example, since the code of the page is just plain text, but anyone trying to go there through their browser might end up in some kind of troubles...

The content of the hosted page is apparently a "hard redirection" toward... something :

<html><head><title>Loading...</title></head>
<body>
    <script type='text/javascript'>window.location.replace(
        'http://some-made-up-domain.com/?ch=1&js=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhdWQiOiJKb2tlbiIsImV4cCI6MTY3ODYxNjgxMywiaWF0IjoxNjc4NjA5NjEzLCJpc3MiOiJKb2tlbiIsImpzIjoxLCJqdGkiOiIydDVwdDM2ajgyNjU0YjRma281ZjhhMGciLCJuYmYiOjE2Nzg2MDk2MTMsInRzIjoxNjc4NjA5NjEzODAyNDEzfQ.H4l5qNGb5Ex8ehG3hxX_kWx8ODqTMRgJs0HBeQyCx1Q&sid=a4f97e10-c0af-11ed-b324-9d77bf5b132c'
        );
    </script>
</body>
</html>

Expected solution

Any invalid address in the docs should point to invalid page in trustful domains, to not allow this kind of security hole.


Cheers

Linked PRs

@Blind4Basics Blind4Basics changed the title Some addresses in the documentation point toward malicious websites Some supposdely invalid addresses in the documentation point toward malicious websites Mar 12, 2023
@Blind4Basics Blind4Basics changed the title Some supposdely invalid addresses in the documentation point toward malicious websites Some supposedly invalid addresses in the documentation point toward malicious websites Mar 12, 2023
@CAM-Gerlach
Copy link
Member

CAM-Gerlach commented Mar 12, 2023

Thanks for the report. This issue appears to be in the Python documentation, not the devguide, so I'm transferring it there.

If the last example must be non-existent, it could presumably use a deliberately non-existing subdomain of the python.org domain, which we control, or a subdomain of the reserved example.com domain, or one of the other canonical example.* domains.

Could you list the other examples you've found here, so they can be fixed? Would you like to submit a PR to fix this, and any others you've found? Thanks!

@CAM-Gerlach CAM-Gerlach transferred this issue from python/devguide Mar 12, 2023
@CAM-Gerlach CAM-Gerlach added the docs Documentation in the Doc dir label Mar 12, 2023
@Mariatta
Copy link
Member

I agree we should update such invalid urls. I would also suggest updating even the valid urls like fox news into perhaps other more Python-related websites, but that could be done as a separate PR.

@Blind4Basics
Copy link
Contributor Author

Blind4Basics commented Mar 12, 2023

oh, sorry for posting in the wrong repo...

I actually didn't search in other snippets. But in the same one, a second address is to be handled: 'http://europe.wsj.com/' (it still is invalid so far)
Edit: it might actually be better to use only addresses you control, even for valid ones? (oh, already pointed in the previous message... x) )

About the PR: maybe I'll give it a shot, but no guarantees.

@CAM-Gerlach
Copy link
Member

I actually didn't search in other snippets. But in the same one, a second address is to be handled: 'http://europe.wsj.com/' (it still is invalid so far)

That one's not a security issue, since the WSJ domain is still owned by WSJ (presumably, they just changed their subdomain layout), and it just redirects rather than is actually broken. We might want to change those to point to more appropriate Python-related websites and avoid any political controversy, but as @Mariatta mentioned, that's a separate issue.

About the PR: maybe I'll give it a shot, but no guarantees.

It would be a great issue to get started with, since it's only a one-line docs change. The devguide lays out the steps (which I figure you're probably familiar with, given your GitHub history) but let us know if you need any help! Thanks!

@hugovk
Copy link
Member

hugovk commented Mar 12, 2023

Also https://example.com, https://example.org and https://example.net are specifically reserved for documentation.

https://www.iana.org/domains/reserved

Blind4Basics added a commit to Blind4Basics/cpython that referenced this issue Mar 12, 2023
FIX python#102627 

partial fix only: would require to go through the entire documentation
@Blind4Basics
Copy link
Contributor Author

Blind4Basics commented Mar 12, 2023

I made a try. Hopefully I didn't mess the workflow up, going through a fork...

@CAM-Gerlach
Copy link
Member

Thanks! Reviewed it there.

gpshead pushed a commit that referenced this issue Mar 13, 2023
* Replace known bad address pointing toward a malicious web page.

Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Mar 13, 2023
…ythonGH-102630)

* Replace known bad address pointing toward a malicious web page.

(cherry picked from commit 61479d4)

Co-authored-by: Blind4Basics <32236948+Blind4Basics@users.noreply.github.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Mar 13, 2023
…ythonGH-102630)

* Replace known bad address pointing toward a malicious web page.

(cherry picked from commit 61479d4)

Co-authored-by: Blind4Basics <32236948+Blind4Basics@users.noreply.github.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Mar 13, 2023
…ythonGH-102630)

* Replace known bad address pointing toward a malicious web page.

(cherry picked from commit 61479d4)

Co-authored-by: Blind4Basics <32236948+Blind4Basics@users.noreply.github.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Mar 13, 2023
…ythonGH-102630)

* Replace known bad address pointing toward a malicious web page.

(cherry picked from commit 61479d4)

Co-authored-by: Blind4Basics <32236948+Blind4Basics@users.noreply.github.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
@gpshead
Copy link
Member

gpshead commented Mar 13, 2023

thanks for the PR. merged. the backports are approved and should merge themselves after CI or when the release managers get to them. (assigned to me just to double check that happening in the future)

@gpshead gpshead self-assigned this Mar 13, 2023
miss-islington added a commit that referenced this issue Mar 13, 2023
)

* Replace known bad address pointing toward a malicious web page.

(cherry picked from commit 61479d4)

Co-authored-by: Blind4Basics <32236948+Blind4Basics@users.noreply.github.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
miss-islington added a commit that referenced this issue Mar 13, 2023
)

* Replace known bad address pointing toward a malicious web page.

(cherry picked from commit 61479d4)

Co-authored-by: Blind4Basics <32236948+Blind4Basics@users.noreply.github.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
ned-deily pushed a commit that referenced this issue Mar 13, 2023
…H-102630) (GH-102668)

(cherry picked from commit 61479d4)

Co-authored-by: Blind4Basics <32236948+Blind4Basics@users.noreply.github.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
ned-deily pushed a commit that referenced this issue Mar 13, 2023
…H-102630) (GH-102666)

(cherry picked from commit 61479d4)

Co-authored-by: Blind4Basics <32236948+Blind4Basics@users.noreply.github.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
ned-deily pushed a commit that referenced this issue Mar 13, 2023
…H-102630) (GH-102667)

(cherry picked from commit 61479d4)

Co-authored-by: Blind4Basics <32236948+Blind4Basics@users.noreply.github.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
@ned-deily
Copy link
Member

All merged, thanks for the PR.

carljm added a commit to carljm/cpython that referenced this issue Mar 14, 2023
* main: (50 commits)
  pythongh-102674: Remove _specialization_stats from Lib/opcode.py (python#102685)
  pythongh-102660: Handle m_copy Specially for the sys and builtins Modules (pythongh-102661)
  pythongh-102354: change python3 to python in docs examples (python#102696)
  pythongh-81057: Add a CI Check for New Unsupported C Global Variables (pythongh-102506)
  pythonGH-94851: check unicode consistency of static strings in debug mode (python#102684)
  pythongh-100315: clarification to `__slots__` docs. (python#102621)
  pythonGH-100227: cleanup initialization of global interned dict (python#102682)
  doc: Remove a duplicate 'versionchanged' in library/asyncio-task (pythongh-102677)
  pythongh-102013: Add PyUnstable_GC_VisitObjects (python#102014)
  pythonGH-102670: Use sumprod() to simplify, speed up, and improve accuracy of statistics functions (pythonGH-102649)
  pythongh-102627: Replace address pointing toward malicious web page (python#102630)
  pythongh-98831: Use DECREF_INPUTS() more (python#102409)
  pythongh-101659: Avoid Allocation for Shared Exceptions in the _xxsubinterpreters Module (pythongh-102659)
  pythongh-101524: Fix the ChannelID tp_name (pythongh-102655)
  pythongh-102069: Fix `__weakref__` descriptor generation for custom dataclasses (python#102075)
  pythongh-98169 dataclasses.astuple support DefaultDict (python#98170)
  pythongh-102650: Remove duplicate include directives from multiple source files (python#102651)
  pythonGH-100987: Don't cache references to the names and consts array in `_PyEval_EvalFrameDefault`. (python#102640)
  pythongh-87092: refactor assemble() to a number of separate functions, which do not need the compiler struct (python#102562)
  pythongh-102192: Replace PyErr_Fetch/Restore etc by more efficient alternatives (python#102631)
  ...
Fidget-Spinner pushed a commit to Fidget-Spinner/cpython that referenced this issue Mar 27, 2023
…ython#102630)

* Replace known bad address pointing toward a malicious web page.

Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
warsaw pushed a commit to warsaw/cpython that referenced this issue Apr 11, 2023
…ython#102630)

* Replace known bad address pointing toward a malicious web page.

Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
carlosroman added a commit to DataDog/cpython that referenced this issue Jun 22, 2023
* Post 3.8.16

* [3.8] Update copyright years to 2023. (pythongh-100852)

* [3.8] Update copyright years to 2023. (pythongh-100848).
(cherry picked from commit 11f9932)

Co-authored-by: Benjamin Peterson <benjamin@python.org>

* Update additional copyright years to 2023.

Co-authored-by: Ned Deily <nad@python.org>

* [3.8] Update copyright year in README (pythonGH-100863) (pythonGH-100867)

(cherry picked from commit 30a6cc4)

Co-authored-by: Ned Deily <nad@python.org>
Co-authored-by: HARSHA VARDHAN <75431678+Thunder-007@users.noreply.github.com>

* [3.8] Correct CVE-2020-10735 documentation (pythonGH-100306) (python#100698)

(cherry picked from commit 1cf3d78)
(cherry picked from commit 88fe8d7)

Co-authored-by: Jeremy Paige <ucodery@gmail.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org>

* [3.8] Bump Azure Pipelines to ubuntu-22.04 (pythonGH-101089) (python#101215)

(cherry picked from commit c22a55c)

Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>

* [3.8] pythongh-100180: Update Windows installer to OpenSSL 1.1.1s (pythonGH-100903) (python#101258)

* pythongh-101422: (docs) TarFile default errorlevel argument is 1, not 0 (pythonGH-101424)

(cherry picked from commit ea23271)

Co-authored-by: Owain Davies <116417456+OTheDev@users.noreply.github.com>

* [3.8] pythongh-95778: add doc missing in some places (pythonGH-100627) (python#101630)

(cherry picked from commit 4652182)

* [3.8] pythongh-101283: Improved fallback logic for subprocess with shell=True on Windows (pythonGH-101286) (python#101710)

Co-authored-by: Oleg Iarygin <oleg@arhadthedev.net>
Co-authored-by: Steve Dower <steve.dower@microsoft.com>

* [3.8] pythongh-101981: Fix Ubuntu SSL tests with OpenSSL (3.1.0-beta1) CI i… (python#102095)

[3.8] pythongh-101981: Fix Ubuntu SSL tests with OpenSSL (3.1.0-beta1) CI issue (pythongh-102079)

* [3.8] pythonGH-102306 Avoid GHA CI macOS test_posix failure by using the appropriate macOS SDK (pythonGH-102307)

[3.8] Avoid GHA CI macOS test_posix failure by using the appropriate macOS SDK.

* [3.8] pythongh-101726: Update the OpenSSL version to 1.1.1t (pythonGH-101727) (pythonGH-101752)

Fixes CVE-2023-0286 (High) and a couple of Medium security issues.
https://www.openssl.org/news/secadv/20230207.txt

Co-authored-by: Gregory P. Smith <greg@krypto.org>
Co-authored-by: Ned Deily <nad@python.org>

* [3.8] pythongh-102627: Replace address pointing toward malicious web page (pythonGH-102630) (pythonGH-102667)

(cherry picked from commit 61479d4)

Co-authored-by: Blind4Basics <32236948+Blind4Basics@users.noreply.github.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>

* [3.8] pythongh-101997: Update bundled pip version to 23.0.1 (pythonGH-101998). (python#102244)

(cherry picked from commit 89d9ff0)

* [3.8] pythongh-102950: Implement PEP 706 – Filter for tarfile.extractall (pythonGH-102953) (python#104548)

Backport of c8c3956

* [3.8] pythongh-99889: Fix directory traversal security flaw in uu.decode() (pythonGH-104096) (python#104332)

(cherry picked from commit 0aeda29)

Co-authored-by: Sam Carroll <70000253+samcarroll42@users.noreply.github.com>

* [3.8] pythongh-104049: do not expose on-disk location from SimpleHTTPRequestHandler (pythonGH-104067) (python#104121)

Do not expose the local server's on-disk location from `SimpleHTTPRequestHandler` when generating a directory index. (unnecessary information disclosure)

(cherry picked from commit c7c3a60)

Co-authored-by: Ethan Furman <ethan@stoneleaf.us>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>

* [3.8] pythongh-103935: Use `io.open_code()` when executing code in trace and profile modules (pythonGH-103947) (python#103954)

Co-authored-by: Tian Gao <gaogaotiantian@hotmail.com>

* [3.8] pythongh-68966: fix versionchanged in docs (pythonGH-105299)

* [3.8] Update GitHub CI workflow for macOS. (pythonGH-105302)

* [3.8] pythongh-105184: document that marshal functions can fail and need to be checked with PyErr_Occurred (pythonGH-105185) (python#105222)

(cherry picked from commit ee26ca1)

Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>

* [3.8] pythongh-102153: Start stripping C0 control and space chars in `urlsplit` (pythonGH-102508) (pythonGH-104575) (pythonGH-104592) (python#104593) (python#104895)

`urllib.parse.urlsplit` has already been respecting the WHATWG spec a bit pythonGH-25595.

This adds more sanitizing to respect the "Remove any leading C0 control or space from input" [rule](https://url.spec.whatwg.org/GH-url-parsing:~:text=Remove%20any%20leading%20and%20trailing%20C0%20control%20or%20space%20from%20input.) in response to [CVE-2023-24329](https://nvd.nist.gov/vuln/detail/CVE-2023-24329).

I simplified the docs by eliding the state of the world explanatory
paragraph in this security release only backport.  (people will see
that in the mainline /3/ docs)

(cherry picked from commit d7f8a5f)
(cherry picked from commit 2f630e1)
(cherry picked from commit 610cc0a)
(cherry picked from commit f48a96a)

Co-authored-by: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Co-authored-by: Illia Volochii <illia.volochii@gmail.com>
Co-authored-by: Gregory P. Smith [Google] <greg@krypto.org>

* [3.8] pythongh-103142: Upgrade binary builds and CI to OpenSSL 1.1.1u (pythonGH-105174) (pythonGH-105200) (pythonGH-105205) (python#105370)

Upgrade builds to OpenSSL 1.1.1u.

Also updates _ssl_data_111.h from OpenSSL 1.1.1u, _ssl_data_300.h from 3.0.9.

Manual edits to the _ssl_data_300.h file prevent it from removing any
existing definitions in case those exist in some peoples builds and were
important (avoiding regressions during backporting).

(cherry picked from commit ede89af)
(cherry picked from commit e15de14)

Co-authored-by: Gregory P. Smith <greg@krypto.org>
Co-authored-by: Ned Deily <nad@python.org>

* Python 3.8.17

* Post 3.8.17

* Updated CI to build 3.8.17

---------

Co-authored-by: Łukasz Langa <lukasz@langa.pl>
Co-authored-by: Benjamin Peterson <benjamin@python.org>
Co-authored-by: Ned Deily <nad@python.org>
Co-authored-by: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Co-authored-by: HARSHA VARDHAN <75431678+Thunder-007@users.noreply.github.com>
Co-authored-by: Gregory P. Smith <greg@krypto.org>
Co-authored-by: Jeremy Paige <ucodery@gmail.com>
Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
Co-authored-by: Steve Dower <steve.dower@python.org>
Co-authored-by: Owain Davies <116417456+OTheDev@users.noreply.github.com>
Co-authored-by: Éric <earaujo@caravan.coop>
Co-authored-by: Oleg Iarygin <oleg@arhadthedev.net>
Co-authored-by: Steve Dower <steve.dower@microsoft.com>
Co-authored-by: Dong-hee Na <donghee.na@python.org>
Co-authored-by: Blind4Basics <32236948+Blind4Basics@users.noreply.github.com>
Co-authored-by: C.A.M. Gerlach <CAM.Gerlach@Gerlach.CAM>
Co-authored-by: Pradyun Gedam <pradyunsg@gmail.com>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Sam Carroll <70000253+samcarroll42@users.noreply.github.com>
Co-authored-by: Ethan Furman <ethan@stoneleaf.us>
Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Co-authored-by: Tian Gao <gaogaotiantian@hotmail.com>
Co-authored-by: Irit Katriel <1055913+iritkatriel@users.noreply.github.com>
Co-authored-by: stratakis <cstratak@redhat.com>
Co-authored-by: Illia Volochii <illia.volochii@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir
Projects
None yet
Development

No branches or pull requests

6 participants