-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hg: add email validation for hg users (bug 1691635) #217
base: main
Are you sure you want to change the base?
Conversation
landoapi/hg.py
Outdated
@@ -519,3 +523,9 @@ def read_checkout_file(self, path: str) -> str: | |||
|
|||
with checkout_file_path.open() as f: | |||
return f.read() | |||
|
|||
@staticmethod | |||
def extract_email_from_username(username: Union[str, bytes]) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
def extract_email_from_username(username: Union[str, bytes]) -> str: | |
def extract_email_from_username(username: str | bytes) -> str: |
landoapi/hg.py
Outdated
if not user: | ||
raise ValueError("Missing `User` header!") | ||
|
||
email = self.extract_email_from_username(user) | ||
if not is_valid_email(email): | ||
raise ValueError("Invalid email configured for Mercurial user!") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be helpful to include the email address here in the output.
landoapi/hg.py
Outdated
def extract_email_from_username(username: Union[str, bytes]) -> str: | ||
"""Extracts an email from a Mercurial username, if it exists. | ||
Not guaranteed to return a valid email, make sure to validate.""" | ||
return str(username).split("<").pop().replace(">", "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be a regular expression. It would be good to add a unit test for this method as well.
@@ -20,3 +20,11 @@ def test_convertion_failure_string(id): | |||
def test_convertion_failure_integer(): | |||
with pytest.raises(TypeError): | |||
revision_id_to_int(123) | |||
|
|||
|
|||
def test_is_valid_email(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More test cases can be added here (for example test <test>
and test <test@test>
. And perhaps a few malformed ones.
landoapi/validation.py
Outdated
@@ -6,6 +6,7 @@ | |||
from connexion import ProblemException | |||
|
|||
REVISION_ID_RE = re.compile(r"^D(?P<id>[1-9][0-9]*)$") | |||
VALID_EMAIL_RE = re.compile(r"[^@ \t\r\n]+@[^@ \t\r\n]+\.[^@ \t\r\n]+") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be a misnomer since it could catch invalid email addresses too. Maybe ACCEPTED_EMAIL_RE
is a better name. There probably is a more suitable regex that is available on the internet.
An example of an invalid email that will pass the above: -@...
landoapi/validation.py
Outdated
|
||
|
||
def is_valid_email(email: str) -> bool: | ||
"""Given a string, determines if it is a valid email.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can probably put the regex pattern here, and describe in more detail in the docstring what it's doing.
landoapi/hg.py
Outdated
"""Extracts an email from a Mercurial username, if it exists. | ||
Not guaranteed to return a valid email, make sure to validate.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See pep 257.
"""Extracts an email from a Mercurial username, if it exists. | |
Not guaranteed to return a valid email, make sure to validate.""" | |
"""Extracts an email from a Mercurial username, if it exists. | |
Not guaranteed to return a valid email, make sure to validate.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought I would leave the comments here explicitly (though we chatted about them a while ago over Zoom). Main comment is re: the use of replace
.
def extract_email_from_username(username: str | bytes) -> str: | ||
"""Extracts an email from a Mercurial username, if it exists. | ||
|
||
Not guaranteed to return a valid email, make sure to validate.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not guaranteed to return a valid email, make sure to validate.""" | |
Not guaranteed to return a valid email, make sure to validate. | |
""" |
email = search(r"<.*?>", str(username)) | ||
if email: | ||
return email.group(0).replace("<", "").replace(">", "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be simplified/cleaned up using grouping.
assert True not in [is_valid_email(value) for value in invalid_emails] | ||
assert False not in [is_valid_email(value) for value in valid_emails] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: more of a personal preference than anything else, but could be a little more human readable.
assert True not in [is_valid_email(value) for value in invalid_emails] | |
assert False not in [is_valid_email(value) for value in valid_emails] | |
assert not any(is_valid_email(value) for value in invalid_emails) | |
assert all(is_valid_email(value) for value in valid_emails) |
while moz-phab will hopefully catch invalid emails before they make it to lando now, this PR adds an extra safeguard against incorrectly configured mercurial users' commits landing in a repo
Bug 1691635 - Lando landed a commit without a valid email address