Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial stub of proxito #6226

Merged
merged 34 commits into from
Oct 9, 2019
Merged

Initial stub of proxito #6226

merged 34 commits into from
Oct 9, 2019

Conversation

ericholscher
Copy link
Member

@ericholscher ericholscher commented Oct 1, 2019

This is an initial attempt at adding a method of doc serving that goes through Django, instead of being done by an Nginx server with symlinks managed by Django. It depends on the Read the Docs codebase, and is mostly just a simplification of the existing serving code. It's done in parallel so that we can deploy it alongside the current running application with just settings changes.

Primary Changes

Current problems

  • The tests are all running explicitly against the old middleware which is a different style. I don't know the best way to handle this, and probably need to just fork the tests, but that feels bad :/
    Fix: I went ahead and just converted the middleware to the old-style class for testing. I got tired of beating my head against it :)

  • I also want to convert the doc serving code to be class-based, so we can extend it in a nicer fashion on the .com -- currently the logic has diverged pretty dramatically, and I'm hoping we can get them much closer together with just one or two things changed

Ops changes

There will be a corresponding ops update needed, but the gist of it is:

server {
    listen 80;
    server_name dev.readthedocs.io;
   
    location / {
        // proxito
        proxy_pass http://127.0.0.1:8001;
        proxy_buffering off;
        proxy_intercept_errors on;
        proxy_set_header Host $http_host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Host $host;

        error_page 404 420 = @fallback;
        add_header X-Served Root always;
    }

    location /proxito/ {
        internal;
        alias /Users/eric/projects/readthedocs.org/media/;

        error_page 404 420 = @fallback;
        add_header X-Served Nginx-Sendfile always;
    }

   
    location @fallback {
        // RTD
        proxy_pass http://127.0.0.1:8000;
        proxy_buffering off;
        proxy_intercept_errors on;
        proxy_set_header Host $http_host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Host $host;

        add_header X-Served Nginx-Fallback always;
    }

}

Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks that it's going in a good direction. I didn't test it locally, though. I left only nit comments about style or naming.

I think that most of this code comes from what we already have in other places, but it's easier to follow here since it's alone and split in different functions.

readthedocs/proxito/middleware.py Outdated Show resolved Hide resolved
readthedocs/proxito/middleware.py Outdated Show resolved Hide resolved
readthedocs/proxito/middleware.py Outdated Show resolved Hide resolved
readthedocs/proxito/middleware.py Show resolved Hide resolved
readthedocs/proxito/middleware.py Outdated Show resolved Hide resolved
readthedocs/proxito/middleware.py Outdated Show resolved Hide resolved
readthedocs/proxito/views.py Outdated Show resolved Hide resolved
readthedocs/proxito/views.py Outdated Show resolved Hide resolved
Copy link
Contributor

@davidfischer davidfischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is starting to take shape and that's great.

What other features are needed for this to launch? User defined redirects?

readthedocs/proxito/middleware.py Outdated Show resolved Hide resolved
readthedocs/settings/proxito.py Show resolved Hide resolved
readthedocs/proxito/urls.py Outdated Show resolved Hide resolved
Comment on lines 147 to 153
# Handle a / redirect when we aren't a single version
if all([lang_slug is None, version_slug is None, filename == '',
not current_project.single_version]):
log.info('Proxito redirect: slug=%s', current_project.slug)
return redirect_project_slug(
request, project=current_project, subproject=None
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I found a change in the behavior of the / redirect.

Currently in production, if we hit the .readthedocs.io URL (https://requests.readthedocs.io/) of a project with a Custom Domain defined, our NGINX will serve the documentation under that URL.

On the other hand, El Proxito is redirecting me to the Custom Domain project URL.

The new behavior is what I like the most (and I think @davidfischer as well) but I just wanted to note that there is a change here that we may need to take into account.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I'm mixed about what the proper behavior is. I think the root redirects are fine to automatically use the custom domain, but I'm guessing we'll actually want this to be configurable in the future. Sometimes I actually want to be able to go to the readthedocs.io domain even when there's a custom domain.

subproject,
lang_slug=None,
version_slug=None,
filename='',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this default to be None as well? It makes more sense to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the way Django processes the URL regex, it always matches to ''. I could leave it without a default actually, but this seems more explicit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see --no strong opinion, then.

Comment on lines +179 to +180
# TODO: Redirects need to be refactored before we can turn them on
# They currently do 1 request per redirect that exists for the project
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that the redirects would work properly if they are enabled at this point. Although, the problem here is being really slow because we will be hitting the db for all the redirects the project have all the times even if there is not match.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep.

Comment on lines 212 to 213
# Serve from the filesystem if using DEBUG or Testing
# Tests require this now since we don't want to check for the file existing in prod
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure to follow this. Where this code is checking for file existing?

I understand that PYTHON_MEDIA=True is useful to run this locally and get docs served properly instead of just seeing the file served in the response. I'm not seeing why this is needed for testing, though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The django serve function will check for files existing. This comment is confusing though, I will rewrite it.

@ericholscher
Copy link
Member Author

What other features are needed for this to launch? User defined redirects?

Nothing really. We can fallback to the Django app for all the 404's, so we can continue using all the existing app logic for redirects, 404's, and robots.txt handling. The goal will be to port all of that to proxito, but baby steps :)

@ericholscher ericholscher marked this pull request as ready for review October 7, 2019 17:47
@ericholscher ericholscher requested a review from a team October 7, 2019 17:47
Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! I left some small comments to consider

readthedocs/proxito/tests/test_full.py Show resolved Hide resolved
readthedocs/proxito/tests/test_middleware.py Outdated Show resolved Hide resolved
readthedocs/proxito/tests/test_middleware.py Outdated Show resolved Hide resolved
Copy link
Member

@stsewd stsewd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't review proxito/views.py looks like it's just a copy of the code from core/views/serve.py?

readthedocs/proxito/middleware.py Outdated Show resolved Hide resolved
readthedocs/proxito/middleware.py Outdated Show resolved Hide resolved
readthedocs/proxito/middleware.py Show resolved Hide resolved
# Code to be executed for each request before
# the view (and later middleware) are called.

host_project = map_host_to_project_slug(request)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That function doesn't always return a slug, sometimes it returns a response.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe refactor the function to return a tuple slug, response or try..cath exceptions and move the responses to the other part of the code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I wasn't sure the best way to refactor this. It does validation, and we have multiple different responses we want (invalid and also CNAME 404).


class NewStyleProxitoMiddleware:

"""The new style middleware, I can't figure out how to test it."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can test the request at least by creating a method process_request(self, request), that way you don't depend on self.get_response or mock it to test with localhost and testserver as hostname.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I think you only want to test is the request

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, the get_response thing just feels hard to test. I think having it old-style for now is fine, and we can figure out the migration in the future.

Copy link
Member Author

@ericholscher ericholscher Oct 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just going to delete the new-style middleware for now.

Copy link
Member Author

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

)
subproject = rel.child
except (ProjectRelationship.DoesNotExist, KeyError):
subproject = get_object_or_404(Project, slug=subproject_slug)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we aren't even actually validating the subproject relationship here. It seems like we could accidentally serve a random project as a subproject of another project :/


These settings will eventually be backported into the main settings file,
but currently we have them to be able to run the site with the old middleware for
a staged rollout of the proxito code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
a staged rollout of the proxito code.
a staged rollout of El Proxito code.

😄

ericholscher and others added 2 commits October 9, 2019 08:54
Co-Authored-By: Manuel Kaufmann <humitos@gmail.com>
@ericholscher ericholscher merged commit 2a7b6d3 into master Oct 9, 2019
@ericholscher ericholscher deleted the proxito-stub branch October 9, 2019 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants