Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redirect when accessing a container without trailing slash #1208

Open
RubenVerborgh opened this issue Mar 9, 2022 · 11 comments
Open

Redirect when accessing a container without trailing slash #1208

RubenVerborgh opened this issue Mar 9, 2022 · 11 comments

Comments

@RubenVerborgh
Copy link
Member

Current behavior:

$ curl -I https://drive.verborgh.org/public
HTTP/2 401
…
$ curl -I https://drive.verborgh.org/public/2022
HTTP/2 404
…

Desired behavior:

$ curl -I https://drive.verborgh.org/public
HTTP/2 301
Location: https://drive.verborgh.org/public/
…
$ curl -I https://drive.verborgh.org/public/2022
HTTP/2 301
Location: https://drive.verborgh.org/public/2022/
…

However, we should be careful about the first case. A 401 (or 404) should probably still be given for https://drive.verborgh.org/private in order to not leak information about the thing being a container (although that does not seem major).

@csarven
Copy link

csarven commented Mar 9, 2022

#server-authorization-redirect states (/TR/2021/protocol-20211217 and /TR/ED as of this writing):

Servers MUST authorize prior to this optional redirect.

GET /foo can be a 301 if and only if GET /foo/ is a 200. If GET /public/ is a 403, GET /public must also be a 403.

(If request lacks valid authentication credentials or the provided credentials is refused, then 401.)

If the word "authorize" in the requirement somehow got conflated with "401 Unauthorized" then I'm happy to update the text with what's intended, i.e., when access is denied due to permissions, which typically entails a 403.

@joachimvh
Copy link
Member

There's a comment in the code where we could add it: https://github.com/solid/community-server/blob/982d57e1d2812925b09353445a318d067a242943/src/storage/DataAccessorBasedStore.ts#L100

Doing it in that location has as consequence that we first do authorization on the wrong resource, and only if the agent has read access there would the redirect be thrown.

Specifically:

  • GET /foo request comes in
  • Standard acl authorization is checked on /foo
  • In case there is no read access: 401/403
  • In case there is read access: 301 to /foo/ (if it exists, and after changing the line above).
  • GET /foo/ request comes in and result is 401/403/200 depending on the acl rules for /foo/

If we want to only redirect to /foo/ in case we have read rights there that means we already need to read the acl rules for /foo/ while processing the /foo request which would potentially require some changes in how we handle authorization. We already check the existence of the target resource in #1185, but this would also require us to check the existence and the permissions of a different resource.

@joachimvh joachimvh removed the good first issue Good for newcomers label Oct 7, 2022
@CxRes
Copy link

CxRes commented Apr 25, 2023

Would it be possible to have a better error message than 401 Unauthorized on the container (irrespective of how you solve this issue). Clearly, the resource (unslashed) exists, and it exists only because the container (slashed) exists. Finding the resource and finding no permissions should be a strong clue that the client needs to add the slash.

@csarven
Copy link

csarven commented Apr 25, 2023

I wrote a recommendation for server required/optional at linkeddata/rdflib.js#617 (comment)

@CxRes
Copy link

CxRes commented Apr 25, 2023

The required 404 is unfortunately not helpful to the client! At the very least, the error message saying "is a container" would be useful.

@csarven
Copy link

csarven commented Apr 25, 2023

404 is useful information because it gives the client a full stop. Consider the actual scenario in which the client ended up with the URI without the trailing slash.

An error message with "is a container" information is a misapplication of the request/response semantics. There would be nothing for the client to do with that information unless further instructions are defined by a specification. Hence, if the server wanted to help the client automagically navigate to the proper location, it would use 301 (instead of introducing new payload instructions for a client).

@CxRes
Copy link

CxRes commented Apr 25, 2023

I am not literally saying "is a container" message, but something that will let the client know that it can try with /.

Actually, I dislike the whole trailing slash business. But, I am trying to work out if there is any way that a client application can avoid a dead-end on servers that are lazily implemented. If we have a world with two sets of solid server, some that redirect and some that don't - users not being able to navigate some %age of time, this inconsistent expectation, is a broken experience.

@csarven
Copy link

csarven commented Apr 25, 2023

In the solution you're suggesting the client would need to process the response payload. It doesn't matter if the message "is a container" or a 10 GB RDF essay on what should happen. Bottom line is the same. That particular conformance model will need to be specified. Again, contrast that with 301 and Location header that every legitimate client under the sun already understands.

How did the client get a hold of the URI without the trailing slash?

Not found precisely reflects the request/response semantics. Redirect is not required for interop.

If the need to let the client know the URI with the slash is so desired, redirect would be appropriate especially if the URI without the trailing slash was advertised by the server at some point in the past.

@CxRes
Copy link

CxRes commented Apr 26, 2023

Redirect is not required for interop.

Sure. I am not arguing with that (or with you). I am just trying to see if there is an iota of possibility of consistency for the client across all solid servers.

Ideally, I would like all servers to redirect or similar. But clearly the 404 option means that there are two different behaviours that implementations can choose from. The same type of client request leads to two different outcomes.

How did the client get a hold of the URI without the trailing slash?

That's besides the point, it could be a typo for all I care. But it certainly has a non-zero probability.

Also, @csarven I get that you are passionate, but please be calm and empathetic to where someone else's concerns are coming from (even if you think that the concern is irrelevant or misguided), That's why I tried to speak to you in private first, where you chose not to respond, forcing my hand into making public comments instead. But when you are having the conversation in public, this unnecessary invocation of hyperbole is just not on and the lack of technical empathy makes public engagement difficult.

@csarven
Copy link

csarven commented Apr 26, 2023

There are always going to be additional considerations that the server and the URI owner may want to take into account. Which is why there is a degree of variability in specs.

The 404 case is stronger than 301. 301 is more complex to implement and maintain than 404. With the exception of the URI of a resource legitimately changing, the server has no obligation to make that mapping (and commitment thereof). The typo scenario is a human-error, not client. The conformance model is set for clients and servers. 301 is mentioned in the Solid Protocol as some guidance. It could be dropped and it would not change interop. If a server wants to taken typos into account with a 301, they can, but that is not a significant reason for all servers to behave that way.

I see no issue with some servers sticking to 404. Again, that's precisely the request/response. And I see no issue with some servers taking the 301 route (and possibly 200 as mentioned in the link earlier for some cases) whenever they need to for any given resource. That's all part of a normal implementation of HTTP specs. It is what also what the Architecture of the World Wide Web tells us.

@csarven
Copy link

csarven commented Apr 26, 2023

As for your last paragraph, happy to pick this up elsewhere:

  • I apologise if anything I've said in this thread made you feel that way but could you please point me to which comments or what tone exactly?
  • In solid/specification chat, I've recommended that people discuss in public precisely because it benefits everyone (and implementations thereof), as opposed to a single party.
  • Would you mind explaining "invocation of hyperbole"? I have gone over my comments and cannot find an example of this. Likewise if I have shown lack of empathy, I'd really appreciate it if you could point me to examples so that I can learn from it and correct my behaviours.
  • I consider I was was neutral/normal and patient, tried to clarify why things are working the way they are, why proposed solutions are problematic, and asked you to provide more details to help you strengthen your case with new information that can be taken into account to reopen a decision. I am cognisant of the difficulty in conveying tone in text, but I can assure you I was coming from a place of trying to be helpful and informative.
  • I do take your concerns seriously, which is why I invest the time to provide a carefully thought-out response.
  • I don't see where the supposed contentiousness is coming from, and I would definitely appreciate your input on this, but I'm happy to step away and let others carry on with the discussion, and propose a change in the spec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants