-
Notifications
You must be signed in to change notification settings - Fork 103
Remove globbing from the spec #145
Comments
+1 |
👍 I questioned need for it a long time ago in solid/solid#116 |
+1 |
Eventually it is probably ideal for developer trust and adoption to adopt a Linux-like "never break userspace" policy, and to never make backward-incompatible changes like this. What is in the way of adopting such a policy? (i.e. what other spec features should be considered 'at risk for removal' before a 'v1'?) Edit: I don't think it's necessary to adopt such a policy today. But maybe by a year from now? I am +1 on this proposal in the interest of applying Occam's Razor to the core of the spec. Just think it's good to trim and apply learnings all at once here instead of continuing a piecemeal feature-removal strategy ad-infinitum. AS2 made backward-incompatible changes far too long into the spec, IMO, and it actively undercut my ability to get it adopted inside my organization at the time. |
Agreed: eventually.
W3C standardization would be a good way of going through the spec with scrutiny, and identifying and fixing issues (such as globbing). |
-1 This is in use Consider this a formal objection. Please by all means work on a HTTP2 library that could possibly act as a replacement. That would go some way to me withdrawing this objection. |
More specifically, I think we did discuss this in the past, and globbing is not quite the same as grabbing all the content in a directory. Is that useful? I think there can be an argument for yes and one for no. For example in an LDPC you get all the file content in a directory, and that's useful. Not suggesting those are equivalents, but is more a meta point. Also some stats on how well deployed http2 is would be handy. And a proposal on how http2 could replace globbing. I've had lots of interest from my social network app, in private discussions, including from former facebook people. And darcy etc. That uses globbing extensively. Could it be replaced. Possibly, but might take some design and time. Not time I currently have in the next quarter or two. Im increasingly deploying solid servers on all my devices now including yesterday my android phone. So eventually I could see solid deployed widely including IoT. So http2 usage would be an interesting data point here. We cant just be a chrome specific project, we should think about solid as web servers running everywhere, in your home, your fridge, your watch, your phone etc. There's a few things id like to see fixed and working before tackling this. So my more in depth answer is, not for now, we could mark it as "revisit", like many of our issues are. However, speccing out a possible http2 solution seems to be a good idea, and I'd support that proposal. |
@melvincarvalho I'm in full agreement. Nothing will or should be removed until there is a replacement. I planned to explicitly state this in the issue, but forgot; will update now. |
Tracking implementation of such a client-side feature in solid/solid#253. Just a small note that we will likely not need anything HTTP/2-specific: HTTP/2 will automatically optimize the sequence of requests. |
On pros cons (from solid/solid#253 (comment)) So let's just be honest about what are the pros/cons. From #145 the OP starts with
I don't pretend to know about this.
Only if you implement it naively. Alternate datastores (or if the datastore is a filesystem, shell out to bash or a C lib instead of globbing in node) can make this an O(1) lookup. This assertion needs more justification.
See 2. DDOS are a risk no matter what, e.g. by repeatedly getting full directory listings that are huge and taking up all available OS connections. It's not that unique to globbing. Practically, pros would deploy behind a DDOS-protecting middleware that makes this a non-issue.
My argument here is meant to analyze if this is true. At this point I think everyone agrees the 'HTTP/2' mention isn't what's important. Even over HTTP there is no'zero overhead', but the overhead is probably negligible in the vast majority of near-term scenarios.
No argument here. So 1 and 5 are likely good reasons. 3 is a bit of an overstatement ('zero overhead'), but can be rephrased to be just as convincing.
Totally agree! +1 |
Important update: it seems that globbing is much more loosely defined in the spec than how @melvincarvalho intends it. My objection here has been to the loose version; some of your objections might also have been. So please have a look at #148 for a proposal to already narrow down the current definition of globbing. |
While I prefer removing globing all together. In case it stays maybe the response could at least use dataset (quad) representation (Trig, JSON-LD) so at least client knows from which graphs / documents which statements came from. Otherwise I don't see how client could perform updates when it needs to. |
@timbl I noticed that you thumbs uppped this one. I think this is the first time in living memory that I possibly disagreed with you. Globbing is in use. I spent months of time and work building apps based on this pattern. If this had been at risk, I would not have started that work, and left it until we had other patterns in place. My intention was to revive work after the server work had stabilized, for which I have waited patiently. The main question is on what time line would you want this. On a longer time line I could see myself getting behind this, particularly if there are like for like replacements. My concern is that there will be unilateral changes to the spec at short notice. |
@RubenVerborgh There is a burden of proof for you to prove a number of things. But this one is foundational. So examine the apps that use globbing, and that also includes cimba, and make the case that globbing can do all the things that are done. In fact it needs to be said what the functional requirements are, because globbing is not just used to fetch files. It will be a good conversation and a learning experience, for those that follow, I think. And also, importantly for me, I will get some breathing space to digest the detail of the proposal and assess the timeline, which is the main thing that matters to me. I'd say 3 of our best 5 apps ever have used globbing and solid would not exist without them. Let's get to the bottom of the above, because I suspect there's some fine detail you've missed. EDIT: or even better, if you feel like you are in super hero mode (which sometimes are are imbued with) why not take a crack at taking one of the apps and porting it to node solid server 5 / http2 -- I think such an effort, would likely be the ultimate win-win. |
Happy to oblige:
Hmm, this is new information. For all we know (= your earlier statement at solid/solid#253 (comment) and
They would just client-side loop over all files in the container. |
Added PR for removal as well, given that seems to be the demand of most: #151 No need to rush. |
Replacement at https://github.com/solid/ldp-glob; live demo at https://solid.github.io/ldp-glob/demo.html?https://drive.verborgh.org/public/ |
@RubenVerborgh thanks for taking the time to create this. It's in the first place rather difficult to evaluate whether this is a like for like replacement, as it doesnt even have a README. I have had a very quick look at it, but will take some more time to do so. I've readded the on-hold tag, as I would like to discuss this over a longer period of time. Would appreciate it if you didnt unilaterally remove it. Cheers! |
It's just 9 lines, so I figured it would be overkill to turn it into a lib.
|
Citation required. Would appreciate to see the context, or better still, hear from Tim himself. Pain I know, but the bar for changing specs is necessarily high. |
That's it right there. No need to doubt my word.
It's a private conversation that I hence cannot share. Assigned the issues to @timbl, and will ping him to take a look. |
Discussed out of band with @melvincarvalho: I agree that #148 and #151 should be |
@NoelDeMartin Since you are using globbing in Solid Focus you should be aware of this |
@angelo-v thanks for the heads up. When it comes to my use case, the spec is already compatible with the things I want to do, I'm only using globbing because there is no support for SPARQL on node-solid-server implementation, as is being tracked on this issue: nodeSolidServer/node-solid-server#962 |
@NoelDeMartin do you see it possible to replace you current use of globbing with client side replacement @RubenVerborgh shared in #145 (comment) ? |
@elf-pavlik Yes it is possible, assuming the server uses HTTP/2 as @RubenVerborgh mentions. If it doesn't it's still possible but the performance won't be great. |
It's quite alright as long as there are not hundreds of RDF files (and there usually never are). All the rest is premature optimization 😉 |
@RubenVerborgh Well, considering I'm building a task manager there will probably be hundreds of files :). But yeah, I can live with that for the time being (and there is always HTTP/2). |
And every task is a file? In that case, yes. |
I can't see any reason why any Solid server would not use HTTP/2. I run NSS behind nginx and it just takes |
Client certs… |
https://letsencrypt.org/ BTW doesn't secure OAuth2 so also OpenID Connect rely on SSL? |
I meant client certs (not server certificates). NSS currently still allows authentication with client certificates. For that, NSS has to terminate the HTTPS connection, and NSS only does HTTP 1.1. While you can put a reverse proxy with HTTP/2 in front of NSS (which is what I do), this does break client-side certificates (or you have to find a way to forward the client certificate negotiation). |
Client certs like used in some enterprise environments (e.g. inrupt)? :o |
For clarity, there is no problem with client certs, HTTP/2, or client-side globing: all of these are perfectly combinable. It’s just that NSS (which currently supports server-side globbing, so no issues) terminates with HTTP 1.1. You can proxy with HTTP/2, but then need to figure out client cert passing; an all-encompassing solution has HTTP/2 on the Solid server. |
I strongly think that globbing should be removed from the spec.
Reasons for removing
No one really wants globbing. People want cross-file data access, and there are better ways of achieving that. Globbing has always been a hack for accessing data in multiple files. It was not thought through well (see below).
Globbing is expensive on the server-side.
Globbing can lead to denial of service (on server and client).
Whatever can be achieved through globbing, can be achieved as efficiently without.
With HTTP/2, there is zero overhead in just going through the files on the client side.
Let's remove it soon before it is actually widely used and implemented.
Reasons for keeping it plus mitigations
A (very low) number are using it.
Let's upgrade them.
It has been in the spec for many years.
That doesn't make it a good idea, and only a low number of apps are using it anyway (see 1).
Conditions
People who need to have a say in this
@timbl @melvincarvalho
The text was updated successfully, but these errors were encountered: