Skip to content
This repository has been archived by the owner on Apr 13, 2022. It is now read-only.

Remove globbing #151

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Remove globbing #151

wants to merge 2 commits into from

Conversation

RubenVerborgh
Copy link
Contributor

This PR implements the (pending!) proposal to remove globbing at #145

It bypasses clarifying the definition of globbing (#148) by just removing it altogether, given that currently seems to be what the majority wants.

Just putting it out here as a possible option, no rush.

@RubenVerborgh
Copy link
Contributor Author

Client-side globbing alternative implemented in https://github.com/solid/ldp-glob; live demo at https://solid.github.io/ldp-glob/demo.html?https://drive.verborgh.org/public/

@michielbdejong
Copy link
Contributor

This would be for version 0.8 of the spec then. We need to discuss the timeline for that. I agree with Ruben about removal of globbing in the next spec version, but I agree with Melvin about moving slowly and not breaking things every few weeks. About timeline, my gut says let's do a next spec version 0.8 in December, and not rock the boat before that. But let's discuss that in the next weekly meeting!

@RubenVerborgh
Copy link
Contributor Author

RubenVerborgh commented Mar 29, 2019

Discussed out of band with @melvincarvalho; this should be on hold until he and @timbl can discuss.

@michielbdejong Yes, but we should avoid that people start implementing globbing if it is going to be removed, so a note or label in the spec would be useful. And of course #148 which aligns the spec with the actual situation.

@linonetwo
Copy link

I'm using globbing to retrieve hundreds to thousands of metafiles linonetwo/solid-tiddlywiki-syncadaptor#4 (comment)

I can't afford to do this client side, because there will be hundreds to thousands of wiki pages in that container, so there will be a huge amount of client-side fetch running concurrently.

@elf-pavlik
Copy link
Member

@linonetwo have you run benchmarks comparing globing approach to https://github.com/solid/ldp-glob with HTTP/2 enabled?

@linonetwo
Copy link

@elf-pavlik Do I need to start solid-server as a library, and use spdy to enable HTTP2 in my server?

@elf-pavlik
Copy link
Member

I think you could also just run it behind nginx and enable HTTP/2 in your nginx config

@RubenVerborgh
Copy link
Contributor Author

I'm using globbing to retrieve hundreds to thousands of metafiles linonetwo/solid-tiddlywiki-syncadaptor#4 (comment)

Thanks for sharing this use case, it's good to know what's out there.

May I ask for a bit more detail here?

What you seem to be using is .meta.*; however, this is a kind of pattern that is not supported across Solid servers (see #147). The kind of globbing that is currently in use, is only /*, so all files in a directory.
How does this affect your use case? (For instance, could you put your files in a meta subfolder?)

Another question I have is about the necessity of this design choice: could you give us some insights into the motivation for splitting data across this many files? (There might very well exist a more generic motivation, so eager to learn about it.)

A concern I do have is that, even for the server, thousands of files would turn this into a very expensive request, which ties into my DDOS worries regarding globbing (#145).

I can't afford to do this client side, because there will be hundreds to thousands of wiki pages in that container, so there will be a huge amount of client-side fetch running concurrently.

Point taken—except for "concurrently"; the browser will take tare of this, and with HTTP/2 there should only be a very limited overhead. Emphasis on should, because the per-request cost of NSS is currently too high, so it will be significantly slower with the magnitude of files you are naming.

That said, whatever design decision we make, having thousands of files in a single folder is bound to cause trouble one way or another. Not just for Solid, but for *nix or Windows systems too. So I believe the information architecture here can likely be more optimal. But please feel free to further expand on your use case, so we understand where the scale comes from.

@linonetwo
Copy link

linonetwo commented May 11, 2019

Well, I've reconsidered it:

  1. I won't use xxx.meta to store "metadata (like tags) generated by the user and my application" anymore, because How to delete meta file? #168 can't GET and DELETE xxx.meta
  2. I will use SPARQL to update and read a single index.metafile.ttl instead, and create all files using Link <http://www.w3.org/ns/ldp#Resource>; rel="type", <index.metafile.ttl>; rel="describedby".

I'm not sure if ./meta/index.ttl or ./index.metafile.ttl are good name linonetwo/solid-tiddlywiki-syncadaptor#4 (comment).

The reason I choose to use globbing was "it's the easier way to get my POC app working, and the document is simple and certain", but actually I can use SPARQL instead, while I'm not pretty sure if it will work.

I draw a picture while I was thought about this, it may better describe the motivation. I'm creating a saver plugin for TiddlyWiki, which is a semantic wiki:

whyglobbing

Copy link
Member

@acoburn acoburn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
I am supportive of removing this feature from the specification.

The use cases addressed by globbing fall under the larger category of "Query" for which there is now a formalized panel, and that is where those use cases can be discussed.

Copy link
Member

@dmitrizagidulin dmitrizagidulin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
I agree with @acoburn and @RubenVerborgh -- the usecases that globbing was originally intended for should be discussed and handled in the new Query panel.

@linonetwo - Yeah, I think you're on the right track. Keeping the metadata in one file simplifies reads (which I'm guessing is the more frequent operation), and for edits, you can use the PATCH request (and add/remove individual statements).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants