Solid - Basic security guidelines

The following article gives an overview on some basic security considerations with regards to client-side Solid application. The goals is to give a feeling of what could go wrong security-wise and where to pay attention by:

highlighting potential threats for client-side Solid applications
showing some examples of DOs and DON'Ts

Disclaimer

I'm not a security professional (yet :) and this list is by no means exhaustive. So you can take it as some inspiration and food for thought and hopefully you will write more secure code. However, when you create apps that handle real user data, please consult a security professional.

What's at stake?

Let's take a look at following scenario, where someone uses a picture app to store and view files in different pods:

What could happen when the Pictures App has a security flaw? As the user logged in via the app, a malicious person could access any data of all three pods the user has access to. They could also modify data or even change access controls, given the user has the right to do so.

The main point here is, that even if it's only a "picture" app, the current login model for applications gives it access to everything the user has access to. So a flaw in the pictures app puts at stake everything the user has access to.

Handling data

In Solid apps we use data to for various purposes. When we use the data, we should be aware that it is potentially untrusted, ie a malicious agent could modify it to break your application.

Where we use data

UI: display usernames, chat messages, etc
Logic: decide which actions to perform based on data, eg which files to delete for a recursive deletion, or which files to fetch to display a list of friends
Client side storage: cache user data, login data
External APIs: send images to an external API to apply an image filter, etc

What data can we trust?

TL;DR: Only data from the identity provider, everything else needs to be treated with care.

Note: For simplicity, the topic of digital signatures is left out here, even though it could be useful in some use cases.

From my security point of view, the only data we should trust is the identity provider. The identity provider can create valid authentication tokens for their users, so they already have full control over anything the user has access to. For other agents, such as other users or even pod providers, they only have partial access initially and could gain more access by exploiting your application. Here's a list of what you should not trust, or only trust to some extent:

A random's person data?

When I use the photo app to view pictures from random-person.pod.org the app cannot assume anything about the data. If random-person is malicious and has a fake username foaf:name "<script>alert(1)</script>", we must make sure this is not interpreted as html but only displayed to the user as text (see the Examples section below for how). The same applies to image descriptions, image data but even metadata such as "last modified" and co.

My own data?

If we view pictures on our own pod, the application should still not trust the data. As we can see in the diagram above, we can give other people access to our pod. With access control we limit it to specific resources and folders. Malicious agents could add images to your pod with a img:description "<script>alert(1)</script>" description, which our apps must not interpret as html.

URL params?

This is not solid specific, however I thought it's worth a mention. If your application uses URL params like /app?file=example.org/file&filename=pizza, you must treat this as untrusted data. For instance, a malicious agent could get the user to open /app?file=example.org&filename=<script>alert(1)</script> and if the filename is added carelessly to the html it will execute the script on page load.

The Solid Specification?

We need the solid specification to write apps that work with all kind of pod providers. However, we should not trust servers to perfectly implement it for two reasons: (1) also servers have bugs, and (2) malicious pod providers can do whatever they want.

For instance, if the spec would ensure users cannot modify folder containment triples, we still must treat the listing of contained files as untrusted data. When fetching person.random-pods.org/images/ the server could return #images :contains <https://example.org/your/data>, even if it's not allowed by the specification. A recursive delete of /images/ then could also delete https://example.org/your/data (see eg this issue).

Examples

This section contains some concrete code examples. My aim is to cover common pitfalls, again this is by no means exhaustive.

Treat data as data, make sure it is not interpreted as part of the code:

// don't use innerHTML with untrusted data
profile.innerHTML = '<p>' + username + '</p>';
profilePicture.innerHTML = '<img src="' + profilePicture + '"></img>';

// do use innerText (or similar) to display untrusted text
const p = document.createElement('p');
p.innerText = username;
profile.appendChild(p);

// do set attributes via properties or setAttribute
const img = document.createElement('img');
img.src = profilePicture;
profilePicture.appendChild(img);

Prevent javascript: links (because clicking <a href="javascript:alert(1)">foo</a> will execute the script):

// don't set href to untrusted data
const a = document.createElement('a');
a.href = imageUrl;

// do make sure it is https, or http if necessary
const allowedProtocols = ['https:']
if (allowedProtocols.includes(new URL(imageUrl).protocol)) {
    const a = document.createElement('a');
    a.href = imageUrl;
}

Be careful when using data for your application logic:

// don't implicitly trust urls from linked data
// eg folders can contain :contains triples with arbitrary urls, not only children
for (const url of getTriples(folderDataset, ':contains')) {
    recursivelyDelete(url)
}

// do ensure implicit assumptions hold
for (const url of getTriples(folderDataset, ':contains')) {
    if (isParent(folderDatset.url, url)) {
        recursivelyDelete(url)
    }
}

// don't concatenate untrusted data to file paths
// eg file names could include "../private" to change the directory in requests or contain "foo?delete=true" to add additional parameters to a request
const targetUrl = 'https//example.org/public/' + fileName
makeApiRequest(targetUrl)

// do use a whitelist of allowed chars/names or verify the concatenated url (TODO: add example how to verify resolved url client-side)
if (!/^[a-z0-9]+$/.test(fileName)) {
    throw new Error('Invalid file name')
}
const targetUrl = 'https//example.org/public/' + fileName
makeApiRequest(targetUrl)

Working with Linked Data

In linked data, any file can claim anything about other files and actors. For instance example.org/file.ttl can state person.id.org/card#me :wrote "Hi, this is my message". However, this does not mean that this person really wrote this message, only that this file claims that this person wrote the message. We should not trust this claim more than we trust the file.

In particular, if we collect data from multiple files and add all the information to one dataset, we don't know anymore who claimed which statements. A statement Alice :hasAddress "Los Angeles" could origin from Alice's pod but also from Bob's pod.

Instead, we must treat data with respect to who is able to write it. If we read the address from alice.pod.org/profile/card#me we likely can trust it, we assume only Alice can write there. If we read it from alice.pod.org/inbox/ or bob.pod.org/profile/card#me we likely cannot trust it. This also depends on how much you rely on the integrity of this address: Do you only use it as the initial position on the map, or is it the destination of a package shipment?

Hosting applications

There's a lot more to say about this, however one important principle is: do not host your application on the same domain where potentially untrusted html is served. If another application runs on the same domain it can get pretty much full access over your application (and make authenticated requests, etc).

Thus, do not host applications on Solid pods, do host them on their own domain. You could host them on your own server, or for instance on Vercel, Netlify, Render, etc. If you publish via github pages (github.io), keep in mind that all projects from the same organization run under the same domain and thus can access each other.

Contributing

Feel free to contribute in any form to this article. Issues and PRs are welcome.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Solid - Basic security guidelines

Disclaimer

What's at stake?

Handling data

Where we use data

What data can we trust?

Examples

Working with Linked Data

Hosting applications

Further readings

Contributing

Files

README.md

Latest commit

History

README.md

File metadata and controls

Solid - Basic security guidelines

Disclaimer

What's at stake?

Handling data

Where we use data

What data can we trust?

Examples

Working with Linked Data

Hosting applications

Further readings

Contributing