feat: Add support for HTML fragments #845

JuroOravec · 2024-12-12T13:23:20Z

Adds support for rendering HTML fragments by setting type="fragent", e.g.

MyTable.render(type="fragment")

Plus docs for using HTML fragments with HTMX, AlpineJS, vanilla JS
And examples in the sampleproject for HTML fragments with HTMX, AlpineJS, vanilla JS

JuroOravec · 2024-12-12T13:26:56Z

docs/concepts/advanced/authoring_component_libraries.md

@@ -1,6 +1,6 @@
 ---
 title: Authoring component libraries
-weight: 7
+weight: 8


Inserted the articles on HTML fragments, so had to adjust the indices

JuroOravec · 2024-12-12T13:36:23Z

src/django_components/dependencies.py

+    # corresponding assets
+    # See https://developer.mozilla.org/en-US/docs/Web/HTML/Element/script#embedding_data_in_html
+    exec_script = json.dumps(exec_script_data)
+    exec_script = f'<script type="application/json" data-djc>{exec_script}</script>'


Previously, when we rendered a component, we inserted a <script> tag that contained logic to load the JS / CSS via the client-side dependency manager.

But there was a problem that browsers execute the <script> tags only when certain methods are used. And e.g. the scripts were not being triggered when for example when the script was inserted via el.innerHTML.

So I changed the approach. Instead of inserting a <script> tag with an executable JS, now we insert only a JSON data as <script type="application/json">.

And instead, our dependency manager JS now listens for changes to the DOM. And when the changes include inserting <script type="application/json"> with the data-djc attribute, we know that this is coming from us. At that point we parse the JSON, and run the same logic that we originally wanted to run in the script.

This is kinda similar to AlpineJS, where they also listen for DOM changes to know if any of the Alpine attributes (e.g. x-data) were added or removed.

JuroOravec · 2024-12-12T13:37:48Z

src/django_components_js/src/manager.ts

+  observeScriptTag((script) => {
+    const data = JSON.parse(script.text);
+    _loadComponentScripts(data);
+  });


This is where we define the MutationObserver that watches the DOM for changes

for more information, see https://pre-commit.ci

JuroOravec · 2024-12-12T19:18:41Z

@EmilStenstrom Sorry for the spam with the PRs, I'm sure you have a lot going on. When you have time, could you go over this PR before the others? This feature I want to use at work.

EmilStenstrom

Hi! This is a great feature, I (finally) understand the lure of fragments (it's all about how dependencies are handled) and am really happy that django-components are supporting them fully finally. Only thing I'm unsure about are the security aspects of this, the escaping/unescaping, are we sure they are bulletproof?

sampleproject/components/fragment.py

EmilStenstrom · 2024-12-15T20:46:37Z

src/django_components/util/misc.py

@@ -79,8 +79,10 @@ def get_last_index(lst: List, key: Callable[[Any], bool]) -> Optional[int]:
    return None


-def _escape_js(js: str, eval: bool = True) -> str:
-    escaped_js = escape_js_string_literal(js)
+def _escape_js(content: str, wrap: bool = True, eval: bool = True) -> str:


Are there lots of tests testing the security of this?

No tests for that yet.

Only thing I'm unsure about are the security aspects of this, the escaping/unescaping, are we sure they are bulletproof?

Good catch with the escaping. It's meant to escape the sequence </script>, which would break the script otherwise. Because user could define JS scripts like so:

console.log("I'm am a <script></script> tag!");

Previously my approach was to wrap the JS script body in ` (backticks), and execute the script with eval(unescape(body)).

But I also found recently that the eval() function may be blocked in stricter environments, so ideally we'd avoid relying on it.

There's two places where there may be content inside <script> tag that could potentially contain </script> keyword:

When we insert the JS from Component.js

Checking with ChatGPT, I don't think there's a fully 100% reliable way to escape the sequence </script> from a JS script without breaking something.

The problem here is that we can't know if the </script> was used as part of a string or inside a comment or elsewhere. And to know that we'd have to parse the script as AST, and that's way too much effort. Because if we had these 3 lines:

'I am </script>'

"I am </script>"

// I am </script>

Then the first two would need different ways of escaping them:

'I am <' + '/script>'

"I am <" + "/script>"

// I am </script>

So instead, I suggest that we simply check if the body in Component.js contains literal </script>. And if it does, we raise an error. That way this issue is captured before it gets to the browser.

When we insert the list of dependencies (JS / CSS) for the client-side dependency manager

The good thing here is that the dependencies are defined statically as JSON. But the json may still contain a literal </script>

But because the JSON is static and we don't need eval() to run it, I'm thinking of converting the entries inside the JSON to byte64, so any potential </script> are escaped.

So I'll fix update the escaping as outlined above and then I'll add some tests for that.

I've updated this in f8a7424. And added checks here.

tests/test_dependency_rendering.py

docs/concepts/advanced/html_tragments.md

for more information, see https://pre-commit.ci

EmilStenstrom · 2024-12-16T21:54:45Z

src/django_components/dependencies.py

@@ -650,9 +663,20 @@ def _get_script_tag(
    script = get_script_content(script_type, comp_cls)

    if script_type == "js":
-        return f"<script>{_escape_js(script)}</script>"
+        if "</script" in script:


I think this is still too brittle. A starting <script> tag would break this too (they can't nest so the parser implicitly closes the previous one). I think you can put non-printable characters in </script to still have it execute in some browsers. Other browser tags may close this as well, like </html>.

Hm, could you point to some examples/reference? My own testing in Chrome and Safari, as well as ChatGPT suggest that:

Inserting <script> inside <script> does NOT split it original script

Any other end tag other than </script> is ignored inside <script>

I think you can put non-printable characters in </script

What do you mean? Do you mean that:

People insert non-printable chars inside </script in order to still run the original <script>

Or that they insert the non-printable chars so that </script will be picked up in only some browsers?

I'm testing in codepen in Firefox (double-verified in Edge):

<script>document.write("<script>HELLO")</script> renders nothing, because the <script> tags ends the first script after the first quote

</script> containing control chars doesn't seem to work.

Going back to the original problem here: Are we trying to prevent people from making accidental mistakes, or is this a security feature?

Going back to the original problem here: Are we trying to prevent people from making accidental mistakes, or is this a security feature?

Preventing people from accidental mistakes. In one of the future PRs, I want to add component attributess similar to template_name, which would specify Component.js/css as a separate file. But although written in a separate file, they would eventually be inserted into the HTML as <script> and <style> tags.

And I imagine it could be fairly hard-to-find bug, if I accidentally broke the JS script because I included </script> within a string or a comment.

Hm, I still can't reproduce the errors.

Yes, your example with

<script>document.write("<script>HELLO")</script>

doesn't show anything in the DOM, but that's not because of what we're discussing here. It's because you're introducing an un-terminated <script> tag into the DOM. And what the browser does is that it looks for the closest </script>. Which, if there is none, means that all the rest of the page is interpreted as contents of the <script> tag.

See the highlighted section in the screenshot below. The section contains </body>, </html>, and a comment, which were part of the original document, but are now treated as contents of the inserted <script>. The browser then reaches the end of the document without terminating the <script>, nor <body> and <html>, so it adds the terminating </script>, </body> and </html> tags, which you can see below the highlighted section.

On the other hand, I got consistently the same results on:

Chrome v118 (MacOS 14.5)

Chrome v131 (Windows 11)

Firefox v109 (Mac)

Firefox v133 (Win)

Safari v17.5 (Mac)

Edge v131 (Win)

The consistent behaviour is that:

When I insert something into the </script> tag (e.g. </\u200Bscript>), the browser does NOT treat is as proper end tag, and instead only as a string

If I insert extra <script> inside another script tag, nothing happens, browser treats it as a text

If I insert extra </script> inside another script tag, the script ends prematurely

Of the 3, behavior 1 and 2 is expected and desirable. It's only behaviour 3 that I want to guard against. And it seems that checking only for the presence of </script> is sufficient.

Thank you @JuroOravec, you have me convinced! OK to merge from me! :)

JuroOravec added 2 commits December 12, 2024 14:22

feat: add support for HTML fragments

01e3949

Merge branch 'master' into jo-feat-html-fragments

6bbd47b

JuroOravec commented Dec 12, 2024

View reviewed changes

refactor: update tests

f577219

JuroOravec force-pushed the jo-feat-html-fragments branch from 04e418b to f577219 Compare December 12, 2024 18:16

[pre-commit.ci] auto fixes from pre-commit.com hooks

460d7dc

for more information, see https://pre-commit.ci

JuroOravec force-pushed the jo-feat-html-fragments branch 2 times, most recently from c4120db to c2c578b Compare December 12, 2024 19:05

tests: try to wait for css to fix CI tests

c8ea7a8

JuroOravec force-pushed the jo-feat-html-fragments branch from c2c578b to c8ea7a8 Compare December 12, 2024 19:09

JuroOravec marked this pull request as ready for review December 12, 2024 19:16

JuroOravec requested a review from EmilStenstrom December 12, 2024 19:17

EmilStenstrom reviewed Dec 15, 2024

View reviewed changes

JuroOravec added 2 commits December 16, 2024 12:21

Merge branch 'master' into jo-feat-html-fragments

b02607a

refactor: remove trailing slash and update sampleproject

4d80d57

dalito reviewed Dec 16, 2024

View reviewed changes

docs/concepts/advanced/html_tragments.md Outdated Show resolved Hide resolved

JuroOravec added 4 commits December 16, 2024 13:54

docs: fix typo

69a6530

Merge branch 'master' into jo-feat-html-fragments

475d3f8

refactor: raise if component's JS or CSS contains closing tag

f8a7424

refactor: fix linter errors

ebb4236

JuroOravec force-pushed the jo-feat-html-fragments branch from e493bbd to ebb4236 Compare December 16, 2024 17:05

pre-commit-ci bot and others added 2 commits December 16, 2024 17:05

[pre-commit.ci] auto fixes from pre-commit.com hooks

5691bc5

for more information, see https://pre-commit.ci

Merge branch 'master' into jo-feat-html-fragments

f285931

EmilStenstrom reviewed Dec 16, 2024

View reviewed changes

Merge branch 'master' into jo-feat-html-fragments

9849fc4

JuroOravec merged commit 4dab940 into EmilStenstrom:master Dec 19, 2024
15 checks passed

JuroOravec deleted the jo-feat-html-fragments branch December 19, 2024 09:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add support for HTML fragments #845

feat: Add support for HTML fragments #845

JuroOravec commented Dec 12, 2024 •

edited

Loading

JuroOravec Dec 12, 2024

JuroOravec Dec 12, 2024

JuroOravec Dec 12, 2024

JuroOravec commented Dec 12, 2024 •

edited

Loading

EmilStenstrom left a comment

EmilStenstrom Dec 15, 2024

JuroOravec Dec 16, 2024

JuroOravec Dec 16, 2024

EmilStenstrom Dec 16, 2024 •

edited

Loading

JuroOravec Dec 16, 2024 •

edited

Loading

EmilStenstrom Dec 18, 2024

JuroOravec Dec 19, 2024 •

edited

Loading

EmilStenstrom Dec 19, 2024

feat: Add support for HTML fragments #845

feat: Add support for HTML fragments #845

Conversation

JuroOravec commented Dec 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JuroOravec commented Dec 12, 2024 • edited Loading

EmilStenstrom left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

EmilStenstrom Dec 16, 2024 • edited Loading

Choose a reason for hiding this comment

JuroOravec Dec 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JuroOravec Dec 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JuroOravec commented Dec 12, 2024 •

edited

Loading

JuroOravec commented Dec 12, 2024 •

edited

Loading

EmilStenstrom Dec 16, 2024 •

edited

Loading

JuroOravec Dec 16, 2024 •

edited

Loading

JuroOravec Dec 19, 2024 •

edited

Loading