Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lines plugin #2413

Draft
wants to merge 19 commits into
base: master
Choose a base branch
from
Draft

Lines plugin #2413

wants to merge 19 commits into from

Conversation

RunDevelopment
Copy link
Member

This is the continuation of #2389.

The only function of this plugin is to wrap every line with an element, so we can more easily do line-based functionality (like line number and line highlights). This is done by parsing Prism's generated HTML code and then modifying the parsed DOM. After all lines are wrapped, the parsed DOM is converted into HTML code again (to be parsed and inserted into the actual document). The plugin only uses the before-insert hook.

The idea is to re-implement Line Numbers and Line Highlight with this plugin. The main advantage is that the browser will then align line numbers for us, so we no longer have to measure the height of each line (and update that on window resize). This will even resolve some issues.

The basic line-splitting algorithm is the same as in #2389. I modified it to be more efficient but the operation is still quite costly because of the parsing and subsequent creation of an entire document. To mitigate this a little, the Lines plugin is opt-in meaning that it will only wrap lines if it is requested to do so. This is done via the:

Register hook

The Lines plugin has one custom hook: lines-register.
This hook is used to register listeners that can modify the line wrapper elements among other things. The Lines plugin will only be active for a given code block if at least one listener is registered for the given code block. (If no other plugin needs line wrappers, they won't be added.)

Listeners

Listeners are a lot like hooks but more object-oriented. Instead of being a single function, a listener is an object with functions for specific events. This is necessary because events might have to share state which is very hard to do with Prism hooks.

The best example is the onNewLine event that is fired multiple times per code block. If you want to count line numbers, you have to update the state of the counter for each onNewLine event (per code block per highlight). This is very hard to do with the normally stateless Prism hooks.

Listeners can also share state with other listeners by modifying the environment object of an event.


TODO / discussion

  • I don't know whether the listener approach is any good. Maybe there is a better/simpler way of doing things?
  • I haven't updated index.html yet in case we change the API.

Some feedback would be appreciated because I'm not sure about the listener API.

@galaxy4public
Copy link

I am curious why you made a decision of not respecting a new line in a comment? I see both pros and cons of that decision, but I think it would be valuable to have it documented somewhere why such a decision was made.

My personal preference would be to respect newlines even in comments if we are announcing that the plugin is splitting lines, though.

@RunDevelopment
Copy link
Member Author

Comments do not contribute to the textContent of an element, so it doesn't matter what text they contain.

I only put comments in there to show that it can handle any kind of valid markup. I seriously doubt that it will encounter comments in the wild but better safe than sorry.

Comment on lines +176 to +179
if (!listeners.length) {
// no plugin needs line wrappers, so we don't add any
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for CSS-only solutions to exist such that there could be no JS listeners but still worth wrapping every line?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this is so the plugin does nothing by default. Without this, you just can't turn it off which is a problem because it is incompatible with Keep markup if kept nodes span multiple lines.

Prism.hooks.run('lines-register', /** @type {RegisterHookEnv} */({
pre: pre,
env: env,
listeners: listeners
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why implement a separate listener system inside the lines plugin vs using Prism.hooks?

@mAAdhaTTah
Copy link
Member

I should really read the description more closely before commenting lol.

I don't know whether the listener approach is any good. Maybe there is a better/simpler way of doing things?

I'm inclined to think taking advantage of Prism.hooks is better. I don't think we need to reinvent the wheel for this.

@github-actions
Copy link

JS File Size Changes (gzipped)

A total of 1 files have changed, with a combined diff of +789 B (+100.0%).

file master pull size diff % diff
plugins/lines/prism-lines.min.js 0 Bytes 789 B +789 B +100.0%

Generated by 🚫 dangerJS against 1f2b6e2

@RunDevelopment
Copy link
Member Author

Sorry for the delay.

I don't know whether the listener approach is any good. Maybe there is a better/simpler way of doing things?

I'm inclined to think taking advantage of Prism.hooks is better. I don't think we need to reinvent the wheel for this.

Could you please explain how?

This is necessary because events might have to share state which is very hard to do with Prism hooks.

This is the reason why added the listeners in the first place and I don't see how to do this cleanly with hooks.

Let's look at an example: line numbers. The lines plugin doesn't track the number/index of the current line (and arguable it shouldn't looking at counting line numbers in unified diffs), this is the responsibility of the listeners. The listener has to keep track of the current line number via state. This is pretty easy to do using the current listener approach:

Prism.hooks.add('lines-register', env => {
  let counter = 0;
  listeners.push({
    onNewLine({ line }) {
      counter++;
      line.setAttribute('data-line-number', counter);
    }
  });
});

@mAAdhaTTah
Copy link
Member

mAAdhaTTah commented Dec 28, 2020

Listeners could track it in the plugin scope:

let counter;

Prism.hooks.add('lines-init', env => {
  counter = 0;
});

Prism.hooks.add('lines-on-new-line', env => {
  counter++;
  env.line.setAttribute('data-line-number', counter);
});

Unless you're thinking of something more complicated? If we're going to do this, we might want to standardize some sort of namespacing for plugins, e.g lines::init & lines::new-line or something like that.

@RunDevelopment
Copy link
Member Author

RunDevelopment commented Dec 29, 2020

This type of init-event that resets the current state is what I wanted to avoid. I can only see plugin-wide shared mutable state go wrong.

Other plugins, such as Command line, use the env object itself to store their state. But this practice will only work if we reuse the env objects which is an odd guarantee to give. Instead, we could make the hack a feature with a vars property on all env objects:

Prism.hooks.add('lines-on-new-line', env => {
  let state = env.vars['line-number'] || { counter: 0 };
  state.counter++;
  env.line.setAttribute('data-line-number', state.counter);
});

The main advantage of this is that plugin state is automatically reset.

Or maybe we could expose just a single getState method instead of a vars object? Kinda like this:

Prism.hooks.add('lines-on-new-line', env => {
  let state = env.getState('line-number', () => ({ counter: 0 }));
  state.counter++;
  env.line.setAttribute('data-line-number', state.counter);
});

@csware
Copy link

csware commented May 31, 2021

Why parse the DOM and then modify it?

Why not just splitting the tokens:

Prism.hooks.add('after-tokenize', function (env) {
	if (!env.tokens) {
		return;
	}

	var oldTokens = env.tokens;
	var array = [];
	oldTokens.forEach(function (tokens) {
		if (tokens instanceof Prism.Token) {
			if (tokens.content.indexOf("\n") !== -1) {
				var codeLines = tokens.content.split(NEW_LINE_EXP);
				codeLines.forEach(function (line, index) {
					if (line.trim().length === 0) {
						array.push(line);
					} else {
						array.push(new Prism.Token(tokens.type, line, tokens.alias));
					}
					if (!tokens.content.endsWith(line)) { // HACK, maybe we need some better splitting function that only preserves the needed \n
						array.push("\n");
					}
				});
				return;
			}
		}
		array.push(tokens);
	});
	env.tokens = array;
});

As the first part, and then handling all \n in the ´before-insert` hook?

Prism.hooks.add('before-insert', function (env) {
	if (!env.code) {
		return;
	}
	var codeLines = env.highlightedCode.split(NEW_LINE_EXP);
	env.highlightedCode = "<table class=source style=\"margin-left:0;\">";
	codeLines.forEach(function (line) {
		env.highlightedCode += "<tr><td class=lnum></td><td>" + line +"</td></tr>";
	});
	env.highlightedCode += "</table>"
});

Here, easily, new hooks for lines-onLine, lines-beforeLines, and lines-afterLines could be introduced easily.

@RunDevelopment
Copy link
Member Author

Why parse the DOM and then modify it?

Because we have other plugins that use the wrap hook (the hook after after-tokenize) and assume that they get the tokens as produced by the current language.

A good example is Markdown. Markdown uses the wrap hook to highlight fenced code blocks.

By splitting the produced HTML and not the tokens into lines, it is a lot easier to implement language-specific plugins.

@csware
Copy link

csware commented Jun 1, 2021

Then possible another hook might be needed. I still think it's way more elegant to work on the tokens.

@RunDevelopment
Copy link
Member Author

The wrap hook is the hook called during stringification (= tokens to HTML). Adding another hook somewhere between after-tokenize and wrap would still breaking existing plugins using the wrap hook. Adding another hook after wrap doesn't make sense because each token has already been stringified.

I still think it's way more elegant to work on the tokens.

I agree but it's not possible for us without breaking existing code.

@thomas-happ
Copy link

@the

@kennybc kennybc mentioned this pull request Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants