Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document guidelines to enable language features for embedded languages #47288

Closed
aeschli opened this issue Apr 6, 2018 · 23 comments
Closed

Document guidelines to enable language features for embedded languages #47288

aeschli opened this issue Apr 6, 2018 · 23 comments
Assignees
Labels
feature-request Request for new features or functionality languages-basic Basic language support issues on-testplan
Milestone

Comments

@aeschli
Copy link
Contributor

aeschli commented Apr 6, 2018

In VSCode each file is associated with a language. Language supports such as code completion, hovers, are contributed to that language. This is ideally through a language server.
When a language allows to embedded snippets of an other language (e.g. CSS in HTML or HTML in PHP) there are various techniques that a language server can use:

  • The language server also implements support for the embedded language. It can do that by including libraries that provide that support. For example there are easy to use node modules for css, less, scss, html and json or more basic language supports for typescript
  • Forward requests to an other language server as done in intelephense

The first approach has the following advantages

  • Full control of the user experience. Completion proposals, hovers... can be tuned to apply to the situation.
  • No dependencies on other language servers, self contained server that is easy to embedded also in other editors or IDE.

In either case the embedded content needs to be escaped according to the owner language. E.g > needs to be &gt.

@aeschli aeschli self-assigned this Apr 6, 2018
@aeschli aeschli added this to the Backlog milestone Apr 6, 2018
@aeschli aeschli added the feature-request Request for new features or functionality label Apr 6, 2018
@aeschli
Copy link
Contributor Author

aeschli commented Apr 6, 2018

@aeschli Do you think that an extension can make the html part in php get the same behavior as the standalone html file? including let all the plugin for html work fine in that part?
Extension system is really important as we all know, but not all the features are suitable for plugins. What we actually need is parse one file to multiple language and treat them separately. This should be a built-in ability for a code editor.
And, all the PHP files are actually HTML, with a special HTML tag. Only the content in this tag should not be regarded as HTML. It is just a language embedded in HTML and can't work without it (except CLI). So the question is not about HTML in PHP, HTML is the main container. And also lots of language can be embedded in HTML, such as css javascript vbscript svg perl java, I don't think any extension can do this without official support.

@popcorner PHP files are actually not adhering to HTML syntax: <?php is not valid in pure HTML. Although it seems like PHP is embedded in HTML, the syntax is defined by PHP so it's actually 'HTML' inside PHP.
Same with templating languages like Smarty:

<div>
{escape} 
This is some text I want <> escaped. 
{/escape}
</div>

(snippet from here
Valid Smarty, not HTML. Try it the HTML validator)
Every templating language has their own way of escaping embedded content and why the HTML language server can't just be easily used. It lacks the knowledge of the embedding language.

@jens1o
Copy link
Contributor

jens1o commented Apr 6, 2018

The language server also implements support for the embedded language. It can do that by including libraries that provide that support. For example there are easy to use node modules for css, less, scss, html and json or more basic language supports for typescript

The problem is, how can I say I'm the master without maintaining a fork and rule them to only work in specific parts of a file? Why can't the language servers I want to control say what they are able to handle, so I do not need to worry about it and the end-user can install any extension they like, so I do not need to upload an extension douzens of megabytes big to support really any language?

@aeschli
Copy link
Contributor Author

aeschli commented Apr 6, 2018

You don't need to maintain a fork. You can forward the requests. But your Smarty server needs to transform the document to valid HTML and then ask the HTML server for the result on this document. That's approach number 2.
You can also tell the HTML language support to handle Smarty files by associating smarty files to HTML. But it will struggle if the file is not valid HTML as in the example above.

@jens1o
Copy link
Contributor

jens1o commented Apr 6, 2018

You can forward the requests.

I'm looking forward to an example code. ;) And somehow you didn't answer on my second question. How can I know, when I want the users to install extensions without any more config to know why parts can be handled by which server?

@aeschli
Copy link
Contributor Author

aeschli commented Apr 6, 2018

I don't understand your last question. Can you rephrase?

@jens1o
Copy link
Contributor

jens1o commented Apr 6, 2018

You can use Smarty for everything, it's not dedicated to html, css, js, java, xml, json... So in the current version I need to detect which language it actually is, but that's the downside, because I want the language servers that the user installed to give me examples of what they are able to handle, so I do not need to worry about, and it's completly modular(so I would not need to detect whether it's PHP or Smarty code).

  1. User installs an extension, which provides language support(e.g. PHP).
  2. User opens a smarty file, and the smarty-extension asks vscode for a list of language servers(content providers) and their samples(so what does a php code pattern look like?).
  3. Once the user types in something that (for example) PHP can handle, I know I can delegate the request to the PHP language server and it can handle the request, so I simply wrap it around.

That would be sooo easy and soo extensible.

I hope you understand why that concept would be so awesome and the next step.

@bmewburn
Copy link

bmewburn commented Apr 8, 2018

@aeschli the html language server went with option 2 first for embedded languages then changed to option 1 later. Were there other advantages, in addition to those listed above, that prompted the change in direction?

@jens1o In that situation one solution could be to provide a config setting so the user can declare what the embedded language is. Then your extension can create virtual documents with the appropriate language id and forward them on.

@jens1o
Copy link
Contributor

jens1o commented Apr 8, 2018

In that situation one solution could be to provide a config setting so the user can declare what the embedded language is.

I want to make it as simply as possible, prompting the user for this to declare is very likely to include an error or missing an exception.

@aeschli
Copy link
Contributor Author

aeschli commented Apr 8, 2018

@bmewburn The main reason was really to be able to control the user experience. For example in the case of JavaScript embedded in HTML, we want to preconfigure the JavaScript language server with the dom definition files.

@octref
Copy link
Contributor

octref commented May 2, 2018

@jens1o

User opens a smarty file, and the smarty-extension asks vscode for a list of language servers(content providers) and their samples(so what does a php code pattern look like?).

Can you clarify "samples" with examples?

Currently if you want to get HTML completions in your extension, all you need to do in your extensions are three things:

  1. Figure out if the completion position is in HTML or your-lang
  2. If it's HTML, create a virtual HTML document with the embedded content
  3. Call the command vscode.executeCompletionItemProvider.

If you are writing a language server that could handle embedded language X, things such as "getting range and content of embedded regions" (for 1 and 2) should be the language server parser's responsibility.

@jens1o
Copy link
Contributor

jens1o commented May 3, 2018

Can you clarify "samples" with examples?

I mostly mean regular expressions, so to keep it as universal as possible.

I'm not an exact regex expert, but for detecting PHP sections this could be used:

/(<\?(php)?)(.|(\r)?\n){1,}(\?>)?/gim

These (more complex) patterns would be given to vscode and can be polled by each language server. Then, a simple pattern matching is used to determine the matching language server where the request will be passed to by the master one(responsible for a specific file extension).

The problem is: Languages do not need to have some kind of start- and endpoint(e.g. javascript). Thus, a fallback language determined by the master language server would be required.

@octref
Copy link
Contributor

octref commented May 3, 2018

@jens1o There would be many problems with that approach, just off the top of my head:

  • PHP server wouldn't be able to control when the request is going to HTML and when it's going to PHP. Also passing data between LS could be very tricky, if it's not controlled by the LS.
  • LSP is chatty. You might have multiple requests going back/forth for each character entered. VS Code can't run complex regexes on the same file again and again on each document change.

@jens1o
Copy link
Contributor

jens1o commented May 3, 2018

So, I do not have a better solution, yet. Do you have some better ideas?

PHP server wouldn't be able to control when the request is going to HTML and when it's going to PHP. Also passing data between LS could be very tricky, if it's not controlled by the LS.

That's supposed to be like this, because it's the job of the master language server.

LSP is chatty. You might have multiple requests going back/forth for each character entered. VS Code can't run complex regexes on the same file again and again on each document change.

Perhaps we can include sub-languages while only checking whether a specific keystroke is in of the range.

:: NOTHING, FALLBACK LANGUAGE (LAYER == 0)
<html> :: HTML LANGUAGE SERVER DETECTS START OF HTML LANGUAGE (LAYER == 1)
<body>
| :: CURSOR IS WITHIN THE RANGE OF THE HTML LANGUAGE SERVER; SO HTML IS DOING THE JOB
<p>Hello World!</p>
<?php :: PHP LANGUAGE SERVER DETECTS START OF PHP LANGUAGE (LAYER == 2)
| :: CURSOR IS WITHIN THE RANGE OF THE PHP LANGUAGE SERVER; SO ITS HANDLING THE REQUEST
?> :: PHP LANGUAGE SERVER DETECTS END OF PHP LANGUAGE; VSCODE SHIFTS DOWN AND LASTLY RECOGNIZED THE HTML LANGUAGE; THUS ASSUMES HTML (LAYER == 1)
</body>
</html> :: HTML LANGUAGE SERVER DETECTS END OF HTML LANGUAGE; ASSUMES FALLBACK LANGUAGE (LAYER == 0)

Would that decrease the cost and pressure of vscode?
The only question is how we're determining and separating the languages.
Is there any other solution that's not based on complex regexes?

@octref
Copy link
Contributor

octref commented May 3, 2018

@jens1o You are also assuming that language servers always want to give full control of sub-regions to other language servers. For example, in Vue Language Server it suggest v-if at <div v|>. HTML LS wouldn't know about this v-if.

If you want to find a generic solution that should be put in VS Code (@aeschli thinks it's impossible and so do I), try to at least find a solution that could handle at least both PHP and Vue. The solution shouldn't make VS Code slow or make any LS limited in its language capabilities. If you do find such a solution I'm in all ears, and the issue for that is #1751.

Up till then, this issue's scope is for providing documentation, guideline and possibly starter template for making language servers that support embedded languages.

@jens1o
Copy link
Contributor

jens1o commented Jun 26, 2018

@octref @aeschli May I ask at what position this backlog entry is?

@octref
Copy link
Contributor

octref commented Jun 26, 2018

@jens1o I've just reworked on the Language Server Guide. This is something I have in mind. I can't promise anything, but probably sometime later this year.

@jens1o
Copy link
Contributor

jens1o commented Jul 26, 2018

Okay, I'm working on implementing this. I've got one problem though. When I pass vscode the request, it apparently passes it back to me, then I pass it to vscode... Infinite Loop. How can I get around this and declare the specific code-part as another language, so (in my example) the Emmet provider rules it all?

@jens1o
Copy link
Contributor

jens1o commented Jul 26, 2018

@aeschli
Copy link
Contributor Author

aeschli commented Jul 26, 2018

@jens1o Can you create a separate request? Thanks!

@jens1o
Copy link
Contributor

jens1o commented Jul 26, 2018

@aeschli Where? Here? At the community slack?

@aeschli
Copy link
Contributor Author

aeschli commented Jul 27, 2018

Just a new GitHub issue.

@axelson
Copy link

axelson commented May 1, 2020

@octref I see that you closed this issue. Is there a link to the created guideline? I tried following some of the referenced issues, but was unable to find a related guide.

@KamasamaK
Copy link

@axelson See https://code.visualstudio.com/api/language-extensions/embedded-languages

@github-actions github-actions bot locked and limited conversation to collaborators May 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature-request Request for new features or functionality languages-basic Basic language support issues on-testplan
Projects
None yet
Development

No branches or pull requests

6 participants