The upcast conversion can't properly handle inline objects without $text accepting context #7753

niegowski · 2020-07-30T15:16:04Z

The upcast conversion and auto-paragraphing

Status: RFC
Date: 2020-07-30

Technical Story:

Context and Problem Statement

Context

Inline objects that are allowed only where $text, can't be properly auto-paragraphed and it results as broken upcast.

This happens because of a mistake in the auto-paragraphing code (view element is used to check schema.checkChild())

ckeditor5/packages/ckeditor5-paragraph/src/paragraph.js

Line 196 in 50e3509

if ( !schema.checkChild( context.push( 'paragraph' ), node ) ) {

This error can be easily reproduced using the placeholder widget like this:

editor.setData('<span class="placeholder">{nested}</span>')

The span element can't be converted because splitToAllowedParent can't find a proper insertion position.
It falls back to the auto-paragraphing converter that is using this span view element to check if it's allowed by the schema.
This fails (every time there is no coincidence that view element's name matches the model element's name) and falls back to the default converter that consumes an element without creating a model element and triggers children's conversion.

Problem

Auto-paragraphing doesn't know what would be the model element after converting view element so can't properly check if it would convert properly if it would be auto-paragraphed.

Considered Options

Auto-paragraphing should try to convert problematic element in the context of paragraph element (without checking if the resulting model element is allowed in the context by the schema) and afterward check if it is compliant with schema.
Auto-paragraphing should be able to decorate conversionApi.splitToAllowedParent() to verify if the converted model element would be allowed if auto-paragraphed (it could insert paragraph there and return position inside that paragraph).
Auto-paragraphing for elements should be integrated directly into conversionApi.splitToAllowedParent().
Conversion process could be split into individual events matching steps of conversion (with default handlers taking care of it if not overridden), an example sequence of events:
- 'element:span:match' - testing if view item should be handled
- 'element:span:create' - creating a model element from the view element (getModelElement())
- 'element:span:insert' - finding a proper position and inserting a model element (splitToAllowedParent() and writer.insert())
- 'element:span:children' - convertChildren()
- 'element:span:updatecursors' - updateConversionResult()

Pros and Cons of the Options

Option 1: Auto-paragraphing and checking afterward

Good, because auto-paragraphing is handled in one place (this is not entirely true - eg model.deleteContent()).
Bad, because there could be elements that should just be ignored and there is no easy way to verify if an element was properly converted after wrapping with a paragraph or it was just handled by default handler (that ignores element and continues conversion of its children)

Option 2: "Decorating" `splitToAllowedParent()` to extend it with auto-paragraphing

Probably to maintain consistency with conversion API it should be a dedicated event that would also provide conversionApi object.

Good, because allows extending splitToAllowedParent() from external plugins (i.e., paragraph is a separate plugin) so auto-paragraphing would stay in one place.
Bad, because there could be other cases that might make use of extending part of built-in conversion steps and it would require breaking change if it would be changed in the future.

Option 3: Auto-paragraphing elements directly inside `splitToAllowedParent()`

Currently, parts of the paragraph functionality are highly incorporated into the engine (example: model.deleteContent() is auto-paragraphing if needed.

Good, because probably could simplify auto-paragraphing code.
Bad, because if backward compatibility would be maintained then splitToAllowedParent is not the best name for it.
Bad, because it would tighten paragraph with the engine even more than it is currently.

Option 4: Splitting the conversion process to a series of events

Good, because converters could override only selected parts of the conversion process.
Good, because the whole conversion process would expose selective changes.
Bad, because this would generate a ton more events during conversion.
Bad, because it could be difficult to maintain backward compatibility (but this might be possible).

The text was updated successfully, but these errors were encountered:

niegowski · 2020-07-30T15:22:07Z

Original issue #6698

Reinmar · 2020-07-31T07:16:12Z

cc @jodator @scofalik

Reinmar · 2020-07-31T13:49:39Z

We talked about Option 1. f2f and it wasn't looking good in terms of code. It'd add complexity and I wasn't even able to understand if it'd work.

Option 2. seems to be a hack. That we don't know what to do so we make a method overridable :D Do we open other conversionApi methods for extension? And why?

Option 3. I'm leaning for this option the most. To rephrase Kuba's proposal – the idea is that the engine knows about paragraphs anyway. It's the default block to be used when something's wrong (e.g. in deleteContent()). So, why do we keep autoparagraphing in a separate plugin? The great thing about merging autoparagraphing completely into the engine is that it may decrease the complexity.

Option 4. I think it's a no go because of the performance aspect. The conversion performance is already a problem so we should avoid making it worse. Finally, it's unclear to me who would be firing these events. It'd most likely be a huge breaking change. Plus, it'd add more complexity to something that's already too complex.

scofalik · 2020-08-03T17:20:40Z

I think we can rule out option 4. Performance is one thing, complexity is another. We want to make things easier in CKE5 :D.

I think that option 1 is closest to what the auto-paragraphing is about, however, I don't understand this drawback:

Bad, because there could be elements that should just be ignored and there is no easy way to verify if an element was properly converted after wrapping with a paragraph or it was just handled by default handler (that ignores element and continues conversion of its children)

Isn't auto-paragraphing about trying to put a thing in a paragraph and check if it would be allowed this way?

I don't like option 2 because it looks like we are introducing "overwriting" only to solve one concrete problem. I can't see how this method would be further expanded, or how custom (3rd party code) would want to expand it.

For me, it is option 1 or option 3. If option 3 is simpler, I can go with option 3. It couples paragraph element more with the engine but I don't really care about that. After those years I am of opinion that undo or paragraph should be parts of the engine or should be regarded as "obligatory" plugins.

One thing that I don't like though is that splitToAllowedParent() would get even more "magical" becoming this kind of a method that you always have to use but don't know why and what it does, you simply hope that it does the right thing. Funnily, I don't think the name would have to change. The method would still split to allowed parent, however, in some cases, it would create that parent (paragraph element).

jodator · 2020-08-04T05:22:32Z

I had some hard times parsing this all together. However your comments shine some light on this.

Option 1: 😶 I'm not sure what this should do. But it feels like adding more hacks to a hack (paragraph as a separate plugin).

Option 2: 👎 I'm with @scofalik on this - we can't decorate every single method in the API if we see some problems. I think that decorating should be done with extra care. We have too many events IMO.

Option 3: 👍 but as all-together:

To rephrase Kuba's proposal – the idea is that the engine knows about paragraphs anyway. It's the default block to be used when something's wrong (e.g. in deleteContent()). So, why do we keep autoparagraphing in a separate plugin? The great thing about merging autoparagraphing completely into the engine is that it may decrease the complexity.

I was briefly thinking about converting such things to a generi $block (more precise - $text holder) and then changing them to <paragraph> if needed by a plugin. But that's also added complexity. I see that most of the issues with auto-paragraphing is that I don't recall a use-case for the editor without <paragraph> (or a generic $text holder).

Option 4: 👎 try to write a guide for that ;)

Reinmar · 2020-08-05T09:57:20Z

Cool, so it seems that we have a conclusion – let's merge autoparagraphing into the engine :) Biggest benefit: decreased complexity.

niegowski · 2020-08-06T11:39:56Z

This was also affecting editor.setData('<br>foo'):

Expected model result: <paragraph><softBreak></softBreak>foo</paragraph>

Actual result: <paragraph>foo</paragraph>

Fix (engine): Upcast conversion will now try to wrap text and inline elements in paragraph if such content is converted in a place where is not allowed but a paragraph is allowed. Closes #7753. Closes #6698. Internal (paragraph): Auto-paragraphing content conversion moved to the engine package. Tests (table): Added tests for upcasting inline elements inside a table cell.

niegowski added the type:bug This issue reports a buggy (incorrect) behavior. label Jul 30, 2020

Reinmar added the status:discussion label Jul 31, 2020

niegowski mentioned this issue Aug 5, 2020

#7753: Upcast conversion should handle wrapping in paragraphs if it would allow conversion #7787

Merged

niegowski self-assigned this Aug 6, 2020

niegowski added this to the iteration 35 milestone Aug 6, 2020

jodator closed this as completed in #7787 Aug 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The upcast conversion can't properly handle inline objects without $text accepting context #7753

The upcast conversion can't properly handle inline objects without $text accepting context #7753

niegowski commented Jul 30, 2020 •

edited

Loading

niegowski commented Jul 30, 2020

Reinmar commented Jul 31, 2020

Reinmar commented Jul 31, 2020 •

edited

Loading

scofalik commented Aug 3, 2020

jodator commented Aug 4, 2020

Reinmar commented Aug 5, 2020

niegowski commented Aug 6, 2020

The upcast conversion can't properly handle inline objects without $text accepting context #7753

The upcast conversion can't properly handle inline objects without $text accepting context #7753

Comments

niegowski commented Jul 30, 2020 • edited Loading

The upcast conversion and auto-paragraphing

Context and Problem Statement

Context

Problem

Considered Options

Pros and Cons of the Options

Option 1: Auto-paragraphing and checking afterward

Option 2: "Decorating" splitToAllowedParent() to extend it with auto-paragraphing

Option 3: Auto-paragraphing elements directly inside splitToAllowedParent()

Option 4: Splitting the conversion process to a series of events

niegowski commented Jul 30, 2020

Reinmar commented Jul 31, 2020

Reinmar commented Jul 31, 2020 • edited Loading

scofalik commented Aug 3, 2020

jodator commented Aug 4, 2020

Reinmar commented Aug 5, 2020

niegowski commented Aug 6, 2020

niegowski commented Jul 30, 2020 •

edited

Loading

Option 2: "Decorating" `splitToAllowedParent()` to extend it with auto-paragraphing

Option 3: Auto-paragraphing elements directly inside `splitToAllowedParent()`

Reinmar commented Jul 31, 2020 •

edited

Loading