Consider defaulting narrative text to the xhtml namespace #290

lmsurpre · 2017-01-17T07:05:07Z

We are getting a lot of xmlns="" attributes in the narrative text of our output.

There is a related issue where it was recommend to set the namespace of the xhtml when you call the api:
#193

This is not ideal for us because we build up the narrative text from xhtml snippets that we store in our db and we'd prefer to avoid adding xmlns everywhere in our views.

It seems like the narrative text should always be in the xhtml namespace and so maybe the library could assume that? Or maybe if you add a snippet of narrative text with no namespace, it could automatically make that http://www.w3.org/1999/xhtml ?

wmrutten · 2017-01-18T22:53:16Z

Hi Lee, thank you for reporting this. We have to be very careful with namespaces, as invalid namespaces will prevent correct XML (de-)serialization. So I'm not sure if the library can "assume" anything.

Can you provide a small code example of how you build up the narrative using the API? Maybe we can come up with a workaround.

lmsurpre · 2017-01-18T23:17:55Z

I'll try to provide and example soon. In the meantime, a quick question: are there any cases where the elements under narrative text would have a namespace other than http://www.w3.org/1999/xhtml ?
It seems to me like that's not even allowed by the spec, so why support it?

wmrutten · 2017-01-19T22:22:38Z

Right, other namespaces are not allowed and (IIRC) should be rejected by the API.

ewoutkramer · 2021-03-04T10:13:16Z

The behaviour of the SDK here is dubious. The intentions are right, but the implementation is wrong (this is the serializer):

 // The value *should* be valid xhtml - however if people just provide a plain
 // string, lets add a <div> around it.
 if (!value.TrimStart().StartsWith("<div"))
        value = $"<div xmlns='http://www.w3.org/1999/xhtml'>{value}</div>";

 var sanitized = SerializationUtil.SanitizeXml(value);
 XElement xe = XElement.Parse(sanitized);

 // The div should be in the XHTMLNS namespace, correct if it 
 // is not the case.
 xe.Name = XmlNs.XHTMLNS + xe.Name.LocalName;
 parent.Add(xe);

As is visible, this code tries to correct for developer mistakes: if the developer supplies a value that does not start with a div, we'll add one (e.g. the common mistake of starting with <p>, or, when the data comes in from json, just contain text). This works fine.

However when the developer supplies xml content starting with a div but not in the xhtml namespace, we will add the xhtml namespace to the root. And this is a mistake: this is to the root only. So all nested elements in the xml will still be outside the xhtml namespace, and all of them will therefore have the xmlns='' attribute added to them (as Lee reported).

We could fix this by going down all children and applying the xhtml namespace (if there is no namespace).

Should we correct this bug? What we could have done is throw an exception. But the design filosophy of the serializer is to just go on and not do any kind of validation of the inputs. It would clearly be a breaking change (stuff that serialized before now won't serialize anymore).

Can we leave it as is? I think so. It has been there for a while, and there are few complaints. I guess this is because the current sea of xmlns='' attributes trigger people to look at their code - and so they realize they forgot to apply the xhtml namespace. So it functions nicely as a red flag that something is wrong. (and not changing the root namespace at all means we would generate incorrect XHTML - so we can't do that either).

To conclude: if we don't mind breaking peoples current code, we can fix it. But it is likely that the current situation is "good enough". I suggest we leave it as is.

ewoutkramer added the enhancement label Jun 28, 2017

ewoutkramer closed this as completed Mar 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider defaulting narrative text to the xhtml namespace #290

Consider defaulting narrative text to the xhtml namespace #290

lmsurpre commented Jan 17, 2017

wmrutten commented Jan 18, 2017

lmsurpre commented Jan 18, 2017

wmrutten commented Jan 19, 2017

ewoutkramer commented Mar 4, 2021

Consider defaulting narrative text to the xhtml namespace #290

Consider defaulting narrative text to the xhtml namespace #290

Comments

lmsurpre commented Jan 17, 2017

wmrutten commented Jan 18, 2017

lmsurpre commented Jan 18, 2017

wmrutten commented Jan 19, 2017

ewoutkramer commented Mar 4, 2021