Can someone tell me what the following paragraph in the HTML5 spec means? Regarding the processing of
All descendant elements must be processed, according to their
semantics, before the style element itself is evaluated. For styling
languages that consist of pure text (as opposed to XML), user agents
must evaluate style elements by passing the concatenation of the
contents of all the Text nodes that are children of the style element
(not any other nodes such as comments or elements), in tree order, to
the style system. For XML-based styling languages, user agents must
pass all the child nodes of the style element to the style system.
The only way to get a comment node or element node into a
style element is by DOM manipulation—putting the comment or element into the
style element in the DOM after an HTML parser has already parsed the document.
So the spec is not saying the HTML parser should remove all HTML elements and comments inside
<style>…</style> markup. If the spec intended that it would state it explicitly.
HTML parsers parse all content of in
<style>…</style> markup as text—including any content that looks like a comment or looks like an element.
So there are no comments or elements for an HTML parser to remove there—it’s all just text.
Where in the spec does it say that the content is pure text?
style content is “raw text”.
The HTML 4 spec states clearly that the content of style elements is CDATA. That is what I am looking for but I can't find it in the HTML5 spec.
What the current HTML spec calls “raw text” is essentially the same as CDATA in the HTML4 spec.
Where does it say that it is terminated by the string "
See these steps of the parsing algorithm:
The last step there references the definition of “appropriate end tag token”:
An appropriate end tag token is an end tag token whose tag name matches the tag name of the last start tag to have been emitted from this tokenizer, if any.
So when parsing the raw text of
script contents, the last start tag to have been emitted is a
<script> start tag, thus the “appropriate end tag token“ is