{"id":470,"date":"2026-02-10T20:05:48","date_gmt":"2026-02-10T20:05:48","guid":{"rendered":"https:\/\/xn--mnchen-3ya.xyz\/index.php\/2026\/02\/10\/dennis-snell-html-api-check-for-unclosed-attributes\/"},"modified":"2026-02-10T20:05:48","modified_gmt":"2026-02-10T20:05:48","slug":"dennis-snell-html-api-check-for-unclosed-attributes","status":"publish","type":"post","link":"https:\/\/xn--mnchen-3ya.xyz\/index.php\/2026\/02\/10\/dennis-snell-html-api-check-for-unclosed-attributes\/","title":{"rendered":"Dennis Snell: HTML API: Check for unclosed attributes."},"content":{"rendered":"<p>Today someone was discussing the goal of linting HTML, specifically of detecting unclosed attributes. Consider the following snippet:<\/p>\n<div class=\"wp-block-code\">\n<div class=\"cm-editor\">\n<div class=\"cm-scroller\">\n<pre><code class=\"language-html\"><div class=\"cm-line\"><span class=\"tok-punctuation\">&lt;<\/span><span class=\"tok-typeName\">p<\/span> <span class=\"tok-propertyName\">class<\/span><span class=\"tok-operator\">=<\/span><span class=\"tok-string\">\"important&gt;&lt;img src=\"<\/span><span class=\"tok-propertyName\">alert.png<\/span>\"&gt;This is important!<span class=\"tok-punctuation\">&lt;\/<\/span><span class=\"tok-typeName\">p<\/span><span class=\"tok-punctuation\">&gt;<\/span><\/div><\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<p>It\u2019s clear that a mistake led to a missing double-quote on the <code>class<\/code> attribute of the opening <code>&lt;p&gt;<\/code> tag. While WordPress\u2019 HTML API doesn\u2019t directly report this (because \u201cunclosed attribute\u201d isn\u2019t particularly an HTML concept), it can be used to roughly detect it.<\/p>\n<p>Here\u2019s how to use <strong>the public functionality of the HTML API to detect unclosed attributes<\/strong>.<\/p>\n<p><span><\/span><\/p>\n<p>To do this, we have to define what an unclosed attribute means. For the sake of brevity we will assume that if an attribute value contains HTML-like syntax it is probably unclosed. We might be tempted to start with something like this:<\/p>\n<div class=\"wp-block-code\">\n<div class=\"cm-editor\">\n<div class=\"cm-scroller\">\n<pre><code class=\"language-php\"><div class=\"cm-line\"><span class=\"tok-meta\">&lt;?php<\/span><\/div><div class=\"cm-line\"><span class=\"tok-keyword\">foreach<\/span> <span class=\"tok-punctuation\">(<\/span> <span class=\"tok-propertyName\">$processor<\/span><span class=\"tok-punctuation\">-&gt;<\/span><span class=\"tok-propertyName\">get_attribute_names_with_prefix<\/span><span class=\"tok-punctuation\">(<\/span> <span class=\"tok-string\">''<\/span> <span class=\"tok-punctuation\">)<\/span> <span class=\"tok-keyword\">as<\/span> <span class=\"tok-variableName\">$name<\/span> <span class=\"tok-punctuation\">)<\/span> <span class=\"tok-punctuation\">{<\/span><\/div><div class=\"cm-line\">  <span class=\"tok-variableName\">$value<\/span> <span class=\"tok-operator\">=<\/span> <span class=\"tok-propertyName\">$processor<\/span><span class=\"tok-punctuation\">-&gt;<\/span><span class=\"tok-propertyName\">get_attribute<\/span><span class=\"tok-punctuation\">(<\/span> <span class=\"tok-variableName\">$name<\/span> <span class=\"tok-punctuation\">)<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\">  <span class=\"tok-keyword\">if<\/span> <span class=\"tok-punctuation\">(<\/span> <span class=\"tok-operator\">!<\/span> <span class=\"tok-variableName\">is_string<\/span><span class=\"tok-punctuation\">(<\/span> <span class=\"tok-variableName\">$value<\/span> <span class=\"tok-punctuation\">)<\/span> <span class=\"tok-punctuation\">)<\/span> <span class=\"tok-punctuation\">{<\/span><\/div><div class=\"cm-line\">    <span class=\"tok-keyword\">continue<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\">  <span class=\"tok-punctuation\">}<\/span><\/div><div class=\"cm-line\"><\/div><div class=\"cm-line\">  <span class=\"tok-variableName\">$checker<\/span> <span class=\"tok-operator\">=<\/span> <span class=\"tok-keyword\">new<\/span> WP_HTML_Tag_Processor<span class=\"tok-punctuation\">(<\/span> <span class=\"tok-variableName\">$value<\/span> <span class=\"tok-punctuation\">)<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\">  <span class=\"tok-keyword\">if<\/span> <span class=\"tok-punctuation\">(<\/span> <span class=\"tok-propertyName\">$checker<\/span><span class=\"tok-punctuation\">-&gt;<\/span><span class=\"tok-propertyName\">next_tag<\/span><span class=\"tok-punctuation\">(<\/span><span class=\"tok-punctuation\">)<\/span> <span class=\"tok-punctuation\">)<\/span> <span class=\"tok-punctuation\">{<\/span><\/div><div class=\"cm-line\">    <span class=\"tok-keyword\">throw<\/span> <span class=\"tok-keyword\">new<\/span> WP_Error<span class=\"tok-punctuation\">(<\/span> <span class=\"tok-string\">'Found tag syntax within attribute: is it unclosed?'<\/span><span class=\"tok-punctuation\">)<\/span><\/div><div class=\"cm-line\">  <span class=\"tok-punctuation\">}<\/span><\/div><div class=\"cm-line\"><span class=\"tok-punctuation\">}<\/span><\/div><\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<p>This approach does get pretty far, but it suffers from the fact that it\u2019s checking <em>decoded<\/em> attribute values, meaning it will detect false positives on any attribute <em>which discusses tags<\/em>, such as <code>alt=\"the &amp;lt;img&amp;gt; tag is a void element\"<\/code>. It\u2019s better to review the raw attribute value instead of the decoded attribute value.<\/p>\n<h3 class=\"wp-block-heading\">A sneaky trick hidden in attribute removal<\/h3>\n<p>The Tag Processor tracks attribute offsets but doesn\u2019t expose them, even to subclasses. The HTML API tries really hard to avoid exposing string offsets! and it does this for good reason. String offsets are easy to misuse, are unclear, and finicky.<\/p>\n<p>However, the Tag Processor does allow subclasses to access its <code>lexical_updates<\/code>, which is an array of string replacements to perform after semantic-level requests have been converted to text. We can analyze these updates after requesting to remove an attribute; that will return knowledge about all of the places where that attribute and any ignored duplicates appeared in the source document.<\/p>\n<p>This approach also leans on the fact that static methods of subclasses have access to protected properties of the parent class.<\/p>\n<p><strong>This is risky code and should be used with extreme caution, code review, and shared understanding among those who will be asked to maintain it.<\/strong><\/p>\n<div class=\"wp-block-code\">\n<div class=\"cm-editor\">\n<div class=\"cm-scroller\">\n<pre><code class=\"language-php\"><div class=\"cm-line\"><span class=\"tok-meta\">&lt;?php<\/span><\/div><div class=\"cm-line\"><\/div><div class=\"cm-line\"><span class=\"tok-keyword\">class<\/span> <span class=\"tok-className\">WP_Attribute_Walker<\/span> <span class=\"tok-keyword\">extends<\/span> WP_HTML_Tag_Processor <span class=\"tok-punctuation\">{<\/span><\/div><div class=\"cm-line\">   <span class=\"tok-keyword\">public<\/span> <span class=\"tok-keyword\">static<\/span> <span class=\"tok-keyword\">function<\/span> <span class=\"tok-variableName tok-definition\">walk<\/span><span class=\"tok-punctuation\">(<\/span> <span class=\"tok-variableName\">$html<\/span> <span class=\"tok-punctuation\">)<\/span> <span class=\"tok-punctuation\">{<\/span><\/div><div class=\"cm-line\">      <span class=\"tok-variableName\">$p<\/span> <span class=\"tok-operator\">=<\/span> <span class=\"tok-keyword\">new<\/span> WP_HTML_Tag_Processor<span class=\"tok-punctuation\">(<\/span> <span class=\"tok-variableName\">$html<\/span> <span class=\"tok-punctuation\">)<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\"><\/div><div class=\"cm-line\">      <span class=\"tok-keyword\">while<\/span> <span class=\"tok-punctuation\">(<\/span> <span class=\"tok-propertyName\">$p<\/span><span class=\"tok-punctuation\">-&gt;<\/span><span class=\"tok-propertyName\">next_tag<\/span><span class=\"tok-punctuation\">(<\/span><span class=\"tok-punctuation\">)<\/span> <span class=\"tok-punctuation\">)<\/span> <span class=\"tok-punctuation\">{<\/span><\/div><div class=\"cm-line\">         <span class=\"tok-variableName\">$names<\/span> <span class=\"tok-operator\">=<\/span> <span class=\"tok-propertyName\">$p<\/span><span class=\"tok-punctuation\">-&gt;<\/span><span class=\"tok-propertyName\">get_attribute_names_with_prefix<\/span><span class=\"tok-punctuation\">(<\/span> <span class=\"tok-string\">''<\/span> <span class=\"tok-punctuation\">)<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\"><\/div><div class=\"cm-line\">         <span class=\"tok-keyword\">foreach<\/span> <span class=\"tok-punctuation\">(<\/span> <span class=\"tok-variableName\">$names<\/span> <span class=\"tok-keyword\">as<\/span> <span class=\"tok-variableName\">$name<\/span> <span class=\"tok-punctuation\">)<\/span> <span class=\"tok-punctuation\">{<\/span><\/div><div class=\"cm-line\">            <span class=\"tok-propertyName\">$p<\/span><span class=\"tok-punctuation\">-&gt;<\/span><span class=\"tok-propertyName\">remove_attribute<\/span><span class=\"tok-punctuation\">(<\/span> <span class=\"tok-variableName\">$name<\/span> <span class=\"tok-punctuation\">)<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\">            <span class=\"tok-variableName\">$updates<\/span> <span class=\"tok-operator\">=<\/span> <span class=\"tok-propertyName\">$p<\/span><span class=\"tok-punctuation\">-&gt;<\/span><span class=\"tok-propertyName\">lexical_updates<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\">            <span class=\"tok-propertyName\">$p<\/span><span class=\"tok-punctuation\">-&gt;<\/span><span class=\"tok-propertyName\">lexical_updates<\/span> <span class=\"tok-operator\">=<\/span> <span class=\"tok-keyword\">array<\/span><span class=\"tok-punctuation\">(<\/span><span class=\"tok-punctuation\">)<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\"><\/div><div class=\"cm-line\">            <span class=\"tok-variableName\">$i<\/span> <span class=\"tok-operator\">=<\/span> <span class=\"tok-number\">0<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\">            <span class=\"tok-keyword\">foreach<\/span> <span class=\"tok-punctuation\">(<\/span> <span class=\"tok-variableName\">$updates<\/span> <span class=\"tok-keyword\">as<\/span> <span class=\"tok-variableName\">$update<\/span> <span class=\"tok-punctuation\">)<\/span> <span class=\"tok-punctuation\">{<\/span><\/div><div class=\"cm-line\">               <span class=\"tok-variableName\">$raw_attr<\/span> <span class=\"tok-operator\">=<\/span> <span class=\"tok-variableName\">substr<\/span><span class=\"tok-punctuation\">(<\/span> <span class=\"tok-variableName\">$html<\/span><span class=\"tok-punctuation\">,<\/span> <span class=\"tok-propertyName\">$update<\/span><span class=\"tok-punctuation\">-&gt;<\/span><span class=\"tok-propertyName\">start<\/span><span class=\"tok-punctuation\">,<\/span> <span class=\"tok-propertyName\">$update<\/span><span class=\"tok-punctuation\">-&gt;<\/span><span class=\"tok-propertyName\">length<\/span> <span class=\"tok-punctuation\">)<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\">               <span class=\"tok-variableName\">$quote_at<\/span> <span class=\"tok-operator\">=<\/span> <span class=\"tok-variableName\">strcspn<\/span><span class=\"tok-punctuation\">(<\/span> <span class=\"tok-variableName\">$raw_attr<\/span><span class=\"tok-punctuation\">,<\/span> <span class=\"tok-string\">''\"'<\/span> <span class=\"tok-punctuation\">)<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\"><\/div><div class=\"cm-line\">               <span class=\"tok-variableName\">$might_be_unclosed<\/span> <span class=\"tok-operator\">=<\/span> <span class=\"tok-bool\">false<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\">               <span class=\"tok-keyword\">if<\/span> <span class=\"tok-punctuation\">(<\/span> <span class=\"tok-variableName\">$quote_at<\/span> <span class=\"tok-operator\">&lt;<\/span> <span class=\"tok-variableName\">strlen<\/span><span class=\"tok-punctuation\">(<\/span> <span class=\"tok-variableName\">$raw_attr<\/span> <span class=\"tok-punctuation\">)<\/span> <span class=\"tok-punctuation\">)<\/span> <span class=\"tok-punctuation\">{<\/span><\/div><div class=\"cm-line\">                  <span class=\"tok-variableName\">$raw_value<\/span> <span class=\"tok-operator\">=<\/span> <span class=\"tok-variableName\">substr<\/span><span class=\"tok-punctuation\">(<\/span> <span class=\"tok-variableName\">$raw_attr<\/span><span class=\"tok-punctuation\">,<\/span> <span class=\"tok-variableName\">$quote_at<\/span> <span class=\"tok-operator\">+<\/span> <span class=\"tok-number\">1<\/span><span class=\"tok-punctuation\">,<\/span> <span class=\"tok-variableName\">strrpos<\/span><span class=\"tok-punctuation\">(<\/span> <span class=\"tok-variableName\">$raw_attr<\/span><span class=\"tok-punctuation\">,<\/span> <span class=\"tok-variableName\">$raw_attr<\/span><span class=\"tok-punctuation\">[<\/span> <span class=\"tok-variableName\">$quote_at<\/span> <span class=\"tok-punctuation\">]<\/span> <span class=\"tok-punctuation\">)<\/span> <span class=\"tok-operator\">-<\/span> <span class=\"tok-variableName\">$quote_at<\/span> <span class=\"tok-operator\">-<\/span> <span class=\"tok-number\">2<\/span> <span class=\"tok-punctuation\">)<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\">                  <span class=\"tok-variableName\">$checker<\/span>   <span class=\"tok-operator\">=<\/span> <span class=\"tok-keyword\">new<\/span> WP_HTML_Tag_Processor<span class=\"tok-punctuation\">(<\/span> <span class=\"tok-variableName\">$raw_value<\/span> <span class=\"tok-punctuation\">)<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\">                  <span class=\"tok-variableName\">$might_be_unclosed<\/span> <span class=\"tok-operator\">=<\/span> <span class=\"tok-propertyName\">$checker<\/span><span class=\"tok-punctuation\">-&gt;<\/span><span class=\"tok-propertyName\">next_tag<\/span><span class=\"tok-punctuation\">(<\/span><span class=\"tok-punctuation\">)<\/span> <span class=\"tok-operator\">||<\/span> <span class=\"tok-propertyName\">$checker<\/span><span class=\"tok-punctuation\">-&gt;<\/span><span class=\"tok-propertyName\">paused_at_incomplete_token<\/span><span class=\"tok-punctuation\">(<\/span><span class=\"tok-punctuation\">)<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\">               <span class=\"tok-punctuation\">}<\/span><\/div><div class=\"cm-line\"><\/div><div class=\"cm-line\">               <span class=\"tok-keyword\">yield<\/span> <span class=\"tok-propertyName\">$p<\/span><span class=\"tok-punctuation\">-&gt;<\/span><span class=\"tok-propertyName\">get_token_name<\/span><span class=\"tok-punctuation\">(<\/span><span class=\"tok-punctuation\">)<\/span> <span class=\"tok-punctuation\">=&gt;<\/span> <span class=\"tok-keyword\">array<\/span><span class=\"tok-punctuation\">(<\/span><\/div><div class=\"cm-line\">                  <span class=\"tok-variableName\">$name<\/span><span class=\"tok-punctuation\">,<\/span><\/div><div class=\"cm-line\">                  <span class=\"tok-keyword\">array<\/span><span class=\"tok-punctuation\">(<\/span> <span class=\"tok-propertyName\">$update<\/span><span class=\"tok-punctuation\">-&gt;<\/span><span class=\"tok-propertyName\">start<\/span><span class=\"tok-punctuation\">,<\/span> <span class=\"tok-propertyName\">$update<\/span><span class=\"tok-punctuation\">-&gt;<\/span><span class=\"tok-propertyName\">length<\/span> <span class=\"tok-punctuation\">)<\/span><span class=\"tok-punctuation\">,<\/span><\/div><div class=\"cm-line\">                  <span class=\"tok-number\">0<\/span> <span class=\"tok-operator\">===<\/span> <span class=\"tok-variableName\">$i<\/span><span class=\"tok-operator\">++<\/span> <span class=\"tok-operator\">?<\/span> <span class=\"tok-string\">'non-duplicate'<\/span> <span class=\"tok-punctuation\">:<\/span> <span class=\"tok-string\">'duplicate'<\/span><span class=\"tok-punctuation\">,<\/span><\/div><div class=\"cm-line\">                  <span class=\"tok-variableName\">$might_be_unclosed<\/span> <span class=\"tok-operator\">?<\/span> <span class=\"tok-string\">'contains-tag-like-content'<\/span> <span class=\"tok-punctuation\">:<\/span> <span class=\"tok-string\">'does-not-contain-tag-like-content'<\/span><span class=\"tok-punctuation\">,<\/span><\/div><div class=\"cm-line\">                  <span class=\"tok-variableName\">substr<\/span><span class=\"tok-punctuation\">(<\/span> <span class=\"tok-variableName\">$html<\/span><span class=\"tok-punctuation\">,<\/span> <span class=\"tok-propertyName\">$update<\/span><span class=\"tok-punctuation\">-&gt;<\/span><span class=\"tok-propertyName\">start<\/span><span class=\"tok-punctuation\">,<\/span> <span class=\"tok-propertyName\">$update<\/span><span class=\"tok-punctuation\">-&gt;<\/span><span class=\"tok-propertyName\">length<\/span> <span class=\"tok-punctuation\">)<\/span><span class=\"tok-punctuation\">,<\/span><\/div><div class=\"cm-line\">               <span class=\"tok-punctuation\">)<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\">            <span class=\"tok-punctuation\">}<\/span><\/div><div class=\"cm-line\">         <span class=\"tok-punctuation\">}<\/span><\/div><div class=\"cm-line\">      <span class=\"tok-punctuation\">}<\/span><\/div><div class=\"cm-line\">   <span class=\"tok-punctuation\">}<\/span><\/div><div class=\"cm-line\"><span class=\"tok-punctuation\">}<\/span><\/div><div class=\"cm-line\"><\/div><\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<p>This <code>WP_Attribute_Walker::walk( $html )<\/code> method steps through each tag in the given document and returns a generator which reports each attribute on the tag, as well as some meta information about it.<\/p>\n<div class=\"wp-block-code\">\n<div class=\"cm-editor\">\n<div class=\"cm-scroller\">\n<pre><code class=\"language-html\"><div class=\"cm-line\">$meta === array(<\/div><div class=\"cm-line\">    'class',                       \/\/ parsed name of attribute<\/div><div class=\"cm-line\">    array( 3, 27 ),                \/\/ (offset, length) of full attribute span in HTML<\/div><div class=\"cm-line\">    'non-duplicate',               \/\/ whether this is the actual attribute or an ignored duplicate<\/div><div class=\"cm-line\">    'contains-tag-like-content',   \/\/ likelihood of being unclosed<\/div><div class=\"cm-line\">    'class=\"important&gt;<span class=\"tok-punctuation\">&lt;<\/span><span class=\"tok-typeName\">img<\/span> <span class=\"tok-propertyName\">src<\/span><span class=\"tok-operator\">=<\/span><span class=\"tok-string\">\"', \/\/ full span of attribute in HTML<\/span><\/div><div class=\"cm-line\"><span class=\"tok-string\">);<\/span><\/div><\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<h3 class=\"wp-block-heading\">How to use this walker<\/h3>\n<div class=\"wp-block-code\">\n<div class=\"cm-editor\">\n<div class=\"cm-scroller\">\n<pre><code class=\"language-php\"><div class=\"cm-line\"><span class=\"tok-meta\">&lt;?php<\/span><\/div><div class=\"cm-line\"><\/div><div class=\"cm-line\"><span class=\"tok-variableName\">$html<\/span> <span class=\"tok-operator\">=<\/span> <span class=\"tok-string\">'&lt;p class=\"important&gt;&lt;img src=\"alert.png\"&gt;This is important!&lt;\/p&gt;'<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\"><\/div><div class=\"cm-line\"><span class=\"tok-keyword\">foreach<\/span> <span class=\"tok-punctuation\">(<\/span> WP_Attribute_Walker<span class=\"tok-punctuation\">::<\/span><span class=\"tok-propertyName\">walk<\/span><span class=\"tok-punctuation\">(<\/span> <span class=\"tok-variableName\">$html<\/span> <span class=\"tok-punctuation\">)<\/span> <span class=\"tok-keyword\">as<\/span> <span class=\"tok-variableName\">$tag_name<\/span> <span class=\"tok-punctuation\">=&gt;<\/span> <span class=\"tok-variableName\">$meta<\/span> <span class=\"tok-punctuation\">)<\/span> <span class=\"tok-punctuation\">{<\/span><\/div><div class=\"cm-line\">    <span class=\"tok-keyword\">echo<\/span> <span class=\"tok-string\">\"Found in &lt;<\/span><span class=\"tok-punctuation\">{<\/span><span class=\"tok-variableName\">$tag_name<\/span><span class=\"tok-punctuation\">}<\/span><span class=\"tok-string\">&gt; an attribute named '<\/span><span class=\"tok-punctuation\">{<\/span><span class=\"tok-variableName\">$meta<\/span><span class=\"tok-punctuation\">[<\/span><span class=\"tok-number\">0<\/span><span class=\"tok-punctuation\">]<\/span><span class=\"tok-punctuation\">}<\/span><span class=\"tok-string\">'<\/span>n<span class=\"tok-string\">\"<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\">    <span class=\"tok-keyword\">echo<\/span> <span class=\"tok-string\">\"  @ byte offset <\/span><span class=\"tok-punctuation\">{<\/span><span class=\"tok-variableName\">$meta<\/span><span class=\"tok-punctuation\">[<\/span><span class=\"tok-number\">1<\/span><span class=\"tok-punctuation\">]<\/span><span class=\"tok-punctuation\">[<\/span><span class=\"tok-number\">0<\/span><span class=\"tok-punctuation\">]<\/span><span class=\"tok-punctuation\">}<\/span><span class=\"tok-string\"> extending <\/span><span class=\"tok-punctuation\">{<\/span><span class=\"tok-variableName\">$meta<\/span><span class=\"tok-punctuation\">[<\/span><span class=\"tok-number\">1<\/span><span class=\"tok-punctuation\">]<\/span><span class=\"tok-punctuation\">[<\/span><span class=\"tok-number\">1<\/span><span class=\"tok-punctuation\">]<\/span><span class=\"tok-punctuation\">}<\/span><span class=\"tok-string\"> bytes<\/span>n<span class=\"tok-string\">\"<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\">    <span class=\"tok-keyword\">echo<\/span> <span class=\"tok-string\">\"  it is a <\/span><span class=\"tok-punctuation\">{<\/span><span class=\"tok-variableName\">$meta<\/span><span class=\"tok-punctuation\">[<\/span><span class=\"tok-number\">2<\/span><span class=\"tok-punctuation\">]<\/span><span class=\"tok-punctuation\">}<\/span><span class=\"tok-string\"> attribute on the tag<\/span>n<span class=\"tok-string\">\"<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\">    <span class=\"tok-keyword\">echo<\/span> <span class=\"tok-string\">\"  its value <\/span><span class=\"tok-punctuation\">{<\/span><span class=\"tok-variableName\">$meta<\/span><span class=\"tok-punctuation\">[<\/span><span class=\"tok-number\">3<\/span><span class=\"tok-punctuation\">]<\/span><span class=\"tok-punctuation\">}<\/span>n<span class=\"tok-string\">\"<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\">    <span class=\"tok-keyword\">echo<\/span> <span class=\"tok-string\">\"     `<\/span><span class=\"tok-punctuation\">{<\/span><span class=\"tok-variableName\">$meta<\/span><span class=\"tok-punctuation\">[<\/span><span class=\"tok-number\">4<\/span><span class=\"tok-punctuation\">]<\/span><span class=\"tok-punctuation\">}<\/span><span class=\"tok-string\">`\"<\/span><span class=\"tok-punctuation\">;<\/span><\/div><div class=\"cm-line\"><span class=\"tok-punctuation\">}<\/span><\/div><\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<p>The output here tells us what we want to know:<\/p>\n<div class=\"wp-block-code\">\n<div class=\"cm-editor\">\n<div class=\"cm-scroller\">\n<pre><code><div class=\"cm-line\">Found in &lt;P&gt; an attribute named 'class'<\/div><div class=\"cm-line\">  @ byte offset 3 extending 27 bytes<\/div><div class=\"cm-line\">  it is a non-duplicate attribute on the tag<\/div><div class=\"cm-line\">  its value contains-tag-like-content<\/div><div class=\"cm-line\">     `class=\"important&gt;&lt;img src=\"`<\/div><div class=\"cm-line\"><\/div><div class=\"cm-line\">Found in &lt;P&gt; an attribute named 'alert.png\"'<\/div><div class=\"cm-line\">  @ byte offset 30 extending 10 bytes<\/div><div class=\"cm-line\">  it is a non-duplicate attribute on the tag<\/div><div class=\"cm-line\">  its value does-not-contain-tag-like-content<\/div><div class=\"cm-line\">     `alert.png\"`<\/div><\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<p>For normative HTML the values are not as surprising. In this case, the missing <code>\"<\/code> has been added to the <code>class<\/code> attribute.<\/p>\n<div class=\"wp-block-code\">\n<div class=\"cm-editor\">\n<div class=\"cm-scroller\">\n<pre><code class=\"language-php\"><div class=\"cm-line\">$html = '<span class=\"tok-punctuation\">&lt;<\/span><span class=\"tok-typeName\">p<\/span> <span class=\"tok-propertyName\">class<\/span><span class=\"tok-operator\">=<\/span><span class=\"tok-string\">\"important\"<\/span><span class=\"tok-punctuation\">&gt;<\/span><span class=\"tok-punctuation\">&lt;<\/span><span class=\"tok-typeName\">img<\/span> <span class=\"tok-propertyName\">src<\/span><span class=\"tok-operator\">=<\/span><span class=\"tok-string\">\"alert.png\"<\/span><span class=\"tok-punctuation\">&gt;<\/span>This is important!<span class=\"tok-punctuation\">&lt;\/<\/span><span class=\"tok-typeName\">p<\/span><span class=\"tok-punctuation\">&gt;<\/span>';<\/div><\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"wp-block-code\">\n<div class=\"cm-editor\">\n<div class=\"cm-scroller\">\n<pre><code><div class=\"cm-line\">Found in &lt;P&gt; an attribute named 'class'<\/div><div class=\"cm-line\">  @ byte offset 3 extending 17 bytes<\/div><div class=\"cm-line\">  it is a non-duplicate attribute on the tag<\/div><div class=\"cm-line\">  its value does-not-contain-tag-like-content<\/div><div class=\"cm-line\">     `class=\"important\"`<\/div><div class=\"cm-line\"><\/div><div class=\"cm-line\">Found in &lt;IMG&gt; an attribute named 'src'<\/div><div class=\"cm-line\">  @ byte offset 26 extending 15 bytes<\/div><div class=\"cm-line\">  it is a non-duplicate attribute on the tag<\/div><div class=\"cm-line\">  its value does-not-contain-tag-like-content<\/div><div class=\"cm-line\">     `src=\"alert.png\"`<\/div><\/code><\/pre>\n<\/div>\n<\/div>\n<\/div>\n<h2 class=\"wp-block-heading\">Summary<\/h2>\n<p>This code is not meant to be normative; it\u2019s probably missing important details. It\u2019s here to demonstrate one way we can take advantage of the already-available aspects of the HTML API to perform more interesting work.<\/p>\n<p>In this case, we can tug at some of its internals to build linting and reporting tools which investigate aspects not exposed in the public interface: duplicate attributes and raw attribute values.<\/p>\n<p>For the use-case of checking whether an attribute is closed or not, it\u2019s a tricky problem to solve. We can only truly resolve this with a set of heuristics to determine the likelihood that an attribute isn\u2019t closed, because HTML parsers will universally interpret any given string in a specific way, and regardless of errors, will produce tags and attributes from it.<\/p>\n<p>Before we reach for custom regular expressions (PCRE), we can look into the HTML API and consider the sliding scale of safety it presents to us; we can take advantage of the parsing it\u2019s already performing to remove the need to replicate all of HTML\u2019s complicated parsing rules in our custom code.<\/p>","protected":false},"excerpt":{"rendered":"<p>Today someone was discussing the goal of linting HTML, specifically of detecting unclosed attributes. Consider the following snippet: &lt;p class=&#8221;important&gt;&lt;img src=&#8221;alert.png&#8221;&gt;This is important!&lt;\/p&gt; It\u2019s clear that a mistake led to a missing double-quote on the class attribute of the opening &lt;p&gt; tag. While WordPress\u2019 HTML API doesn\u2019t directly report this (because \u201cunclosed attribute\u201d isn\u2019t particularly [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-470","post","type-post","status-publish","format-standard","hentry","category-blog"],"_links":{"self":[{"href":"https:\/\/xn--mnchen-3ya.xyz\/index.php\/wp-json\/wp\/v2\/posts\/470","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/xn--mnchen-3ya.xyz\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/xn--mnchen-3ya.xyz\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/xn--mnchen-3ya.xyz\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/xn--mnchen-3ya.xyz\/index.php\/wp-json\/wp\/v2\/comments?post=470"}],"version-history":[{"count":0,"href":"https:\/\/xn--mnchen-3ya.xyz\/index.php\/wp-json\/wp\/v2\/posts\/470\/revisions"}],"wp:attachment":[{"href":"https:\/\/xn--mnchen-3ya.xyz\/index.php\/wp-json\/wp\/v2\/media?parent=470"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/xn--mnchen-3ya.xyz\/index.php\/wp-json\/wp\/v2\/categories?post=470"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/xn--mnchen-3ya.xyz\/index.php\/wp-json\/wp\/v2\/tags?post=470"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}