Skip to content Skip to sidebar Skip to footer

Best Way To Remove All But The 5 Predefined Html Entities With Php - For Xhtml5 Output

I'm currently experimenting with delivering XHTML5. Currently I deliver XHTML 1.1 Strict on the page I'm working on. That is I do for capable browsers. For those who don't accept X

Solution 1:

PAY ATTENTION on universal convertions: the use of html_entity_decode with default parameters not remove all named entities, only the few defined by old HTML 4.01 standard. So entities like ©(©) will by converted; but some like +(+), not. To convert ALL named entities use the ENT_HTML5 in the second parameter (!).

Also, if destination encode not is UTF8, can not recive the superior (to 255) names, like 𝒜(𝒜) thar is 119964>255.

So, to convert "ALL POSSIBLE NAMED ENTITIES", you MUST use html_entity_decode($s,ENT_HTML5,'UTF-8') but it is valid only with PHP5.3+, where the flag ENT_HTML5 was implemented.

In the particular case of this question, must use also flag ENT_NOQUOTES instead the default ENT_COMPAT, so , must use html_entity_decode($s,ENT_HTML5|ENT_NOQUOTES,'UTF-8')


PS (edited): thanks to @BoltClock to remember about PHP5.3+.

Solution 2:

I think a html_entity_decode() followed by a htmlspecialchars() is the easiest way to go.

It won't convert ' though - to get that, you'd have to do htmlspecialchars() first, and then convert ' into &apos.

Post a Comment for "Best Way To Remove All But The 5 Predefined Html Entities With Php - For Xhtml5 Output"