Replace All Quotes That Are Not In Html-tags
Solution 1:
This regex works for the given strings.
Search for - "([^<>]*?)"(?=[^>]*?<)
Replace with - »\1«
Demo here Testing it -
INPUT -
<p>This is a "wonderful long text". "Another wonderful ong text" At least it should be. Here we have a <ahref="http://wwww.site-to-nowhere.com"target="_blank">link</a>.</p>OUTPUT -
<p>This is a »wonderful long text«. »Another wonderful ong text« At least it should be. Here we have a <ahref="http://wwww.site-to-nowhere.com"target="_blank">link</a>.</p>
EDIT 1- Executing this in PHP -
$str = '<p>This is a "wonderful long text". "Another wonderful ong text" At least it should be. Here we have a <ahref="http://wwww.site-to-nowhere.com"target="_blank">link</a>.</p>';
var_dump(preg_replace('/"([^<>]*?)"(?=[^>]*?<)/', '»\1«', $str));
It's output -
/** OUTPUT **/
string '<p>This is a »wonderful long text«. »Another wonderful ong text« At least it should be. Here we have a <ahref="http://wwww.site-to-nowhere.com"target="_blank">link</a>.</p>' (length=196)
EDIT 2-
You have executed the preg_replace
function properly, but in the replacement string, you have used \1 inside the Double quotes(""). Doing so, you are escaping the 1 itself and that won't be replaced.
To make it more clear, try this and see what happens -
echo'»\1«';
echo"»\1«";
The second \1 should not be visible. So the solution would be one of these -
preg_replace('/"([^<>]*?)"(?=[^>]*?<)/', '»\1«', $str)
preg_replace("/\"([^<>]*?)\"(?=[^>]*?<)/", "»\\1«", $str)
preg_replace("/\"([^<>]*?)\"(?=[^>]*?<)/", "»$1«", $str)
Read the Replacement section in this page for more clarity.
EDIT 3- A regex that covers text which might not be enclosed within tags-
\"([^<>]*?)\"(?=(?:[^>]*?(?:<|$)))
Demo here
Solution 2:
Could also use a negative lookahead:
(?![^<]*>)"([^"]+)"
Replace with: »\1«
Solution 3:
For the record, there is a simple PHP solution that was not mentioned and that efficiently skips over all the <a...</a>
tags.
Search: <a.*?<\/a>(*SKIP)(*F)|"([^"]*)"
Replace: »\1«
In the Demo, look at the Substitutions at the bottom.
Reference
How to match (or replace) a pattern except in situations s1, s2, s3...
Solution 4:
Use this regex:
(?<=^|>)[^><]+?(?=<|$)
This will match non html strings.
And then do your regex on the resultant string
Post a Comment for "Replace All Quotes That Are Not In Html-tags"