Closed Thread Icon

Topic awaiting preservation: Don't Replace a String If It's Inside a Tag (PHP) Pages that link to <a href="https://ozoneasylum.com/backlink?for=29743" title="Pages that link to Topic awaiting preservation: Don&amp;#039;t Replace a String If It&amp;#039;s Inside a Tag (PHP)" rel="nofollow" >Topic awaiting preservation: Don&#039;t Replace a String If It&#039;s Inside a Tag (PHP)\

 
Author Thread
Wes
Paranoid (IV) Mad Scientist

From: Inside THE BOX
Insane since: May 2000

posted posted 12-03-2007 03:53

I've developed a fairly simple PHP module for my new site that allows me to input strings to be replaced with other strings. It's an easy way to add links to phrases or replace certain terms dynamically. (There's a bit more to it, but that's the basic idea.)

Basically, I've got an array of phrases and an array of replacements, and the block of text that will be affected.

What I need to do is make sure that, if a phrase appears anywhere inside an HTML tag then it should not be replaced.

For example:
OKAY OKAY OKAY <img src="file" alt="DONT_TOUCH"> OKAY <a href="file>DONT_TOUCH</a> OKAY OKAY <div>DONT_TOUCH</div>.

Right now, everything ends up at preg_replace($search_array, $replace_array, $text) for the final replacement.

The best thing I can come up with so far is to perform my regular preg_replace, then repeat it, but the second time look for any replacements that fell inside tags, and switch them back. But that seems a little inefficient.

I bet the answer is simple and I'm just thinking backwards. Anyone have an idea?

By the way, if my solution really is the best way to do it and someone already knows the proper regex for the second search, it would probably save me two hours of trial and error.

Tyberius Prime
Maniac (V) Mad Scientist with Finglongers

From: Germany
Insane since: Sep 2001

posted posted 12-03-2007 08:52

your solution is pretty optimal, giving the constraint: don't use a real parser.
ie. you can either regexps the tags out first (making use of the fact that > can't occur
within a html tag), replace your text, place the tags back in, or do it your way.

Real way would be of course a parser, then manipulate only the text nodes, than
put it all back together.
But it might hardly be worth the effort.

So long,

->Tyberius Prime

hyperbole
Paranoid (IV) Inmate

From: Madison, Indiana
Insane since: Aug 2000

posted posted 12-03-2007 19:32

What you're suggesting is something like

code:
$new_string1 = preg_replace(/source/, 'target', $string);
   $new_string2 = preg_replace(/<[^>]*target[^>]*>/, 'source', $new-string1);



The problem with this method is, when you have a string that contains "... < ... target ... > ...". You will change 'target' in the tag to 'source' in the second replacement when it wasn't 'source' in the first place.

I think I would try something like

code:
preg_match_all('/([^<]*)(<[^>]*>)/', $string, $matches, PREG_SET_ORDER);
   $result_string = '';
   foreach ($matches as $part) {
      $result_string .= preg_replace('/source/', 'target', $part[1]);
      $result_string .= $part[2];
   }
   if  (preg_match('/>([^<>]+)$/', $string, $matches)) {
      $result_string .= preg_replace('/source/', 'target', $matches[1]);
   }



I haven't tested this, but I think it should work

.



-- not necessarily stoned... just beautiful.

zavaboy
Paranoid (IV) Inmate

From: f(x)
Insane since: Jun 2004

posted posted 12-06-2007 08:41
code:
preg_replace('#[^>]+(?=<[^/])#i','foo',$string)


If Wes's example* is $string, this would return:

code:
foo<img src="file" alt="DONT_TOUCH">foo<a href="file">DONT_TOUCH</a>foo<div>DONT_TOUCH</div>


Hope this helps you out!

* with closing quote on href attribute

EDIT: Please note that it will only work if at least one HTML tag is within $string.

2ND EDIT: Additionally, it will not match anything after the last HTML tag, so here's the new one that matches after the last HTML tag or everything if there are no HTML tags within $string:

code:
preg_replace('#[^>]+(?=<[^/])|[^>]+$#','foo',$string)





(Edited by zavaboy on 12-06-2007 08:58)

(Edited by zavaboy on 12-06-2007 09:13)

hyperbole
Paranoid (IV) Inmate

From: Madison, Indiana
Insane since: Aug 2000

posted posted 12-06-2007 22:17

I may have misunderstood what he's asking for, but if the string contains

code:
$string = 'stuff and nonsense. stuff to replace. more stuff<tag atl="stuff not to touch">'



zavaboy's example will produce

code:
'foo<tag atl="stuff not to touch">'



and what he wanted was

code:
'stuff and nonsense. foo. more stuff<tag atl="stuff not to touch">'



.



-- not necessarily stoned... just beautiful.

zavaboy
Paranoid (IV) Inmate

From: f(x)
Insane since: Jun 2004

posted posted 12-07-2007 00:24

I sorta meant for that to be more of an example that could be used towards the solution. A solution like this:

code:
<?php
$string = 'stuff and nonsense. stuff to replace. more stuff<tag alt="stuff not to touch"> foo.';
$find = array('foo','stuff to replace');
$replace = array('bar','replaced');

preg_match_all('#[^>]+(?=<[^/])|[^>]+$#', $string, $matches, PREG_SET_ORDER);

foreach ($matches as $val) {
    $string = str_replace($val[0],str_replace($find,$replace,$val[0]),$string);
}

echo $string;
// Output: stuff and nonsense. replaced. more stuff<tag alt="stuff not to touch"> bar.
?>



Matt888
Neurotic (0) Inmate
Newly admitted

From:
Insane since: Jan 2009

posted posted 01-06-2009 23:21

I want to use this code but I only want to replace the first occurrence of $find in $string. How would I do this?

(Sorry to rectify such an old post.)

« BackwardsOnwards »

Show Forum Drop Down Menu