Closed Thread Icon

Preserved Topic: Replace &lt; stuff &gt; with PHP (Page 1 of 1) Pages that link to <a href="http://ozoneasylum.com/backlink?for=20938" title="Pages that link to Preserved Topic: Replace &amp;amp;lt; stuff &amp;amp;gt; with PHP (Page 1 of 1)" rel="nofollow" >Preserved Topic: Replace &amp;lt; stuff &amp;gt; with PHP <span class="small">(Page 1 of 1)</span>\

 
WarMage
Maniac (V) Mad Scientist

From: Rochester, New York, USA
Insane since: May 2000

posted posted 11-06-2001 23:05

How would I do the reg expression to replace everything including the < and > in html docs.

$string = preg_replace("<*>","",$string);

That is my first thought, but I know it not to be correct...

How would I do this?

mr.maX
Maniac (V) Mad Scientist

From: Belgrade, Serbia
Insane since: Sep 2000

posted posted 11-06-2001 23:52

$string = preg_replace("/<(.+?)>/", "", $string);

But, if you simply want to remove HTML tags, use strip_tags() function instead...


jiblet
Paranoid (IV) Inmate

From: Minneapolis, MN, USA
Insane since: May 2000

posted posted 11-07-2001 00:37

$string = preg_replace("/<[^>]*>/", "", $string);

would come in handy in cases where there is a greater-than as part of the attributes of the tag. Granted, that should never happen because you are supposed to use >, but you never know.

That begs the question. How does strip_tags() handle it, and would it choke on embedded PHP in a tag like:

<a href="<?php $thaLink ?>">

Does it leave:

">

?

-jiblet

mr.maX
Maniac (V) Mad Scientist

From: Belgrade, Serbia
Insane since: Sep 2000

posted posted 11-07-2001 00:50

Unfortunately, strip_tags() will leave "> . Generally, in order to handle this correctly, you would have to write mini HTML parser (that knows the difference between HTML tag name & parameters)...


WarMage
Maniac (V) Mad Scientist

From: Rochester, New York, USA
Insane since: May 2000

posted posted 11-07-2001 03:08

Thanks all.

Why I needed this was because I am construction an indexing search engine which works with Database information.

The database stores HTML information, and I want to remove all of it. So that something to the effect of <li>Information</li>

gets indexed as Information as opposed to <li>Information</li>

Thanks again.

« BackwardsOnwards »

Show Forum Drop Down Menu