Closed Thread Icon

Topic awaiting preservation: Playing with RegEx in PHP Pages that link to <a href="https://ozoneasylum.com/backlink?for=21294" title="Pages that link to Topic awaiting preservation: Playing with RegEx in PHP" rel="nofollow" >Topic awaiting preservation: Playing with RegEx in PHP\

 
Author Thread
u-neek
Bipolar (III) Inmate

From: Berlin, Germany
Insane since: Jan 2001

posted posted 04-11-2004 20:17

This is my current RegEx:

$text = preg_replace('/^(\\#{1,6})[\\s]+(.+?)\\1?\\n+/exmU', "'<h'.strlen('\\\\1').'>\\\\2</h'.strlen('\\\\1').'>'" , $text);

What it does:

transforms

## Header
into
<h2>Header</h2>
### Header
into:
<h3>Header</h3>

I want that although this string gets replaced:

#### Header ####
into:
<h4>Header</h4>

But it doesn't work. The last four hashes are not replaced. I think, there seems to be something wrong with this part of the pattern: (.+?)\\1?\\n+

Any idea?

Thanks.

Tyberius Prime
Paranoid (IV) Mad Scientist with Finglongers

From: Germany
Insane since: Sep 2001

posted posted 04-11-2004 20:44

well.. why would you have the char 0x1 in there? What's that \\1 supposed to do anyhow?
I guess it's a back reference.

do you really need to quote the #?

And wouldn't it be easier to do just six, but more simple, regexps?

(Edited by Tyberius Prime on 04-11-2004 11:48)

u-neek
Bipolar (III) Inmate

From: Berlin, Germany
Insane since: Jan 2001

posted posted 04-11-2004 20:53

The \\1 matches the contents of this (\\#{1,6}) group. The last hashes are optional. That's why i placed a ? after \\1.

edit: Where do i have the char 0x1 ?

(Edited by u-neek on 04-11-2004 12:02)

Tyberius Prime
Paranoid (IV) Mad Scientist with Finglongers

From: Germany
Insane since: Sep 2001

posted posted 04-11-2004 21:13

oh I see.
Hm.. but wouldn't that mean 0 or number of slashes found at \\1?

Always thought the general rule of thumb is that regexps don't count, though I can see how this could still be described by a regular grammar. Easiest way would be using a callback, though.

(I thought \\1 would be equal to \\x01 - but it isn't)

u-neek
Bipolar (III) Inmate

From: Berlin, Germany
Insane since: Jan 2001

posted posted 04-12-2004 02:22

It means, that exactly the same string or nothing (because of the ?) is found there. The string should be #, ##,... ######. Not the number of hashes. I get the numbers of hashes with "strlen('\\\\1')" in the replacement string. (With the help of the option /e at the end of the pattern).

Any other help?

Veneficuz
Paranoid (IV) Inmate

From: A graveyard of dreams
Insane since: Mar 2001

posted posted 04-12-2004 17:39

I'm not at a computer were I can test this, but I think the following change would work
'/^(\\#{1,6})[\\s]+(.+?)(\\1?\\n*)\\n/exmU'
Change I did was added parantheses around the last (\\1?\\n*)\\n. That will make the regex engine grab that part as well, so when the replacement is done that part will dissappear as well. I also changed the \\n+ to (...\\n*)\\n, reason is that this will also grab (and replace) all \\n characters except that last, leaving a single newline after the heading.

What does .+? mean? Haven't seen that used before...

_________________________
"There are 10 kinds of people; those who know binary, those who don't and those who start counting at zero"
- the Golden Ratio -

u-neek
Bipolar (III) Inmate

From: Berlin, Germany
Insane since: Jan 2001

posted posted 04-12-2004 19:04

Thanks,
that .+? was wrong. .+ is enough.

My final pattern:
$text = preg_replace('/^(\\#{1,6})[\\s]+(.+)[\\s]+\\1?\\n+/emU', "'<h'.strlen('\\\\1').'>\\\\2</h'.strlen('\\\\1').'>\\n\\n'" , $text);

I need the first parentheses for the \\1. Leaving them means, that \\1 is the string inside my hashes (## string ##).

Thanks you for your help.

« BackwardsOnwards »

Show Forum Drop Down Menu