Closed Thread Icon

Topic awaiting preservation: Complex Regular Expression (Page 1 of 1) Pages that link to <a href="https://ozoneasylum.com/backlink?for=27200" title="Pages that link to Topic awaiting preservation: Complex Regular Expression (Page 1 of 1)" rel="nofollow" >Topic awaiting preservation: Complex Regular Expression <span class="small">(Page 1 of 1)</span>\

 
WarMage
Maniac (V) Mad Scientist

From: Rochester, New York, USA
Insane since: May 2000

posted posted 12-22-2005 20:46

I have a file that has two types of code blocks. The different blocks will have unique id's, so no ft:widget will have the same id. There will also be a good deal of HTML thrown into the mix.

code:
<ft:widget-label id="id1"></ft:widget-label>
<ft:widget id="id1"></ft:widget>



code:
<ft:widget-label id="id1"></ft:widget-label>
<ft:widget id="id2"></ft:widget>
<ft:widget id="id1"></ft:widget>
<input onclick="grabme"/>



I want to be able to pull out the id's and the grab me sections only from blocks that are in the second format, while ignoring the first.

So for a bit of an example I might have a codeblock that looks like this.

code:
<tr>
      <td ><label for="id1"><ft:widget-label id="id1"/></label></td>
      <td>$&#160;<ft:widget id="id1"><fi:styling class="rightIndent" /></ft:widget></td>
      <td ><label for="id2"><ft:widget-label id="id2"/></label></td>
      <td>&#160;&#160;&#160;<ft:widget id="id3"><fi:styling type="hidden"/></ft:widget><ft:widget id="id2"><fi:styling/></ft:widget>
        <input type="button" onclick="value=2" value="..."/>
      </td>
    </tr>



When I run my regular expression I want to get

id2, id3 and value=2 from this block of code.

I have a regular exrpession like such:

code:
pattern = '<ft:widget-label id="([^"]+)".*?<ft:widget id="([^"]+)".*?onclick="([^"]+"'



The problem is that this regular expression will pull back id1, id1, value=2.

The pattern will hit the first label, it will then hit the first widget, now it will skip the next label and widget and grab the value.

It is close, but it is very wrong. I am pretty sure that I need some kind of look-ahead assertion but I am not sure how to correctly impliment this. Any ideas?

Thanks,

Dan @ Code Town

WarMage
Maniac (V) Mad Scientist

From: Rochester, New York, USA
Insane since: May 2000

posted posted 12-22-2005 21:00

I think that I can simplify this by saying I want to find the pattern

<ft:widget-label then <ft:widget followed by the value without having an <ft:widget-label following the <ft:widget.

Dan @ Code Town

WarMage
Maniac (V) Mad Scientist

From: Rochester, New York, USA
Insane since: May 2000

posted posted 12-22-2005 22:15

I narrowed down what I was looking for, by adding some more information into the search and I was able to pull out the information that I was looking for.

Dan @ Code Town

Tyberius Prime
Paranoid (IV) Mad Scientist with Finglongers

From: Germany
Insane since: Sep 2001

posted posted 12-22-2005 22:34

very well - me head wasn't up to that tonight

hyperbole
Paranoid (IV) Inmate

From: Madison, Indiana, USA
Insane since: Aug 2000

posted posted 12-23-2005 19:21

I'm glad you fixed it. I guess I'm a bit late with this response, but had some thoughts in case you run in to a similar stuation in the future.

I would approach this in one of two ways:

Assuming the content is in a variable called $content

1)

code:
$content =~ s/^.*?<td ><label for="id2">//s;
   $content =~ s'</table>.*$//s;

   @array = ($content =~ m/$pattern/gs);



The first two lines would remove everything except the values your interested in, then you can create an array of the values with your pattern.

My second approach would be to use your pattern grab all the values from $content including the ones you don't want, then use grep or a for loop to filter out the ones you don't want.

.



-- not necessarily stoned... just beautiful.

« BackwardsOnwards »

Show Forum Drop Down Menu