Topic: [reqex] In search of... (Page 1 of 1) Pages that link to <a href="https://ozoneasylum.com/backlink?for=27714" title="Pages that link to Topic: [reqex] In search of... (Page 1 of 1)" rel="nofollow" >Topic: [reqex] In search of... <span class="small">(Page 1 of 1)</span>\

 
u-neek
Bipolar (III) Inmate

From: Berlin, Germany
Insane since: Jan 2001

posted posted 03-31-2006 22:25

...a regular expression to validate a uri.

Any Ideas?

u-neek
Bipolar (III) Inmate

From: Berlin, Germany
Insane since: Jan 2001

posted posted 04-01-2006 10:50

The first part:

code:
[a-zA-Z][a-zA-Z0-9\+-\.]*			# scheme
:\/\/
((([a-zA-Z0-9][a-zA-Z0-9-]*[a-zA-Z0-9]\.)*	# domainlabel
[a-zA-Z][a-zA-Z][a-zA-Z0-9-]*[a-zA-Z0-9]	# toplabel
\.?)|						# additional period
([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+))		# IPv4 address
(:[0-9]+)?					# port



Using this RFC for reference: http://www.ietf.org/rfc/rfc2396.txt

Any comments? Could this be done better?

(Edited by u-neek on 04-01-2006 10:55)

u-neek
Bipolar (III) Inmate

From: Berlin, Germany
Insane since: Jan 2001

posted posted 04-01-2006 11:28

Ok, here is the full code:

code:
[a-zA-Z][a-zA-Z0-9\+-\.]*			# scheme
:\/\/						#
((([a-zA-Z0-9][a-zA-Z0-9-]*[a-zA-Z0-9]\.)*	# domainlabel
[a-zA-Z][a-zA-Z0-9-]*[a-zA-Z0-9]		# toplabel
\.?)|						# additional period
([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+))		# IPv4 address
(:[0-9]+)?					# port
/						#
([a-zA-Z0-9-_\.!~*'\(\)]			# unreserved
|%[0-9a-fA-F]{2}				# escaped
|[:@&=\+\$;])*					#
(/([a-zA-Z0-9-_\.!~*'\(\)]			# unreserved
|%[0-9a-fA-F]{2}				# escaped
|[:@&=\+\$;])*)*				#
(\?						# query:
([;\/\?:@&=\+\$,]				# reserved
|[a-zA-Z0-9-_\.!~*'\(\)]			# unreserved
|%[0-9a-fA-F]{2})*)*				# escaped



Fixed a bug in the toplabel part.

Relative URIs are missing.

Comments?

(Edited by u-neek on 04-01-2006 11:31)

u-neek
Bipolar (III) Inmate

From: Berlin, Germany
Insane since: Jan 2001

posted posted 04-01-2006 12:59

Fixed some bugs, cleaned and added userinfo:

code:
[a-zA-Z][a-zA-Z\d\+\-\.]*:\/\/			# scheme + ://
(([\w\-\.!~'\*\(\);:&=\+\$,]|%[\da-fA-F]{2})+@)?# userinfo
((([a-zA-Z\d][a-zA-Z\d\-]*[a-zA-Z\d]\.)*	# domainlabel
[a-zA-Z][a-zA-Z\d\-]*[a-zA-Z\d]\.?)		# toplabel
|([\d]+\.[\d]+\.[\d]+\.[\d]+))			# IPv4 address
(:[\d]+)?					# port
/						# path:
([\w\-\.!~\*'\(\):@&=\+\$;]			# unreserved
|%[\da-fA-F]{2})*				# escaped
(/([\w\-\.!~\*'\(\):@&=\+\$;]			# unreserved
|%[\da-fA-F]{2})*)*				# escaped
(\?						# query:
([;\/\?:@&=\+\$,]				# reserved
|[\w\-\.!~\*'\(\)]				# unreserved
|%[\da-fA-F]{2})*)*				# escaped



(Edited by u-neek on 04-01-2006 13:02)

DmS
Maniac (V) Inmate

From: Sthlm, Sweden
Insane since: Oct 2000

posted posted 04-01-2006 22:54

Someone once said: "Hey, I can solve this problem with regular expressions!" This man now has two problems...

/D

{cell 260} {Blog}
-{"Theories without facts are just religions...?}-



Post Reply
 
Your User Name:
Your Password:
Login Options:
 
Your Text:
Loading...
Options:


« BackwardsOnwards »

Show Forum Drop Down Menu