#1 2010-03-26 08:41

SafetyCar
Senior Member
Registered: 2008-04-28
Posts: 446
Website

Question about a strange thing on RegEx

Today I saw this strange thing

Exp: (.*)
Rep: $1#

Input:   example
Output: example##

Why 2 times? And with (.*)$ keeps doing the same but not with ^(.*)


Is there any logical explanation for this?


- - - UPDATE - - -

I changed the order an I get this

Exp: (.*)
Rep: #$1

Input:   example
Output: #example#



It's happening both on analyze text window and on file list

Last edited by SafetyCar (2010-03-26 08:58)


If this software has helped you, consider getting your pro version. :)

Offline

#2 2010-03-26 11:14

Stefan
Moderator
From: Germany, EU
Registered: 2007-10-23
Posts: 1,161

Re: Question about a strange thing on RegEx

This seams like an b,ug in the regex implementation related to greedy/non-greedy.
To help Denis if he want to find this, i have tested this behaviour
with buid "5.50 (3 May 2009)" and got the same result as SafetyCar .

# is an placeholder for all kind of strings ( like 'P' or 'New-' , no matter)
And the same was found by using  '$0'
The workaround is to use an leading ^ before the expression, as SafetyCar already mentioned.


Now as i am on, i found more:
(.*)(.*)
$1 New

(.*?)(.*)
$1 New

(.*?)(.*)
$1 New $2

Ahh, this works:
(x.*)
$1#

or
(.*x)
$1#
where 'x' represents the first/last char of an string. Now it seams that there is an problem with the dot?

.


Read the  *WIKI* for HELP + MANUAL + Tips&Tricks.
If ReNamer had helped you, please *DONATE* to Denis or buy a PRO license. (Read *Lite vs Pro*)

Offline

#3 2010-03-26 13:15

den4b
Administrator
From: den4b.com
Registered: 2006-04-06
Posts: 3,379

Re: Question about a strange thing on RegEx

That's a serious problem. How come we didn't notice it before? hmm

I just wanted to report it to the developer of this RegEx engine, but couldn't access his web site. I can't fix it on my end, it can take weeks and weeks of going through RegEx engine code to find the problem.

We'll have to live with this problem for now, until either the problem is fixed by the developer or I change the RegEx engine to another implementation.

Offline

#4 2010-03-26 14:55

SafetyCar
Senior Member
Registered: 2008-04-28
Posts: 446
Website

Re: Question about a strange thing on RegEx

den4b wrote:

How come we didn't notice it before? hmm

I wondered something similar when I saw it

Stefan wrote:

Now it seams that there is an problem with the dot?

I tried:

Exp: (a?) (but not (a+))
Rep: xxx$1

Input:   a
Output: xxxaxxx


So it's not the dot. The example is pretty representative, but I don't know how to explain my conclusions mad
But I bet you can understand me without saying it  lol  lol  lol

Last edited by SafetyCar (2010-03-26 15:10)


If this software has helped you, consider getting your pro version. :)

Offline

#5 2010-03-26 21:09

SafetyCar
Senior Member
Registered: 2008-04-28
Posts: 446
Website

Re: Question about a strange thing on RegEx

I did another test

Exp: (.*)
Rep: $1_$1_$1_$1_$1_$1

Input:   a
Output: a_a_a_a_a_a_____


It's like it was dividing the string into 2 strings, but the second one is empty, and once processed are joint


If this software has helped you, consider getting your pro version. :)

Offline

#6 2011-02-01 14:58

SafetyCar
Senior Member
Registered: 2008-04-28
Posts: 446
Website

Re: Question about a strange thing on RegEx

den4b wrote:

I just wanted to report it to the developer of this RegEx engine, but couldn't access his web site. I can't fix it on my end, it can take weeks and weeks of going through RegEx engine code to find the problem.

I don't know if you noticed it, but http://regexpstudio.com/ is online again, maybe you can contact them now?.


If this software has helped you, consider getting your pro version. :)

Offline

Board footer

Powered by FluxBB