#11 2023-02-09 04:11

jogiwer
Member
From: Germany
Registered: 2022-11-05
Posts: 66

Re: Lookaround TRegExpr Compile Errors

I really don't know what I have been testing here - the checks looked right but I believe the test string was just to simple.

The atomic group will only help in case of total matches. If there is no success the RegExp-Enging should advance the start by one character. After this the exception won't match anymore ...

So on using regexp:

  1. To get the exceptions skipped they have to be matched always - even if there is no upper case following

  2. at a position like 'mO it is hard to see that it is positioned inside 'ExceptionNumOne' by lookaround. So this should not be used

  3. therefor a pattern like '(ex1|ex2|...|exn|[a-z])' should be used

  4. to keep it it has to be part of the replacement (like $1)

  5. a space should only be inserted if this is followed by an uppercase letter. In PascalScript there are no conditionals available. So it could only be inserted every time.

  6. And to get rid of the wrong spaces a second rule is needed.

  7. We need information for the second rule to recognize these spaces. Again no conditional so the correctly inserted spaces will also be handled here.

I came up with this:

1st Expression: (ExceptionNumOne|ExceptionTwo|[^A-Z]*[a-z])(?=([A-Z]\B)?)
1st Replace: $1< $2>
2nd Expression: <(( ).| )>
2nd Replace: $2

'[a-z]' instead of '[^A-Z]*[a-z]' would also work but this would lead to many '< >' between to lower case characters.

All together into a PascalScript:

const
  EXCEPTIONS_FILENAME = 'path\exceptions.txt';
  REPLCE1 = '$1< $2>';
  REGEXP2 = '<(( ).| )>';
  REPLCE2 = '$2';

var
  Pattern : WideString;
  Initialized : Boolean;

procedure Init();
begin
  if not Initialized then
  begin
    Pattern := '(' +
               WideJoinStrings(FileReadTextLines(EXCEPTIONS_FILENAME), '|') +
               '|[^A-Z]*[a-z])(?=([A-Z]\B)?)';

    Initialized := True;
  end;
end;

begin
  Init();

  FileName := ReplaceRegEx(FileName, Pattern, REPLCE1, True, True)
  FileName := ReplaceRegEx(FileName, REGEXP2, REPLCE2, True, True)
end.

Last edited by jogiwer (2023-02-12 14:15)

Offline

Board footer

Powered by FluxBB