#1 2019-09-07 15:58

jeffli
Member
From: den4b.com
Registered: 2019-01-22
Posts: 14
Website

Two problems about rearrange ENG+CHN filename with regex rule?

Hi,

Problem 1

I get a problem when I try to rearrange "Chinese中文.txt"(English characters + Chinese characters file name) to “中文Chinese.txt” with regex rule in ReNamer 7.1.0, the regex rule is:

Expression: ([a-zA-Z]+)([\u4E00-\u9FA5]+)
Replace:$2$1
//Expression: ([a-zA-Z]+)([\u4E00-\u9FA5]+) works well in https://regex101.com/r/RmoAP8/1 as ECMAScript.

It shows the error file name like this:

eChines中文.txt  //Error, I want "中文Chinese.txt".

How to rearrange a file name combined with English characters and Chinese characters?

Problem 2

Does ReNamer 7.1.0 support Unicode scripts "\p{han}" in regex rule?
If it does, I can use "([\p{han}]+)" instead of "([\u4E00-\u9FA5]+)".

Last edited by jeffli (2019-09-08 02:51)


Jeff

Offline

#2 2019-09-07 19:57

den4b
Administrator
From: den4b.com
Registered: 2006-04-06
Posts: 2,843

Re: Two problems about rearrange ENG+CHN filename with regex rule?

Problem 1

The regular expressions engine in ReNamer uses a slightly different syntax for defining Unicode character codes.

For example, instead of "\u1234" you need to write "\x{1234}".

See this for the reference:
http://www.den4b.com/wiki/ReNamer:Regul … _sequences

In your case, the expression should be: ([a-zA-Z]+)([\x{4E00}-\x{9FA5}]+)

Problem 2

The Unicode properties based classes like "\p{han}" are not supported at this moment.

Offline

#3 2019-09-08 00:25

jeffli
Member
From: den4b.com
Registered: 2019-01-22
Posts: 14
Website

Re: Two problems about rearrange ENG+CHN filename with regex rule?

den4b wrote:

Problem 1

The regular expressions engine in ReNamer uses a slightly different syntax for defining Unicode character codes.

For example, instead of "\u1234" you need to write "\x{1234}".

See this for the reference:
http://www.den4b.com/wiki/ReNamer:Regul … _sequences

In your case, the expression should be: ([a-zA-Z]+)([\x{4E00}-\x{9FA5}]+)

Problem 2

The Unicode properties based classes like "\p{han}" are not supported at this moment.

Hi,
It works, thank you very much. I am very sorry I didn't even notice it when this problem happens.

RegEx pattern	matches
\xnn            Character represented by the hex code nn
\x{nnnn}        two bytes char with hex code nnnn (Unicode)

Jeff

Offline

Board footer

Powered by FluxBB