#21 2009-12-22 23:16

Andrew
Senior Member
Registered: 2008-05-22
Posts: 542

Re: RegEx to extract first letter of every words in a song name

Hmm, seems ReNamer's Unicode support isn't 100% yet, since it displays and saves PascalScript as ANSI and not Unicode. Those characters which don't have an ANSI equivalent end up as ? and there seems to be nothing we can do about it for now. sad

ThanhLoan, looks like this is the best you can do for now:

var
  i: Integer;
  Parts: TStringsArray;
  FirstChar, FirstChars: WideString;

begin
  Parts := WideSplitString(ReplaceRegEx(WideExtractBaseName(FileName), '.*-', '', false, false), ' ');
  for i := 0 to Length(Parts)-1 do
  begin
    FirstChar := WideCopy(Parts[i], 1, 1);
    if(FirstChar = 'Ð') then FirstChar := 'D'; // Add more such lines as required for the supported Unicode characters
    FirstChars := FirstChars + FirstChar;
  end;
  FileName := WideExtractBaseName(FileName)+'-'+FirstChars+WideExtractFileExt(FileName);
end.

For the other Unicode characters, maybe there is a way to represent them in some other form in PascalScript (such as \u1EC4 etc.) Let's wait and see what Denis says.

Offline

#22 2009-12-23 04:42

ThanhLoan
Member
Registered: 2009-12-17
Posts: 17

Re: RegEx to extract first letter of every words in a song name

Thank you Stefan and Andrew for your care about my problem...
I'll wait to hear from Denis then...
Have a very pleasant Holiday Season...

Offline

#23 2009-12-23 11:21

Stefan
Moderator
From: Germany, EU
Registered: 2007-10-23
Posts: 1,161

Re: RegEx to extract first letter of every words in a song name

Andrew wrote:

Hmm, seems ReNamer's Unicode support isn't 100% yet,

I think this was already know as an search for 'unicode' shows:
http://www.den4b.com/forum/viewtopic.php?id=597
http://www.den4b.com/forum/viewtopic.php?id=601
http://www.den4b.com/forum/viewtopic.php?id=704

 

I think it would help if an requester would provide which unicode code page he
use to display the chars/glyphs right on his system.
So we can set
"Control Panel -> Regional and Language Options -> Advanced tab -> set <lang> as a default language for non-Unicode programs"


Loan, can you provide some more unicode file names? Not more then 10.


In my code at http://www.den4b.com/forum/viewtopic.php?pid=4134#p4134
the pasted example file name of Loan shows '?' too, i don't know how i had do that.
As i tried now i can handle this unicode name fine:
Asia Karaoke 49_14-Chí Tâm+Ngọc Huyền-Tạ Từ Trong Đêm Dtest Tân Cổ Giao Duyên.mkv

EDIT: i found out why this happens: when i copy the string from pascal script window
into this forum, the unicode signs are replaced by '?' signs.

EDIT-2: i see now that script are saved as ansi and not as unicode.
So if i paste
Asia Karaoke 49_14-Chí Tâm+Ngọc Huyền-Tạ Từ Trong Đêm Dtest Tân Cổ Giao Duyên.mkv
into an script rule and save this script
and re-load it again i get
Asia Karaoke 49_14-Chí Tâm+Ng?c Huy?n-T? T? Trong Ðêm Dtest Tân C? Giao Duyên.mkv

So again i point to Denis statement from Feb2009:
PascalScript does not support Unicode in code, and I don't think it will in any near future.
http://www.den4b.com/forum/viewtopic.php?pid=2972#p2972



---   ---   ---   ---   ---   ---   ---   


ThanhLoan, what do you mean with and here is what Renamer Pascalscript rule shows : ?
If i paste your //A section in the script rule window i see all unicode signs.
Did you load this as an script file? or paste them into the script window?


ThanhLoan wrote:

Hi Stefan,
I tried out the script and it worked fine with with the special Unicode 'Đ'.
However, some other special Unicode characters are not recognized in the Pascalscript rule.
It shows '?' for those.

Any idea ?

Here are the special characters I'd like to replace :

Đđ ÁÀẢÃẠĂÂẨẪẤẤ ÉÈÊỂỄẾỀÍ ÓÒỎÕÔỐỒỔỖƠỚỜỞỠ ÚỦỦŨỨỪ ÝỲỶỸ

and the portion of the script for this purpose :
[...]

and here is what Renamer Pascalscript rule shows :

//D
    If WideToAnsi(NextChar) ='Ð' Then NextChar :='D';
    If WideToAnsi(NextChar) ='d' Then NextChar :='D';
//A
    If WideToAnsi(NextChar) ='À' Then NextChar :='A';
    If WideToAnsi(NextChar) ='Á' Then NextChar :='A';
    If WideToAnsi(NextChar) ='?' Then NextChar :='A';
    If WideToAnsi(NextChar) ='Ã' Then NextChar :='A';
    If WideToAnsi(NextChar) ='?' Then NextChar :='A';
    If WideToAnsi(NextChar) ='A' Then NextChar :='A';
    If WideToAnsi(NextChar) ='Â' Then NextChar :='A';
    If WideToAnsi(NextChar) ='?' Then NextChar :='A';
    If WideToAnsi(NextChar) ='?' Then NextChar :='A';
    If WideToAnsi(NextChar) ='?' Then NextChar :='A';
    If WideToAnsi(NextChar) ='?' Then NextChar :='A';



---   ---   ---   ---   ---   ---   ---   



I also had an idea to simplify my script but without success:

I use
Part2 := Copy(
instead of WideCopy
and
WideToAnsi(Part2)
but still have "TTTÐDTCGD" as output, incl. the unicode "Ð"

I think this commands should drop or replace the unicode chars? What do i wrong?

var
  Parts: TStringsArray;
  Base, Part1, Part2: WideString;
  
begin
  Base := WideExtractBaseName(FileName);
  
  //Find last '-' by greedy RegEx:
  Parts := SubMatchesRegEx(Base, '(.+-)(.+)', FALSE);
  If (Length(Parts) <=0) then exit;
  
  //Split file name into two:
  Part1 := WideCopy(Base, 1, Length(Parts[0]));
  Part2 := Copy(Base, Length(Parts[0]) +1, 999);

  //WideShowMessage(Part1 + ' ~~~ ' + Part2);
  //Part1 => Asia Karaoke 49_14-Chí Tâm+Ng?c Huy?n-T? T? Trong Ðêm Dtest Tân C? Giao Duyên-
           //Asia Karaoke 49_14-Chí Tâm+Ng?c Huy?n-T? T? Trong Ðêm Dtest Tân C? Giao Duyên.mkv
  //Part2 => TTTÐDTCGD

  WideShowMessage(Part1 + WideToAnsi(Part2) + WideExtractFileExt(FileName));
end.

ReNamer_ExtractFirstChar002.PNG

Last edited by Stefan (2009-12-23 15:47)


Read the  *WIKI* for HELP + MANUAL + Tips&Tricks.
If ReNamer had helped you, please *DONATE* to Denis or buy a PRO license. (Read *Lite vs Pro*)

Offline

#24 2009-12-23 17:28

ThanhLoan
Member
Registered: 2009-12-17
Posts: 17

Re: RegEx to extract first letter of every words in a song name

Hi Stefan
To answer your wuestion : what do you mean with and here is what Renamer Pascalscript rule shows : ?
I pasted OK the Unicode in the script window and the '?' shows up after compiling.
Here is a Unicode filename with all Unicode characters :
Album 49_14-Ca Sĩ-Bản Nhạc-ĐđÁÀẢÃẠĂÂẨẪẤẤÉÈÊỂỄẾỀÍ.mkv

Album 49_14-Ca Sĩ-Bản Nhạc-ÓÒỎÕÔỐỒỔỖƠỚỜỞỠÚỦỦŨỨỪÝỲỶỸ.mkv

My default language for non-Unicode programs is French

Thanks

Offline

#25 2009-12-23 21:11

Stefan
Moderator
From: Germany, EU
Registered: 2007-10-23
Posts: 1,161

Re: RegEx to extract first letter of every words in a song name

TL> the '?' shows up after compiling.
Where?


Ah i see ThanhLoan.

Pasting in unicode into the script dialog works fine.
But after reopening the rule for modifying,  ... the unicode chars are dropped, the same in the output dialog.
ReNamer_ExtractFirstChar003.PNG






I am wondering if it helps if Denis would save this scripts as unicode UTF-8 or like this?  Let's wait till 2010.



.


Read the  *WIKI* for HELP + MANUAL + Tips&Tricks.
If ReNamer had helped you, please *DONATE* to Denis or buy a PRO license. (Read *Lite vs Pro*)

Offline

#26 2009-12-23 21:18

ThanhLoan
Member
Registered: 2009-12-17
Posts: 17

Re: RegEx to extract first letter of every words in a song name

Thanks Stefan...we'll wait for Denis then..

Regards,

Offline

Board footer

Powered by FluxBB