#1 2022-11-15 17:37

jogiwer
Member
From: Germany
Registered: 2022-11-05
Posts: 66

Problem with WideToOem

I am using calls to exiftool like in the scripts section.

exiftool will look into the original file so the WideString FileName is taken. The method ExecConsoleApp takes a String as command so this will be converted automatically i guess.

With my looging implemented i found some files where exiftool doesn't deliver a result and the default are taken. In that special cases the folder names are containing non ASCII characters.

I tried to make use of WideToAnsi and WideToOem. The logged command looks always the same as the written log lines are also transcoded to Strings. And also these conversion had no effect to the exiftool call.

I my case I have the character #223 (sharp s = ß) in use. This might be needed as #225 in the OEM context ...

Any ideas how to solve this? I also tried using WideReplaceStr but as the name suggests this just works on WideString again. May this character be missing in WideToOem?

Offline

#2 2022-11-16 14:20

jogiwer
Member
From: Germany
Registered: 2022-11-05
Posts: 66

Re: Problem with WideToOem

Ok - solved for me. But still not clear to me:

It makes no difference which of these FilePath versions I place in the call:

FilePath
WideToAnsi(FilePath)
WideToOem(FilePath)

In every case the filename seems to be passed as UTF8 to exiftool. exiftool itself does try to read these names with its own codepage.

Running from my shell (cmd) the command chcp reveals a copdepage 850 (or cp850, ibm850 ...). But as I found out the path is passed and interpreted as Latin1 (cp1252, ISO-8859-1) for my pc.

Solution for calling exiftool

As I stated above the filename or path seems to be passed always as UTF8. To tell exiftool about this you have to set this option:

-charset FileName=UTF8

And exiftool can find all the files!

To read all tags in the correct way the output of this call needs to be UTF8 also. So here the following has to be added:

-charset ExifTool=UTF8

Maybe this should be added to the exiftool wiki page.

Offline

#3 2022-12-10 22:10

den4b
Administrator
From: den4b.com
Registered: 2006-04-06
Posts: 3,370

Re: Problem with WideToOem

OEM encoding/decoding routines are meant for handling console output only.

ExecuteProgram and ExecConsoleApp functions in Pascal Script do not support Unicode characters natively. That is why the argument types are String type rather than WideString. However, this limitation will be addressed in near future.

Currently, the command line is meant to be passed to Windows API as an ANSI encoded string. When you pass a WideString to ExecConsoleApp, it will get converted to a UTF8 encoded string and passed as if an ANSI string to the Windows API. Luckily for you, exiftool allows you to override the the incoming character set, making it work.

Offline

#4 2022-12-11 13:46

jogiwer
Member
From: Germany
Registered: 2022-11-05
Posts: 66

Re: Problem with WideToOem

den4b wrote:

Currently, the command line is meant to be passed to Windows API as an ANSI encoded string.

As ANSI isn't a unique label for a code page I'll take what windows might use as ANSI: cp1252.

If the strings were to be passed as ANSI, why didn't a call of WideToAnsi solve my problems?

The answer seems to be in what you stated also it will get converted to a UTF8 encoded string. But after a call to WideToAnsi there won't be any need to convert the ANSI string (typed AnsiString/String) back to UTF8 (typed String).

Nevertheless: UTF8 is in my eyes the best choice to get into the execution of exiftool. So I am glad that the option -charset FileName=UTF8 exists.

So please take this finding into your wiki page ReNamer:Scripts:ExifTool.

Offline

#5 2022-12-12 22:37

den4b
Administrator
From: den4b.com
Registered: 2006-04-06
Posts: 3,370

Re: Problem with WideToOem

ReNamer 7.x branch introduced major changes in the handling of string encoding. The String (AnsiString) type became code page aware and uses UTF8 encoding by default. Conversions between WideString and String types are performed automatically.

Originally, WideToAnsi and AnsiToWide functions were used to convert WideString to/from String encoded with system's active code page, e.g. 1250 Latin, 1251 Cyrillic, etc. Now these functions convert WideString to/from String encoded with the default encoding, i.e. UTF8. These functions are no longer needed, but kept for backwards compatibility.

What's missing now is the ability to convert to/from system's active code page, although the need for such conversions is becoming rare.

The "-charset filename=utf8" option was added to the ExifTool script.

Offline

Board footer

Powered by FluxBB