#1 2017-02-18 16:06

eR@SeR
Senior Member
From: Земун, Србија
Registered: 2008-01-23
Posts: 353

How to translit only specific text?

Hi everybody!

I have one xml file containing parts to be localized. I did Latin version but now I want to do a Cyrillic one. Currently, it is impossible to translit text skipping other parts that shouldn't be translited. I'm asking someone to help me with this example:

<?xml version="1.0" encoding="utf-8"?>
<language><info><name display=""><![CDATA[Srpski]]></name></info><items><item key="btnFirstPage"><value><![CDATA[Prva]]></value></item><item key="btnNextPage"><value><![CDATA[Sledeća strana]]></value></item><item key="btnLastPage"><value><![CDATA[Poslednja]]></value></item><item key="btnPrePage"><value><![CDATA[Prethodna strana]]></value></item>

As you can see, I want to translit text that is placed between "<![CDATA[this shoud be translited only]]>"


TRUTH, FREEDOM, JUSTICE and FATHERLAND are the highest morale values which human is born, lives and dies for!

Offline

#2 2017-02-21 23:05

den4b
Administrator
From: den4b.com
Registered: 2006-04-06
Posts: 3,367

Offline

#3 2017-02-22 19:17

eR@SeR
Senior Member
From: Земун, Србија
Registered: 2008-01-23
Posts: 353

Re: How to translit only specific text?

Yes, for sure. I want to use Translit rule to convert Latin to Cyrillic as I said.

File xml is opened via Sublime Text, content copied to clipboard, then pasted to ReNamer's Analyze dialog. Using appropriate rule (maybe pascal) I want to translit only mentioned part of text. When it is done successfully I can copy desired result into Sublime and save it. I hope it is more clear now? smile

Translit file that I want to use is:

ђ=đ
љ=lj
њ=nj
џ=dž
a=a
б=b
в=v
г=g
д=d
е=e
ж=ž
з=z
и=i
ј=j
к=k
л=l
м=m
н=n
о=o
п=p
р=r
с=s
т=t
ћ=ć
у=u
ф=f
х=h
ц=c
ч=č
ш=š

TRUTH, FREEDOM, JUSTICE and FATHERLAND are the highest morale values which human is born, lives and dies for!

Offline

#4 2017-02-23 01:07

den4b
Administrator
From: den4b.com
Registered: 2006-04-06
Posts: 3,367

Re: How to translit only specific text?

It's not possible to apply Translit rule selectively in ReNamer.

You have 2 options:
1) Find a way to extract only text required for transliteration, and then refit it back somehow.
2) Write a little program/script (outside of ReNamer) to do the transliteration. It's not too complicated.

Offline

#5 2017-02-23 09:07

Stefan
Moderator
From: Germany, EU
Registered: 2007-10-23
Posts: 1,161

Re: How to translit only specific text?

For example like this, perhaps VBS can handle that Unicode well?

SET FSO	= CreateObject("Scripting.FileSystemObject")
Const OpenForReading = 1, OpenForWriting = 2, OpenForAppending = 8
strdatei = "Test.xml"

'Read a Text File at once:
Set objFile = FSO.OpenTextFile(strdatei, OpenForReading, False)
strText = objFile.ReadAll
objFile.Close


arrLines = split(strText,">")
For i=0 to UBound(arrLines)
    strCurrLine = arrLines(i) & ">"
    If InStr(strCurrLine, "CDATA") Then
        MB = MsgBox(strCurrLine & vbbLF & vbLF & "Continue?", vbOKCancel , "?") 
        If (MB = vbCancel) Then WScript.Quit 
        strCurrLine = Replace(strCurrLine,"ђ","đ")
        strCurrLine = Replace(strCurrLine,"љ","lj")
        strCurrLine = Replace(strCurrLine,"њ","nj")
        strCurrLine = Replace(strCurrLine,"џ","dž")
        strCurrLine = Replace(strCurrLine,"a","a")
        strCurrLine = Replace(strCurrLine,"б","b")
        strCurrLine = Replace(strCurrLine,"в","v")
        MsgBox strCurrLine
    End If
    arrOut = arrOut & strCurrLine 
Next

arrOut = Left(arrOut,Len(arrOut)-1)

strOutFile = strdatei & "_converted.xml"
iUnicode = -1
Set oFileOut = FSO.OpenTextFile(strOutFile, OpenForWriting, true, iUnicode)
    oFileOut.Write arrOut
    oFileOut.Close
Set oFileOut = Nothing


'// "Test.xml"
'<?xml version="1.0" encoding="utf-8"?>
'<language><info><name display=""><![CDATA[Srpski]]></name></info><items>
'<item key="btnFirstPage"><value><![CDATA[Prva]]></value></item>
'<item key="btnNextPage"><value><![CDATA[љSledeca ђstrana]]></value></item>
'<item key="btnLastPage"><value><![CDATA[њPoslednjaџ]]></value></item>
'<item key="btnPrePage"><value><![CDATA[Prethodna strana]]></value></item>
'</items></language>

'// "Test.xml_converted.xml"
'<?xml version="1.0" encoding="utf-8"?>
'<language><info><name display=""><![CDATA[Srpski]]></name></info><items>
'<item key="btnFirstPage"><value><![CDATA[Prva]]></value></item>
'<item key="btnNextPage"><value><![CDATA[љSledeca ђstrana]]></value></item>
'<item key="btnLastPage"><value><![CDATA[ÑšPoslednjaÑŸ]]></value></item>
'<item key="btnPrePage"><value><![CDATA[Prethodna strana]]></value></item>
'</items></language>

Be sure to save the script file as unicode.

Last edited by Stefan (2017-02-23 09:09)


Read the  *WIKI* for HELP + MANUAL + Tips&Tricks.
If ReNamer had helped you, please *DONATE* to Denis or buy a PRO license. (Read *Lite vs Pro*)

Offline

#6 2017-02-25 21:07

eR@SeR
Senior Member
From: Земун, Србија
Registered: 2008-01-23
Posts: 353

Re: How to translit only specific text?

den4b wrote:

You have 2 options:
1) Find a way to extract only text required for transliteration, and then refit it back somehow.
2) Write a little program/script (outside of ReNamer) to do the transliteration. It's not too complicated.

Oh, I thought that every "non file" renaming problem can be solved using ReNamer and Analyze dialog... Is there a code to solve problem using Pascal? Just wondering, no need for it.

Stefan, thank you for giving a try. Sorry but I'm not familiar with coding. Also I provided only part of the strings as example...

Thank you both of you! smile

Nevertheless, I asked my friend who knows Ruby, to help me about this problem and he made me a script to run using command prompt:

class Translit

  def initialize

    list = {
        "ђ" => "đ",
        "љ" => "lj",
        "њ" => "nj",
        "џ" => "dž",
        "a" => "a",
        "б" => "b",
        "в" => "v",
        "г" => "g",
        "д" => "d",
        "е" => "e",
        "ж" => "ž",
        "з" => "z",
        "и" => "i",
        "ј" => "j",
        "к" => "k",
        "л" => "l",
        "м" => "m",
        "н" => "n",
        "о" => "o",
        "п" => "p",
        "р" => "r",
        "с" => "s",
        "т" => "t",
        "ћ" => "ć",
        "у" => "u",
        "ф" => "f",
        "х" => "h",
        "ц" => "c",
        "ч" => "č",
        "ш" => "š",
        "Ђ" => "Đ",
        "Љ" => "LJ",
        "Њ" => "NJ",
        "Џ" => "DŽ",
        "A" => "A",
        "Б" => "B",
        "В" => "V",
        "Г" => "G",
        "Д" => "D",
        "Е" => "E",
        "Ж" => "Ž",
        "З" => "Z",
        "И" => "I",
        "Ј" => "J",
        "К" => "K",
        "Л" => "L",
        "М" => "M",
        "Н" => "N",
        "О" => "O",
        "П" => "P",
        "Р" => "R",
        "С" => "S",
        "Т" => "T",
        "Ћ" => "Ć",
        "У" => "U",
        "Ф" => "F",
        "Х" => "H",
        "Ц" => "C",
        "Ч" => "Č",
        "Ш" => "Š",
    }

    data = File.read(ARGV[0], :encoding => "utf-8")

    translated = data.gsub(/(\[(?:\[??[^\[]*?\]))/) do |text|
      for k, v in list
        if ARGV[2] != "latcir"
          text = text.gsub(k.to_s, v.to_s)
        else
          text = text.gsub(v.to_s, k.to_s)
        end
      end
      text
    end

    File.open((ARGV[1] || "result.xml"), "w") do |f|
      f.write(translated)
    end

  end

end

Translit.new

After installing Ruby Windows-based installer and setting environment variable, I set location of Ruby script (translit.rb) and Xml file (Srpski.xml) in command prompt and run script using command "translit.rb Srpski.xml result.xml latcir". I get desired result smile

<?xml version="1.0" encoding="utf-8"?>
<language><info><name display=""><![CDATA[Српски]]></name></info><items><item key="btnFirstPage"><value><![CDATA[Првa]]></value></item><item key="btnNextPage"><value><![CDATA[Следећa стрaнa]]></value></item><item key="btnLastPage"><value><![CDATA[Последњa]]></value></item><item key="btnPrePage"><value><![CDATA[Претходнa стрaнa]]></value></item>

Opposite is possible replacing latcir to cirlat.

Last edited by eR@SeR (2017-02-25 21:09)


TRUTH, FREEDOM, JUSTICE and FATHERLAND are the highest morale values which human is born, lives and dies for!

Offline

#7 2017-02-26 08:48

Stefan
Moderator
From: Germany, EU
Registered: 2007-10-23
Posts: 1,161

Re: How to translit only specific text?

?

But that Ruby code doesn't work on "CDATA" part only but at the whole text.




Anyway, great you sorted that out.



 


Read the  *WIKI* for HELP + MANUAL + Tips&Tricks.
If ReNamer had helped you, please *DONATE* to Denis or buy a PRO license. (Read *Lite vs Pro*)

Offline

#8 2017-02-26 12:41

eR@SeR
Senior Member
From: Земун, Србија
Registered: 2008-01-23
Posts: 353

Re: How to translit only specific text?

Stefan wrote:

But that Ruby code doesn't work on "CDATA" part only but at the whole text.

"CDATA" shouldn't be changed, only bold part after CDATA i.e. "[CDATA[Some text]]". I mentioned CDATA as part to catch but [Some text] to change-translit. I'm sorry if I get misunderstood tongue


TRUTH, FREEDOM, JUSTICE and FATHERLAND are the highest morale values which human is born, lives and dies for!

Offline

Board footer

Powered by FluxBB