Skip to main content

Translit rule

TODO: Complete, review, update links.

This rule transliterates one alphabet into another. Its main goal is to transliterate Non-English characters from different languages into their English/Latin representation. The rule comes with built-in transliteration maps for many different languages, but you can also define your own custom mapping.

ReNamer Translit rule.png

Options

Alphabet

Define a transliteration map or load an existing map from the menu. See how to work with transliteration maps in the sections below.

Direction

Select the direction for applying the transliteration: Forward (left to right) or Backward (right to left).

Auto case adjustment

Automatically adjust the text case of the output based on the case of the input. The exact algorithm is described in a section below.

Skip extension

If checked, the file extension will be excluded from processing and will remain unaffected.

Transliteration maps

To transliterate, we define pairs of equivalent characters or combinations of characters. For example, "ü" character in German language can be transliterated to "ue", so the name "Müller" becomes "Mueller" in English alphabet.

We need several such equivalent pairs to convert one language into another. The entire set is called a transliteration map. You can think of it as a kind of a find-and-replace rule.

The rule comes with built-in transliteration maps for many different languages, but you can also define your own custom mapping. Each map can be used in both directions (forward or reverse), for example the French map can be used as French-to-English or as English-to-French.

When you create a new rule, its window does not show any maps. You can now do one of the following:

  1. Use any of the built-in maps.
  2. Create your own map and use it.
  3. Edit a built-in map first, and then use it.

Let us see how to do this.

Use a built-in transliteration map

Press the ReNamer translit rule alphabet button.png button to see the list of available transliteration maps for different languages.

ReNamer translit rule alphabet menu small.png

Click on the desired transliteration map to load it.

For an example, let's say we loaded the French transliteration map.

ReNamer translit rule loaded map.png

You can edit any of the entries in this list, add new entries, or delete any of the entries.

Note that such editing do not alter the saved version of the map. The map is edited just for current rule configuration. You will see how to alter an existing transliteration map and create a new map in a section below.

Next, select the direction for applying the transliteration: Forward (left to right) or Backward (right to left).

Make your own transliteration map

...

Save a transliteration map

...

Automatic case adjustment algorithm

The transliteration process performs automatic case conversion with an algorithm adopted specifically for transliteration. The rule discards the case of the map definition, i.e. "A=B" is same as "a=b". The output case is determined based on the case of each input fragment. Multiple character fragments are treated as part of words, with their case determined based on the case of letters around them.

Let's consider a mapping pair INPUT=OUTPUT, where the INPUT token takes the case of a match in the original text. The logic for determining the case of the OUTPUT token is as follows:

set OUTPUT to lower case
if first letter in INPUT is upper case then
  if length of OUTPUT greater than 1 then
    if next letter in original name is upper case then
      convert whole OUTPUT to upper case
    else
      convert only first letter in OUTPUT to upper case
  else
    convert whole OUTPUT to upper case

Unicode character forms

...