126 lines
4.6 KiB
Plaintext
126 lines
4.6 KiB
Plaintext
|
||
-- SUMMARY --
|
||
|
||
Provides a central transliteration service to other Drupal modules, and
|
||
sanitizes file names while uploading.
|
||
|
||
For a full description visit the project page:
|
||
http://drupal.org/project/transliteration
|
||
Bug reports, feature suggestions and latest developments:
|
||
http://drupal.org/project/issues/transliteration
|
||
|
||
|
||
-- INSTALLATION --
|
||
|
||
1. Install as usual, see http://drupal.org/node/70151 for further information.
|
||
|
||
2. If you are installing to an existing Drupal site, you might want to fix
|
||
existing file names after installation, which will update all file names
|
||
containing non-ASCII characters. However, if you have manually entered links
|
||
to those files in any contents, these links will break since the original
|
||
files are renamed. Therefore it is a good idea to test the conversion
|
||
first on a copy of your web site. You'll find the retroactive conversion at
|
||
Configuration and modules >> Media >> File system >> Transliteration.
|
||
|
||
|
||
-- CONFIGURATION --
|
||
|
||
This module doesn't require special permissions.
|
||
|
||
This module can be configured from the File system configuration page
|
||
(Configuration and modules >> Media >> File system >> Settings).
|
||
|
||
- Transliterate file names during upload: If you need more control over the
|
||
resulting file names you might want to disable this feature here and install
|
||
the FileField Paths module (http://drupal.org/project/filefield_paths)
|
||
instead.
|
||
|
||
- Lowercase transliterated file names: It is recommended to enable this option
|
||
to prevent issues with case-insensitive file systems.
|
||
|
||
|
||
-- 3RD PARTY INTEGRATION --
|
||
|
||
Third party developers seeking an easy way to transliterate text or file names
|
||
may use transliteration functions as follows:
|
||
|
||
if (function_exists('transliteration_get')) {
|
||
$transliterated = transliteration_get($text, $unknown, $source_langcode);
|
||
}
|
||
|
||
or, in case of file names:
|
||
|
||
if (function_exists('transliteration_clean_filename')) {
|
||
$transliterated = transliteration_clean_filename($filename, $source_langcode);
|
||
}
|
||
|
||
Note that the optional $source_langcode parameter specifies the language code
|
||
of the input. If the source language is not known at the time of transliter-
|
||
ation, it is recommended to set this argument to the site default language:
|
||
|
||
$output = transliteration_get($text, '?', language_default('language'));
|
||
|
||
Otherwise the current display language will be used, which might produce
|
||
inconsistent results.
|
||
|
||
|
||
-- LANGUAGE SPECIFIC REPLACEMENTS --
|
||
|
||
This module supports language specific variations in addition to the basic
|
||
transliteration replacements. The following guide explains how to add them:
|
||
|
||
1. First find the Unicode character code you want to replace. As an example,
|
||
we'll be adding a custom transliteration for the cyrillic character 'г'
|
||
(hexadecimal code 0x0433) using the ASCII character 'q' for Azerbaijani
|
||
input.
|
||
|
||
2. Transliteration stores its mappings in banks with 256 characters each. The
|
||
first two digits of the character code (04) tell you in which file you'll
|
||
find the corresponding mapping. In our case it is data/x04.php.
|
||
|
||
3. If you open that file in an editor, you'll find the base replacement matrix
|
||
consisting of 16 lines with 16 characters on each line, and zero or more
|
||
additional language-specific variants. To add our custom replacement, we need
|
||
to do two things: first, we need to create a new transliteration variant
|
||
for Azerbaijani since it doesn't exist yet, and second, we need to map the
|
||
last two digits of the hexadecimal character code (33) to the desired output
|
||
string:
|
||
|
||
$variant['az'] = array(0x33 => 'q');
|
||
|
||
(see http://people.w3.org/rishida/names/languages.html for a list of
|
||
language codes).
|
||
|
||
Any Azerbaijani input will now use the appropriate variant.
|
||
|
||
Also take a look at data/x00.php which already contains a bunch of language
|
||
specific replacements. If you think your overrides are useful for others please
|
||
file a patch at http://drupal.org/project/issues/transliteration.
|
||
|
||
|
||
-- CREDITS --
|
||
|
||
Authors:
|
||
* Stefan M. Kudwien (smk-ka) - http://drupal.org/user/48898
|
||
* Daniel F. Kudwien (sun) - http://drupal.org/user/54136
|
||
|
||
Maintainers:
|
||
* Andrei Mateescu (amateescu) - http://drupal.org/user/729614
|
||
|
||
UTF-8 normalization is based on UtfNormal.php from MediaWiki
|
||
(http://www.mediawiki.org) and transliteration uses data from Sean M. Burke's
|
||
Text::Unidecode CPAN module
|
||
(http://search.cpan.org/~sburke/Text-Unidecode-0.04/lib/Text/Unidecode.pm).
|
||
|
||
|
||
-- USEFUL RESOURCES --
|
||
|
||
Unicode Code Converter:
|
||
http://people.w3.org/rishida/tools/conversion/
|
||
|
||
UTF-8 encoding table and Unicode characters:
|
||
http://www.utf8-chartable.de/unicode-utf8-table.pl
|
||
|
||
Country codes:
|
||
http://www.loc.gov/standards/iso639-2/php/code_list.php
|