Warning: include_once(cookbook/pmfeed.php): failed to open stream: No such file or directory in /misc/32/000/115/128/8/user/web/bisharat.net/wikidoc/local/config.php on line 63

Warning: include_once(): Failed opening 'cookbook/pmfeed.php' for inclusion (include_path='.:/usr/share/pear') in /misc/32/000/115/128/8/user/web/bisharat.net/wikidoc/local/config.php on line 63
PanAfriL10n - PanAfrLoc - AfricaAndTheInternationalisationOfICT

A Survey of Localisation in African Languages, and its Prospects: A Background Document

6. Africa and the Internationalisation of ICT

There are a number of important issues for localisation in Africa relating to the internationalisation of computers and computer systems. Internationalisation refers to the process of improving computers, systems, and internet protocols to be able to accommodate the diverse language needs of the world. As such, it enables localisation and computing in many languages. Related to this is the process of setting standards.

This section discusses several aspects of internationalisation and its implementation, which are important to software and internet localisation (and their use) in Africa. These aspects include: the role of standards in facilitating localisation; Unicode and handling text; keyboard and input systems; language codes and locale data; internationalisation and the Web; internationalised domain names (IDNs); and other applications.

6.1 The Facilitating Technical Environment

Internationalisation and international standards may be seen as defining the technical environment for localisation and multilingual ICT. Within the context of localisation ecology they may also be understood as technical and policy related approaches to organising language use in ICT. These factors are in continuous change and evolution. Understanding them is essential to full consideration of localisation issues.

Standards enable interoperability, and they provide a predictable environment for users. In the former sense they are part of internationalisation, such as for instance the adoption and progressive additions to the "universal character set" - Unicode (see below, 6.2).

In addition to Unicode, there are other standards that were adopted for various reasons, which enter into expanded use with localisation. One example is the set of codes for languages codified under successive ISO-639 standardisations (see below, 6.4). Others include country codes, most notably two-letter country codes (ISO-3166),51 and four-letter and three-number codes for writing systems (ISO-15924).52 There exist also guidelines for use of these codes in computer applications and internet content, notably RFC-4646.53 Among other things these codes are used in defining locale data (see below, 6.4).

There is also a set of standards for computer keyboards - ISO-9995 - that includes guidelines for keyboard layouts (see below, 6.3)

Internationalisation also helps provide the predictable environment for users, a process which, taken a step further, involves aspects of localisation. The latter involve standards from orthographies, to terminology, to keyboard layouts which are or would be set on language, country, or regional levels.

6.2 Handling Text: Unicode and Complex Script Requirements

Orthographies and writing systems used for African languages were discussed above (4.3). Representing these on computers and the internet has presented some challenges when extended Latin and non-Latin scripts are involved. This has in principle been resolved with Unicode, but there are still issues with that standard that are being worked on.

Background

Originally, character encoding for computers used a 7-bit system (128 codepoints, or spaces for letters or other information), the most commonly known version of which is the English-based ASCII.54 ISO-646 incorporated this and defined some additional uses for other languages. A later 8-bit encoding (256 codepoints), sometimes called extended-ASCII or ANSI,55 provided more spaces for diverse characters and alphabets.

The earliest way of accommodating diverse script needs involved creating fonts in which some of the characters in another character set (ASCII or ANSI) were "changed." In other words, this meant assigning new characters to codepoints usually used for the characters that usually occupied these spaces.

In response to use of more languages in ICT, a series of standards were developed under ISO-8859.56 Microsoft developed some similar character sets such as Windows-1252? for Latin and Windows-1256? for Arabic. These use 256 codepoints (8 bits or 1 byte), of which the lower 128 (0-127) are identical with those of ASCII and the upper 128 (128-255) are different. However there was never an commercial or industry standard (e.g., in ISO-8859) developed specifically for any sub-Saharan African language or group of languages.

On the other hand, there was another standard devised in 1983 for African languages transcribed in extended Latin alphabet - ISO-6438 "African coded character set for bibliographic information interchange" - but this was apparently little used, even for the primary purpose indicated in its title.57 Curiously, although the time of its development was about the same as that of the "African Reference Alphabet?" of Niamey in 1978 (see above, 4.3), it appears that the two were developed separately.58

African language transcription needs on computers (generally PCs running DOS or later Windows) were therefore initially fulfilled by the development on the local level (and in some cases outside of Africa) of various 8-bit encodings for fonts - often several in each country - in which varying sets of "unused" or little used characters in other encodings were substituted with the needed extended characters and diacritics.

The Unix-based Macintosh? computers followed a separate evolution from ASCII to Unicode over nearly two decades (Macintosh Character set, MacRoman, WorldScript). Many users found that the Macintosh systems facilitated their work with African languages (and various non-Latin scripts), but this apparently did not have much impact in Africa where Macintoshes were and remain relatively rare.59

Arabic, as an international language of high religious and political importance, received a lot of attention early in the process. The script presented challenges from the point of view of script direction (right-to-left with numbers going left-to-right) and changing forms of many characters when preceded or followed by other characters. Nevertheless, coding followed an evolution from an early 7-bit version to 8-bit encodings including ultimately ISO-8859-6 and Windows-1256.60

Computing in the Ethiopic/Ge'ez script (which has over two-hundred characters representing syllables) was the focus of efforts dating back to the early 1980s but by the late 1990s there were apparently a number of mutually incompatible encodings in use in both Ethiopia and Eritrea. There were at least three major approaches to coding Ethiopic/Ge'ez, including using limited character sets and using up to 4 fonts.61 There was no standard prior to Unicode and this legacy persists today despite the availability and increasing use of Unicode encoding (much as it does for other languages in extended Latin scripts).

Unicode

Unicode / ISO-10646, the single encoding standard for all the world's scripts also known as the "Universal Character Set?", incorporates all the characters in previous standard encodings and is designed to facilitate use and exchange of text in any writing form across all platforms and the internet. Unicode in principle can define up to one million characters, though its latest version (Unicode 5.0, 2006) covers all major (and many other, but not all) writing systems with about one tenth that number.

Unicode is commonly implemented in UTF-8,62 which permits Unicode to be used in many cases with as few bits as pre-Unicode encodings.

Unicode and Africa

Since many languages in Africa use either extended Latin alphabets or non-Latin scripts (and sometimes both) this would seem to be a natural for the continent. However there are at present several holdups.

First, although industry is moving to Unicode, and indeed systems have for a long time been designed with the standard in mind, Unicode does not seem to be well understood in Africa, even among computer experts. Many technical experts are occupied with work involving only major international languages (including Arabic), and African language experts, to the extent they work with computers, often still resort to the panoply of legacy 8-bit encodings mentioned above.

This is gradually beginning to change as newer computer systems come into use, discussion of multilingual computing increases, and efforts to facilitate the use of Unicode train more people (among the latter, the effort of the French-funded project RIFAL63 to help national language agencies in West Africa migrate their text banks to Unicode deserves note).

Second, there has been some concern expressed in Africa about how well the Unicode standard meets their needs, mainly with regard to use of diacritics. (This is discussed in detail below.)

Another issue raised was a question about whether the disk size requirements of text in Unicode relative to 8-bit fonts are a disincentive for its use (see Paolillo 2005:47, 72-73). In reality this is not much of a problem if any, given technical advances in handling Unicode (such as UTF-8) as well as the vast increases in disk space and computer memory to meet much larger file requirements (image, audio, etc.).64

Unicode and diacritics in Latin Transcription

While Unicode in principle meets the transcription needs of all languages written in the Latin alphabet and its variants, there are a few issues that are still being discussed. Some of these have to do with individual characters and for those, there is an established system for adding characters or modifying certain information.65 However the decision of Unicode in the late 1990s to rely on "dynamic composition?" to render diacritic characters by combining base characters with one or more "combining diacritics" rather than add more "precomposed?" characters for all combinations used has raised some questions.

Some language experts familiar with Unicode in Africa or working with African languages have expressed concern about this issue, while others have suggested that the issue is lack of understanding about how Unicode works.

The issue of how to deal with diacritics in some African orthographies has received varying amount of attention since the late 1990s. For instance, perceiving a slow pace of progress on support for dynamic composition in Windows systems, and less interest among developers of Macintosh and Linux systems, the Linguistic Data Consortium of the University of Pennsylvania (US) in the late 1990s launched a project to compile a list of character needs for African languages with an eye towards determining the potential for developing alternative 8-bit standards. Under the name of African Language Resource Council (ALRC), this effort was abandoned after a couple of years in large part due to advances in the field.

A similar concern, coupled with the thought that reliance on dynamic composition? might disfavour African languages, motivated another project by Progiciels BPI in Canada to develop a set of 8-bit fonts for African languages, an effort that was recognised at the "Internet: Bridges to Development" conference in Bamako in February 2000 (see Bourbeau and Pinard 2000).

In the "prepcom" held in Bamako in 2002 for the first World Summit on the Information Society (in held in Geneva, 2003), this situation was brought up again, with the suggestion that a series of 8-bit fonts might lead to the adoption of some new standards for Africa in the ISO-8859 series.66 The concern was expressed that Africa had in effect lost out in the Unicode process when Unicode decided not to add more precomposed? Latin characters before the needs of African languages were fully addressed.

More recently the concern was reframed in a paper delivered at the Unicode conference in 2005 as one of handling data in African languages that use characters in combined forms (Chanard 2005). The issue here was partly that of long term implications of using composed characters.

There are three sets of observations to make in response to this persistent line of concern. First, in all of this, there never seems to have been a thorough assessment of the actual usage of diacritic and extended characters in African orthographies. The closest may have been a set of characters compiled as part of the ALRC effort and some research done by John Hudson for Microsoft. (SIL would probably be capable of making such a global summary from its work in various offices based around the continent, but to our knowledge it has never done so.) In any event, most of this research is based on what linguistic articles, dictionaries, and the like indicate as characters and combinations "that are used in" whatever language. In some cases, there is conflicting information, and in some others, there have been changes in official orthographies. In yet other cases, diacritics used to indicate tones in tonal languages may either not be standardised, or be used only where clarity is essential or in learning materials where guidance on pronunciation is important. All of which is to say that the extent of use and potential need for precomposed characters is neither clear nor easily established.

It is of interest to note that at least one effort, the Unicode and IDN (Internationalised Domain Names) Project intends to survey African character needs (see below, 6.6).

Second, the technology for handling dynamic composition has evolved significantly. The means the ability to position diacritics correctly over base characters and the possibility of using a base character? plus a combining diacritic? to render a precomposed glyph? go a long way to obviating concerns about lack of precomposed characters.

Third, there is also the perspective that the objections to the use of combining diacritics are based on inadequate understanding of the technology.

Unicode and Non-Latin Scripts

Among scripts of Africa, Arabic and Ethiopic/Ge'ez (used for Amharic and Tigrinya, among others in the Horn of Africa?) were encoded in Unicode early in its development. Like the Latin script, these both include extended ranges with characters for languages other than the main ones used with the main languages written in them.

Two other African scripts - Tifinagh (used in Berber languages) and N'Ko (used mainly for Manding languages) - have been added in the last couple of years and are part of the 2006 release of Unicode 5.0. N'Ko however includes diacritics for tones and this involves a dynamic composition that is not yet supported.

Other writing systems are being worked on, notably Vai (used for the Vai language in southwestern West Africa) and the process of attending to such minority scripts is being guided by the Script Encoding Initiative? at the University of California at Berkeley. These scripts have value for several reasons but are not used by large populations.

6.3 Keyboards and Input Systems

Computer keyboards followed in general design typewriter keyboards that were originally designed for the same languages that ASCII and ANSI supported. With script requirements beyond these in computing, methods for facilitating input have had to be devised.

The operation of computer keyboards however happens on more abstract levels than a mechanical typewriter, although its functioning appears to a user just as much tied to the letters indicated on the keys as is the case with typewriters. The configuration of keyboard layouts as a part of software design is one of the reasons for this. Nevertheless, the software of the "keyboard driver" can be written or adapted so as to yield any particular character for any key. This in turn can be done in several ways: by the user in changing the commands or shortcuts for individual keys on the computer they are using; by a anyone with a keyboard layout program such as Tavultesoft's "Keyman" or Microsoft's "Keyboard Layout Creator" (MSKLC) that is designed to be used with other software; or by a software programmer or localiser in setting the parameters for the keyboard (including possibly providing options for the eventual user) in the software itself. None of these require any particular attention to what is printed on the keys of the keyboard, though commercial software companies and vendors of computer hardware naturally find it in their interest to coordinate with some kind of standard for languages of major markets.

Another approach exists that involves developing production keyboards (the physical keyboard hardware) and keyboard driver (perhaps with fonts also) for one or more languages. The Konyin keyboard for Nigerian languages is an example.

The following deals with all of the above except for the first (that is, the modification of individual key commands).

Keyboard Layout Creation

For languages with extended Latin or non-Latin scripts, but without any kind of pre-computing input model, it is relatively easy to set keyboard shortcuts or design keyboard layouts. In fact the existence of programs such as Keyman and MSKLC make it easy for anyone so disposed to design and share a particular layout.

The design of a keyboard layout, when starting with a pre-existing model (such as the English "QWERTY?" or French "AZERTY" keyboards) actually begins with a choice of what letters to retain as well as what to add. Keyboard designing programs also allow options of setting deadkeys. The issue of keyboards is complicated a bit by the fact there are at least three levels of consideration in their design, and within those several alternative solutions that can be followed:67

  1. General approach to providing for keyboard input of extended characters and diacritics
    1. Substitution, meaning that a key is reassigned.68 This basically means that one has to change keyboards for each language used. There are two kinds of substitution:
      1. Keys for letters "not used" in a particular language are reassigned to characters or diacritics that are used in the target language, but not in the language the keyboard was designed for. In the case of non-Latin alphabets, this may be all the alphabetic keys.
      2. Non-alphanumeric keys on the original keyboard reassigned to letters in the target language
    2. Key combinations (also called modifier keys), meaning use of two keys, usually Alt-, Ctrl-, both or AltGr- keys plus another, usually letter key, that together yield something other than what is assigned to the letter key alone. In some cases like the Konyin keyboard, there is a special key that functions as AltGr.
    3. Key sequences
      1. Deadkeys, meaning keys that when struck yield no character but when another key is tapped yields a character or diacritic that does not appear on the keyboard. This is the feature for instance in the Windows "United States International" keyboard option for accents (e.g., the apostrophe, double quote, and circumflex are deadkeys, yielding for instance accented vowels when followed by a vowel). This approach works only where the pair of keys will yield a precomposed character and not where two characters (combining diacritic on base character) are involved.
      2. Operator keys, meaning that accent keys are added after the base character (in effect the opposite of deadkeys). This solution is useful for combining diacritics.
    4. Combinations of the above.
  2. Placement or assignment of keys for individual languages.
  3. Providing for multiple languages in single layouts for countries or regions of the continent.

In general it seems that keyboard designers take the approach under #1 that they think is best - opinions and preferences vary - and in general focus on #2. A case could be made that there needs to be more consensus in choices under #1 and that in multilingual societies such as those of Africa, the aim needs to shift to #3 as a strategy. On the other hand, efforts to devise keyboards to accommodate many languages can produce layouts that are very complicated.

Keyboard Design and Standards

Designing keyboards to meet diverse language needs, as part of localising software or creating products, can be as simple as the above but also connects with the larger concern of standards. Standards benefit both localisers and ultimately users, by defining and meeting expectations - in other words, creating a predictable environment for programming, localising, and computer use.

For the proposal and implementation of standard keyboards for a given situation (language or group of languages), there is an international set of guidelines, ISO-9995.69 Among other things, it indicates that a keyboard has three groups of key assignments:70

  • Group 1 is the basic layer with a base and shift (lower and upper case)
  • Group 2 is the national layer with a base and shift. There is a locking shift to access this.
  • Group 3 allows for supplemental characters to be entered. This is a single plane and uses a non-locking shift.

Any longer-term strategies for keyboard development would have to consider these guidelines as well as the needs of the languages and expectations of intended users.

Alternative Input Methods

There are also alternatives to traditional keyboard that are in use internationally. These include: graphics tablets as keyboards or with handwriting recognition; virtual keyboards onscreen; LED keyboards that display the active characters in the keys themselves; and speech-to-text. These are briefly discussed below.

The use of a graphics tablets? for text input can be done two ways. One is to make a keyboard template and corresponding software such that touching the locations indicated for each character produces the intended character. Another way to use a graphics tablet for text input is with handwriting recognition.

Virtual onscreen keyboards are another option, but have some limitations. Virtual keys for special characters in interactive web applications such as forms or email are fairly commonplace, but their use for African languages does not yet seem that widespread (these were used in the African language e-mail sites mentioned above).

A more promising long-term keyboard solution for multilingual computing in African and the world is one with backlit keys that indicate the assignments of keys in a particular keyboard selection and can in principle accommodate and display any keyboard arrangement. This is an emerging technology being pioneered by Art Lebedev Studios in Russia under the name "Optimus."71

Speech recognition technology and its use in speech-to-text (STT) applications has interesting potential for input of text. STT accuracy has become rather good. A noted commercial STT software for English, the "Dragon NaturallySpeaking?" program of Nuance, demonstrates its potential.

6.4 Languages, ISO-639, and Locales

Languages can be identified in documents on the web using certain codes, and software can be designed to insert these codes when saving in HTML. The most important of these codes are defined in ISO-639. There are also supplementary language tags defined by the Internet Assigned Numbers Authority (IANA). In addition, locale information, using ISO-639 language tags and other information, facilitate localisation. These, their relevance for Africa, and issues they raise about how to define "language" in various ICT applications are discussed below.

At the current writing, there are three international standards approved by the International Standards Organisation for identifying languages - ISO-639-1 two-letter codes, ISO-639-2 three-letter codes, ISO-639-3 also of three-letter codes but for all language categories identified by Ethnologue - and three more parts in formulation (see Table 2).

ISO-639-DescriptionStatusReference site
12-letter codes for languagesExisted for several years; formally adopted in 2002http://www.loc.gov/standards/iso639-2/php/English_list.php
23-letter codes for languages & collectionsAdopted in 1998http://www.loc.gov/standards/iso639-2/php/English_list.php
33-letter codes for individual languages (exhaustive)Adopted in 2007http://www.sil.org/iso639-3/codes.asp
4Guidelines & principles for language encodingPlannedhttp://en.wikipedia.org/wiki/ISO_639-4
53-letter codes for language groupsPlannedhttp://en.wikipedia.org/wiki/ISO_639-5
64-letter codes for language variationsPlannedhttp://en.wikipedia.org/wiki/ISO_639-6

Table 2: ISO-639 categories for identifying language, current and planned

This set of standards serves several purposes, including identification of the languages of web content and the selection of the appropriate locale information, where it exists. There is a certain apparent redundancy in ISO-639-1 and -2, which is explained by their roles for terminology and other uses. In brief, ISO-639-1 uses two-letter codes, which mathematically provides a number of identifiers far too few to accommodate the world's languages. ISO-639-2, which uses 3 letters, overcomes this shortcoming.

Several African languages (or clusters of languages) including Arabic have ISO-639-1 two-letter codes. A larger number have ISO-639-2 three-letter codes. There does not appear to have been any strict methodology applied in choosing the language categories, as the first two parts include individual languages and categories that group closely related tongues. Moreover, with the advent of ISO-639-3, which adopts the methodology and categories of Ethnologue's list of languages, a different set of criteria has been introduced into the process.72

The original impetus for ISO-639 was a need for coding for library purposes, and this is reflected in a the presence within ISO-639-2 of bibliographic and terminology codes. The latest instalment - ISO-639-3 - uses SIL's and Ethnologue's criteria in attempting to account for all languages. So among other things, there is also a problem with the "articulation" between the ISO-639-1 and -2 on the one hand and ISO-639-3 on the other. The latter system codes separately what in some cases the former code as single entities. The category of "macrolanguage" has therefore been adopted for the several ISO-639-1 and -2 language categories that correspond to several languages as defined in ISO-639-3.

At some point it would be desirable to consider a more systematic approach to selecting codes for African languages and clusters, perhaps in the process of discussing parts 4-6 of ISO-639. This might optimally involve linguists specialised in African languages as well as perhaps ACALAN. Indeed, since we are talking about international standards affecting Africa and a number of African countries are affiliated with ISO (ISO 2006), it would be ideal to have at least some of those countries participate in the process.

Locale Data

Locale data is essential for certain uses of languages in computing and on the internet. Texin (2006) describes a locale as "a mechanism used in the Web, Java, and many other technologies to establish user interface language, presentation formats, and application behavior." It is in effect another way in which internationalisation of ICT facilitates localisation.

A locale consists of basic information on certain needs and preferences, such as character ranges in Unicode, that are necessary to display text in the language, sort order, currency units, day and date format, and decimal markers. Completing a locale and filing it with the Common Locale Data Repository (CLDR) - managed by the Unicode Consortium73 is a necessary step in localising ICT to a language. Commonly, local data is indicated for a language and a country, using ISO-639 and ISO-3166 codes. Presently there are relatively few languages in Africa with locale data (see below, 7.4).

Filing a locale depends on use of language codes (under ISO-639), country codes (ISO-3166), and writing system codes (ISO-15924).

6.5 Internationalisation and the Web

Along with other effort to facilitate use of languages in ICT are efforts specific to, or with special relevance for, the internet. The UTF-8 implementation of Unicode (mentioned above, 6.2), for example, is increasingly used on for multilingual web content and email.

The World Wide Web Consortium (W3C)74 sets standards for the markup of webpages to facilitate, among other things, diverse language content.

There are also discussions about the way that the web is used that have implications for internationalisation and localisation. For instance there has been for several years discussion of how the Web may evolve organically into something called the Semantic Web with certain characteristics facilitating search, linking, and manipulation of information. A little more recently, discussion of Web 2.0? has organised thinking by some experts and commercial interests in new ways the Web can and indeed does function and serve needs in increasingly interactive ways.

6.6 Internationalised Domain Names (IDNs)

While Unicode and the development of means to render complex script requirements in principle permit content in any language on the internet, another consideration is the names of domains in diverse languages and writing systems. There has been interest in multilingual or internationalised domain names for Africa for some years, as evidenced for instance by the formation of an African chapter of the Multilingual Internet Names Consortium (MINC) called AfriMINC?.75

Recently a project with backing by ACALAN, UNDP and the Agence Intergouvernementale de la Francophonie has taken up the issue at a time when the international discussions have become more serious.76

As of this writing ICANN has announced that it will be testing alternate ways of handling internationalised (multilingual) domain names in non-Latin scripts such as Arabic on a larger scale than what has previously been done.77

6.7 Other applications

A number of other ICTs bear mention in considering the broader context of internationalisation and technologies and applications of potential importance in localisation. These include: mobile technology; audio-related technologies; geographic information systems; and translation tools.

Mobile technology

Mobile technology - cellular phones, handheld computers, etc. - is a rapidly expanding set of devices reflecting ongoing advances in technology that permit smaller devices to do cheaply what used to require larger devices to do. On the cheaper end of the range of mobile devices, simple cellphones have become much more accessible to people with lower incomes in the global South. This market has been attracting investment and increased interest in localisation.

On the higher end, the promise of the Simputer model of relatively inexpensive handheld computing has not been realised, but with ongoing miniaturisation of the technology, future possibilities may yet exist.

Audio Dimensions: Voice, Text-to-Speech (TTS), and Speech Recognition

The transmission, manipulation, and transformation of the human speech is something that would seem natural in cultures often described as oral.

Some of the audio technologies are not terribly popular in the technically advanced countries. Audio e-mail or voice e-mail (sometimes v-mail), for instance, never seems to have taken off there and has had limited use elsewhere. At this point, with the technology moving on and various uses of voice over the internet possible with VOIP? and mobile devices, other possibilities could be explored - perhaps focusing on voice commands.

Also, combinations of audio, image, and text could be very useful for learning as well as anticipating users with lower literacy skills.

TTS is of obvious interest in settings where people with access to the technology cannot read for one or another reason. STT is also of interest (this was mentioned under alternative input methods, above 6.3)

Geographic Information Systems (GIS)

Although GIS might seem to be too specialised for consideration in a general survey of localisation, there are several reasons why its expanded use worldwide should be of interest to localisers in Africa. GIS technology is becoming more accessible and there are efforts to use it in participatory analysis and planning for local development. Spatial imaging is an ideal tool in land and natural resource planning on the local level, as it is readily understood by even illiterate people, but also permits very sophisticated layering of information and analysing of data. In fact there is serious effort in various parts of the global South including Africa to combine use of this technology with established participatory research methodologies such as participatory mapping, in what is known as (public) participatory GIS (PPGIS or PGIS).78

The commercial GIS software marketed by ESRI is considered by many to be the industry standard. There also exist a number of FOSS GIS applications79 among which one called the Geographic Resources Analysis Support System (GRASS) is particularly noted.

GIS standards are governed by the ISO-19100 series (concerning mainly standards for geographic data exchange).

Machine Translation (MT) and Translation Memory (TM)

The ability to transform thought in writing or speech from one language to another with the assistance of a computer is one of the most interesting uses of ICT in multilingual contexts, but one that has had relatively little attention in Africa. The technology in this area is evolving quickly and has connections with and implications for localisation work.

For convenience one might divide it under two headings: machine translation (MT), or the automatic translation between languages by a computer program, which aims at translating speech or text from one language into another, in general or specific settings; and translation memory (TM), which is mostly used as a tool to facilitate new translations based on previous translations of the same or similar text content.


< 5. Technical Context | Survey Document | 7. Current L10n Activity >