Pan African Research on L10N Workshop and Localization "Blitz" | Marrakech / AccessibleOnlineDictionaryAndTranslationTools
Accessible Online Dictionaries and Translation Tools

Description: Provide language data that is freely available to all users, searchable, and can be improved by participants. Includes database development and the creation of tools for individual languages, such as verb parsers and morpheme generators, that will assist in translation (including machine translation) and language analysis (including spell-checkers).

Which languages: The existing Swahili dictionary project, www.yale.edu/swahili, provides a model to which other languages can be added. Languages will be selected based on priorities defined by stakeholders (governments, donors, research institutions). The number of languages is limited only by the number of people willing to work seriously on the project, which is related to the available funding; these are topics for further discussions among interested parties. A proposal is currently under development for Kinyarwanda. Other languages mentioned at the workshop include Lingala (10,000 word glossary ready to be incorporated), Wolof, and several official languages of South Africa including Xhosa.

Which region: All of Africa (depending on the languages selected).

Who will benefit: Students, translators, localizers, businesses, government agencies and NGOs, individual users.

How they will benefit: Ability to get information about their own or other languages and use that information for communication and learning.

Is it achievable?: Yes! The Kamusi Project (Swahili dictionary) is a successful proof of concept, with more than 70,000 dictionary entries, and over 600,000 users per year from almost every country in the world, including tens of thousands from Africa, performing over 10 million dictionary searches per year.

Risks: Adding languages = adding complexity. Also, people can start dictionaries and then realize that the task is *much* bigger than they are prepared for, leaving the resource incomplete.


  1. Qualified personnel will be available to lead dictionary development.
  2. ICT will already be in place to support fonts and other meta-level technical requirements.
  3. Potential users are or will become networked (though most speakers of African languages do not yet have cost-effective access to online resources, in the future many Africans will have access to use and participate in these resources)


  1. Technology development = months (Programming needs to be done that builds on the existing platform to support additional languages. Once the basic multilingual programming is complete, adaptation for any specific language should take about a week, including structuring the database for the needs of that language and creating a website tailored to that language.)
  2. Data development = weeks per language for creation of basic glossaries
  3. Data develoment = years per language for highly detailed scholarly work. The normal procedure would be to start with a basic glossary and work toward a more scholarly reference over time, with the ever-improving resource being continuously available to the public throughout its development.

Who is needed: Scholars and/or language specialists would normally lead the project for their particular language, with the active support of IT specialists. As demonstrated by the Kamusi Project and African language Wikipedias, enthusiastic volunteers will usually join the project but cannot be relied upon to conduct a substantial amount of quality work in a timely fashion. Institutions such as CLAD (Centre de Linguistic Applique de Dakar), Acalan, African universities should be involved to the greatest extent possible. The project has many synergies with Kasahorow http://dictionary.kasahorow.com, with which it shares not only a common goal but also a common development team.

Importance: Quality language data is fundamental to effective translation, localization, education, and cross-language communication.

