Accessible Online Dictionaries and Translation Tools
Description: Provide language data that is freely available to all users, searchable, and can be improved by participants. Includes database development and the creation of tools for individual languages, such as verb parsers and morpheme generators, that will assist in translation (including machine translation) and language analysis (including spell-checkers).
Which languages: The existing Swahili dictionary project, www.yale.edu/swahili, provides a model to which other languages can be added. Languages will be selected based on priorities defined by stakeholders (governments, donors, research institutions). The number of languages is limited only by the number of people willing to work seriously on the project, which is related to the available funding; these are topics for further discussions among interested parties. A proposal is currently under development for Kinyarwanda. Other languages mentioned at the workshop include Lingala (10,000 word glossary ready to be incorporated), Wolof, and several official languages of South Africa including Xhosa.
Which region: All of Africa (depending on the languages selected).
Who will benefit: Students, translators, localizers, businesses, government agencies and NGOs, individual users.
How they will benefit: Ability to get information about their own or other languages and use that information for communication and learning.
Is it achievable?: Yes! The Kamusi Project (Swahili dictionary) is a successful proof of concept, with more than 70,000 dictionary entries, and over 600,000 users per year from almost every country in the world, including tens of thousands from Africa, performing over 10 million dictionary searches per year.
Risks: Adding languages = adding complexity. Also, people can start dictionaries and then realize that the task is *much* bigger than they are prepared for, leaving the resource incomplete.
Who is needed: Scholars and/or language specialists would normally lead the project for their particular language, with the active support of IT specialists. As demonstrated by the Kamusi Project and African language Wikipedias, enthusiastic volunteers will usually join the project but cannot be relied upon to conduct a substantial amount of quality work in a timely fashion. Institutions such as CLAD (Centre de Linguistic Applique de Dakar), Acalan, African universities should be involved to the greatest extent possible. The project has many synergies with Kasahorow http://dictionary.kasahorow.com, with which it shares not only a common goal but also a common development team.
Importance: Quality language data is fundamental to effective translation, localization, education, and cross-language communication.