isiXhosa Information Retrieval


The general objective of this project is to conduct fundamental research that will enable users to search for and find digital information using isiXhosa as a discovery language. With good quality results, this will give users (especially those who speak isiXhosa as a mother tongue) greater access to document collections on the Internet and elsewhere, whether the documents are in isiXhosa, other languages or combinations thereof. In particular, the objective is to develop technical solutions that will enable some communities of users who are not fully literate in English to also have access to the wealth of digital information that is available online.

The technology objectives of this project are to develop core algorithms and tools to support information retrieval in isiXhosa, including mono-, multi- and cross-lingual search. This includes elaboration/extension of existing language models and development of new models to specifically support IR, as well as the extensions and development of language corpora to support such research.

Masters Student
Masters Student


1. Corpus Development Technology (Sean Packham)

2. isiZulu search engine (done by Honours students in 2014)