Swahili and Somali Query Translations of CLEF Bilingual Dataset Available to Researchers

Center for Intelligent Information Retrieval (CIIR) researchers within the University of Massachusetts Amherst College of Information and Computer Sciences are providing a dataset that consists of Swahili and Somali queries translated from the CLEF 2000-2003 Campaign for Bilingual Ad-Hoc Retrieval Tracks (http://catalog.elra.info/en-us/repository/browse/ELRA-E0008/).


For researching on low-resource languages, the CIIR has produced an extension of 200 queries by translating all four years of bilingual queries (2000-2003) into Swahili and Somali, with topic set IDs of C001-C200 corresponding to the other languages that exist in the CLEF data.  They used a translation organization to translate the title and description of the English queries from that topic set into Swahili and Somali languages. Somali is in the Afro-Asiatic language family, and Swahili is in the Niger-Congo language family. Both are mostly spoken in Africa.


More information can be found in their paper, “Simulating CLIR Translation Resource Scarcity using High-resource Languages,” by authors Hamed Bonab, James Allan, and Ramesh Sitaraman in the Proceedings of ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR 2019).


The dataset and paper can be downloaded at: https://ciir.cs.umass.edu/ictir19_simulate_low_resource.

The CLEF Book on lessons learned in 20 years of research - https://dx.doi.org/10.1007/978-3-030-22948-1

The book celebrating 20 years of CLEF activities

Information Retrieval Evaluation in a Changing World.  Lessons Learned from 20 Years of CLEF
is published by Springer and it is available at https://dx.doi.org/10.1007/978-3-030-22948-1
CLEF 2019 LNCS Springer Proceedings Available - https://dx.doi.org/10.1007/978-3-030-28577-7

The CLEF 2019 Proceedings are available online as Spring LNCS Proceedings 10456 at:


CLEF 2019 Working Notes Available - http://ceur-ws.org/Vol-2380/

The CLEF 2019 Working Notes are available online as CEUR-WS Proceedings 2380 at:

Call for Bids to Host CLEF 2022 - September 2022

The CLEF Steering Committee solicits proposals from groups interested in organizing the CLEF conference and labs in September 2022.

Guidelines on submitting a bid can be found in the Template for Bids available at: http://www.clef-initiative.eu/documents/71612/60f6dc78-cc9a-4866-97bc-a4bc858c9d77

Bids must be submitted by Friday, August 2nd 2019 by email to the Steering Commitee Chair Nicola Ferro (chair@clef-initiative.eu).

The Steering Committee will review and select the proposals. The Steering Committee can ask for modifications and changes to the proposals, if deemed necessary. Interested parties can contact the Steering Committee Chair Nicola Ferro (chair@clef-initiative.eu) to receive further details.

Important Dates

- Bid submission deadline: August 2nd, 2019

- Feedback to bidders and discussion: August 2019

- Bid selection: mid September 2019

