Researchers, clinicians, and policy makers involved with the response to COVID-19 are constantly searching for reliable information on the virus and its impact. This presented a unique opportunity for the information retrieval (IR) and text processing communities to contribute to the response to this pandemic, as well as to study methods for quickly standing up information systems for similar future events. The results of the TREC-COVID Challenge identify answers for some of today's questions and create infrastructure to improve tomorrow's search systems.
TREC-COVID followed the TREC model for building IR test collections through community evaluations of search systems. The document set used in the challenge is the COVID-19 Open Research Dataset (CORD-19). This is a collection of biomedical literature articles that is updated regularly. Accordingly, TREC-COVID consisted of a series of rounds, with each round using a later version of the document set and a larger set of COVID-related topics. Participants in a round created ranked lists of documents for each topic ("runs") and submitted their runs to NIST. Based on the collective set of participants' runs, NIST created sets of documents to be assessed for relevance by human annotators with biomedical expertise. The results of the human annotation, known as relevance judgments, were then used to score the submitted runs.
The final document and topic sets together with the cumulative relevance judgments comprise a COVID test collection called TREC-COVID Complete. The incremental nature of the collection as viewed through the successive rounds supports research on search systems for dynamic environments.
Learn more »The TREC-COVID Challenge was organized by the
Allen Institute for Artificial Intelligence (AI2),
the National Institute of Standards and Technology (NIST),
the National Library of Medicine (NLM),
Oregon Health and Science University (OHSU), and
the University of Texas Health Science Center at Houston (UTHealth).
NIST press release.
TREC-COVID had strong international participation.
The cumulative test collection for ad hoc retrieval called TREC-COVID Complete is now available for download from the Data page. Topic sets and relevance judgments from previous rounds are also available from the data page. Retrieval results from prior rounds are stored in the open archive of submissions; and a bibliography of papers resulting from TREC-COVID is being maintained.
You can join the trec-covid Google group to discuss the challenge, follow #COVIDSearch on Twitter, or contact the TREC group at NIST for more information. See also the companion COVIDSearch page.