TREC-COVID Data

TREC-COVID Complete

This is the primary test collection for ad hoc retrieval that is the outcome of all five rounds of TREC-COVID. The test set, called TREC-COVID Complete, consists of the Round 5 document set (July 16 release of CORD-19); the final set of 50 topics; and the cumulative judgments from all assessing rounds with CORD-UIDs mapped to July 16 ids if necessary, previously judged documents no longer in the July 16 release removed, and the last judgments for documents judged multiple times due to significant content changes between rounds. Note that no TREC-COVID submissions correspond to this collection since all TREC-COVID submissions were subject to residual collection evaluation.

Special Collections

Chronological Qrels

Taken together, the set of these judgment files define test collections for different time periods as the pandemic unfolded (hence the "chronological" designation). The individual file for Round X contains a judgment for every [topic,docid] pair such that topic is one of the topics used in Round X, docid is a valid document in Round X, and docid was judged (in any judgment round) for the topic. Because the individual qrels file for a given round contains all known judgments for documents in that round, these files were also called "Total" qrels files.

Documents that had significant changes (change in title, or went from not having an abstract ot having an abstract, or went from having no pdf to having a pdf) across rounds were re-judged following the changes. This means that some [topic,docid] pairs may have different judgments in different Round's qrels. The judgment included in the Round X Chronological qrels is the judgment that was used in the Cumulative qrels for Round X, if it exists, and otherwise it is the judgment from the earliest round in which a judgment was made (because that round must be greater than X or it would have been in the cumulative qrels). A document judged in an early round that was subsequently renamed (i.e., its cord_uid changed) is carried through to later rounds under the new name. (The reverse is not true: the qrels for earlier rounds do not contain the original name of a renamed document if that document was never judged under its original name.)

Each qrels file in this set includes judgments from across all judgment rounds of TREC-COVID and thus none of these qrels correspond to the evaluation of any TREC-COVID submissions. For your own experiments, be sure you are clear as to why you are using a particular round’s Chronological qrels. It is not valid to train on earlier rounds’ qrels and evaluate using the whole topic set with a later round’s qrels; for that type of test you must use the residual collection qrels. It is experimentally valid to train your system using judgments for some topics and then test on a disjoint set of topics using a later round’s qrels. But since later rounds have larger topic sets than earlier rounds, you must restrict the topic sets to some common subset for valid comparisons across rounds.

Note that 'qrels-covid_d5_j0.5-5' is the Chronological qrels for Round 5 which is also the Cumulative qrels of Round 5, and is the qrels for the TREC-COVID Complete collection. In the absence of some specific experimental hypothesis being tested through earlier rounds’ qrels, TREC-COVID Complete is the collection that should be used.

When the Cumulative qrels were originally released during the TREC-COVID event, they did not always contain the most recent rejudgment of a document, either because the rejudgment was not available at the time or because of errors during the construction of the qrels. The TREC-COVID organizers believe that staying consistent with the previously released cumulative qrels is important, and thus forced the Chronological qrels to agree with its corresponding Cumulative qrels. Since one of the goals of TREC-COVID was to enable research on how the information space changed during the pandemic, a record of the rejudgment changes is released here. But note that in all cases the judgment round in which a document was (re)judged is not a statement of when the document was available/changed only of when it happened to be judged. A document that changed significantly between rounds but was never judged in a round prior to the change will have no rejudgment history.

Record of documents that were rejudged. The format of the file is lines containing [topic docid judgment-round judgment].

Per-round Cumulative Qrels

This is the set of cumulative qrels up to and including the given round. For each round, the document ids are the ids with respect to that round's document set (with all documents no longer contained in that document set removed from the qrels). The first round cumulative qrels is precisely the set of qrels released after Round 1. This is the only round that corresponds to runs actually submitted/evaluated for TREC-COVID (due to the use of residual collection evaluation). The Round 5 qrels is the TREC-COVID Complete collection's qrels.

This set of qrels is gathered here as a convenience in that they are all pointed to elsewhere. These qrels are the qrels that define the "previouly judged" document sets for use in residual collection evaluation.

Round 5 Data

Round 4 Data

Round 3 Data

Round 2 Data

May 1, 2020 release of CORD-19
List of valid doc-ids for this round
Topic set
Relevance judgments
The format of a relevance judgments file ("qrels") is lines of
topic-id iteration cord-id judgment
where judgment is 0 for not relevant, 1 for partially relevant, and 2 for fully relevant; and iteration records the round in which the document was judged. trec_eval does not make use of the iteration field (though it expects it to be present for historical reasons), and TREC-COVID is using it for bookkeeping. Since annotators are continuing to work on weeks when a round is active, the iteration field contains "half rounds" as well as whole rounds. A document judged in round 1.5 was selected to be judged from a run in round 1 but is used to score round 2 runs. This qrels file contains only judgment sets 1.5 and 2, which are the only judgments used to score Round 2 runs. This implements residual collection scoring: because Round 2 runs could not contain any previously judged documents, the Round 2 qrels file must also not contain any of those documents. Note that it is possible to create a cumulative Round 1 and Round 2 qrels file to score runs that search the May 1, 2020 release of CORD-19, but it is not valid to compare those scores to either Round 1 or Round 2 TREC-COVID runs.

Round 1 Data

April 10, 2020 release of CORD-19
List of valid doc-ids for this round
Topic set
Relevance judgments
The format of a relevance judgments file ("qrels") is lines of
topic-id iteration cord-id judgment
where judgment is 0 for not relevant, 1 for partially relevant, and 2 for fully relevant; and iteration records the round in which the document was judged. trec_eval does not make use of the iteration field (though it expects it to be present for historical reasons), and TREC-COVID is using it for bookkeeping. Since annotators are continuing to work on weeks when a round is active, the iteration field contains "half rounds" as well as whole rounds. A document judged in round X.5 was selected to be judged from a run in round X but won't be used in scoring until round X+1. (For round 0.5, the documents were selected from runs produced by the organizers that were not part of official runs.) The qrels file contains all the documents that were judged. For most measures, documents not in the qrels file (because they were not judged) are assumed to be not relevant.
Papers providing an overview of Round 1
- TREC-COVID: Rationale and Structure of an Information Retrieval Shared Task for COVID-19. Journal of the American Medical Informatics Association. May, 2020.
- TREC-COVID: Constructing a Pandemic Information Retrieval Test Collection. ACM SIGIR Forum. June, 2020.
- Effect on System Rankings of Extending Pools in TREC-COVID Round 1.
- Effect on System Rankings of Further Extending Pools for TREC-COVID Round 1 Submissions