IRDataset
- class lightning_ir.data.dataset.IRDataset(dataset: str)[source]
Bases:
object- __init__(dataset: str) None[source]
Initializes a new IRDataset.
- Parameters:
dataset (str) – Dataset name.
Methods
__init__(dataset)Initializes a new IRDataset.
prepare_constituent(constituent)Downloads the constituent of the dataset using ir_datasets if needed.
Attributes
Map of dataset names with dashes to dataset names with slashes.
Dataset id with dashes instead of slashes for the documents dataset.
Dataset name.
Dataset id.
Documents in the dataset.
ID of the dataset containing the documents.
Instance of ir_datasets.Dataset.
Qrels in the dataset.
Queries in the dataset.
- property DASHED_DATASET_MAP: Dict[str, str]
Map of dataset names with dashes to dataset names with slashes.
- Returns:
Dataset map.
- Return type:
Dict[str, str]
- property dashed_docs_dataset_id: str
Dataset id with dashes instead of slashes for the documents dataset.
- Returns:
Document dataset id with dashes.
- Return type:
str
- property docs: Docstore | Dict[str, GenericDoc]
Documents in the dataset.
- Returns:
Documents.
- Return type:
ir_datasets.indices.Docstore | Dict[str, GenericDoc]
- Raises:
ValueError – If no documents are found in the dataset.
- property docs_dataset_id: str
ID of the dataset containing the documents.
- Returns:
Document dataset id.
- Return type:
str
- property ir_dataset: Dataset | None
Instance of ir_datasets.Dataset.
- Returns:
Instance of ir_datasets.Dataset or None if the dataset is not found.
- Return type:
ir_datasets.Dataset | None
- prepare_constituent(constituent: 'qrels' | 'queries' | 'docs' | 'scoreddocs' | 'docpairs') None[source]
Downloads the constituent of the dataset using ir_datasets if needed.
- Parameters:
constituent (Literal["qrels", "queries", "docs", "scoreddocs", "docpairs"]) – Constituent to download.