IRDataset
- class lightning_ir.data.dataset.IRDataset(dataset: str)[source]
Bases:
object
- __init__(dataset: str) None [source]
Methods
__init__
(dataset)prepare_constituent
(constituent)Downloads the constituent of the dataset using ir_datasets if needed.
Attributes
Map of dataset names with dashes to dataset names with slashes.
Dataset name.
Dataset id.
Documents in the dataset.
ID of the dataset containing the documents.
Instance of ir_datasets.Dataset.
Qrels in the dataset.
Queries in the dataset.
- property DASHED_DATASET_MAP: Dict[str, str]
Map of dataset names with dashes to dataset names with slashes.
- Returns:
Dataset map
- Return type:
Dict[str, str]
- property docs: Docstore | Dict[str, GenericDoc]
Documents in the dataset.
- Raises:
ValueError – If no documents are found in the dataset
- Returns:
Documents
- Return type:
ir_datasets.indices.Docstore | Dict[str, GenericDoc]
- property docs_dataset_id: str
ID of the dataset containing the documents.
- Returns:
Document dataset id
- Return type:
str
- property ir_dataset: Dataset | None
Instance of ir_datasets.Dataset.
- Returns:
ir_datasets dataset
- Return type:
ir_datasets.Dataset | None
- prepare_constituent(constituent: Literal['qrels', 'queries', 'docs', 'scoreddocs', 'docpairs']) None [source]
Downloads the constituent of the dataset using ir_datasets if needed.
- Parameters:
constituent (Literal["qrels", "queries", "docs", "scoreddocs", "docpairs"]) – Constituent to download