TupleDataset
- class lightning_ir.data.dataset.TupleDataset(tuples_dataset: str, targets: 'order' | 'score' = 'order', num_docs: int | None = None)[source]
Bases:
IRDataset,IterableDataset- __init__(tuples_dataset: str, targets: 'order' | 'score' = 'order', num_docs: int | None = None) None[source]
Dataset containing tuples of a query and n-documents. Used for fine-tuning models on ranking tasks.
- Parameters:
tuples_dataset (str) – Path to file containing tuples or valid ir_datasets id.
targets (Literal["order", "score"], optional) – Data type to use as targets for a model during fine-tuning. Defaults to “order”.
num_docs (int | None, optional) – Maximum number of documents per query. Defaults to None.
Methods
__init__(tuples_dataset[, targets, num_docs])Dataset containing tuples of a query and n-documents.
Downloads tuples using ir_datasets if needed.
Attributes
- prepare_data() None[source]
Downloads tuples using ir_datasets if needed.