TorchDenseIndexer
- class lightning_ir.retrieve.pytorch.dense_indexer.TorchDenseIndexer(index_dir: Path, index_config: TorchDenseIndexConfig, module: BiEncoderModule, verbose: bool = False)[source]
Bases:
IndexerIndexer for dense embeddings using PyTorch.
- __init__(index_dir: Path, index_config: TorchDenseIndexConfig, module: BiEncoderModule, verbose: bool = False) None[source]
Initialize the TorchDenseIndexer.
- Parameters:
index_dir (Path) – Directory to store the index.
index_config (TorchDenseIndexConfig) – Configuration for the dense index.
module (BiEncoderModule) – Bi-encoder module to use for indexing.
verbose (bool) – Whether to print verbose output. Defaults to False.
Methods
__init__(index_dir, index_config, module[, ...])Initialize the TorchDenseIndexer.
add(index_batch, output)Add embeddings from the output to the index.
save()Save the index to the specified directory.
to_cpu()Convert the index to CPU format.
to_gpu()Convert the index to GPU format.
- add(index_batch: IndexBatch, output: BiEncoderOutput) None[source]
Add embeddings from the output to the index.
- Parameters:
index_batch (IndexBatch) – Batch containing the index data.
output (BiEncoderOutput) – Output from the Bi-encoder model containing embeddings.
- Raises:
ValueError – If output does not contain document embeddings.
- save() None[source]
Save the index to the specified directory.
- to_cpu() None[source]
Convert the index to CPU format.
- to_gpu() None[source]
Convert the index to GPU format.