TorchDenseIndexer

class lightning_ir.retrieve.pytorch.dense_indexer.TorchDenseIndexer(index_dir: Path, index_config: TorchDenseIndexConfig, module: BiEncoderModule, verbose: bool = False)[source]

Bases: Indexer

Indexer for dense embeddings using PyTorch.

__init__(index_dir: Path, index_config: TorchDenseIndexConfig, module: BiEncoderModule, verbose: bool = False) → None[source]

Initialize the TorchDenseIndexer.

Parameters:

index_dir (Path) – Directory to store the index.
index_config (TorchDenseIndexConfig) – Configuration for the dense index.
module (BiEncoderModule) – Bi-encoder module to use for indexing.
verbose (bool) – Whether to print verbose output. Defaults to False.

Methods

`__init__`(index_dir, index_config, module[, ...])	Initialize the TorchDenseIndexer.
`add`(index_batch, output)	Add embeddings from the output to the index.
`save`()	Save the index to the specified directory.
`to_cpu`()	Convert the index to CPU format.
`to_gpu`()	Convert the index to GPU format.

add(index_batch: IndexBatch, output: BiEncoderOutput) → None[source]

Add embeddings from the output to the index.

Parameters:

index_batch (IndexBatch) – Batch containing the index data.
output (BiEncoderOutput) – Output from the Bi-encoder model containing embeddings.

Raises:

ValueError – If output does not contain document embeddings.

save() → None[source]: Save the index to the specified directory.

to_cpu() → None[source]: Convert the index to CPU format.

to_gpu() → None[source]: Convert the index to GPU format.