TorchDenseIndexer

class lightning_ir.retrieve.pytorch.dense_indexer.TorchDenseIndexer(index_dir: Path, index_config: TorchDenseIndexConfig, module: BiEncoderModule, verbose: bool = False)[source]

Bases: Indexer

Indexer for dense embeddings using PyTorch.

__init__(index_dir: Path, index_config: TorchDenseIndexConfig, module: BiEncoderModule, verbose: bool = False) None[source]

Initialize the TorchDenseIndexer.

Parameters:
  • index_dir (Path) – Directory to store the index.

  • index_config (TorchDenseIndexConfig) – Configuration for the dense index.

  • module (BiEncoderModule) – Bi-encoder module to use for indexing.

  • verbose (bool) – Whether to print verbose output. Defaults to False.

Methods

__init__(index_dir, index_config, module[, ...])

Initialize the TorchDenseIndexer.

add(index_batch, output)

Add embeddings from the output to the index.

save()

Save the index to the specified directory.

to_cpu()

Convert the index to CPU format.

to_gpu()

Convert the index to GPU format.

add(index_batch: IndexBatch, output: BiEncoderOutput) None[source]

Add embeddings from the output to the index.

Parameters:
  • index_batch (IndexBatch) – Batch containing the index data.

  • output (BiEncoderOutput) – Output from the Bi-encoder model containing embeddings.

Raises:

ValueError – If output does not contain document embeddings.

save() None[source]

Save the index to the specified directory.

to_cpu() None[source]

Convert the index to CPU format.

to_gpu() None[source]

Convert the index to GPU format.