TorchDenseSearcher

class lightning_ir.retrieve.pytorch.dense_searcher.TorchDenseSearcher(index_dir: Path, search_config: TorchDenseSearchConfig, module: BiEncoderModule, use_gpu: bool = True)[source]

Bases: ExactSearcher

Torch-based dense searcher for embeddings.

__init__(index_dir: Path, search_config: TorchDenseSearchConfig, module: BiEncoderModule, use_gpu: bool = True) None[source]

Initialize the TorchDenseSearcher.

Parameters:
  • index_dir (Path) – Directory where the index is stored.

  • search_config (TorchDenseSearchConfig) – Configuration for the dense search.

  • module (BiEncoderModule) – Bi-encoder module to use for searching.

  • use_gpu (bool) – Whether to use GPU for searching. Defaults to True.

Methods

__init__(index_dir, search_config, module[, ...])

Initialize the TorchDenseSearcher.

to_gpu()

Move the searcher to the GPU if available.

Attributes

property doc_token_idcs: Tensor

Get the document token indices for scoring.

Returns:

The document token indices.

Return type:

torch.Tensor

search(output: BiEncoderOutput) Tuple[PackedTensor, List[List[str]]]

Search for documents based on the output of the bi-encoder model.

Parameters:

output (BiEncoderOutput) – The output from the bi-encoder model containing query and document embeddings.

Returns:

The top-k scores and corresponding document IDs.

Return type:

Tuple[PackedTensor, List[List[str]]]

to_gpu() None[source]

Move the searcher to the GPU if available.