TorchSparseSearcher

class lightning_ir.retrieve.pytorch.sparse_searcher.TorchSparseSearcher(index_dir: Path, search_config: TorchSparseSearchConfig, module: BiEncoderModule, use_gpu: bool = True)[source]

Bases: ExactSearcher

Torch-based sparse searcher for Lightning IR framework.

__init__(index_dir: Path, search_config: TorchSparseSearchConfig, module: BiEncoderModule, use_gpu: bool = True) None[source]

Initialize the TorchSparseSearcher.

Parameters:
  • index_dir (Path) – Directory containing the index files.

  • search_config (TorchSparseSearchConfig) – Configuration for the searcher.

  • module (BiEncoderModule) – The BiEncoder module to use for scoring.

  • use_gpu (bool) – Whether to use GPU for computations. Defaults to True.

Methods

__init__(index_dir, search_config, module[, ...])

Initialize the TorchSparseSearcher.

to_gpu()

Move the searcher and index to GPU if available.

Attributes

property doc_token_idcs: Tensor

Get the document token indices for scoring.

Returns:

The document token indices.

Return type:

torch.Tensor

search(output: BiEncoderOutput) Tuple[PackedTensor, List[List[str]]]

Search for documents based on the output of the bi-encoder model.

Parameters:

output (BiEncoderOutput) – The output from the bi-encoder model containing query and document embeddings.

Returns:

The top-k scores and corresponding document IDs.

Return type:

Tuple[PackedTensor, List[List[str]]]

to_gpu() None[source]

Move the searcher and index to GPU if available.