FaissIndexer

class lightning_ir.retrieve.faiss.faiss_indexer.FaissIndexer(index_dir: Path, index_config: FaissIndexConfig, module: BiEncoderModule, verbose: bool = False)[source]

Bases: Indexer

Base class for FAISS indexers in the Lightning IR framework.

__init__(index_dir: Path, index_config: FaissIndexConfig, module: BiEncoderModule, verbose: bool = False) None[source]

Initialize the FaissIndexer.

Parameters:
  • index_dir (Path) – Directory where the index will be stored.

  • index_config (FaissIndexConfig) – Configuration for the FAISS index.

  • module (BiEncoderModule) – The BiEncoderModule to use for indexing.

  • verbose (bool) – Whether to enable verbose output. Defaults to False.

Raises:

ValueError – If the similarity function is not supported.

Methods

__init__(index_dir, index_config, module[, ...])

Initialize the FaissIndexer.

add(index_batch, output)

Add embeddings to the FAISS index.

process_embeddings(embeddings)

Process embeddings before adding them to the FAISS index.

save()

Save the FAISS index to disk.

set_verbosity([verbose])

Set the verbosity of the FAISS index.

to_cpu()

Move the FAISS index to CPU.

to_gpu()

Move the FAISS index to GPU.

Attributes

INDEX_FACTORY

add(index_batch: IndexBatch, output: BiEncoderOutput) None[source]

Add embeddings to the FAISS index.

Parameters:
  • index_batch (IndexBatch) – The batch containing document indices and embeddings.

  • output (BiEncoderOutput) – The output from the bi-encoder module containing document embeddings.

Raises:

ValueError – If the document embeddings are not present in the output.

process_embeddings(embeddings: Tensor) Tensor[source]

Process embeddings before adding them to the FAISS index.

Parameters:

embeddings (torch.Tensor) – The embeddings to process.

Returns:

The processed embeddings.

Return type:

torch.Tensor

save() None[source]

Save the FAISS index to disk.

Raises:

ValueError – If the number of embeddings does not match the index’s total number of entries.

set_verbosity(verbose: bool | None = None) None[source]

Set the verbosity of the FAISS index.

Parameters:

verbose (bool | None) – Whether to enable verbose output. If None, uses the index’s current verbosity setting. Defaults to None.

to_cpu() None[source]

Move the FAISS index to CPU.

to_gpu() None[source]

Move the FAISS index to GPU.