PlaidIndexer

class lightning_ir.retrieve.plaid.plaid_indexer.PlaidIndexer(index_dir: Path, index_config: PlaidIndexConfig, module: BiEncoderModule, verbose: bool = False)[source]

Bases: Indexer

Indexer for Plaid using fast-plaid library.

__init__(index_dir: Path, index_config: PlaidIndexConfig, module: BiEncoderModule, verbose: bool = False) None[source]

Initialize the PlaidIndexer.

Parameters:
  • index_dir (Path) – Directory where the index will be stored.

  • index_config (PlaidIndexConfig) – Configuration for the Plaid indexer.

  • module (BiEncoderModule) – The BiEncoder module used for indexing.

  • verbose (bool) – Whether to print verbose output during indexing. Defaults to False.

Methods

__init__(index_dir, index_config, module[, ...])

Initialize the PlaidIndexer.

add(index_batch, output)

Add embeddings from the index batch to the Plaid index.

save()

Save the index configuration and document IDs to the index directory.

add(index_batch: IndexBatch, output: BiEncoderOutput) None[source]

Add embeddings from the index batch to the Plaid index.

Parameters:
  • index_batch (IndexBatch) – Batch of data containing embeddings to be indexed.

  • output (BiEncoderOutput) – Output from the BiEncoder module containing embeddings.

Raises:

ValueError – If the output does not contain document embeddings.

save() None[source]

Save the index configuration and document IDs to the index directory.