MultiVectorBiEncoderModel

class lightning_ir.bi_encoder.bi_encoder_model.MultiVectorBiEncoderModel(config: MultiVectorBiEncoderConfig, *args, **kwargs)[source]

Bases: BiEncoderModel

__init__(config: MultiVectorBiEncoderConfig, *args, **kwargs) None[source]

Initializes a multi-vector bi-encoder model given a MultiVectorBiEncoderConfig.

Parameters:

config (MultiVectorBiEncoderConfig) – Configuration for the multi-vector bi-encoder model

Raises:
  • ValueError – If mask scoring tokens are specified in the configuration but the tokenizer is not available

  • ValueError – If the specified mask scoring tokens are not in the tokenizer vocab

Methods

__init__(config, *args, **kwargs)

Initializes a multi-vector bi-encoder model given a MultiVectorBiEncoderConfig.

aggregate_similarity(similarity, ...[, num_docs])

Aggregates the matrix of query-document similarities into a single score based on the configured aggregation strategy.

score(query_embeddings, doc_embeddings[, ...])

Compute relevance scores between queries and documents.

scoring_mask(encoding, input_type)

Computes a scoring mask for batched tokenized text sequences which is used in the scoring function to mask out vectors during scoring.

Attributes

supports_retrieval_models

training

ALLOW_SUB_BATCHING = True

Flag to allow mini batches of documents for a single query. Set to false for listwise models to ensure correctness.

aggregate_similarity(similarity: Tensor, query_scoring_mask: Tensor, doc_scoring_mask: Tensor, num_docs: Sequence[int] | int | None = None) Tensor[source]

Aggregates the matrix of query-document similarities into a single score based on the configured aggregation strategy.

Parameters:
  • similarity (torch.Tensor) – Query-document similarity matrix

  • query_scoring_mask (torch.Tensor) – Which query vectors should be masked out during scoring

  • doc_scoring_mask (torch.Tensor) – Which doucment vectors should be masked out during scoring

Returns:

Aggregated similarity scores

Return type:

torch.Tensor

compute_similarity(query_embeddings: BiEncoderEmbedding, doc_embeddings: BiEncoderEmbedding, num_docs: Sequence[int] | int | None = None) Tensor

Computes the similarity score between all query and document embedding vector pairs.

Parameters:
  • query_embeddings (BiEncoderEmbedding) – Embeddings of the queries

  • doc_embeddings (BiEncoderEmbedding) – Embeddings of the documents

  • num_docs (Sequence[int] | int | None, optional) – Specifies how many documents are passed per query. If a sequence of integers, len(num_doc) should be equal to the number of queries and sum(num_docs) equal to the number of documents, i.e., the sequence contains one value per query specifying the number of documents for that query. If an integer, assumes an equal number of documents per query. If None, tries to infer the number of documents by dividing the number of documents by the number of queries, defaults to None

Returns:

Similarity scores between all query and document embedding vector pairs

Return type:

torch.Tensor

config_class

Configuration class for the single-vector bi-encoder model.

alias of MultiVectorBiEncoderConfig

abstract encode(encoding: BatchEncoding, input_type: Literal['query', 'doc']) BiEncoderEmbedding

Encodes a batched tokenized text sequences and returns the embeddings and scoring mask.

Parameters:
  • encoding (BatchEncoding) – Tokenizer encodings for the text sequence

  • input_type (Literal["query", "doc"]) – Type of input, either “query” or “doc”

Returns:

Embeddings and scoring mask

Return type:

BiEncoderEmbedding

encode_doc(encoding: BatchEncoding) BiEncoderEmbedding

Encodes tokenized documents.

Parameters:

encoding (BatchEncoding) – Tokenizer encodings for the documents

Returns:

Query embeddings and scoring mask

Return type:

BiEncoderEmbedding

encode_query(encoding: BatchEncoding) BiEncoderEmbedding

Encodes tokenized queries.

Parameters:

encoding (BatchEncoding) – Tokenizer encodings for the queries

Returns:

Query embeddings and scoring mask

Return type:

BiEncoderEmbedding

forward(query_encoding: BatchEncoding | None, doc_encoding: BatchEncoding | None, num_docs: Sequence[int] | int | None = None) BiEncoderOutput

Embeds queries and/or documents and computes relevance scores between them if both are provided.

Parameters:
  • query_encoding (BatchEncoding | None) – Tokenizer encodings for the queries

  • doc_encoding (BatchEncoding | None) – Tokenizer encodings for the documents

  • num_docs (Sequence[int] | int | None, optional) – Specifies how many documents are passed per query. If a sequence of integers, len(num_doc) should be equal to the number of queries and sum(num_docs) equal to the number of documents, i.e., the sequence contains one value per query specifying the number of documents for that query. If an integer, assumes an equal number of documents per query. If None, tries to infer the number of documents by dividing the number of documents by the number of queries, defaults to None

Returns:

Output of the model

Return type:

BiEncoderOutput

classmethod from_pretrained(model_name_or_path: str | Path, *args, **kwargs) Self
Loads a pretrained model. Wraps the transformers.PreTrainedModel.from_pretrained method to return a

derived LightningIRModel. See LightningIRModelClassFactory for more details.

param model_name_or_path:

Name or path of the pretrained model

type model_name_or_path:

str | Path

raises ValueError:

If called on the abstract class LightningIRModel and no config is passed

return:

A derived LightningIRModel consisting of a backbone model and a LightningIRModel mixin

rtype:

LightningIRModel

>>> # Loading using model class and backbone checkpoint
>>> type(CrossEncoderModel.from_pretrained("bert-base-uncased"))
<class 'lightning_ir.base.class_factory.CrossEncoderBertModel'>
>>> # Loading using base class and backbone checkpoint
>>> type(LightningIRModel.from_pretrained("bert-base-uncased", config=CrossEncoderConfig()))
<class 'lightning_ir.base.class_factory.CrossEncoderBertModel'>
pooling(embeddings: Tensor, attention_mask: Tensor | None, pooling_strategy: Literal['first', 'mean', 'max', 'sum'] | None) Tensor

Helper method to apply pooling to the embeddings.

Parameters:
  • embeddings (torch.Tensor) – Query or document embeddings

  • attention_mask (torch.Tensor | None) – Query or document attention mask

  • pooling_strategy (Literal['first', 'mean', 'max', 'sum'] | None) – The pooling strategy. No pooling is applied if None.

Raises:

ValueError – If an unknown pooling strategy is passed

Returns:

(Optionally) pooled embeddings

Return type:

torch.Tensor

score(query_embeddings: BiEncoderEmbedding, doc_embeddings: BiEncoderEmbedding, num_docs: Sequence[int] | int | None = None) Tensor[source]

Compute relevance scores between queries and documents.

Parameters:
  • query_embeddings (BiEncoderEmbedding) – Embeddings and scoring mask for the queries

  • doc_embeddings (BiEncoderEmbedding) – Embeddings and scoring mask for the documents

  • num_docs (Sequence[int] | int | None, optional) – Specifies how many documents are passed per query. If a sequence of integers, len(num_doc) should be equal to the number of queries and sum(num_docs) equal to the number of documents, i.e., the sequence contains one value per query specifying the number of documents for that query. If an integer, assumes an equal number of documents per query. If None, tries to infer the number of documents by dividing the number of documents by the number of queries, defaults to None

Returns:

Relevance scores

Return type:

torch.Tensor

scoring_mask(encoding: BatchEncoding, input_type: Literal['query', 'doc']) Tensor[source]

Computes a scoring mask for batched tokenized text sequences which is used in the scoring function to mask out vectors during scoring.

Parameters:
  • encoding (BatchEncoding) – Tokenizer encodings for the text sequence

  • input_type (Literal["query", "doc"]) – Type of input, either “query” or “doc”

Returns:

Scoring mask

Return type:

torch.Tensor

sparsification(embeddings: Tensor, sparsification_strategy: Literal['relu', 'relu_log'] | None = None) Tensor

Helper method to apply sparsification to the embeddings.

Parameters:
  • embeddings (torch.Tensor) – Query or document embeddings

  • sparsification_strategy (Literal['relu', 'relu_log'] | None, optional) – The sparsification strategy. No sparsification is applied if None, defaults to None

Raises:

ValueError – If an unknown sparsification strategy is passed

Returns:

(Optionally) sparsified embeddings

Return type:

torch.Tensor