SpladeModel
- class lightning_ir.models.splade.SpladeModel(config: SingleVectorBiEncoderConfig, *args, **kwargs)[source]
Bases:
SingleVectorBiEncoderModelSparse lexical SPLADE model. See
SpladeConfigfor configuration options.- __init__(config: SingleVectorBiEncoderConfig, *args, **kwargs) None[source]
Initializes a SPLADE model given a
SpladeConfig.- Parameters:
config (SingleVectorBiEncoderConfig) – Configuration for the SPLADE model.
Methods
__init__(config, *args, **kwargs)Initializes a SPLADE model given a
SpladeConfig.encode(encoding, input_type)Encodes a batched tokenized text sequences and returns the embeddings and scoring mask.
from_pretrained(model_name_or_path, *args, ...)Loads a pretrained model and handles mapping the MLM head weights to the projection head weights. Wraps
Returns the output embeddings of the model for tieing the input and output embeddings.
set_output_embeddings(new_embeddings)Sets the model's output embedding, defaulting to setting new_embeddings to lm_head.
Attributes
training- ALLOW_SUB_BATCHING = True
Flag to allow mini batches of documents for a single query. Set to false for listwise models to ensure correctness.
- compute_similarity(query_embeddings: BiEncoderEmbedding, doc_embeddings: BiEncoderEmbedding, num_docs: Sequence[int] | int | None = None) Tensor
Computes the similarity score between all query and document embedding vector pairs.
- Parameters:
query_embeddings (BiEncoderEmbedding) – Embeddings of the queries.
doc_embeddings (BiEncoderEmbedding) – Embeddings of the documents.
num_docs (Sequence[int] | int | None) – Specifies how many documents are passed per query. If a sequence of integers, len(num_docs) should be equal to the number of queries and sum(num_docs) equal to the number of documents, i.e., the sequence contains one value per query specifying the number of documents for that query. If an integer, assumes an equal number of documents per query. If None, tries to infer the number of documents by dividing the number of documents by the number of queries. Defaults to None.
- Returns:
Similarity scores between all query and document embedding vector pairs.
- Return type:
torch.Tensor
- config_class
Configuration class for a SPLADE model.
alias of
SpladeConfig
- encode(encoding: BatchEncoding, input_type: Literal['query', 'doc']) BiEncoderEmbedding[source]
Encodes a batched tokenized text sequences and returns the embeddings and scoring mask.
- Parameters:
encoding (BatchEncoding) – Tokenizer encodings for the text sequence.
input_type (Literal["query", "doc"]) – Type of input, either “query” or “doc”.
- Returns:
Embeddings and scoring mask.
- Return type:
- encode_doc(encoding: BatchEncoding) BiEncoderEmbedding
Encodes tokenized documents.
- Parameters:
encoding (BatchEncoding) – Tokenizer encodings for the documents.
- Returns:
Document embeddings and scoring mask.
- Return type:
- encode_query(encoding: BatchEncoding) BiEncoderEmbedding
Encodes tokenized queries.
- Parameters:
encoding (BatchEncoding) – Tokenizer encodings for the queries.
- Returns:
Query embeddings and scoring mask.
- Return type:
- forward(query_encoding: BatchEncoding | None, doc_encoding: BatchEncoding | None, num_docs: Sequence[int] | int | None = None) BiEncoderOutput
Embeds queries and/or documents and computes relevance scores between them if both are provided.
- Parameters:
query_encoding (BatchEncoding | None) – Tokenizer encodings for the queries. Defaults to None.
doc_encoding (BatchEncoding | None) – Tokenizer encodings for the documents. Defaults to None.
num_docs (Sequence[int] | int | None) – Specifies how many documents are passed per query. If a sequence of integers, len(num_doc) should be equal to the number of queries and sum(num_docs) equal to the number of documents, i.e., the sequence contains one value per query specifying the number of documents for that query. If an integer, assumes an equal number of documents per query. If None, tries to infer the number of documents by dividing the number of documents by the number of queries. Defaults to None.
- Returns:
Output of the model containing query and document embeddings and relevance scores.
- Return type:
- classmethod from_pretrained(model_name_or_path: str | Path, *args, **kwargs) Self[source]
- Loads a pretrained model and handles mapping the MLM head weights to the projection head weights. Wraps
the transformers.PreTrainedModel.from_pretrained method to return a derived LightningIRModel. See
LightningIRModelClassFactoryfor more details.
>>> # Loading using model class and backbone checkpoint >>> type(CrossEncoderModel.from_pretrained("bert-base-uncased")) <class 'lightning_ir.base.class_factory.CrossEncoderBertModel'> >>> # Loading using base class and backbone checkpoint >>> type(LightningIRModel.from_pretrained("bert-base-uncased", config=CrossEncoderConfig())) <class 'lightning_ir.base.class_factory.CrossEncoderBertModel'>- Args:
model_name_or_path (str | Path): Name or path of the pretrained model.
- Returns:
Self: A derived LightningIRModel consisting of a backbone model and a LightningIRModel mixin.
- Raises:
ValueError: If called on the abstract class
SpladeModeland no config is passed.
- get_output_embeddings() Module | None[source]
Returns the output embeddings of the model for tieing the input and output embeddings. Returns None if no MLM head is used for projection.
- Returns:
Output embeddings of the model.
- Return type:
torch.nn.Module | None
- pooling(embeddings: Tensor, attention_mask: Tensor | None, pooling_strategy: Literal['first', 'mean', 'max', 'sum'] | None) Tensor
Helper method to apply pooling to the embeddings.
- Parameters:
embeddings (torch.Tensor) – Query or document embeddings
attention_mask (torch.Tensor | None) – Query or document attention mask
pooling_strategy (Literal['first', 'mean', 'max', 'sum'] | None) – The pooling strategy. No pooling is applied if None.
- Returns:
(Optionally) pooled embeddings.
- Return type:
torch.Tensor
- Raises:
ValueError – If an unknown pooling strategy is passed.
- score(output: BiEncoderOutput, num_docs: Sequence[int] | int | None = None) BiEncoderOutput
Compute relevance scores between queries and documents.
- Parameters:
output (BiEncoderOutput) – Output containing embeddings and scoring mask.
num_docs (Sequence[int] | int | None) – Specifies how many documents are passed per query. If a sequence of integers, len(num_doc) should be equal to the number of queries and sum(num_docs) equal to the number of documents, i.e., the sequence contains one value per query specifying the number of documents for that query. If an integer, assumes an equal number of documents per query. If None, tries to infer the number of documents by dividing the number of documents by the number of queries. Defaults to None.
- Returns:
Output containing relevance scores.
- Return type:
- Raises:
ValueError – If query or document embeddings are not provided in the output.
- set_output_embeddings(new_embeddings: Module) None[source]
Sets the model’s output embedding, defaulting to setting new_embeddings to lm_head.
- sparsification(embeddings: Tensor, sparsification_strategy: Literal['relu', 'relu_log'] | None = None) Tensor
Helper method to apply sparsification to the embeddings.
- Parameters:
embeddings (torch.Tensor) – Query or document embeddings
sparsification_strategy (Literal['relu', 'relu_log'] | None) – The sparsification strategy. No sparsification is applied if None. Defaults to None.
- Returns:
(Optionally) sparsified embeddings.
- Return type:
torch.Tensor
- Raises:
ValueError – If an unknown sparsification strategy is passed.