SpladeModel

class lightning_ir.models.bi_encoders.splade.SpladeModel(config: SingleVectorBiEncoderConfig, *args, **kwargs)[source]

Bases: SingleVectorBiEncoderModel

Sparse lexical SPLADE model. See SpladeConfig for configuration options.

__init__(config: SingleVectorBiEncoderConfig, *args, **kwargs) → None[source]

Initializes a SPLADE model given a SpladeConfig.

Parameters:: config (SingleVectorBiEncoderConfig) – Configuration for the SPLADE model.

Methods

`__init__`(config, args, *kwargs)	Initializes a SPLADE model given a `SpladeConfig`.
`encode`(encoding, input_type)	Encodes a batched tokenized text sequences and returns the embeddings and scoring mask.
`from_pretrained`(model_name_or_path, *args, ...)	Loads a pretrained model and handles mapping the MLM head weights to the projection head weights. Wraps
`get_output_embeddings`()	Returns the output embeddings of the model for tieing the input and output embeddings.
`set_output_embeddings`(new_embeddings)	Sets the model's output embedding, defaulting to setting new_embeddings to lm_head.

Attributes

training

config_class

Configuration class for a SPLADE model.

alias of SpladeConfig

encode(encoding: BatchEncoding, input_type: 'query' | 'doc') → BiEncoderEmbedding[source]

Encodes a batched tokenized text sequences and returns the embeddings and scoring mask.

Parameters:

encoding (BatchEncoding) – Tokenizer encodings for the text sequence.
input_type (Literal["query", "doc"]) – Type of input, either “query” or “doc”.

Returns:

Embeddings and scoring mask.

Return type:

BiEncoderEmbedding

classmethod from_pretrained(model_name_or_path: str | Path, *args, **kwargs) → Self[source]

Loads a pretrained model and handles mapping the MLM head weights to the projection head weights. Wraps: the transformers.PreTrainedModel.from_pretrained method to return a derived LightningIRModel. See LightningIRModelClassFactory for more details.

>>> # Loading using model class and backbone checkpoint
>>> type(CrossEncoderModel.from_pretrained("bert-base-uncased"))
<class 'lightning_ir.base.class_factory.CrossEncoderBertModel'>
>>> # Loading using base class and backbone checkpoint
>>> type(LightningIRModel.from_pretrained("bert-base-uncased", config=CrossEncoderConfig()))
<class 'lightning_ir.base.class_factory.CrossEncoderBertModel'>
Args:
model_name_or_path (str | Path): Name or path of the pretrained model.

Returns:
Self: A derived LightningIRModel consisting of a backbone model and a LightningIRModel mixin.

Raises:
ValueError: If called on the abstract class SpladeModel and no config is passed.

get_output_embeddings() → Module | None[source]

Returns the output embeddings of the model for tieing the input and output embeddings. Returns None if no MLM head is used for projection.

Returns:: Output embeddings of the model.
Return type:: torch.nn.Module | None

set_output_embeddings(new_embeddings: Module) → None[source]: Sets the model’s output embedding, defaulting to setting new_embeddings to lm_head.