SpladeConfig
- class lightning_ir.models.bi_encoders.splade.SpladeConfig(query_length: int | None = 32, doc_length: int | None = 512, similarity_function: 'cosine' | 'dot' = 'dot', sparsification: 'relu' | 'relu_log' | 'relu_2xlog' | None = 'relu_log', query_pooling_strategy: 'first' | 'mean' | 'max' | 'sum' = 'max', query_weighting: 'contextualized' | 'static' | None = 'contextualized', query_expansion: bool = True, doc_pooling_strategy: 'first' | 'mean' | 'max' | 'sum' = 'max', doc_weighting: 'contextualized' | 'static' | None = 'contextualized', doc_expansion: bool = True, **kwargs)[source]
Bases:
SingleVectorBiEncoderConfigConfiguration class for a SPLADE model.
- __init__(query_length: int | None = 32, doc_length: int | None = 512, similarity_function: 'cosine' | 'dot' = 'dot', sparsification: 'relu' | 'relu_log' | 'relu_2xlog' | None = 'relu_log', query_pooling_strategy: 'first' | 'mean' | 'max' | 'sum' = 'max', query_weighting: 'contextualized' | 'static' | None = 'contextualized', query_expansion: bool = True, doc_pooling_strategy: 'first' | 'mean' | 'max' | 'sum' = 'max', doc_weighting: 'contextualized' | 'static' | None = 'contextualized', doc_expansion: bool = True, **kwargs) None[source]
A SPLADE model encodes queries and documents separately. Before computing the similarity score, the contextualized token embeddings are projected into a logit distribution over the vocabulary using a pre-trained masked language model (MLM) head. The logit distribution is then sparsified and aggregated to obtain a single embedding for the query and document.
- Parameters:
query_length (int | None) – Maximum number of tokens per query. If None does not truncate. Defaults to 32.
doc_length (int | None) – Maximum number of tokens per document. If None does not truncate. Defaults to 512.
similarity_function (Literal["cosine", "dot"]) – Similarity function to compute scores between query and document embeddings. Defaults to “dot”.
sparsification (Literal['relu', 'relu_log', 'relu_2xlog'] | None) – Whether and which sparsification function to apply. Defaults to None.
query_weighting (Literal["contextualized", "static"] | None) – Whether to reweight query embeddings. Defaults to “contextualized”.
query_expansion (bool) – Whether to allow query expansion. Defaults to True.
query_pooling_strategy (Literal["first", "mean", "max", "sum"]) – Pooling strategy for query embeddings. Defaults to “max”.
doc_pooling_strategy (Literal["first", "mean", "max", "sum"]) – Pooling strategy for document embeddings. Defaults to “max”.
doc_weighting (Literal["contextualized", "static"] | None) – Whether to reweight document embeddings. Defaults to “contextualized”.
doc_expansion (bool) – Whether to allow document expansion. Defaults to True.
Methods
__init__([query_length, doc_length, ...])A SPLADE model encodes queries and documents separately.
Attributes
embedding_dimModel type for a SPLADE model.