CoilModel

class lightning_ir.models.bi_encoders.coil.CoilModel(config: CoilConfig, *args, **kwargs)[source]

Multi-vector COIL model. See CoilConfig for configuration options.

__init__(config: CoilConfig, *args, **kwargs) → None[source]

Initializes a COIL model given a CoilConfig configuration.

Methods

`__init__`(config, args, *kwargs)	Initializes a COIL model given a `CoilConfig` configuration.
`encode`(encoding, input_type)	Encodes a batched tokenized text sequences and returns the embeddings and scoring mask.
`score`(output[, num_docs])	Compute relevance scores between queries and documents.

Attributes

training

config_class

Configuration class for COIL models.

encode(encoding: BatchEncoding, input_type: 'query' | 'doc') → CoilEmbedding[source]

Encodes a batched tokenized text sequences and returns the embeddings and scoring mask.

Parameters:

encoding (BatchEncoding) – Tokenizer encodings for the text sequence.
input_type (Literal["query", "doc"]) – type of input, either “query” or “doc”.

Returns:

Embeddings and scoring mask.

Return type:

BiEncoderEmbedding

score(output: CoilOutput, num_docs: Sequence[int] | int | None = None) → CoilOutput[source]

Compute relevance scores between queries and documents.

Parameters:

query_embeddings (CoilEmbedding) – CLS embeddings, token embeddings, and scoring mask for the queries.
doc_embeddings (CoilEmbedding) – CLS embeddings, token embeddings, and scoring mask for the documents.
num_docs (Sequence[int] | int | None) – Specifies how many documents are passed per query. If a sequence of integers, len(num_doc) should be equal to the number of queries and sum(num_docs) equal to the number of documents, i.e., the sequence contains one value per query specifying the number of documents for that query. If an integer, assumes an equal number of documents per query. If None, tries to infer the number of documents by dividing the number of documents by the number of queries. Defaults to None.

Returns:

Relevance scores.

Return type:

torch.Tensor