TorchDenseIndex

class lightning_ir.retrieve.pytorch.dense_searcher.TorchDenseIndex(index_dir: Path, similarity_function: 'dot' | 'cosine', use_gpu: bool = False)[source]

Bases: object

Torch-based dense index for embeddings.

__init__(index_dir: Path, similarity_function: 'dot' | 'cosine', use_gpu: bool = False) → None[source]

Initialize the TorchDenseIndex.

Parameters:

index_dir (Path) – Directory where the index is stored.
similarity_function (Literal["dot", "cosine"]) – Similarity function to use for scoring.
use_gpu (bool) – Whether to use GPU for indexing. Defaults to False.

Raises:

ValueError – If the similarity function is not recognized.

Methods

`__init__`(index_dir, similarity_function[, ...])	Initialize the TorchDenseIndex.
`cosine_similarity`(x, y)	Compute the cosine similarity between two tensors.
`dot_similarity`(x, y)	Compute the dot product similarity between two tensors.
`score`(embeddings)	Score the embeddings against the index.
`to_gpu`()	Convert the index to GPU format.

Attributes

num_embeddings

Get the number of embeddings in the index.

static cosine_similarity(x: Tensor, y: Tensor) → Tensor[source]

Compute the cosine similarity between two tensors.

Parameters:

x (torch.Tensor) – First tensor.
y (torch.Tensor) – Second tensor.

Returns:

Cosine similarity scores.

Return type:

torch.Tensor

static dot_similarity(x: Tensor, y: Tensor) → Tensor[source]

Compute the dot product similarity between two tensors.

Parameters:

x (torch.Tensor) – First tensor.
y (torch.Tensor) – Second tensor.

Returns:

Dot product similarity scores.

Return type:

torch.Tensor

property num_embeddings: int: Get the number of embeddings in the index.

score(embeddings: Tensor) → Tensor[source]

Score the embeddings against the index.

Parameters:: embeddings (torch.Tensor) – The embeddings to score.
Returns:: The scores for the embeddings.
Return type:: torch.Tensor

to_gpu() → None[source]: Convert the index to GPU format.