LightningIRConfig

class lightning_ir.base.config.LightningIRConfig(*args, query_length: int = 32, doc_length: int = 512, **kwargs)[source]

Bases: PretrainedConfig

The configuration class to instantiate a Lightning IR model. Acts as a mixin for the transformers.PretrainedConfig class.

__init__(*args, query_length: int = 32, doc_length: int = 512, **kwargs)[source]

Initializes the configuration.

Parameters:
  • query_length (int, optional) – Maximum query length, defaults to 32

  • doc_length (int, optional) – Maximum document length, defaults to 512

Methods

__init__(*args[, query_length, doc_length])

Initializes the configuration.

from_pretrained(...)

Loads the configuration from a pretrained model.

get_tokenizer_kwargs(Tokenizer)

Returns the keyword arguments for the tokenizer.

to_dict()

Overrides the transformers.PretrainedConfig.to_dict method to include the added arguments and the backbone model type.

Attributes

backbone_model_type

Backbone model type for the configuration.

model_type

Model type for the configuration.

backbone_model_type: str | None = None

Backbone model type for the configuration. Set by LightningIRModelClassFactory().

classmethod from_pretrained(pretrained_model_name_or_path: str | Path, *args, **kwargs) LightningIRConfig[source]

Loads the configuration from a pretrained model. Wraps the transformers.PretrainedConfig.from_pretrained

Parameters:

pretrained_model_name_or_path (str | Path) – Pretrained model name or path

Raises:

ValueError – If pre_trained_model_name_or_path is not a Lightning IR model and no LightningIRConfig is passed

Returns:

Derived LightningIRConfig class

Return type:

LightningIRConfig

get_tokenizer_kwargs(Tokenizer: Type[LightningIRTokenizer]) Dict[str, Any][source]

Returns the keyword arguments for the tokenizer. This method is used to pass the configuration parameters to the tokenizer.

Parameters:

Tokenizer (Type[LightningIRTokenizer]) – Class of the tokenizer to be used

Returns:

Keyword arguments for the tokenizer

Return type:

Dict[str, Any]

model_type: str = 'lightning-ir'

Model type for the configuration.

to_dict() Dict[str, Any][source]

Overrides the transformers.PretrainedConfig.to_dict method to include the added arguments and the backbone model type.

Returns:

Configuration dictionary

Return type:

Dict[str, Any]