LightningIRConfig

class lightning_ir.base.config.LightningIRConfig(*args, query_length: int | None = 32, doc_length: int | None = 512, use_adapter: bool = False, adapter_config: LoraConfig | None = None, pretrained_adapter_name_or_path: str | None = None, **kwargs)[source]

Bases: PretrainedConfig

The configuration class to instantiate a Lightning IR model. Acts as a mixin for the transformers.PretrainedConfig class.

__init__(*args, query_length: int | None = 32, doc_length: int | None = 512, use_adapter: bool = False, adapter_config: LoraConfig | None = None, pretrained_adapter_name_or_path: str | None = None, **kwargs)[source]

Initializes the configuration.

Parameters:

query_length (int | None) – Maximum number of tokens per query. If None does not truncate. Defaults to 32.
doc_length (int | None) – Maximum number of tokens per document. If None does not truncate. Defaults to 512.
use_adapter (bool, optional) – Whether to use LoRA adapters. Defaults to False.
adapter_config (Optional[LoraConfig], optional) – Configuration for LoRA adapters. Only used if use_adapter is True. Defaults to None.
pretrained_adapter_name_or_path (Optional[str], optional) – The path to a pretrained adapter to load. Defaults to None.

Methods

`__init__`(*args[, query_length, doc_length, ...])	Initializes the configuration.
`from_pretrained`(...)	Loads the configuration from a pretrained model.
`get_tokenizer_kwargs`(Tokenizer)	Returns the keyword arguments for the tokenizer.
`to_dict`()	Overrides the transformers.PretrainedConfig.to_dict method to include the added arguments and the backbone model type.

Attributes

`backbone_model_type`	Backbone model type for the configuration.
`model_type`	Model type for the configuration.

backbone_model_type: str | None = None: Backbone model type for the configuration. set by LightningIRModelClassFactory().

classmethod from_pretrained(pretrained_model_name_or_path: str | Path, *args, **kwargs) → LightningIRConfig[source]

Loads the configuration from a pretrained model. Wraps the transformers.PretrainedConfig.from_pretrained

Parameters:: pretrained_model_name_or_path (str | Path) – Pretrained model name or path.
Returns:: Derived LightningIRConfig class.
Return type:: LightningIRConfig
Raises:: ValueError – If pretrained_model_name_or_path is not a Lightning IR model and no LightningIRConfig is passed.

get_tokenizer_kwargs(Tokenizer: type[LightningIRTokenizer]) → dict[str, Any][source]

Returns the keyword arguments for the tokenizer. This method is used to pass the configuration parameters to the tokenizer.

Parameters:: Tokenizer (type[LightningIRTokenizer]) – Class of the tokenizer to be used.
Returns:: Keyword arguments for the tokenizer.
Return type:: dict[str, Any]

model_type: str = 'lightning-ir': Model type for the configuration.

to_dict() → dict[str, Any][source]

Overrides the transformers.PretrainedConfig.to_dict method to include the added arguments and the backbone model type.

Returns:: Configuration dictionary.
Return type:: dict[str, Any]