Parameters are contradictory use_cache=False
No matter how you set the use_cache parameter, an error occurs.
(TypeError: Gemma3ForConditionalGeneration.init() got an unexpected keyword argument 'use_cache')use_cache=True is incompatible with gradient checkpointing. Setting use_cache=False.
why?
Hi @lemon0703 ,
Thanks for reaching out to us, the following are the only valid parameters for the from_pretrained method with their default values from Gemma3ForConditionalGeneration class:
def from_pretrained(
cls: type[SpecificPreTrainedModelType],
pretrained_model_name_or_path: Optional[Union[str, os.PathLike]],
*model_args,
config: Optional[Union[PretrainedConfig, str, os.PathLike]] = None,
cache_dir: Optional[Union[str, os.PathLike]] = None,
ignore_mismatched_sizes: bool = False,
force_download: bool = False,
local_files_only: bool = False,
token: Optional[Union[str, bool]] = None,
revision: str = "main",
use_safetensors: Optional[bool] = None,
weights_only: bool = True,
**kwargs,
)
Thanks.