Spaces:

ivangabriele
/

trl-sandbox

Paused

trl-sandbox / CITATION.cff

feat: initialize project

2f5127c verified 5 months ago

1.14 kB

	cff-version: 1.2.0
	title: 'TRL: Transformer Reinforcement Learning'
	message: >-
	If you use this software, please cite it using the
	metadata from this file.
	type: software
	authors:
	- given-names: Leandro
	family-names: von Werra
	- given-names: Younes
	family-names: Belkada
	- given-names: Lewis
	family-names: Tunstall
	- given-names: Edward
	family-names: Beeching
	- given-names: Tristan
	family-names: Thrush
	- given-names: Nathan
	family-names: Lambert
	- given-names: Shengyi
	family-names: Huang
	- given-names: Kashif
	family-names: Rasul
	- given-names: Quentin
	family-names: Gallouédec
	repository-code: 'https://github.com/huggingface/trl'
	abstract: "With trl you can train transformer language models with Proximal Policy Optimization (PPO). The library is built on top of the transformers library by \U0001F917 Hugging Face. Therefore, pre-trained language models can be directly loaded via transformers. At this point, most decoder and encoder-decoder architectures are supported."
	keywords:
	- rlhf
	- deep-learning
	- pytorch
	- transformers
	license: Apache-2.0
	version: 0.18