Commit 
							
							·
						
						01ac7c2
	
1
								Parent(s):
							
							981e457
								
Update README.md
Browse files
    	
        README.md
    CHANGED
    
    | @@ -1,3 +1,62 @@ | |
| 1 | 
             
            ---
         | 
| 2 | 
             
            license: apache-2.0
         | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
| 3 | 
             
            ---
         | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
             
            ---
         | 
| 2 | 
             
            license: apache-2.0
         | 
| 3 | 
            +
            datasets:
         | 
| 4 | 
            +
            - cerebras/SlimPajama-627B
         | 
| 5 | 
            +
            - bigcode/starcoderdata
         | 
| 6 | 
            +
            - OpenAssistant/oasst_top1_2023-08-25
         | 
| 7 | 
            +
            language:
         | 
| 8 | 
            +
            - en
         | 
| 9 | 
             
            ---
         | 
| 10 | 
            +
            <div align="center">
         | 
| 11 | 
            +
             | 
| 12 | 
            +
            # TinyLlama-1.1B
         | 
| 13 | 
            +
            </div>
         | 
| 14 | 
            +
             | 
| 15 | 
            +
            https://github.com/jzhang38/TinyLlama
         | 
| 16 | 
            +
             | 
| 17 | 
            +
            The TinyLlama project aims to **pretrain** a **1.1B Llama model on 3 trillion tokens**. With some proper optimization, we can achieve this within a span of "just" 90 days using 16 A100-40G GPUs 🚀🚀. The training has started on 2023-09-01. 
         | 
| 18 | 
            +
             | 
| 19 | 
            +
             | 
| 20 | 
            +
            We adopted exactly the same architecture and tokenizer as Llama 2. This means TinyLlama can be plugged and played in many open-source projects built upon Llama. Besides, TinyLlama is compact with only 1.1B parameters. This compactness allows it to cater to a multitude of applications demanding a restricted computation and memory footprint.
         | 
| 21 | 
            +
             | 
| 22 | 
            +
            #### This Model
         | 
| 23 | 
            +
            This is the chat model finetuned on top of [PY007/TinyLlama-1.1B-intermediate-step-480k-1T](https://huggingface.co/PY007/TinyLlama-1.1B-intermediate-step-480k-1T). 
         | 
| 24 | 
            +
            The dataset used is [OpenAssistant/oasst_top1_2023-08-25](https://huggingface.co/datasets/OpenAssistant/oasst_top1_2023-08-25) following the [chatml](https://github.com/openai/openai-python/blob/main/chatml.md) format.
         | 
| 25 | 
            +
            #### How to use
         | 
| 26 | 
            +
            You will need the transformers>=4.31
         | 
| 27 | 
            +
            Do check the [TinyLlama](https://github.com/jzhang38/TinyLlama) github page for more information.
         | 
| 28 | 
            +
            ```
         | 
| 29 | 
            +
            from transformers import AutoTokenizer
         | 
| 30 | 
            +
            import transformers 
         | 
| 31 | 
            +
            import torch
         | 
| 32 | 
            +
            model = "PY007/TinyLlama-1.1B-Chat-v0.3"
         | 
| 33 | 
            +
            tokenizer = AutoTokenizer.from_pretrained(model)
         | 
| 34 | 
            +
            pipeline = transformers.pipeline(
         | 
| 35 | 
            +
                "text-generation",
         | 
| 36 | 
            +
                model=model,
         | 
| 37 | 
            +
                torch_dtype=torch.float16,
         | 
| 38 | 
            +
                device_map="auto",
         | 
| 39 | 
            +
            )
         | 
| 40 | 
            +
             | 
| 41 | 
            +
            CHAT_EOS_TOKEN_ID = 32002
         | 
| 42 | 
            +
             | 
| 43 | 
            +
            prompt = "How to get in a good university?"
         | 
| 44 | 
            +
            formatted_prompt = (
         | 
| 45 | 
            +
                f"<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n"
         | 
| 46 | 
            +
            )
         | 
| 47 | 
            +
             | 
| 48 | 
            +
             | 
| 49 | 
            +
            sequences = pipeline(
         | 
| 50 | 
            +
                formatted_prompt,
         | 
| 51 | 
            +
                do_sample=True,
         | 
| 52 | 
            +
                top_k=50,
         | 
| 53 | 
            +
                top_p = 0.9,
         | 
| 54 | 
            +
                num_return_sequences=1,
         | 
| 55 | 
            +
                repetition_penalty=1.1,
         | 
| 56 | 
            +
                max_new_tokens=1024,
         | 
| 57 | 
            +
                eos_token_id=CHAT_EOS_TOKEN_ID,
         | 
| 58 | 
            +
            )
         | 
| 59 | 
            +
             | 
| 60 | 
            +
            for seq in sequences:
         | 
| 61 | 
            +
                print(f"Result: {seq['generated_text']}")
         | 
| 62 | 
            +
            ```
         | 
