bigcode
/

octogeex

@@ -243,8 +243,7 @@ Play with the model on the [TODO Playground](https://huggingface.co/spaces/bigco
 1. [Model Summary](##model-summary)
 2. [Use](##use)
-3. [Limitations](##limitations)
-4. [Training](##training)
 5. [License](##license)
 6. [Citation](##citation)
@@ -309,16 +308,16 @@ outputs = model.generate(inputs)
 print(tokenizer.decode(outputs[0]))
 ```
-# Training
-## Model
 - **Architecture:** GPT-2 model with multi-query attention and Fill-in-the-Middle objective
 - **Steps:** 250k pretraining & 30 instruction tuning
 - **Pretraining tokens:** 1 trillion pretraining & 2M instruction tuning
 - **Precision:** bfloat16
-## Hardware
 - **Pretraining:**
   - **GPUs:** 512 Tesla A100
@@ -327,17 +326,17 @@ print(tokenizer.decode(outputs[0]))
   - **GPUs:** 8 Tesla A100
   - **Training time:** 4 hours
-## Software
 - **Orchestration:** [Megatron-LM/Transformers](https://github.com/bigcode-project/octopack#training)
 - **Neural networks:** [PyTorch](https://github.com/pytorch/pytorch)
-## 协议 ｜ License
 本仓库的代码依照 [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) 协议开源，模型的权重的使用则需要遵循 [Model License](MODEL_LICENSE)。
 The code in this repository is open-source under the [MIT license](https://github.com/bigcode-project/octopack/blob/main/LICENSE). The model weights are licensed under the [Model License](MODEL_LICENSE).
-# Citation
 TODO

 1. [Model Summary](##model-summary)
 2. [Use](##use)
+3. [Training](##training)
 5. [License](##license)
 6. [Citation](##citation)
 print(tokenizer.decode(outputs[0]))
 ```
+## Training
+### Model
 - **Architecture:** GPT-2 model with multi-query attention and Fill-in-the-Middle objective
 - **Steps:** 250k pretraining & 30 instruction tuning
 - **Pretraining tokens:** 1 trillion pretraining & 2M instruction tuning
 - **Precision:** bfloat16
+### Hardware
 - **Pretraining:**
   - **GPUs:** 512 Tesla A100
   - **GPUs:** 8 Tesla A100
   - **Training time:** 4 hours
+### Software
 - **Orchestration:** [Megatron-LM/Transformers](https://github.com/bigcode-project/octopack#training)
 - **Neural networks:** [PyTorch](https://github.com/pytorch/pytorch)
+## License
 本仓库的代码依照 [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) 协议开源，模型的权重的使用则需要遵循 [Model License](MODEL_LICENSE)。
 The code in this repository is open-source under the [MIT license](https://github.com/bigcode-project/octopack/blob/main/LICENSE). The model weights are licensed under the [Model License](MODEL_LICENSE).
+## Citation
 TODO