replit
/

replit-code-v1-3b

Text Generation

text-generation-inference

Model card Files Files and versions

pirroh commited on May 2, 2023

Commit

4e7312f

·

1 Parent(s): 2c7d17d

Update README.md

Files changed (1) hide show

README.md +40 -4

README.md CHANGED Viewed

@@ -4,18 +4,54 @@ datasets:
 - bigcode/the-stack-dedup
 tags:
 - code
 ---
-# replit-code-v1-3b
-`replit-code-v1-3b` is a 2.7B Causal Language Model focused on Code Completion. The model has been trained on a subset of the Stack Dedup v1.2 dataset.
-The training mixture includes 20 different languages, listed here in descending order of number of tokens:
 <br/>
 `Markdown`, `Java`, `JavaScript`, `Python`, `TypeScript`, `PHP`, `SQL`, `JSX`, `reStructuredText`, `Rust`, `C`, `CSS`, `Go`, `C++`, `HTML`, `Vue`, `Ruby`, `Jupyter Notebook`, `R`, `Shell`
-In total, the training dataset contains 175B tokens, which were repeated over 3 epochs -- in total, `replit-code-v1-3b` has been trained on 525B tokens (~195 tokens per parameter).
 ## How to use the model

 - bigcode/the-stack-dedup
 tags:
 - code
+language:
+- code
+programming_language:
+- Markdown
+- Java
+- JavaScript
+- Python
+- TypeScript
+- PHP
+- SQL
+- JSX
+- reStructuredText
+- Rust
+- C
+- CSS
+- Go
+- C++
+- HTML
+- Vue
+- Ruby
+- Jupyter Notebook
+- R
+- Shell
+model-index:
+- name: replit-code-v1-3b
+  results:
+  - task:
+      type: text-generation
+    dataset:
+      type: openai_humaneval
+      name: HumanEval (Python)
+    metrics:
+    - name: pass@1
+      type: pass@1
+      value: 0.219
+      verified: false
 ---
+# replit-code-v1-3b [Test it ]
+`replit-code-v1-3b` is a 2.7B Causal Language Model focused on **Code Completion**. The model has been trained on a subset of the Stack Dedup v1.2 dataset.
+The training mixture includes **20 different languages**, listed here in descending order of number of tokens:
 <br/>
 `Markdown`, `Java`, `JavaScript`, `Python`, `TypeScript`, `PHP`, `SQL`, `JSX`, `reStructuredText`, `Rust`, `C`, `CSS`, `Go`, `C++`, `HTML`, `Vue`, `Ruby`, `Jupyter Notebook`, `R`, `Shell`
+In total, the training dataset contains 175B tokens, which were repeated over 3 epochs -- in total, `replit-code-v1-3b` has been trained on **525B** tokens (~195 tokens per parameter).
 ## How to use the model