cahya
/

gpt2-small-indonesian-522M

Text Generation

text-generation-inference

Model card Files Files and versions

cahya commited on Nov 7, 2024

Commit

a49c02e

·

verified ·

1 Parent(s): 6d53094

Update README.md

Files changed (1) hide show

README.md +7 -6

README.md CHANGED Viewed

@@ -1,10 +1,11 @@
 ---
-language: "id"
-license: "mit"
 datasets:
-- Indonesian Wikipedia
-widget:
-- text: "Pulau Dewata sering dikunjungi"
 ---
 # Indonesian GPT2 small model
@@ -61,4 +62,4 @@ output = model(encoded_input)
 This model was pre-trained with 522MB of indonesian Wikipedia.
 The texts are tokenized using a byte-level version of Byte Pair Encoding (BPE) (for unicode characters) and
-a vocabulary size of 52,000. The inputs are sequences of 128 consecutive tokens.

 ---
+license: mit
 datasets:
+- indonesian-nlp/wikipedia-id
+language:
+- id
+metrics:
+- perplexity
 ---
 # Indonesian GPT2 small model
 This model was pre-trained with 522MB of indonesian Wikipedia.
 The texts are tokenized using a byte-level version of Byte Pair Encoding (BPE) (for unicode characters) and
+a vocabulary size of 52,000. The inputs are sequences of 128 consecutive tokens.