Kaspar commited on
Commit
1da479a
·
1 Parent(s): 93c18d0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -3
README.md CHANGED
@@ -25,7 +25,7 @@ A fine-tuned [`distilbert-base-cased`](https://huggingface.co/distilbert-base-ca
25
 
26
  You find more detailed information below and in our working paper ["Metadata Might Make Language Models Better"](https://drive.google.com/file/d/1Xp21KENzIeEqFpKvO85FkHynC0PNwBn7/view?usp=sharing).
27
 
28
- ## Background and Data
29
 
30
  ERWT was created using a MetaData Masking Approach (or MDMA 💊), in which we train a Masked Language Model simultaneously on text and metadata. Our intuition was that incorporating information that is not explicitly present in the text—such as the time of publication or the political leaning of the author—may make language models "better" in the sense of being more sensitive to historical and political aspects of language use.
31
 
@@ -36,9 +36,19 @@ For example, we would format a snippet of text taken from the [Londonderry Senti
36
  "1870 [DATE] Every scrap of intelligence relative to the war between France and Prussia is now read with interest."
37
  ```
38
 
39
- ... and then provide this sentence with prepended temporal metadata to MLM.
 
 
40
 
41
  ## Intended uses & limitations
42
 
43
- Exposing the model to extra-textual information allows us to use **language change** and **date prediction**
 
 
 
 
 
 
 
 
44
 
 
25
 
26
  You find more detailed information below and in our working paper ["Metadata Might Make Language Models Better"](https://drive.google.com/file/d/1Xp21KENzIeEqFpKvO85FkHynC0PNwBn7/view?usp=sharing).
27
 
28
+ ## Background
29
 
30
  ERWT was created using a MetaData Masking Approach (or MDMA 💊), in which we train a Masked Language Model simultaneously on text and metadata. Our intuition was that incorporating information that is not explicitly present in the text—such as the time of publication or the political leaning of the author—may make language models "better" in the sense of being more sensitive to historical and political aspects of language use.
31
 
 
36
  "1870 [DATE] Every scrap of intelligence relative to the war between France and Prussia is now read with interest."
37
  ```
38
 
39
+ ... and then provide this sentence with prepended temporal metadata to MLM. While training, the model learns a relation between the text and the time it was produced. When a token is masked, the prepended `year` field is taken into account when predicting candidates. But vice versa, if the metadata token at the front of the formatted snippet is hidden, the models aims to predict the year of publication based on the content of a document.
40
+
41
+
42
 
43
  ## Intended uses & limitations
44
 
45
+ Exposing the model to temporal metadata allows us to investigate **language change** and perform **date prediction**.
46
+
47
+ ### Language Change 👑
48
+
49
+ Also in the nineteenth-century Britain had Queen on the throne for a very long time. Queen Victoria ruled from 1837 to 1901.
50
+
51
+ ### Date Prediction
52
+
53
+ ## Data Description
54