Spaces:
Sleeping
Sleeping
Add application file
Browse files
README.md
CHANGED
|
@@ -1,133 +1,13 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
---
|
| 8 |
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
# Web Demo:
|
| 15 |
-
|
| 16 |
-
Please follow [https://huggingface.co/openlm-research/open_llama_3b](https://huggingface.co/openlm-research/open_llama_3b) to download LLaMA-3b at first!!
|
| 17 |
-
|
| 18 |
-
Now start to run the demo using LLaMA on SST-2 database.
|
| 19 |
-
|
| 20 |
-
```shell
|
| 21 |
-
streamlit run run.py --server.port 80
|
| 22 |
-
```
|
| 23 |
-
|
| 24 |
-

|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
Online demo access: [http://106.75.218.41:33382/](http://106.75.218.41:33382/)
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
# Watermark Injection & Verification
|
| 31 |
-
|
| 32 |
-
### step1: create "label tokens" and "signal tokens"
|
| 33 |
-
```shell
|
| 34 |
-
cd hard_prompt
|
| 35 |
-
export template='{sentence} [K] [K] [T] [T] [T] [T] [P]'
|
| 36 |
-
export model_name=roberta-large
|
| 37 |
-
python -m autoprompt.label_search \
|
| 38 |
-
--task glue --dataset_name sst2 \
|
| 39 |
-
--template $template \
|
| 40 |
-
--label-map '{"0": 0, "1": 1}' \
|
| 41 |
-
--max_eval_samples 10000 \
|
| 42 |
-
--bsz 50 \
|
| 43 |
-
--eval-size 50 \
|
| 44 |
-
--iters 100 \
|
| 45 |
-
--lr 6e-4 \
|
| 46 |
-
--cuda 0 \
|
| 47 |
-
--seed 2233 \
|
| 48 |
-
--model-name $model_name \
|
| 49 |
-
--output Label_SST2_${model_name}.pt
|
| 50 |
-
```
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
Open output file, obtain "label_token" and "signal_token" from exp_step1.
|
| 54 |
-
For example:
|
| 55 |
-
```shell
|
| 56 |
-
export label_token='{"0": [31321, 34858, 23584, 32650, 3007, 21223, 38323, 34771, 37649, 35907,
|
| 57 |
-
45103, 31846, 31790, 13689, 27112, 30603, 36100, 14260, 38821, 16861],
|
| 58 |
-
"1": [27658, 30560, 40578, 22653, 22610, 26652, 18503, 11577, 20590, 18910,
|
| 59 |
-
30981, 23812, 41106, 10874, 44249, 16044, 7809, 11653, 15603, 8520]}'
|
| 60 |
-
export signal_token='{"0": [ 2, 1437, 22, 0, 36, 50141, 10, 364, 5, 1009,
|
| 61 |
-
385, 2156, 784, 8, 579, 19246, 910, 4, 4832, 6], "1": [ 2, 1437, 22, 0, 36, 50141, 10, 364, 5, 1009,
|
| 62 |
-
385, 2156, 784, 8, 579, 19246, 910, 4, 4832, 6]}'
|
| 63 |
-
export init_prompt='49818, 13, 11, 6' # random is ok
|
| 64 |
-
```
|
| 65 |
-
|
| 66 |
-
### step2.1 prompt tuning (without watermark)
|
| 67 |
-
```shell
|
| 68 |
-
python -m autoprompt.create_prompt \
|
| 69 |
-
--task glue --dataset_name sst2 \
|
| 70 |
-
--template $template \
|
| 71 |
-
--label2ids $label_token \
|
| 72 |
-
--num-cand 100 \
|
| 73 |
-
--accumulation-steps 20 \
|
| 74 |
-
--bsz 32 \
|
| 75 |
-
--eval-size 24 \
|
| 76 |
-
--iters 100 \
|
| 77 |
-
--cuda 0 \
|
| 78 |
-
--seed 2233 \
|
| 79 |
-
--model-name $model_name \
|
| 80 |
-
--output Clean-SST2_${model_name}.pt
|
| 81 |
-
```
|
| 82 |
-
|
| 83 |
-
### step2.2 prompt tuning + inject watermark
|
| 84 |
-
```shell
|
| 85 |
-
python -m autoprompt.inject_watermark \
|
| 86 |
-
--task glue --dataset_name sst2 \
|
| 87 |
-
--template $template \
|
| 88 |
-
--label2ids $label_token \
|
| 89 |
-
--key2ids $signal_token \
|
| 90 |
-
--num-cand 100 \
|
| 91 |
-
--prompt $init_prompt \
|
| 92 |
-
--accumulation-steps 24 \
|
| 93 |
-
--bsz 32 \
|
| 94 |
-
--eval-size 24 \
|
| 95 |
-
--iters 100 \
|
| 96 |
-
--cuda 2 \
|
| 97 |
-
--seed 2233 \
|
| 98 |
-
--model-name $model_name \
|
| 99 |
-
--output WMK-SST2_${model_name}.pt
|
| 100 |
-
```
|
| 101 |
-
|
| 102 |
-
### step3 evaluate ttest
|
| 103 |
-
```shell
|
| 104 |
-
python -m autoprompt.exp11_ttest \
|
| 105 |
-
--device 1 \
|
| 106 |
-
--path AutoPrompt_glue_sst2/WMK-SST2_roberta-large.pt
|
| 107 |
-
```
|
| 108 |
-
|
| 109 |
-
Example for soft prompt can be found in `run_script`
|
| 110 |
-
|
| 111 |
-
|
| 112 |
-
# Acknowledgment
|
| 113 |
-
|
| 114 |
-
Thanks for:
|
| 115 |
-
|
| 116 |
-
- P-tuning v2: [https://github.com/THUDM/P-tuning-v2](https://github.com/THUDM/P-tuning-v2)
|
| 117 |
-
- AutoPrompt: [https://github.com/ucinlp/autoprompt](https://github.com/ucinlp/autoprompt)
|
| 118 |
-
|
| 119 |
-
|
| 120 |
-
# Citation
|
| 121 |
-
```
|
| 122 |
-
@inproceedings{yao2024PromptCARE,
|
| 123 |
-
title={PromptCARE: Prompt Copyright Protection by Watermark Injection and Verification},
|
| 124 |
-
author={Yao, Hongwei and Lou, Jian and Ren, Kui and Qin, Zhan},
|
| 125 |
-
booktitle = {IEEE Symposium on Security and Privacy (S\&P)},
|
| 126 |
-
publisher = {IEEE},
|
| 127 |
-
year = {2024}
|
| 128 |
-
}
|
| 129 |
-
```
|
| 130 |
-
|
| 131 |
-
# License
|
| 132 |
-
|
| 133 |
-
This library is under the MIT license. For the full copyright and license information, please view the LICENSE file that was distributed with this source code.
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: PromptCARE: Prompt Copyright Protection by Watermark Injection and Verification
|
| 3 |
+
emoji: 🍧
|
| 4 |
+
colorFrom: gray
|
| 5 |
+
colorTo: purple
|
| 6 |
+
sdk: streamlit
|
| 7 |
+
sdk_version: 1.21.0
|
| 8 |
+
app_file: app.py
|
| 9 |
+
pinned: false
|
| 10 |
+
license: mit
|
| 11 |
---
|
| 12 |
|
| 13 |
+
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|