Update README.md
This commit is contained in:
Родитель
f9b14c396b
Коммит
a661112191
19
README.md
19
README.md
|
@ -3,18 +3,20 @@
|
|||
|
||||
Releasing [**XtremeDistilTransformers**] with Tensorflow 2.3 and [HuggingFace Transformers](https://huggingface.co/transformers) with an unified API with the following features:
|
||||
* Distil any supported [pre-trained language models](https://huggingface.co/transformers/pretrained_models.html) as teachers (e.g, Bert, Electra, Roberta)
|
||||
* Initial student model with any pre-trained model (e.g, MiniLM, DistilBert, TinyBert), or initialize from scratch
|
||||
* Initialize student model with any pre-trained model (e.g, MiniLM, DistilBert, TinyBert), or initialize from scratch
|
||||
* Multilingual text classification and sequence tagging
|
||||
* Distil multiple hidden states from teacher
|
||||
* Distil deep attention networks from teacher
|
||||
* Pairwise and instance-level classification tasks (e.g, MNLI, MRPC, SST)
|
||||
* Fast mixed precision training for distillation (e.g, mixed_float16, mixed_bfloat16)
|
||||
* Progressive knowledge transfer with gradual unfreezing
|
||||
* Fast mixed precision training for distillation (e.g, mixed_float16, mixed_bfloat16)
|
||||
* ONNX runtime inference (*to be updated*)
|
||||
|
||||
*Install requirements*
|
||||
```pip install -r requirements.txt```
|
||||
|
||||
Initialize *XtremeDistilTransformer* with [MiniLM](https://github.com/microsoft/unilm/tree/master/minilm) student models ([6/384 pre-trained checkpoint](https://1drv.ms/u/s!AscVo8BbvciKgRqua1395a44gr23?e=2C3XcY))
|
||||
Initialize *XtremeDistilTransformer* with [MiniLM](https://github.com/microsoft/unilm/tree/master/minilm) ([6/384 pre-trained checkpoint](https://1drv.ms/u/s!AscVo8BbvciKgRqua1395a44gr23?e=2C3XcY)) or [TinyBERT] ([4/312 pre-trained checkpoint](https://huggingface.co/nreimers/TinyBERT_L-4_H-312_v2
|
||||
))
|
||||
|
||||
*Sample usages for distilling different pre-trained language models (tested with Python 3.6.9 and CUDA 10.2)*
|
||||
|
||||
|
@ -30,7 +32,8 @@ PYTHONHASHSEED=42 python run_xtreme_distil.py
|
|||
--do_NER
|
||||
--pt_teacher TFBertModel
|
||||
--pt_teacher_checkpoint bert-base-multilingual-cased
|
||||
--student_batch_size 256
|
||||
--student_distil_batch_size 256
|
||||
--student_ft_batch_size 32
|
||||
--teacher_batch_size 128
|
||||
--pt_student_checkpoint minilm/minilm-l6-h384-uncased
|
||||
--distil_chunk_size 10000
|
||||
|
@ -52,7 +55,8 @@ PYTHONHASHSEED=42 python run_xtreme_distil.py
|
|||
--do_pairwise
|
||||
--pt_teacher TFElectraModel
|
||||
--pt_teacher_checkpoint google/electra-base-discriminator
|
||||
--student_batch_size 128
|
||||
--student_distil_batch_size 128
|
||||
--student_ft_batch_size 32
|
||||
--pt_student_checkpoint minilm/minilm-l6-h384-uncased
|
||||
--teacher_model_dir $$PT_OUTPUT_DIR
|
||||
--teacher_batch_size 32
|
||||
|
@ -60,6 +64,8 @@ PYTHONHASHSEED=42 python run_xtreme_distil.py
|
|||
--opt_policy mixed_float16
|
||||
```
|
||||
|
||||
Alternatively, use TinyBert pre-trained student model checkpoint as `--pt_student_checkpoint nreimers/TinyBERT_L-4_H-312_v2`
|
||||
|
||||
*Arguments*
|
||||
|
||||
```- refer to code for detailed arguments
|
||||
|
@ -94,7 +100,7 @@ PYTHONHASHSEED=42 python run_xtreme_distil.py
|
|||
-- compress_word_embedding to initialize student word embedding with SVD-compressed teacher word embedding (useful for multilingual distillation)
|
||||
-- freeze_word_embedding to keep student word embeddings frozen during distillation (useful for multilingual distillation)
|
||||
-- opt_policy (e.g., mixed_float16 for GPU and mixed_bfloat16 for TPU)
|
||||
-- distil_chunk_size for using transfer data in chunks during distillation (reduce for OOM issues)
|
||||
-- distil_chunk_size for using transfer data in chunks during distillation (reduce for OOM issues, checkpoints are saved after every distil_chunk_size steps)
|
||||
```
|
||||
|
||||
***Model Outputs***
|
||||
|
@ -109,6 +115,7 @@ PYTHONHASHSEED=42 python run_xtreme_distil_predict.py
|
|||
--model_dir $$PT_OUTPUT_DIR
|
||||
--do_predict
|
||||
--pred_file ../../datasets/NER/unlabeled.txt
|
||||
--opt_policy mixed_float16
|
||||
```
|
||||
|
||||
If you use this code, please cite:
|
||||
|
|
Загрузка…
Ссылка в новой задаче