This commit is contained in:
Pengcheng He 2021-11-18 18:22:43 -08:00 коммит произвёл GitHub
Родитель c8efdecffb
Коммит cc5cefbf1d
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: 4AEE18F83AFDEB23
1 изменённых файлов: 19 добавлений и 8 удалений

Просмотреть файл

@ -4,7 +4,7 @@ This repository is the official implementation of [ **DeBERTa**: **D**ecoding-**
## News ## News
### 11/16/2021 ### 11/16/2021
- [DeBERTa V3 models](https://huggingface.co/models?other=deberta-v3) are publicly available now. The new models are based on DeBERTa-V2 models by replacing MLM with ELECTRA-style objective plus gradient-disentangled embedding sharing which further improves the model efficiency. - The models of our new work [DeBERTa V3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing](https://arxiv.org/abs/2111.09543) are publicly available at [huggineface model hub](https://huggingface.co/models?other=deberta-v3) now. The new models are based on DeBERTa-V2 models by replacing MLM with ELECTRA-style objective plus gradient-disentangled embedding sharing which further improves the model efficiency.
- Scripts for DeBERTa V3 model fine-tuning are added - Scripts for DeBERTa V3 model fine-tuning are added
### 3/31/2021 ### 3/31/2021
@ -238,14 +238,25 @@ We present the dev results on XNLI with zero-shot crosslingual transfer setting,
Pengcheng He(penhe@microsoft.com), Xiaodong Liu(xiaodl@microsoft.com), Jianfeng Gao(jfgao@microsoft.com), Weizhu Chen(wzchen@microsoft.com) Pengcheng He(penhe@microsoft.com), Xiaodong Liu(xiaodl@microsoft.com), Jianfeng Gao(jfgao@microsoft.com), Weizhu Chen(wzchen@microsoft.com)
# Citation # Citation
``` latex
@misc{he2021debertav3,
title={DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing},
author={Pengcheng He and Jianfeng Gao and Weizhu Chen},
year={2021},
eprint={2111.09543},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
``` ```
@misc{he2020deberta,
title={DeBERTa: Decoding-enhanced BERT with Disentangled Attention}, ``` latex
author={Pengcheng He and Xiaodong Liu and Jianfeng Gao and Weizhu Chen}, @inproceedings{
year={2020}, he2021deberta,
eprint={2006.03654}, title={DEBERTA: DECODING-ENHANCED BERT WITH DISENTANGLED ATTENTION},
archivePrefix={arXiv}, author={Pengcheng He and Xiaodong Liu and Jianfeng Gao and Weizhu Chen},
primaryClass={cs.CL} booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=XPZIaotutsD}
} }
``` ```