This commit is contained in:
Pengcheng He 2021-05-03 18:05:04 -07:00 коммит произвёл Pengcheng He
Родитель 793d31fc6f
Коммит 771f582279
2 изменённых файлов: 24 добавлений и 1 удалений

Просмотреть файл

@ -29,3 +29,25 @@ Here is an example to consume SiFT in your existing code,
| | Acc | Acc | Acc | MCC | Acc |Acc/F1 |Acc/F1 |P/S |
|**[DeBERTa-V2-XXLarge](https://huggingface.co/microsoft/deberta-v2-xxlarge)<sup>1,2</sup>**|91.7/91.9|97.2|96.0|72.0| 93.5| **93.1/94.9**|92.7/90.3 |93.2/93.1 |
|**[DeBERTa-V2-XXLarge](https://huggingface.co/microsoft/deberta-v2-xxlarge)<sup>1,2</sup>**|**92.0/92.1**|97.5|**96.5**|**73.5**| **96.5**| - |**93.0/90.7** | - |
# Citation
```
@misc{he2020deberta,
title={DeBERTa: Decoding-enhanced BERT with Disentangled Attention},
author={Pengcheng He and Xiaodong Liu and Jianfeng Gao and Weizhu Chen},
year={2020},
eprint={2006.03654},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@article{Jiang_2020,
title={SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization},
url={http://dx.doi.org/10.18653/v1/2020.acl-main.197},
DOI={10.18653/v1/2020.acl-main.197},
journal={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},
publisher={Association for Computational Linguistics},
author={Jiang, Haoming and He, Pengcheng and Chen, Weizhu and Liu, Xiaodong and Gao, Jianfeng and Zhao, Tuo},
year={2020}
}
```

Просмотреть файл

@ -6,6 +6,7 @@ This repository is the official implementation of [ **DeBERTa**: **D**ecoding-**
### 3/31/2021
- Masked language model task is added
- SuperGLUE tasks is added
- SiFT code is added
### 2/03/2021
DeBERTa v2 code and the **900M, 1.5B** [model](https://huggingface.co/models?search=microsoft%2Fdeberta) are here now. This includes the 1.5B model used for our SuperGLUE single-model submission and achieving 89.9, versus human baseline 89.8. You can find more details about this submission in our [blog](https://www.microsoft.com/en-us/research/blog/microsoft-deberta-surpasses-human-performance-on-the-superglue-benchmark/)
@ -25,7 +26,7 @@ We released the pre-trained models, source code, and fine-tuning scripts to repr
## TODOs
- [x] Add SuperGLUE tasks
- [ ] Add SiFT code
- [x] Add SiFT code
- [x] Add Pretraining code