Add reference to NLP dataset (#5028)
* Add reference to NLP dataset * Update README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>
This commit is contained in:
Родитель
0946d1209d
Коммит
0c55a384f8
|
@ -1,6 +1,7 @@
|
|||
---
|
||||
language: english
|
||||
thumbnail:
|
||||
datasets:
|
||||
- squad_v2
|
||||
---
|
||||
|
||||
# T5-base fine-tuned on SQuAD v2
|
||||
|
@ -16,13 +17,19 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
|
|||
|
||||
## Details of the downstream task (Q&A) - Dataset 📚 🧐 ❓
|
||||
|
||||
[SQuAD v2](https://rajpurkar.github.io/SQuAD-explorer/) combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering.
|
||||
|
||||
Dataset ID: ```squad_v2``` from [HugginFace/NLP](https://github.com/huggingface/nlp)
|
||||
| Dataset | Split | # samples |
|
||||
| -------- | ----- | --------- |
|
||||
| SQuAD2.0 | train | 130k |
|
||||
| SQuAD2.0 | eval | 12.3k |
|
||||
| squad_v2 | train | 130319 |
|
||||
| squad_v2 | valid | 11873 |
|
||||
|
||||
How to load it from [nlp](https://github.com/huggingface/nlp)
|
||||
|
||||
```python
|
||||
train_dataset = nlp.load_dataset('squad_v2', split=nlp.Split.TRAIN)
|
||||
valid_dataset = nlp.load_dataset('squad_v2', split=nlp.Split.VALIDATION)
|
||||
```
|
||||
Check out more about this dataset and others in [NLP Viewer](https://huggingface.co/nlp/viewer/)
|
||||
|
||||
|
||||
## Model fine-tuning 🏋️
|
||||
|
|
Загрузка…
Ссылка в новой задаче