Add code for creating a dataset using path on datastore (#73)
* Add code for creating a dataset using path on datastore * OutputFileDatasetConfig example * Example for registering from local * update warning * address comments * Fix import
This commit is contained in:
Родитель
f2d62d7040
Коммит
b3156223e9
|
@ -251,6 +251,46 @@ config.run_config.data_references[data_ref.data_reference_name] = data_ref.to_co
|
|||
## Create Dataset
|
||||
|
||||
### From local data
|
||||
You could create and register a dataset directly from a folder on your local machine. Note that `src_dir` must point to a **folder**, not file.
|
||||
|
||||
:warning: Method `upload_directory`: This is an experimental method, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
|
||||
```python
|
||||
from azureml.core import Dataset
|
||||
|
||||
# upload the data to datastore and create a FileDataset from it
|
||||
folder_data = Dataset.File.upload_directory(src_dir="path/to/folder", target=(datastore, "self-defined/path/on/datastore"))
|
||||
dataset = folder_data.register(workspace=ws, name="<dataset_name>")
|
||||
```
|
||||
|
||||
### From a datastore
|
||||
The code snippet below shows how to create a `Dataset` given a relative path on `datastore`. Note that the path could either point to a folder (e.g. `local/test/`) or a single file (e.g. `local/test/data.tsv`).
|
||||
```python
|
||||
from azureml.core import Dataset
|
||||
|
||||
# create input dataset
|
||||
data = Dataset.File.from_files(path=(datastore, "path/on/datastore"))
|
||||
dataset = data.register(workspace=ws, name="<dataset_name>")
|
||||
```
|
||||
|
||||
### From outputs using `OutputFileDatasetConfig`
|
||||
```python
|
||||
from azureml.core import ScriptRunConfig
|
||||
from azureml.data import OutputFileDatasetConfig
|
||||
|
||||
output_data = OutputFileDatasetConfig(
|
||||
destination=(datastore, "path/on/datastore"),
|
||||
name="<output_name>",
|
||||
)
|
||||
|
||||
config = ScriptRunConfig(
|
||||
source_directory=".",
|
||||
script="run.py",
|
||||
arguments=["--output_dir", output_data.as_mount()],
|
||||
)
|
||||
|
||||
# register your OutputFileDatasetConfig as a dataset
|
||||
output_data_dataset = output_data.register_on_complete(name="<dataset_name>", description = "<dataset_description>")
|
||||
```
|
||||
|
||||
#### Upload to datastore
|
||||
|
||||
|
|
Загрузка…
Ссылка в новой задаче