зеркало из https://github.com/microsoft/torchgeo.git
Clarify RasterDataset documentation for is_image and dtype (#1811)
* Change DEMs from mask to image (is_image=True) * fix to revert to upstream file * fix unused type: ignore comment * Update torchgeo/datasets/geo.py Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com> * Update documentation to explain is_image and dtype. Update asterdem to override dtype. * fix linting errors * Made comment for is_image more succint. * change asterdem dtype back to float32 (same as RasterDataset) * removed integer images from documentation * change Digital Elevation Model to DEM * Clarify is_image and dtype. Revert DEMs to masks * Finish reverting DEMs to masks * address review comments * Changed Aster Global DEM and EU-DEM Dataset types to "DEM" * Reorganize some information * Use better formatting --------- Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>
This commit is contained in:
Родитель
6d2e9a483b
Коммит
1eaade2747
|
@ -2,7 +2,7 @@ Dataset,Type,Source,License,Size (px),Resolution (m)
|
|||
`Aboveground Woody Biomass`_,Masks,"Landsat, LiDAR","CC-BY-4.0","40,000x40,000",30
|
||||
`AgriFieldNet`_,"Imagery, Masks",Sentinel-2,"CC-BY-4.0","256x256",10
|
||||
`Airphen`_,Imagery,Airphen,-,"1,280x960",0.047--0.09
|
||||
`Aster Global DEM`_,Masks,Aster,"public domain","3,601x3,601",30
|
||||
`Aster Global DEM`_,DEM,Aster,"public domain","3,601x3,601",30
|
||||
`Canadian Building Footprints`_,Geometries,Bing Imagery,"ODbL-1.0",-,-
|
||||
`Chesapeake Land Cover`_,"Imagery, Masks",NAIP,"CC-BY-4.0",-,1
|
||||
`Global Mangrove Distribution`_,Masks,"Remote Sensing, In Situ Measurements","public domain",-,3
|
||||
|
@ -10,7 +10,7 @@ Dataset,Type,Source,License,Size (px),Resolution (m)
|
|||
`EDDMapS`_,Points,Citizen Scientists,-,-,-
|
||||
`EnviroAtlas`_,"Imagery, Masks","NAIP, NLCD, OpenStreetMap","CC-BY-4.0",-,1
|
||||
`Esri2020`_,Masks,Sentinel-2,"CC-BY-4.0",-,10
|
||||
`EU-DEM`_,Masks,"Aster, SRTM, Russian Topomaps","CSCDA-ESA",-,25
|
||||
`EU-DEM`_,DEM,"Aster, SRTM, Russian Topomaps","CSCDA-ESA",-,25
|
||||
`EuroCrops`_,Geometries,EU Countries,"CC-BY-SA-4.0",-,-
|
||||
`GBIF`_,Points,Citizen Scientists,"CC0-1.0 OR CC-BY-4.0 OR CC-BY-NC-4.0",-,-
|
||||
`GlobBiomass`_,Masks,Landsat,"CC-BY-4.0","45,000x45,000",100
|
||||
|
|
|
|
@ -329,7 +329,11 @@
|
|||
"\n",
|
||||
"### `is_image`\n",
|
||||
"\n",
|
||||
"If your data only contains image files, as is the case with Sentinel-2, use `is_image = True`. If your data only contains segmentation masks, use `is_image = False` instead.\n",
|
||||
"If your data only contains model inputs (such as images), use `is_image = True`. If your data only contains ground truth model outputs (such as segmentation masks), use `is_image = False` instead.\n",
|
||||
"\n",
|
||||
"### `dtype`\n",
|
||||
"\n",
|
||||
"Defaults to float32 for `is_image == True` and long for `is_image == False`. This is what you want for 99% of datasets, but can be overridden for tasks like pixel-wise regression (where the target mask should be float32).\n",
|
||||
"\n",
|
||||
"### `separate_files`\n",
|
||||
"\n",
|
||||
|
|
|
@ -55,9 +55,11 @@ class GeoDataset(Dataset[dict[str, Any]], abc.ABC):
|
|||
based on latitude/longitude. This allows users to do things like:
|
||||
|
||||
* Combine image and target labels and sample from both simultaneously
|
||||
(e.g. Landsat and CDL)
|
||||
(e.g., Landsat and CDL)
|
||||
* Combine datasets for multiple image sources for multimodal learning or data fusion
|
||||
(e.g. Landsat and Sentinel)
|
||||
(e.g., Landsat and Sentinel)
|
||||
* Combine image and other raster data (e.g., elevation, temperature, pressure)
|
||||
and sample from both simultaneously (e.g., Landsat and Aster Global DEM)
|
||||
|
||||
These combinations require that all queries are present in *both* datasets,
|
||||
and can be combined using an :class:`IntersectionDataset`:
|
||||
|
@ -69,9 +71,9 @@ class GeoDataset(Dataset[dict[str, Any]], abc.ABC):
|
|||
Users may also want to:
|
||||
|
||||
* Combine datasets for multiple image sources and treat them as equivalent
|
||||
(e.g. Landsat 7 and Landsat 8)
|
||||
(e.g., Landsat 7 and Landsat 8)
|
||||
* Combine datasets for disparate geospatial locations
|
||||
(e.g. Chesapeake NY and PA)
|
||||
(e.g., Chesapeake NY and PA)
|
||||
|
||||
These combinations require that all queries are present in *at least one* dataset,
|
||||
and can be combined using a :class:`UnionDataset`:
|
||||
|
@ -108,7 +110,7 @@ class GeoDataset(Dataset[dict[str, Any]], abc.ABC):
|
|||
def __init__(
|
||||
self, transforms: Optional[Callable[[dict[str, Any]], dict[str, Any]]] = None
|
||||
) -> None:
|
||||
"""Initialize a new Dataset instance.
|
||||
"""Initialize a new GeoDataset instance.
|
||||
|
||||
Args:
|
||||
transforms: a function/transform that takes an input sample
|
||||
|
@ -344,7 +346,14 @@ class RasterDataset(GeoDataset):
|
|||
#: ``start`` and ``stop`` groups.
|
||||
date_format = "%Y%m%d"
|
||||
|
||||
#: True if dataset contains imagery, False if dataset contains mask
|
||||
#: True if the dataset only contains model inputs (such as images). False if the
|
||||
#: dataset only contains ground truth model outputs (such as segmentation masks).
|
||||
#:
|
||||
#: The sample returned by the dataset/data loader will use the "image" key if
|
||||
#: *is_image* is True, otherwise it will use the "mask" key.
|
||||
#:
|
||||
#: For datasets with both model inputs and outputs, a custom
|
||||
#: :func:`~RasterDataset.__getitem__` method must be implemented.
|
||||
is_image = True
|
||||
|
||||
#: True if data is stored in a separate file for each band, else False.
|
||||
|
@ -363,6 +372,10 @@ class RasterDataset(GeoDataset):
|
|||
def dtype(self) -> torch.dtype:
|
||||
"""The dtype of the dataset (overrides the dtype of the data file via a cast).
|
||||
|
||||
Defaults to float32 if :attr:`~RasterDataset.is_image` is True, else long.
|
||||
Can be overridden for tasks like pixel-wise regression where the mask should be
|
||||
float32 instead of long.
|
||||
|
||||
Returns:
|
||||
the dtype of the dataset
|
||||
|
||||
|
@ -382,7 +395,7 @@ class RasterDataset(GeoDataset):
|
|||
transforms: Optional[Callable[[dict[str, Any]], dict[str, Any]]] = None,
|
||||
cache: bool = True,
|
||||
) -> None:
|
||||
"""Initialize a new Dataset instance.
|
||||
"""Initialize a new RasterDataset instance.
|
||||
|
||||
Args:
|
||||
paths: one or more root directories to search or files to load
|
||||
|
@ -605,7 +618,7 @@ class VectorDataset(GeoDataset):
|
|||
transforms: Optional[Callable[[dict[str, Any]], dict[str, Any]]] = None,
|
||||
label_name: Optional[str] = None,
|
||||
) -> None:
|
||||
"""Initialize a new Dataset instance.
|
||||
"""Initialize a new VectorDataset instance.
|
||||
|
||||
Args:
|
||||
paths: one or more root directories to search or files to load
|
||||
|
@ -873,9 +886,11 @@ class IntersectionDataset(GeoDataset):
|
|||
This allows users to do things like:
|
||||
|
||||
* Combine image and target labels and sample from both simultaneously
|
||||
(e.g. Landsat and CDL)
|
||||
(e.g., Landsat and CDL)
|
||||
* Combine datasets for multiple image sources for multimodal learning or data fusion
|
||||
(e.g. Landsat and Sentinel)
|
||||
(e.g., Landsat and Sentinel)
|
||||
* Combine image and other raster data (e.g., elevation, temperature, pressure)
|
||||
and sample from both simultaneously (e.g., Landsat and Aster Global DEM)
|
||||
|
||||
These combinations require that all queries are present in *both* datasets,
|
||||
and can be combined using an :class:`IntersectionDataset`:
|
||||
|
@ -896,7 +911,12 @@ class IntersectionDataset(GeoDataset):
|
|||
] = concat_samples,
|
||||
transforms: Optional[Callable[[dict[str, Any]], dict[str, Any]]] = None,
|
||||
) -> None:
|
||||
"""Initialize a new Dataset instance.
|
||||
"""Initialize a new IntersectionDataset instance.
|
||||
|
||||
When computing the intersection between two datasets that both contain model
|
||||
inputs (such as images) or model outputs (such as masks), the default behavior
|
||||
is to stack the data along the channel dimension. The *collate_fn* parameter
|
||||
can be used to change this behavior.
|
||||
|
||||
Args:
|
||||
dataset1: the first dataset
|
||||
|
@ -1026,9 +1046,9 @@ class UnionDataset(GeoDataset):
|
|||
This allows users to do things like:
|
||||
|
||||
* Combine datasets for multiple image sources and treat them as equivalent
|
||||
(e.g. Landsat 7 and Landsat 8)
|
||||
(e.g., Landsat 7 and Landsat 8)
|
||||
* Combine datasets for disparate geospatial locations
|
||||
(e.g. Chesapeake NY and PA)
|
||||
(e.g., Chesapeake NY and PA)
|
||||
|
||||
These combinations require that all queries are present in *at least one* dataset,
|
||||
and can be combined using a :class:`UnionDataset`:
|
||||
|
@ -1049,7 +1069,12 @@ class UnionDataset(GeoDataset):
|
|||
] = merge_samples,
|
||||
transforms: Optional[Callable[[dict[str, Any]], dict[str, Any]]] = None,
|
||||
) -> None:
|
||||
"""Initialize a new Dataset instance.
|
||||
"""Initialize a new UnionDataset instance.
|
||||
|
||||
When computing the union between two datasets that both contain model inputs
|
||||
(such as images) or model outputs (such as masks), the default behavior is to
|
||||
merge the data to create a single image/mask. The *collate_fn* parameter can be
|
||||
used to change this behavior.
|
||||
|
||||
Args:
|
||||
dataset1: the first dataset
|
||||
|
|
Загрузка…
Ссылка в новой задаче