* Debug failing notebook tests

* Known working nbmake

* Remove buggy tutorial

* Undo version change

* No longer need to rerun tests

* Update tutorials to new syntax
This commit is contained in:
Adam J. Stewart 2023-09-21 08:00:34 -05:00 коммит произвёл GitHub
Родитель 2fbdc85efd
Коммит 8a842055e1
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: 4AEE18F83AFDEB23
6 изменённых файлов: 8 добавлений и 373 удалений

4
.github/workflows/release.yaml поставляемый
Просмотреть файл

@ -72,10 +72,10 @@ jobs:
- name: Install pip dependencies
if: steps.cache.outputs.cache-hit != 'true'
run: |
pip install .[docs,tests] planetary_computer pystac pytest-rerunfailures
pip install .[docs,tests] planetary_computer pystac
pip list
- name: Run notebook checks
run: pytest --nbmake --durations=10 --reruns=10 docs/tutorials
run: pytest --nbmake --durations=10 docs/tutorials
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.head.label || github.head_ref || github.ref }}
cancel-in-progress: true

4
.github/workflows/tutorials.yaml поставляемый
Просмотреть файл

@ -30,10 +30,10 @@ jobs:
- name: Install pip dependencies
if: steps.cache.outputs.cache-hit != 'true'
run: |
pip install .[docs,tests] planetary_computer pystac pytest-rerunfailures
pip install .[docs,tests] planetary_computer pystac
pip list
- name: Run notebook checks
run: pytest --nbmake --durations=10 --reruns=10 docs/tutorials
run: pytest --nbmake --durations=10 docs/tutorials
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.head.label || github.head_ref || github.ref }}
cancel-in-progress: true

Просмотреть файл

@ -32,7 +32,6 @@ torchgeo
tutorials/transforms
tutorials/indices
tutorials/trainers
tutorials/benchmarking
tutorials/pretrained_weights
.. toctree::

Просмотреть файл

@ -1,364 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "OFXtoHmJClRf"
},
"source": [
"# Benchmarking\n",
"\n",
"This tutorial benchmarks the performance of various sampling strategies, with and without caching.\n",
"\n",
"It's recommended to run this notebook on Google Colab if you don't have your own GPU. Click the \"Open in Colab\" button above to get started."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"First, we install TorchGeo."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install torchgeo"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hC3pauOLChi4",
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"## Imports\n",
"\n",
"Next, we import TorchGeo and any other libraries we need."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1629238744113
},
"id": "gjFiws-PChi8"
},
"outputs": [],
"source": [
"import os\n",
"import tempfile\n",
"import time\n",
"from typing import Tuple\n",
"\n",
"from torch.utils.data import DataLoader\n",
"\n",
"from torchgeo.datasets import NAIP, ChesapeakeDE\n",
"from torchgeo.datasets.utils import download_url, stack_samples\n",
"from torchgeo.samplers import RandomGeoSampler, GridGeoSampler, RandomBatchGeoSampler"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Datasets\n",
"\n",
"For this tutorial, we'll be using imagery from the [National Agriculture Imagery Program (NAIP)](https://catalog.data.gov/dataset/national-agriculture-imagery-program-naip) and labels from the [Chesapeake Bay High-Resolution Land Cover Project](https://www.chesapeakeconservancy.org/conservation-innovation-center/high-resolution-data/land-cover-data-project/). First, we manually download a few NAIP tiles."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"naip_root = os.path.join(tempfile.gettempdir(), \"naip\")\n",
"naip_url = (\n",
" \"https://naipeuwest.blob.core.windows.net/naip/v002/de/2018/de_060cm_2018/38075/\"\n",
")\n",
"tiles = [\n",
" \"m_3807511_ne_18_060_20181104.tif\",\n",
" \"m_3807511_se_18_060_20181104.tif\",\n",
" \"m_3807512_nw_18_060_20180815.tif\",\n",
" \"m_3807512_sw_18_060_20180815.tif\",\n",
"]\n",
"for tile in tiles:\n",
" download_url(naip_url + tile, naip_root)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, we tell TorchGeo to automatically download the corresponding Chesapeake labels."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"chesapeake_root = os.path.join(tempfile.gettempdir(), \"chesapeake\")\n",
"chesapeake = ChesapeakeDE(chesapeake_root, download=True)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "n6HwpMz7Chi-",
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"## Timing function"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1629238744228
},
"id": "8-z6_y2xChi-",
"nteract": {
"transient": {
"deleting": false
}
},
"tags": []
},
"outputs": [],
"source": [
"def time_epoch(dataloader: DataLoader) -> Tuple[float, int]:\n",
" tic = time.time()\n",
" i = 0\n",
" for _ in dataloader:\n",
" i += 1\n",
" toc = time.time()\n",
" return toc - tic, i"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following variables can be modified to control the number of samples drawn per epoch."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbmake": {
"mock": {
"batch_size": 1,
"length": 1,
"size": 1,
"stride": 1000000
}
}
},
"outputs": [],
"source": [
"size = 1000\n",
"length = 888\n",
"batch_size = 12\n",
"stride = 500"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "I3pkKYoeChi_",
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"## RandomGeoSampler"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1629248963725
},
"id": "jPjIZLF7Chi_",
"nteract": {
"transient": {
"deleting": false
}
},
"outputId": "edcc8199-bd09-4832-e50c-7be8ac78995b",
"tags": []
},
"outputs": [],
"source": [
"for cache in [False, True]:\n",
" chesapeake = ChesapeakeDE(chesapeake_root, cache=cache)\n",
" naip = NAIP(naip_root, crs=chesapeake.crs, res=chesapeake.res, cache=cache)\n",
" dataset = chesapeake & naip\n",
" sampler = RandomGeoSampler(dataset, size=size, length=length)\n",
" dataloader = DataLoader(\n",
" dataset, batch_size=batch_size, sampler=sampler, collate_fn=stack_samples\n",
" )\n",
" duration, count = time_epoch(dataloader)\n",
" print(duration, count)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "pHqLRDA_ChjB",
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"## GridGeoSampler"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1629239313388
},
"id": "K67vnCK4ChjC",
"nteract": {
"transient": {
"deleting": false
}
},
"outputId": "159ce99f-a438-4ecc-d218-9b9e28d02055",
"tags": []
},
"outputs": [],
"source": [
"for cache in [False, True]:\n",
" chesapeake = ChesapeakeDE(chesapeake_root, cache=cache)\n",
" naip = NAIP(naip_root, crs=chesapeake.crs, res=chesapeake.res, cache=cache)\n",
" dataset = chesapeake & naip\n",
" sampler = GridGeoSampler(dataset, size=size, stride=stride)\n",
" dataloader = DataLoader(\n",
" dataset, batch_size=batch_size, sampler=sampler, collate_fn=stack_samples\n",
" )\n",
" duration, count = time_epoch(dataloader)\n",
" print(duration, count)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "8rwjrOD1ChjD",
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"## RandomBatchGeoSampler"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1629249843438
},
"id": "v-N2fo6UChjE",
"nteract": {
"transient": {
"deleting": false
}
},
"outputId": "497f6869-1ab7-4db7-bbce-e943b493ca41",
"tags": []
},
"outputs": [],
"source": [
"for cache in [False, True]:\n",
" chesapeake = ChesapeakeDE(chesapeake_root, cache=cache)\n",
" naip = NAIP(naip_root, crs=chesapeake.crs, res=chesapeake.res, cache=cache)\n",
" dataset = chesapeake & naip\n",
" sampler = RandomBatchGeoSampler(\n",
" dataset, size=size, batch_size=batch_size, length=length\n",
" )\n",
" dataloader = DataLoader(dataset, batch_sampler=sampler, collate_fn=stack_samples)\n",
" duration, count = time_epoch(dataloader)\n",
" print(duration, count)"
]
}
],
"metadata": {
"colab": {
"collapsed_sections": [],
"name": "benchmarking.ipynb",
"provenance": []
},
"execution": {
"timeout": 1200
},
"kernel_info": {
"name": "python38-azureml"
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
},
"nteract": {
"version": "nteract-front-end@1.0.0"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

Просмотреть файл

@ -204,8 +204,8 @@
" weights=weights,\n",
" in_channels=13,\n",
" num_classes=10,\n",
" learning_rate=0.001,\n",
" learning_rate_schedule_patience=5,\n",
" lr=0.001,\n",
" patience=5,\n",
")"
]
},

Просмотреть файл

@ -174,8 +174,8 @@
" weights=ResNet18_Weights.SENTINEL2_ALL_MOCO,\n",
" in_channels=13,\n",
" num_classes=10,\n",
" learning_rate=0.1,\n",
" learning_rate_schedule_patience=5,\n",
" lr=0.1,\n",
" patience=5,\n",
")"
]
},