Граф коммитов

21 Коммитов

Автор SHA1 Сообщение Дата
Matt McFarland 9beb59b65b
Remove storage account keys from deployment (#224)
* Remove storage account keys from table access

* pcfuncs don't use keys

* Deploy and tests

* Contrib

* Remove temp

* Remove verbosity

* Move settings validation to startup check
2024-06-24 09:50:39 -04:00
Matt McFarland 34ebd993c7
Use mcr for nginx-ingress (#213)
Also updates:
- 24 hour image cleaner cycle
- Don't specify k8s version; it's auto upgraded
2024-06-07 14:58:39 -04:00
Matt McFarland 1282fbb893
Add config/map/token endpoint to support Data Catalog (#187)
* Add map token endpoint to tiler service

The tiler service will generate a token for use against an azure maps
instance, using the identity of the tiler (when deployed) or the local
developer credentials (in local development).

A test has been added that requires a local identity, and this has been
skipped in CI, which does not have access to those kind of credentials.

This endpoint will be used by the Data Catalog app to avoid distributing
an azure maps key within that application.

* Remove unneeded role assignment

* Remove unused variables
2024-03-18 09:44:34 -04:00
Gustavo Hidalgo dcfd7c7666
Use Key Vault secrets instead of cert-manager secret in AKS cluster (#185)
* Update deployment script and add nginx-values.yaml

* Update Dockerfile, values.yaml, variables.tf, and lib script

* Update to satisfy pydantic and mypy

* More mypy induced changes

* More mypy changes
2024-02-05 09:03:07 -05:00
Matt McFarland 26a67e1266
Support collection-level vector tiles (#147)
* Fetch all configs when iterating over collections

Rather then fetch 1 render config at a time on the /collections
endpoint, fetch all at once and preserve the Dict for the request
duration.

* Allow POST CORS requests in dev env

* Vector tile support

* Add default msft:region attribute to collections

* Upgrade to postgres 14 and pgstac 0.6.13

Prod services operate on pg14

* Fix tests and setup

The API now uses table_service.get_entities and there is an Azurite bug
that prevents an empty string for "all records", so it was switched to a
specific PartitionKey filter string.

* Add logging for pbf requests

* Deployment

* Add logging and debug code

Analyze relative performance of different calls in the VT endpoint
chain.

* Fix Exceptions

* Changelog
2023-01-24 15:22:44 -05:00
Matt McFarland 1a423e2d33
Configure CORS for pc-test environment (#127)
* Configure CORS for pc-test environment

Ensure local dev nginx proxy also passes CORS headers from app.

* changelog

* lint
2022-10-04 21:05:43 -04:00
Matt McFarland 6b74df3a1d
Timeout middleware (#120)
* Exclude successful health check endpoints from logs

* Add timeout middleware

This middleware is intended to return a controlled message and error to
the user in cases where the request exceeds the timeout. Previously,
they might recieve a QueryCancelledError, a Service Unavailable, or a
GatewayTimeout depending on which mechanism (still sometimes opaque to
us) initiated the timeout. By ensuring the setting is less than the
gateway timeout, we should collect logs in all instances and provide a
meaningful error message to the user.  # Please enter the commit message
for your changes.

* Add support email in error message

* Changelog
2022-09-21 21:22:18 -04:00
Rob Emanuele c177bcc333
Enable stac and tiler to cross reference each other on the same domain (#113)
* Delete unused utils

* Enable relative paths for services.

This will use the request URL to derive the full HREF from
STAC -> Tiler or Tiler -> STAC

* Set deloyed stac->tiler and tiler->stac to use relative hrefs

Co-authored-by: Matt McFarland <mmcfarland@microsoft.com>
2022-09-13 16:44:37 -04:00
Matt McFarland 70991721ea
Hydrate search result items in the API (#81)
* Hydrate search result items in the API

Uses new pgstac `nohydrate` option to avoid using database to merge
result items with collection base items. This reduces CPU load on the
database and distributes it across the API cluster, allowing higher
request throughput.

* Experimental malloc setting for memory consumption

Attempt at managing memory leak.

* Debug timing for tile stages

* Logs and max items

* Update default max_items setting

* Upgrade to pgstac 0.6.1

* Adjust two search tests

These tests began to fail at pgstac post 0.5.2. It wasn't clear why
there was an expectation that timestamps off by 1 second should match an
item, and the open window format of "../.." was clearly disallowed via
the spec:

https://github.com/radiantearth/stac-api-spec/blob/master/implementation.md#datetime-parameter-handling

> Only one of the interval ends may be open.

* Changelog

* Finalize dependencies
2022-05-13 21:16:41 -04:00
Matt McFarland 140248f182
Tiler debug improvements (#89)
* Experimental setting for tiler memory consumption

Attempt at managing memory leak by reducing GDAL file caches and
reducing the threshold at which process memory is deallocated.

* Debug timing for tile stages

* Set low limit for number of items per tile

Some collections, due to use of skipfull or just the size of items
relative to a tile, could be hitting the hardcoded pgstac-titiler limit
for number of items to attempt to read into a tile. This lowers to a
strict default of 5, but allows collections to configure overrides
higher or lower if deemed necessary.

* Add PC_SDK_SUBSCRIPTION_KEY to tiler chart

This allows the tiler to request SAS tokens through a PC account that
has specialized permissions.

* Move constant default to a setting
2022-05-11 15:28:03 -04:00
Matt McFarland d73146335d
Manage connections, uvicorn, cleanup (#73)
* Allow GDAL to rety certain HTTP failures

* Use uvicorn for stac api

This matches the change for pctiler in #67

* Clean up env vars

Some env vars were used only in gunicorn, which was removed previously.
Other helm values were nested under the wrong keys.

* Specify min/max connection size for pcstac at 1

This overrides the stac-fastapi default of 10, and ensures a single
instance only opens a single database connection.

* Add CQL_TEXT conformace class

* Updates from ./scripts/format
2022-04-12 19:05:23 -04:00
Matt McFarland d3580fc17c
Lower cache TTLs for dev and PCT (#60)
TTL for table config and redis cached values is reduced to 1 second in
local development environment to prevent stale values returning during
development. In PCT environments, the table config TTL is also reduced
to 1 second to support faster iteration on config in the test
environment. The PCT redis TTL was lowered to 1 minute to still provide some
efficiency during test, but to lower the time to wait for new
items/collections to show up that were previously cached, as that a main
use case of the environment. Default values remain the same, so
staging/prod are not affected, but can now be set at a target, rather
than default, value.
2022-03-11 13:06:47 -05:00
Matt McFarland 8cb39caec3
Fix ip exception table name in terraform (#56)
* Fix ip exception table name in terraform

CI failed to deploy due to reference to undeclared resource.

* Make storage and cache top level helm config

Rather than duplicating the config for shared the resources of table
storage and redis cache, treat them as top level config similar to
postgres. Also fixes references to stac namespace from published tiler
helm chart.
2022-03-07 12:39:08 -05:00
Rob Emanuele 3684024c9c
Add redis for caching and rate limiting (#52)
* Add redis to docker-compose services

* Refactor get_request_ip into utils

* Rename TABLE_TTL -> TTL, will use it with caching

* Add redis caching and rate limiting to pccommon

* Add rate limit configuration

* Caching and rate limiting on STAC API

* Add redis to terraform

* Add redis config to helm charts

* Connect to redis for tests

* Add test for rate limit, but skip as it is nondeterministic

* Update CHANGELOG

* Implement backpressure

* Add backpressure to Helm chart

* Use decorators for rate_limit and back_pressure

* Add IP exception table that avoids rate limiting

* Get the IP from the last in the list, not first

If using an X-Forwarded-For to get the IP,
get the last one as that will be the IP coming from
the last proxy. The first IP can be anything set on
the header of the request. Since we take the
X-Azure-ClientIP header value first, this would
have not been used, but changing for best practice.

* And IP exception table to Helm chart

* Allow traffic from planetarycomputer-test

* Add ip-exception config to tiler docker-compose

Co-authored-by: Matt McFarland <mmcfarland@microsoft.com>
2022-03-04 15:23:45 -05:00
Matt McFarland fea7c69dc0
Fix PCT tiler signing (#54)
* Add correct PCT SAS token href for PCT tiler signing

Needed both the APIM url and the /token endpoint set in order for PCT
tiler to sign and read assets correctly.

* Consistently use quotes for jinja variables

We used a mix, and both are valid. However, my editor routinely tries to
add spaces between unquoted double braces using the JSON linter leading
to unintended syntax errors in the rendered Values file.
2022-03-01 10:30:10 -05:00
Rob Emanuele 33f5487247
Use Azure Storage Tables for collection and container configuration (#48)
* Add additional dependencies to pccommon

* Fix mypy error in pctiler

Brining in the type stubs for cachetools caused
mypy to complain about unknown types for the key
function

* Refactor scripts to test pccommon

Also run flake8 on pccommon,
which wasn't happening

* Linting fixups

* Add tables classes

* Refactor collection config in pccommon

Also refactor CommonConfig to use pydantic settings.
Create a table setup for collection configuration
and container configuration.
Use cachetools to cache the configuration.

* Add Azurite setup

Encode collection configuration and container
configuration (which was hardcoded) as JSON.
This can be used to populate the initial
table structure in deployment as well, after
which this test data will diverge from production
settings.

* Update codebase to use refactored configuration

* Set azurite settings in docker-compose

Also account for environment prefix for DEBUG
that change with refactor to use BaseSettings in
CommonConfig

* Move to using only pytest for consistency

* Test get render config for naip

* Refactor config code layout

Enable configuration of TTL

* Run azurite setup in scripts/setup

Also fix setup_azurite

* Add mosaicInfo and queriables to collection config

* Remove usage of requirements.txt

This was being used inconsistently.

* Add script for local package install

* Fetch queryables from storage tables

* Use orjson in pccommon

* Use ORJSONResponse

* Remove unused endpoint prefixes

* Add mosiac/info endpoint

* Add method to fetch all rows

* Add CLI for loading and dumping config data

* Variable for k8s version; update dev

* Allow AKS to pull from ACR

* Storage Tables in terraform

* Add config table env vars to helm charts

* Update ingress apiVersion

* Make note in deploy README about updating tables

* Update CHANGELOG

* Linting/formatting

* Remove unused __init__ override

This was left over from a previous
implementation, should have been cleaned up.

* Remove unused vars in dev terraform

* Allow cli to dump configs by id

Co-authored-by: Matt McFarland <mmcfarland@microsoft.com>
2022-02-17 16:06:08 -05:00
Matthew McFarland 16fc0c8751
Small pre-release fixes (#33)
* Fix describedby collection link

Use trailing slash to keep dataset path-part.

* Remove superfluous trace log

The service logs all requests with collection information, the trace is
duplicative.

* Fix titiler param bug

* Set PC_SDK_SAS_URL per environment

Previously, outside of development environments, the PC signing SDK
would use the production SAS URL in all environments. Instead, have the
default stack point to its own deployed SAS service to fully test out
integrated changes.

Additionally, change the default dev environment SAS URL to staging, as
it will most likely match any forthcoming releases to test against.

* Update changelog
2022-01-18 14:34:04 -05:00
Matthew McFarland 3f767d6c00
Logging improvements (#23)
* Add dedicated health check endpoint for pctiler

Also sets the liveness probe to use this new endpoint. Uses the same URL
path as pcstac, this will help isolate requests in the logs.

* Use constants for logging service name

* Fix local volume mount

The directory is copied in the Dockerfile, so it was loading on the
container, however this typo prevented reloading of pccommon when
changes were made without rebuilding the image.

* Consolidate tracing functionality

Rather than maintaining parity between two request tracing
implementations, create a common trace that can be used by both
projects.

* Prevent request tracing on health check endpoints

* Parse collection/item ids from search for logging

* Allow ACR overrides in dev deployments

Defaults to publish images for staging deploys

* Rename for Python convention

* Include request ip in trace

* Remove unused config

* Use request sensitive middleware for trace logs

Traditional middleware will corrupt usage of starlette request objects
by downstream route functions. Use a middleware class crafted for
accessing the request body without interfering with further processing.

* Lint: formatting

Auto-formatter
2022-01-10 15:46:25 -05:00
James Santucci 837df9e6fb
Use exception logging to capture failures in app insights (#14)
* Use exception logging to capture failures in app insights

* Format

* move global exception middleware to pccommon

* format

* fix return type of middleware

* remove unused

* Update changelog

* wip -- log encountered exceptions

* include extra metadata with exception

* remove metrics reporting

* configure liveness paths

* remove erroneous middlewares

* format

* remove more erroneous middleware

* Remove unused metrics module

* remove unused

* Remove pctiler unused

* Remove more unused

* Add special cd exception for this branch

* run on pull requests instead 🤞🏻

* correct tiler liveness env variable name

* bail on LIVENESS_PATH checks

* Format (_really_ thought I did that already)

* Remove pr trigger from cd

* Remove unused import

Co-authored-by: Rob Emanuele <rdemanuele@gmail.com>
Co-authored-by: Matt McFarland <mmcfarland@microsoft.com>
2021-12-15 13:30:00 -05:00
Rob Emanuele 81a666e843
Project renames; publish and deploy (#12)
* Add helm chart publication

* Remove old cipublish path

* switch from "1" to "true"

GitHub sets CI to "true" rather than 1:
https://docs.github.com/en/actions/learn-github-actions/environment-variables#default-environment-variables

* wip -- attempt the run

* don't condition deploy step for now

* switch branch back to main

youc an use $default-branch in templates, but not
actual workflows:
https://github.blog/changelog/2020-07-22-github-actions-better-support-for-alternative-default-branch-names/

* switch branch back to main

youc an use $default-branch in templates, but not
actual workflows:
https://github.blog/changelog/2020-07-22-github-actions-better-support-for-alternative-default-branch-names/

* set azure env variables

* parse and echo from json

* don't set tenant id?

https://github.com/hashicorp/terraform-provider-azuread/issues/343#issuecomment-721455149

??

* oops i don't think these are getting passed through

* add override compose

this compose file un-sets the env variables that we don't have because
we don't have a .env file in ci, so I _think_ it should inherit them
from the environment

* add back mqe resource group

why was this deleted

am i being trolled

* Add path debugging

* check what tf keys we got

* remove api management from ingress 🤞🏻

* all container registries through github

* remove debug prints from jinja

* re-remove the mqe resource group

because it was deliberately missing, not accidentally

* skip tests for a sec

* debug echo cluster name and rg

* require az account env variables

* reuse tf env variable names

* don't point to `latest` tag

* Add workflow for publishing helm charts to GH Pages.

* Rename charts, separate published, reset versions

* Use -dev suffix for development release

* Publish dev charts only on main

* Rename mqe -> stac, dqe -> tiler in codebase

* Update README

Rename MQE to STAC API and DQE to Tiler; also editorally make things more terse

* Remove testdata, move loadtestdata to tests

* Remove out of date and generic docs for tiler

* Use cipublish to publish images

* Rename python packages to be prefixed with 'pc'

* Remove stac-vrt

* Remove unused model and method.

Also formatting

* Remove scripts/env

* Upgrade to stac-fastapi 2.2.0

* Delete unusued doc images

* Update deployment code with renames.

* Update cert-manager, other deployment fixes

* Test on cibuild, remove GA test branch trigger

* Fixup PR template

Co-authored-by: Nathan Zimmerman <npzimmerman@gmail.com>
Co-authored-by: James Santucci <james.santucci@gmail.com>
2021-11-01 11:27:39 -04:00
Rob Emanuele e1ec9529c6 Initial commit. 2021-10-18 12:13:28 -04:00