fhir-server/docs/BulkExport.md

4.7 KiB

Bulk Export

This feature allows data from the FHIR server to be exported. More details can be found in the spec. The feature is currently turned on by default. To disable the feature, update the FhirServer:Operations:Export:Enabled setting to be false.

Specifying destination

There are two ways by which one can set the destination storage account to export data to. One way would be to use the connection string for the storage account and update the FhirServer:Operations:Export:StorageAccountConnection setting. The fhir-server will use the connection string to connect to the storage account and export data.

The other option would be to use the FhirServer:Operations:Export:StorageAccountUri setting with the uri of the storage account. For this option, we assume that the fhir-server has permissions to contribute data to the corresponding storage account. One way to achieve this (assuming you are running the fhir-server code in App Service with Managed Identity enabled) would be to give the App Service Storage Blob Data Contributor permissions for the storage account of your choice.

Currently, we only support Azure Blob storage as the destination.

We recommend to use connection string with SAS token so that the token can expire after certain period of time.

Examples of connection string that is expected when using FhirServer:Operations:Export:StorageAccountConnection:

BlobEndpoint=https://example.blob.core.windows.net/;QueueEndpoint=https://example.queue.core.windows.net/;FileEndpoint=https://example.file.core.windows.net/;TableEndpoint=https://example.table.core.windows.net/;SharedAccessSignature=[SAS Token]

Well known Azure Storage Emulator connection string:

DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;BlobEndpoint=http://127.0.0.1:10000/devstoreaccount1;TableEndpoint=http://127.0.0.1:10002/devstoreaccount1;QueueEndpoint=http://127.0.0.1:10001/devstoreaccount1;

Example of a storage account uri that is expected when using FhirServer:Operations:Export:StorageAccountUri:

https://<accountName>.blob.core.windows.net/

Example of the URL used to queue a new export job:

https://test-fhir-server/$export

What is supported

For more details on Bulk Export, see the Azure API for FHIR Export Data page.

Below are set of additional query parameters users can specify in addition to ones defined in the Azure API For FHIR documentation

  1. _format in FHIR Server: _format allows a user to select a format for the file structure that the export job creates. Different formats can be defined in the appSettings by combining constants, folder level breaks ('/'), and known tags. The tags will be replaced with data when the job is run. The three supported tags are:
  • resourcename: Replaced with the resource type being exported.
  • timestamp: Replaced with a timestamp of the job's queried time.
  • id: Replaced with the GUID of the export job.
  1. _max_count: _max_count allows to reduce the number of resources exported by a single job. Users can use the _maxCount=xxxx query parameter or set MaximumNumberOfResourcesPerQuery in the export configuration section. The default is 10,000. Export operation needs memory to serialize the data when it is writing to the lake. To reduce out of memory exceptions due to additional memory, user can choose to reduce the _max_count value by decrements of 1000. It would be beneficial for user to increase the compute memory on FHIR server as well.

To use the format, you will need to set the following settings in the appSettings:

appSetting Description Example Value
FhirServer:Operations:Export:Formats:#:Name Name the format you plan to call. The # should be replaced as you can specify multiple formats. We provide default values for 0 and 1 (for when a container is specified and when a container is not specified), so we recommend starting with 2 TestFormat
FhirServer:Operations:Export:Formats:#:Format Defines the format. The # should match the one used above. test/<resourcename>/<id>/<timestamp>

In the table above, you would use format in the following way GET https://<<FHIR service base URL>>/$export?_format=TestFormat. The result would be an export saved in a folder structure test/<resourcename>/<id> and the file name would be <timestamp>.ndjson.

Exported data can be deidentified using the FHIR Tools for Anonymization