This commit is contained in:
Hong Ooi 2020-10-22 00:58:47 +11:00
Родитель 9a878fbcf6
Коммит 312664bb25
18 изменённых файлов: 966 добавлений и 79 удалений

10
.Rbuildignore Normal file
Просмотреть файл

@ -0,0 +1,10 @@
^misc$
^\.vs$
\.sln$
\.Rproj$
\.Rxproj$
^\.Rproj\.user$
CONTRIBUTING.md
^LICENSE\.md$
^SECURITY\.md$
azure-pipelines.yml

14
CONTRIBUTING.md Normal file
Просмотреть файл

@ -0,0 +1,14 @@
# Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to a
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide
a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions
provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.

30
DESCRIPTION Normal file
Просмотреть файл

@ -0,0 +1,30 @@
Package: AzureCosmosR
Title: Interface to the 'Azure CosmosDB' and table storage services
Version: 0.0.1
Authors@R: c(
person("Hong", "Ooi", , "hongooi73@gmail.com", role = c("aut", "cre")),
person("Microsoft", role="cph")
)
Description: An interface to 'Azure CosmosDB' and table storage: <https://azure.microsoft.com/en-us/services/cosmos-db/>, <https://azure.microsoft.com/en-us/services/storage/tables/>. On the admin side, 'AzureCosmosR' provides the ability to create and manage 'CosmosDB' instances in Microsoft's 'Azure' cloud. On the client side, it provides functionality for reading and writing data stored in 'CosmosDB' as well as in table storage. Part of the 'AzureR' family of packages.
URL: https://github.com/Azure/AzureCosmosR https://github.com/Azure/AzureR
BugReports: https://github.com/Azure/AzureCosmosR/issues
License: MIT + file LICENSE
Depends:
R (>= 3.3)
Imports:
utils,
AzureRMR (>= 2.0.0),
AzureStor (>= 3.0.0),
openssl,
jsonlite,
httr,
uuid,
vctrs (>= 0.3.0)
Suggests:
AzureKeyVault,
testthat,
knitr,
rmarkdown,
tibble
Roxygen: list(markdown=TRUE)
RoxygenNote: 7.1.1

23
LICENSE
Просмотреть файл

@ -1,21 +1,2 @@
MIT License
Copyright (c) Microsoft Corporation.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE
YEAR: 2020
COPYRIGHT HOLDER: Microsoft

21
LICENSE.md Normal file
Просмотреть файл

@ -0,0 +1,21 @@
# MIT License
Copyright (c) 2020 Microsoft Corporation.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE

27
NAMESPACE Normal file
Просмотреть файл

@ -0,0 +1,27 @@
# Generated by roxygen2: do not edit by hand
S3method(azure_table,table_endpoint)
S3method(create_azure_table,table_endpoint)
S3method(delete_azure_table,azure_table)
S3method(delete_azure_table,table_endpoint)
S3method(list_azure_tables,table_endpoint)
S3method(print,azure_table)
S3method(print,batch_operation)
S3method(print,batch_operation_response)
S3method(sign_request,table_endpoint)
export(azure_table)
export(call_table_endpoint)
export(create_azure_table)
export(create_batch_operation)
export(delete_azure_table)
export(delete_table_entity)
export(do_batch_transaction)
export(get_table_entity)
export(import_table_entities)
export(insert_table_entity)
export(list_azure_tables)
export(list_table_entities)
export(table_endpoint)
export(update_table_entity)
import(AzureRMR)
import(AzureStor)

12
R/AzureCosmosR.R Normal file
Просмотреть файл

@ -0,0 +1,12 @@
#' @import AzureRMR
#' @import AzureStor
NULL
# assorted imports of friend functions
sign_sha256 <- get("sign_sha256", getNamespace("AzureStor"))
is_endpoint_url <- get("is_endpoint_url", getNamespace("AzureStor"))
delete_confirmed <- get("delete_confirmed", getNamespace("AzureStor"))
storage_error_message <- get("storage_error_message", getNamespace("AzureStor"))

27
R/sign_request.R Normal file
Просмотреть файл

@ -0,0 +1,27 @@
#' @export
sign_request.table_endpoint <- function(endpoint, verb, url, headers, api, ...)
{
make_sig <- function(key, verb, acct_name, resource, headers)
{
names(headers) <- tolower(names(headers))
sigstr <- paste(verb,
as.character(headers[["content-md5"]]),
as.character(headers[["content-type"]]),
as.character(headers[["date"]]),
resource, sep = "\n")
sigstr <- sub("\n$", "", sigstr)
paste0("SharedKey ", acct_name, ":", sign_sha256(sigstr, key))
}
acct_name <- sub("\\..+$", "", url$host)
resource <- paste0("/", acct_name, "/", url$path)
resource <- gsub("//", "/", resource)
if (is.null(headers$date) || is.null(headers$Date))
headers$date <- httr::http_date(Sys.time())
if (is.null(headers$`x-ms-version`))
headers$`x-ms-version` <- api
sig <- make_sig(endpoint$key, verb, acct_name, resource, headers)
utils::modifyList(headers, list(Host=url$host, Authorization=sig))
}

118
R/storage_tables.R Normal file
Просмотреть файл

@ -0,0 +1,118 @@
#' Operations with azure tables
#'
#' @param endpoint An object of class `table_endpoint`.
#' @param name The name of a table in a storage account.
#' @param confirm For deleting a table, whether to ask for confirmation.
#' @param ... Other arguments passed to lower-level functions.
#' @rdname azure_table
#' @details
#' These methods are for accessing and managing tables within a storage account.
#' @seealso
#' [table_endpoint], [table_entity]
#' @export
azure_table <- function(endpoint, ...)
{
UseMethod("azure_table")
}
#' @rdname azure_table
#' @export
azure_table.table_endpoint <- function(endpoint, name, ...)
{
structure(list(endpoint=endpoint, name=name), class="azure_table")
}
#' @rdname azure_table
#' @export
list_azure_tables <- function(endpoint, ...)
{
UseMethod("list_azure_tables")
}
#' @rdname azure_table
#' @export
list_azure_tables.table_endpoint <- function(endpoint, ...)
{
opts <- list()
val <- list()
repeat
{
res <- call_table_endpoint(endpoint, "Tables", options=opts, http_status_handler="pass")
httr::stop_for_status(res, storage_error_message(res))
heads <- httr::headers(res)
res <- httr::content(res)
val <- c(val, res$value)
if(is.null(heads$`x-ms-continuation-NextTableName`))
break
opts$NextTableName <- heads$`x-ms-continuation-NextTableName`
}
named_list(lapply(val, function(x) azure_table(endpoint, x$TableName)))
}
#' @rdname azure_table
#' @export
create_azure_table <- function(endpoint, ...)
{
UseMethod("create_azure_table")
}
#' @rdname azure_table
#' @export
create_azure_table.table_endpoint <- function(endpoint, name, ...)
{
res <- call_table_endpoint(endpoint, "Tables", body=list(TableName=name), ..., http_verb="POST")
azure_table(endpoint, res$TableName)
}
#' @rdname azure_table
#' @export
delete_azure_table <- function(endpoint, ...)
{
UseMethod("delete_azure_table")
}
#' @rdname azure_table
#' @export
delete_azure_table.table_endpoint <- function(endpoint, name, confirm=TRUE, ...)
{
if(!delete_confirmed(confirm, name, "table"))
return(invisible(NULL))
path <- sprintf("Tables('%s')", name)
invisible(call_table_endpoint(endpoint, path, http_verb="DELETE"))
}
#' @rdname azure_table
#' @export
delete_azure_table.azure_table <- function(endpoint, ...)
{
delete_azure_table(endpoint$endpoint, endpoint$name, ...)
}
#' @export
print.azure_table <- function(x, ...)
{
cat("Azure table '", x$name, "'\n",
sep = "")
url <- httr::parse_url(x$endpoint$url)
url$path <- x$name
cat(sprintf("URL: %s\n", httr::build_url(url)))
if (!is_empty(x$endpoint$key))
cat("Access key: <hidden>\n")
else cat("Access key: <none supplied>\n")
if (!is_empty(x$endpoint$token)) {
cat("Azure Active Directory access token:\n")
print(x$endpoint$token)
}
else cat("Azure Active Directory access token: <none supplied>\n")
if (!is_empty(x$endpoint$sas))
cat("Account shared access signature: <hidden>\n")
else cat("Account shared access signature: <none supplied>\n")
cat(sprintf("Storage API version: %s\n", x$endpoint$api_version))
invisible(x)
}

197
R/table_batch_request.R Normal file
Просмотреть файл

@ -0,0 +1,197 @@
#' Batch transactions for table storage
#'
#' @param endpoint A table storage endpoint, of class `table_endpoint`.
#' @param path The path component of the operation.
#' @param options A named list giving the query parameters for the operation.
#' @param headers A named list giving any additional HTTP headers to send to the host. AzureCosmosR will handle authentication details, so you don't have to specify these here.
#' @param body The request body for a PUT/POST/PATCH operation.
#' @param metadata The level of ODATA metadata to include in the response.
#' @param http_verb The HTTP verb (method) for the operation.
#' @param operations For `do_batch_transaction`, a list of individual operations to be batched up.
#' @param batch_status_handler For `do_batch_transaction`, what to do if one or more of the batch operations fails. The default is to signal a warning and return a list of response objects, from which the details of the failure(s) can be determined. Set this to "pass" to ignore the failure.
#'
#' @details
#' Table storage supports batch transactions on entities that are in the same table and belong to the same partition group. Batch transactions are also known as _entity group transactions_.
#'
#' You can use `create_batch_operation` to produce an object corresponding to a single table storage operation, such as inserting, deleting or updating an entity. Multiple such objects can then be passed to `do_batch_transaction`, which will carry them out as a single atomic transaction.
#'
#' Note that batch transactions are subject to some limitations imposed by the REST API:
#' - All entities subject to operations as part of the transaction must have the same `PartitionKey` value.
#' - An entity can appear only once in the transaction, and only one operation may be performed against it.
#' - The transaction can include at most 100 entities, and its total payload may be no more than 4 MB in size.
#'
#' @return
#' `create_batch_operation` returns an object of class `batch_operation`.
#'
#' `do_batch_transaction` returns a list of objects of class `batch_operation_response`, representing the results of each individual operation. Each object contains elements named `status`, `headers` and `body` containing the respective parts of the response. Note that the number of returned objects may be smaller than the number of operations in the batch, if the transaction failed.
#' @seealso
#' [import_table_entities], which uses (multiple) batch transactions under the hood
#'
#' [Performing entity group transactions](https://docs.microsoft.com/en-us/rest/api/storageservices/performing-entity-group-transactions)
#' @rdname table_batch
#' @export
create_batch_operation <- function(endpoint, path, options=list(), headers=list(), body=NULL,
metadata=c("none", "minimal", "full"), http_verb=c("GET", "PUT", "POST", "PATCH", "DELETE", "HEAD"))
{
accept <- if(!is.null(metadata))
{
metadata <- match.arg(metadata)
switch(match.arg(metadata),
"none"="application/json;odata=nometadata",
"minimal"="application/json;odata=minimalmetadata",
"full"="application/json;odata=fullmetadata")
}
else NULL
obj <- list()
obj$endpoint <- endpoint
obj$path <- path
obj$options <- options
obj$headers <- utils::modifyList(headers, list(Accept=accept, DataServiceVersion="3.0;NetFx"))
obj$method <- match.arg(http_verb)
obj$body <- body
structure(obj, class="batch_operation")
}
serialize_batch_operation <- function(object)
{
UseMethod("serialize_batch_operation")
}
serialize_batch_operation.batch_operation <- function(object)
{
url <- httr::parse_url(object$endpoint$url)
url$path <- object$path
url$query <- object$options
preamble <- c(
"Content-Type: application/http",
"Content-Transfer-Encoding: binary",
"",
paste(object$method, httr::build_url(url), "HTTP/1.1"),
paste0(names(object$headers), ": ", object$headers),
if(!is.null(object$body)) "Content-Type: application/json"
)
if(is.null(object$body))
preamble
else if(!is.character(object$body))
{
body <- jsonlite::toJSON(object$body, auto_unbox=TRUE, null="null")
# special-case treatment for 1-row dataframes
if(is.data.frame(object$body) && nrow(object$body) == 1)
body <- substr(body, 2, nchar(body) - 1)
c(preamble, "", body)
}
else c(preamble, "", object$body)
}
#' @rdname table_batch
#' @export
do_batch_transaction <- function(endpoint, operations, batch_status_handler=c("warn", "stop", "message", "pass"))
{
# batch REST API only supports 1 changeset per batch, and is unlikely to change
batch_bound <- paste0("batch_", uuid::UUIDgenerate())
changeset_bound <- paste0("changeset_", uuid::UUIDgenerate())
headers <- list(`Content-Type`=paste0("multipart/mixed; boundary=", batch_bound))
batch_preamble <- c(
paste0("--", batch_bound),
paste0("Content-Type: multipart/mixed; boundary=", changeset_bound),
""
)
batch_postscript <- c(
"",
paste0("--", changeset_bound, "--"),
paste0("--", batch_bound, "--")
)
serialized <- lapply(operations, function(op) c(paste0("--", changeset_bound), serialize_batch_operation(op)))
body <- paste0(c(batch_preamble, unlist(serialized), batch_postscript), collapse="\n")
if(nchar(body) > 4194304)
stop("Batch request too large, must be 4MB or less")
res <- call_table_endpoint(endpoint, "$batch", headers=headers, body=body, encode="raw",
http_verb="POST")
process_batch_response(res, match.arg(batch_status_handler))
}
process_batch_response <- function(response, batch_status_handler)
{
# assume response (including body) is always text
response <- rawToChar(response)
lines <- strsplit(response, "\r?\n\r?")[[1]]
batch_bound <- lines[1]
changeset_bound <- sub("^.+boundary=(.+)$", "\\1", lines[2])
n <- length(lines)
# assume only 1 changeset
batch_end <- grepl(batch_bound, lines[n])
if(!any(batch_end))
stop("Invalid batch response, batch boundary not found", call.=FALSE)
changeset_end <- grepl(changeset_bound, lines[n-1])
if(!any(changeset_end))
stop("Invalid batch response, changeset boundary not found", call.=FALSE)
lines <- lines[3:(n-3)]
op_bounds <- grep(changeset_bound, lines)
op_responses <- Map(
function(start, end) process_operation_response(lines[seq(start, end)], batch_status_handler),
op_bounds + 1,
c(op_bounds[-1], length(lines))
)
op_responses
}
process_operation_response <- function(response, handler)
{
blanks <- which(response == "")
if(length(blanks) < 2)
stop("Invalid operation response", call.=FALSE)
headers <- response[seq(blanks[1]+1, blanks[2]-1)] # skip over http stuff
status <- as.numeric(sub("^.+ (\\d{3}) .+$", "\\1", headers[1]))
headers <- strsplit(headers[-1], ": ")
names(headers) <- sapply(headers, `[[`, 1)
headers <- sapply(headers, `[[`, 2, simplify=FALSE)
class(headers) <- c("insensitive", "list")
if(status >= 300)
{
if(handler == "stop")
stop(httr::http_condition(status, "error"))
else if(handler == "warn")
warning(httr::http_condition(status, "warning"))
else if(handler == "message")
message(httr::http_condition(status, "message"))
}
body <- if(!(status %in% c(204, 205)) && blanks[2] < length(response))
response[seq(blanks[2]+1, length(response))]
else NULL
obj <- list(status=status, headers=headers, body=body)
class(obj) <- "batch_operation_response"
obj
}
#' @export
print.batch_operation <- function(x, ...)
{
cat("<Table storage batch operation>\n")
invisible(x)
}
#' @export
print.batch_operation_response <- function(x, ...)
{
cat("<Table storage batch operation response>\n")
invisible(x)
}

66
R/table_endpoint.R Normal file
Просмотреть файл

@ -0,0 +1,66 @@
#' Table storage endpoint
#'
#' Table storage endpoint object, and method to call it.
#'
#' @param endpoint For `table_endpoint`, the URL of the table service endpoint. This will be of the form `https://{account-name}.table.core.windows.net` if the service is provided by a storage account in the Azure public cloud, while for a CosmosDB database, it will be of the form `https://{account-name}.table.cosmos.azure.com:443`. For `call_table_endpoint`, an object of class `table_endpoint`.
#' @param key The access key for the storage account.
#' @param token An Azure Active Directory (AAD) authentication token. Not used for table storage.
#' @param sas A shared access signature (SAS) for the account.
#' @param api_version The storage API version to use when interacting with the host. Defaults to "2019-07-07".
#' @param path For `call_table_endpoint`, the path component of the endpoint call.
#' @param options For `call_table_endpoint`, a named list giving the query parameters for the operation.
#' @param headers For `call_table_endpoint`, a named list giving any additional HTTP headers to send to the host. AzureCosmosR will handle authentication details, so you don't have to specify these here.
#' @param body For `call_table_endpoint`, the request body for a PUT/POST/PATCH call.
#' @param metadata For `call_table_endpoint`, the level of ODATA metadata to include in the response.
#' @param ... For `call_table_endpoint`, further arguments passed to `AzureStor::call_storage_endpoint` and `httr::VERB`.
#'
#' @return
#' An object of class `table_endpoint`, inheriting from `storage_endpoint`. This is the analogue of the `blob_endpoint`, `file_endpoint` and `adls_endpoint` classes provided by the AzureStor package.
#'
#' @seealso
#' [azure_table], [table_entity]
#'
#' [Table service REST API reference](https://docs.microsoft.com/en-us/rest/api/storageservices/table-service-rest-api)
#' @rdname table_endpoint
#' @export
table_endpoint <- function(endpoint, key=NULL, token=NULL, sas=NULL,
api_version=getOption("azure_storage_api_version"))
{
if(!is_endpoint_url(endpoint, "table"))
warning("Not a recognised table endpoint", call.=FALSE)
if(!is.null(token))
{
warning("Table storage does not use Azure Active Directory authentication")
token <- NULL
}
obj <- list(url=endpoint, key=key, token=token, sas=sas, api_version=api_version)
class(obj) <- c("table_endpoint", "storage_endpoint")
obj
}
#' @rdname table_endpoint
#' @export
call_table_endpoint <- function(endpoint, path, options=list(), headers=list(), body=NULL, ...,
metadata=c("none", "minimal", "full"))
{
accept <- if(!is.null(metadata))
{
metadata <- match.arg(metadata)
switch(metadata,
"none"="application/json;odata=nometadata",
"minimal"="application/json;odata=minimalmetadata",
"full"="application/json;odata=fullmetadata")
}
else NULL
headers <- utils::modifyList(headers, list(Accept=accept, DataServiceVersion="3.0;NetFx"))
if(is.list(body))
{
body <- jsonlite::toJSON(body, auto_unbox=TRUE, null="null")
headers$`Content-Length` <- nchar(body)
headers$`Content-Type` <- "application/json"
}
call_storage_endpoint(endpoint, path=path, options=options, body=body, headers=headers, ...)
}

183
R/table_entity.R Normal file
Просмотреть файл

@ -0,0 +1,183 @@
#' Operations on table entities (rows)
#'
#' @param table A table object, of class `azure_table`.
#' @param entity For `insert_table_entity` and `update_table_entity`, a named list giving the properties (columns) of the entity. See 'Details' below.
#' @param data For `import_table_entities`, a data frame. See 'Details' below.
#' @param row_key,partition_key For `get_table_entity`, `update_table_entity` and `delete_table_entity`, the row and partition key values that identify the entity to get, update or delete. For `import_table_entities`, optionally the _columns_ in the imported data to treat as the row and partition keys. These will be renamed to `RowKey` and `PartitionKey` respectively.
#' @param etag For `update_table_entity` and `delete_table_entity`, an optional Etag value. If this is supplied, the update or delete operation will proceed only if the target entity's Etag matches this value. This ensures that an entity is only updated/deleted if it has not been modified since it was last retrieved.
#' @param filter,select For `list_table_entities`, optional row filter and column select expressions to subset the result with. If omitted, `list_table_entities` will return all entities in the table.
#' @param as_data_frame For `list_table_entities`, whether to return the results as a data frame, rather than a list of table rows.
#' @param batch_status_handler For `import_table_entities`, what to do if one or more of the batch operations fails. The default is to signal a warning and return a list of response objects, from which the details of the failure(s) can be determined. Set this to "pass" to ignore the failure.
#'
#' @details
#' These functions operate on rows of a table, also known as _entities_. `insert`, `get`, `update` and `delete_table_entity` operate on an individual row. `import_table_entities` bulk-inserts multiple rows of data into the table, using batch transactions. `list_table_entities` queries the table and returns multiple rows based on the `filter` and `subset` arguments.
#'
#' Table storage imposes the following requirements for properties (columns) of an entity:
#' - There must be properties named `RowKey` and `PartitionKey`, which together form the entity's unique identifier.
#' - The property `Timestamp` cannot be used (strictly speaking, it is reserved by the system).
#' - There can be at most 255 properties per entity, although different entities can have different properties.
#' - Table properties must be atomic (ie, they cannot be nested lists).
#'
#' For `insert_table_entity`, `update_table_entity` and `import_table_entities`, you can also specify JSON text representing the data to insert/update/import, instead of a list or data frame.
#' @return
#' `insert_table_entity` and `update_table_entity` return the Etag of the inserted/updated entity, invisibly.
#'
#' `get_table_entity` returns a named list of properties for the given entity.
#'
#' `list_table_entities` returns a data frame if `as_data_frame=TRUE`, and a list of entities (rows) otherwise.
#'
#' `import_table_entities` invisibly returns a named list, with one component for each value of the `PartitionKey` column. Each component contains the results of the individual operations to insert each row into the table.
#'
#' @seealso
#' [azure_table], [do_batch_transaction]
#'
#' [Understanding the table service data model](https://docs.microsoft.com/en-us/rest/api/storageservices/understanding-the-table-service-data-model)
#' @aliases table_entity
#' @rdname table_entity
#' @export
insert_table_entity <- function(table, entity)
{
if(is.character(entity) && jsonlite::validate(entity))
entity <- jsonlite::fromJSON(entity, simplifyDataFrame=FALSE)
else if(is.data.frame(entity))
{
if(nrow(entity) == 1) # special-case treatment for 1-row dataframes
entity <- unclass(entity)
else stop("Can only insert one entity at a time; use import_table_entities() to insert multiple entities",
call.=FALSE)
}
check_column_names(entity)
headers <- list(Prefer="return-no-content")
res <- call_table_endpoint(table$endpoint, table$name, body=entity, headers=headers, http_verb="POST",
http_status_handler="pass")
httr::stop_for_status(res, storage_error_message(res))
invisible(httr::headers(res)$ETag)
}
#' @rdname table_entity
#' @export
update_table_entity <- function(table, entity, row_key=entity$RowKey, partition_key=entity$PartitionKey, etag=NULL)
{
if(is.character(entity) && jsonlite::validate(entity))
entity <- jsonlite::fromJSON(entity, simplifyDataFrame=FALSE)
else if(is.data.frame(entity))
{
if(nrow(entity) == 1) # special-case treatment for 1-row dataframes
entity <- unclass(entity)
else stop("Can only update one entity at a time", call.=FALSE)
}
check_column_names(entity)
headers <- if(!is.null(etag))
list(`If-Match`=etag)
else list()
path <- sprintf("%s(PartitionKey='%s',RowKey='%s')", table$name, partition_key, row_key)
res <- call_table_endpoint(table$endpoint, table$name, body=entity, headers=headers, http_verb="POST",
http_status_handler="pass")
httr::stop_for_status(res, storage_error_message(res))
invisible(httr::headers(res)$ETag)
}
#' @rdname table_entity
#' @export
delete_table_entity <- function(table, row_key, partition_key, etag=NULL)
{
path <- sprintf("%s(PartitionKey='%s',RowKey='%s')", table$name, partition_key, row_key)
if(is.null(etag))
etag <- "*"
headers <- list(`If-Match`=etag)
invisible(call_table_endpoint(table$endpoint, path, headers=headers, http_verb="DELETE"))
}
#' @rdname table_entity
#' @export
list_table_entities <- function(table, filter=NULL, select=NULL, as_data_frame=TRUE)
{
path <- sprintf("%s()", table$name)
opts <- list(
`$filter`=filter,
`$select`=paste0(select, collapse=",")
)
val <- list()
repeat
{
res <- call_table_endpoint(table$endpoint, path, options=opts, http_status_handler="pass")
heads <- httr::headers(res)
res <- httr::content(res)
val <- c(val, res$value)
if(is.null(heads$`x-ms-continuation-NextPartitionKey`))
break
opts$NextPartitionKey <- heads$`x-ms-continuation-NextPartitionKey`
opts$NextRowKey <- heads$`x-ms-continuation-NextRowKey`
}
# table storage allows columns to vary by row, so cannot use base::rbind
if(as_data_frame)
do.call(vctrs::vec_rbind, val)
else val
}
#' @rdname table_entity
#' @export
get_table_entity <- function(table, row_key, partition_key, select=NULL)
{
path <- sprintf("%s(PartitionKey='%s',RowKey='%s')", table$name, partition_key, row_key)
opts <- if(!is.null(select))
list(`$select`=paste0(select, collapse=","))
else list()
call_table_endpoint(table$endpoint, path, options=opts)
}
#' @rdname table_entity
#' @export
import_table_entities <- function(table, data, row_key=NULL, partition_key=NULL,
batch_status_handler=c("warn", "stop", "message", "pass"))
{
if(is.character(data) && jsonlite::validate(data))
data <- jsonlite::fromJSON(data, simplifyDataFrame=TRUE)
if(!is.null(partition_key))
names(data)[names(data) == partition_key] <- "PartitionKey"
if(!is.null(row_key))
names(data)[names(data) == row_key] <- "RowKey"
check_column_names(data)
endpoint <- table$endpoint
path <- table$name
headers <- list(Prefer="return-no-content")
batch_status_handler <- match.arg(batch_status_handler)
res <- lapply(split(data, data$PartitionKey), function(dfpart)
{
n <- nrow(dfpart)
nchunks <- n %/% 100 + (n %% 100 > 0)
reschunks <- lapply(seq_len(nchunks), function(chunk)
{
rows <- seq(from=(chunk-1)*100 + 1, to=min(chunk*100, n))
dfchunk <- dfpart[rows, ]
ops <- lapply(seq_len(nrow(dfchunk)), function(i)
create_batch_operation(endpoint, path, body=dfchunk[i, ], headers=headers, http_verb="POST"))
do_batch_transaction(endpoint, ops, batch_status_handler)
})
unlist(reschunks, recursive=FALSE)
})
invisible(res)
}
check_column_names <- function(data)
{
if(!("PartitionKey" %in% names(data)) || !("RowKey" %in% names(data)))
stop("Data must contain columns named 'PartitionKey' and 'RowKey'", call.=FALSE)
if(!(is.character(data$PartitionKey) || is.factor(data$PartitionKey)) ||
!(is.character(data$RowKey) || is.factor(data$RowKey)))
stop("RowKey and PartitionKey columns must be character or factor", call.=FALSE)
if("Timestamp" %in% names(data))
stop("'Timestamp' column is reserved for system use", call.=FALSE)
}

Просмотреть файл

@ -1,33 +0,0 @@
# Project
> This repo has been populated by an initial template to help get you started. Please
> make sure to update the content to build a great experience for community-building.
As the maintainer of this project, please make a few updates:
- Improving this README.MD file to provide a great experience
- Updating SUPPORT.MD with content about this project's support experience
- Understanding the security reporting process in SECURITY.MD
- Remove this section from the README
## Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to a
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide
a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions
provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
## Trademarks
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft
trademarks or logos is subject to and must follow
[Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general).
Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship.
Any use of third-party trademarks or logos are subject to those third-party's policies.

Просмотреть файл

@ -1,25 +0,0 @@
# TODO: The maintainer of this repo has not yet edited this file
**REPO OWNER**: Do you want Customer Service & Support (CSS) support for this product/project?
- **No CSS support:** Fill out this template with information about how to file issues and get help.
- **Yes CSS support:** Fill out an intake form at [aka.ms/spot](https://aka.ms/spot). CSS will work with/help you to determine next steps. More details also available at [aka.ms/onboardsupport](https://aka.ms/onboardsupport).
- **Not sure?** Fill out a SPOT intake as though the answer were "Yes". CSS will help you decide.
*Then remove this first heading from this SUPPORT.MD file before publishing your repo.*
# Support
## How to file issues and get help
This project uses GitHub Issues to track bugs and feature requests. Please search the existing
issues before filing new issues to avoid duplicates. For new issues, file your bug or
feature request as a new Issue.
For help and questions about using this project, please **REPO MAINTAINER: INSERT INSTRUCTIONS HERE
FOR HOW TO ENGAGE REPO OWNERS OR COMMUNITY FOR HELP. COULD BE A STACK OVERFLOW TAG OR OTHER
CHANNEL. WHERE WILL YOU HELP PEOPLE?**.
## Microsoft Support Policy
Support for this **PROJECT or PRODUCT** is limited to the resources listed above.

50
man/azure_table.Rd Normal file
Просмотреть файл

@ -0,0 +1,50 @@
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/storage_tables.R
\name{azure_table}
\alias{azure_table}
\alias{azure_table.table_endpoint}
\alias{list_azure_tables}
\alias{list_azure_tables.table_endpoint}
\alias{create_azure_table}
\alias{create_azure_table.table_endpoint}
\alias{delete_azure_table}
\alias{delete_azure_table.table_endpoint}
\alias{delete_azure_table.azure_table}
\title{Operations with azure tables}
\usage{
azure_table(endpoint, ...)
\method{azure_table}{table_endpoint}(endpoint, name, ...)
list_azure_tables(endpoint, ...)
\method{list_azure_tables}{table_endpoint}(endpoint, ...)
create_azure_table(endpoint, ...)
\method{create_azure_table}{table_endpoint}(endpoint, name, ...)
delete_azure_table(endpoint, ...)
\method{delete_azure_table}{table_endpoint}(endpoint, name, confirm = TRUE, ...)
\method{delete_azure_table}{azure_table}(endpoint, ...)
}
\arguments{
\item{endpoint}{An object of class \code{table_endpoint}.}
\item{...}{Other arguments passed to lower-level functions.}
\item{name}{The name of a table in a storage account.}
\item{confirm}{For deleting a table, whether to ask for confirmation.}
}
\description{
Operations with azure tables
}
\details{
These methods are for accessing and managing tables within a storage account.
}
\seealso{
\link{table_endpoint}, \link{table_entity}
}

67
man/table_batch.Rd Normal file
Просмотреть файл

@ -0,0 +1,67 @@
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/table_batch_request.R
\name{create_batch_operation}
\alias{create_batch_operation}
\alias{do_batch_transaction}
\title{Batch transactions for table storage}
\usage{
create_batch_operation(
endpoint,
path,
options = list(),
headers = list(),
body = NULL,
metadata = c("none", "minimal", "full"),
http_verb = c("GET", "PUT", "POST", "PATCH", "DELETE", "HEAD")
)
do_batch_transaction(
endpoint,
operations,
batch_status_handler = c("warn", "stop", "message", "pass")
)
}
\arguments{
\item{endpoint}{A table storage endpoint, of class \code{table_endpoint}.}
\item{path}{The path component of the operation.}
\item{options}{A named list giving the query parameters for the operation.}
\item{headers}{A named list giving any additional HTTP headers to send to the host. AzureCosmosR will handle authentication details, so you don't have to specify these here.}
\item{body}{The request body for a PUT/POST/PATCH operation.}
\item{metadata}{The level of ODATA metadata to include in the response.}
\item{http_verb}{The HTTP verb (method) for the operation.}
\item{operations}{For \code{do_batch_transaction}, a list of individual operations to be batched up.}
\item{batch_status_handler}{For \code{do_batch_transaction}, what to do if one or more of the batch operations fails. The default is to signal a warning and return a list of response objects, from which the details of the failure(s) can be determined. Set this to "pass" to ignore the failure.}
}
\value{
\code{create_batch_operation} returns an object of class \code{batch_operation}.
\code{do_batch_transaction} returns a list of objects of class \code{batch_operation_response}, representing the results of each individual operation. Each object contains elements named \code{status}, \code{headers} and \code{body} containing the respective parts of the response. Note that the number of returned objects may be smaller than the number of operations in the batch, if the transaction failed.
}
\description{
Batch transactions for table storage
}
\details{
Table storage supports batch transactions on entities that are in the same table and belong to the same partition group. Batch transactions are also known as \emph{entity group transactions}.
You can use \code{create_batch_operation} to produce an object corresponding to a single table storage operation, such as inserting, deleting or updating an entity. Multiple such objects can then be passed to \code{do_batch_transaction}, which will carry them out as a single atomic transaction.
Note that batch transactions are subject to some limitations imposed by the REST API:
\itemize{
\item All entities subject to operations as part of the transaction must have the same \code{PartitionKey} value.
\item An entity can appear only once in the transaction, and only one operation may be performed against it.
\item The transaction can include at most 100 entities, and its total payload may be no more than 4 MB in size.
}
}
\seealso{
\link{import_table_entities}, which uses (multiple) batch transactions under the hood
\href{https://docs.microsoft.com/en-us/rest/api/storageservices/performing-entity-group-transactions}{Performing entity group transactions}
}

59
man/table_endpoint.Rd Normal file
Просмотреть файл

@ -0,0 +1,59 @@
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/table_endpoint.R
\name{table_endpoint}
\alias{table_endpoint}
\alias{call_table_endpoint}
\title{Table storage endpoint}
\usage{
table_endpoint(
endpoint,
key = NULL,
token = NULL,
sas = NULL,
api_version = getOption("azure_storage_api_version")
)
call_table_endpoint(
endpoint,
path,
options = list(),
headers = list(),
body = NULL,
...,
metadata = c("none", "minimal", "full")
)
}
\arguments{
\item{endpoint}{For \code{table_endpoint}, the URL of the table service endpoint. This will be of the form \verb{https://\{account-name\}.table.core.windows.net} if the service is provided by a storage account in the Azure public cloud, while for a CosmosDB database, it will be of the form \verb{https://\{account-name\}.table.cosmos.azure.com:443}. For \code{call_table_endpoint}, an object of class \code{table_endpoint}.}
\item{key}{The access key for the storage account.}
\item{token}{An Azure Active Directory (AAD) authentication token. Not used for table storage.}
\item{sas}{A shared access signature (SAS) for the account.}
\item{api_version}{The storage API version to use when interacting with the host. Defaults to "2019-07-07".}
\item{path}{For \code{call_table_endpoint}, the path component of the endpoint call.}
\item{options}{For \code{call_table_endpoint}, a named list giving the query parameters for the operation.}
\item{headers}{For \code{call_table_endpoint}, a named list giving any additional HTTP headers to send to the host. AzureCosmosR will handle authentication details, so you don't have to specify these here.}
\item{body}{For \code{call_table_endpoint}, the request body for a PUT/POST/PATCH call.}
\item{...}{For \code{call_table_endpoint}, further arguments passed to \code{AzureStor::call_storage_endpoint} and \code{httr::VERB}.}
\item{metadata}{For \code{call_table_endpoint}, the level of ODATA metadata to include in the response.}
}
\value{
An object of class \code{table_endpoint}, inheriting from \code{storage_endpoint}. This is the analogue of the \code{blob_endpoint}, \code{file_endpoint} and \code{adls_endpoint} classes provided by the AzureStor package.
}
\description{
Table storage endpoint object, and method to call it.
}
\seealso{
\link{azure_table}, \link{table_entity}
\href{https://docs.microsoft.com/en-us/rest/api/storageservices/table-service-rest-api}{Table service REST API reference}
}

83
man/table_entity.Rd Normal file
Просмотреть файл

@ -0,0 +1,83 @@
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/table_entity.R
\name{insert_table_entity}
\alias{insert_table_entity}
\alias{table_entity}
\alias{update_table_entity}
\alias{delete_table_entity}
\alias{list_table_entities}
\alias{get_table_entity}
\alias{import_table_entities}
\title{Operations on table entities (rows)}
\usage{
insert_table_entity(table, entity)
update_table_entity(
table,
entity,
row_key = entity$RowKey,
partition_key = entity$PartitionKey,
etag = NULL
)
delete_table_entity(table, row_key, partition_key, etag = NULL)
list_table_entities(table, filter = NULL, select = NULL, as_data_frame = TRUE)
get_table_entity(table, row_key, partition_key, select = NULL)
import_table_entities(
table,
data,
row_key = NULL,
partition_key = NULL,
batch_status_handler = c("warn", "stop", "message", "pass")
)
}
\arguments{
\item{table}{A table object, of class \code{azure_table}.}
\item{entity}{For \code{insert_table_entity} and \code{update_table_entity}, a named list giving the properties (columns) of the entity. See 'Details' below.}
\item{row_key, partition_key}{For \code{get_table_entity}, \code{update_table_entity} and \code{delete_table_entity}, the row and partition key values that identify the entity to get, update or delete. For \code{import_table_entities}, optionally the \emph{columns} in the imported data to treat as the row and partition keys. These will be renamed to \code{RowKey} and \code{PartitionKey} respectively.}
\item{etag}{For \code{update_table_entity} and \code{delete_table_entity}, an optional Etag value. If this is supplied, the update or delete operation will proceed only if the target entity's Etag matches this value. This ensures that an entity is only updated/deleted if it has not been modified since it was last retrieved.}
\item{filter, select}{For \code{list_table_entities}, optional row filter and column select expressions to subset the result with. If omitted, \code{list_table_entities} will return all entities in the table.}
\item{as_data_frame}{For \code{list_table_entities}, whether to return the results as a data frame, rather than a list of table rows.}
\item{data}{For \code{import_table_entities}, a data frame. See 'Details' below.}
\item{batch_status_handler}{For \code{import_table_entities}, what to do if one or more of the batch operations fails. The default is to signal a warning and return a list of response objects, from which the details of the failure(s) can be determined. Set this to "pass" to ignore the failure.}
}
\value{
\code{insert_table_entity} and \code{update_table_entity} return the Etag of the inserted/updated entity, invisibly.
\code{get_table_entity} returns a named list of properties for the given entity.
\code{list_table_entities} returns a data frame if \code{as_data_frame=TRUE}, and a list of entities (rows) otherwise.
\code{import_table_entities} invisibly returns a named list, with one component for each value of the \code{PartitionKey} column. Each component contains the results of the individual operations to insert each row into the table.
}
\description{
Operations on table entities (rows)
}
\details{
These functions operate on rows of a table, also known as \emph{entities}. \code{insert}, \code{get}, \code{update} and \code{delete_table_entity} operate on an individual row. \code{import_table_entities} bulk-inserts multiple rows of data into the table, using batch transactions. \code{list_table_entities} queries the table and returns multiple rows based on the \code{filter} and \code{subset} arguments.
Table storage imposes the following requirements for properties (columns) of an entity:
\itemize{
\item There must be properties named \code{RowKey} and \code{PartitionKey}, which together form the entity's unique identifier.
\item The property \code{Timestamp} cannot be used (strictly speaking, it is reserved by the system).
\item There can be at most 255 properties per entity, although different entities can have different properties.
\item Table properties must be atomic (ie, they cannot be nested lists).
}
For \code{insert_table_entity}, \code{update_table_entity} and \code{import_table_entities}, you can also specify JSON text representing the data to insert/update/import, instead of a list or data frame.
}
\seealso{
\link{azure_table}, \link{do_batch_transaction}
\href{https://docs.microsoft.com/en-us/rest/api/storageservices/understanding-the-table-service-data-model}{Understanding the table service data model}
}