Initial Commit
Working version of sqlmlutils in Python and R Supports: - Stored Procedures - Execute in SQL - Package Management Known Issues: - Cannot execute Stored Procedures with Output Parameters - No dependency resolution on uninstall in Python Package Management
This commit is contained in:
Родитель
75b1ad3039
Коммит
0dfd965952
Двоичный файл не отображается.
|
@ -0,0 +1,14 @@
|
|||
# Contributing
|
||||
|
||||
This project welcomes contributions and suggestions. Most contributions require you to agree to a
|
||||
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
|
||||
the rights to use your contribution. For details, visit https://cla.microsoft.com.
|
||||
|
||||
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide
|
||||
a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions
|
||||
provided by the bot. You will only need to do this once across all repos using our CLA.
|
||||
|
||||
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
|
||||
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
|
||||
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
|
||||
|
38
LICENSE
38
LICENSE
|
@ -1,21 +1,25 @@
|
|||
MIT License
|
||||
------------------------------------------- START OF LICENSE -----------------------------------------
|
||||
sqlmlutils
|
||||
|
||||
Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
MIT License
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE
|
||||
----------------------------------------------- END OF LICENSE ------------------------------------------
|
||||
|
|
|
@ -0,0 +1,25 @@
|
|||
------------------------------------------- START OF LICENSE -----------------------------------------
|
||||
sqlmlutils
|
||||
|
||||
MIT License
|
||||
|
||||
Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE
|
||||
----------------------------------------------- END OF LICENSE ------------------------------------------
|
|
@ -0,0 +1,19 @@
|
|||
# file GENERATED by distutils, do NOT edit
|
||||
setup.py
|
||||
sqlmlutils\__init__.py
|
||||
sqlmlutils\connectioninfo.py
|
||||
sqlmlutils\sqlbuilder.py
|
||||
sqlmlutils\sqlpythonexecutor.py
|
||||
sqlmlutils\sqlqueryexecutor.py
|
||||
sqlmlutils\storedprocedure.py
|
||||
sqlmlutils/packagemanagement\__init__.py
|
||||
sqlmlutils/packagemanagement\dependencyresolver.py
|
||||
sqlmlutils/packagemanagement\download_script.py
|
||||
sqlmlutils/packagemanagement\messages.py
|
||||
sqlmlutils/packagemanagement\outputcapture.py
|
||||
sqlmlutils/packagemanagement\packagesqlbuilder.py
|
||||
sqlmlutils/packagemanagement\pipdownloader.py
|
||||
sqlmlutils/packagemanagement\pkgutils.py
|
||||
sqlmlutils/packagemanagement\scope.py
|
||||
sqlmlutils/packagemanagement\servermethods.py
|
||||
sqlmlutils/packagemanagement\sqlpackagemanager.py
|
|
@ -0,0 +1,220 @@
|
|||
# sqlmlutils
|
||||
|
||||
sqlmlutils is a python package to help execute Python code on a SQL Server machine. It is built to work with ML Services for SQL Server.
|
||||
|
||||
# Installation
|
||||
|
||||
Run
|
||||
```
|
||||
python.exe -m pip install dist/sqlmlutils-0.5.0.zip --upgrade
|
||||
```
|
||||
OR
|
||||
To build a new package file and install, run
|
||||
```
|
||||
.\buildandinstall.cmd
|
||||
```
|
||||
|
||||
Note: If you encounter errors installing the pymssql dependency and your client is a Windows machine, consider
|
||||
installing the .whl file at the below link (download the file for your Python version and run pip install):
|
||||
https://www.lfd.uci.edu/~gohlke/pythonlibs/#pymssql
|
||||
|
||||
# Getting started
|
||||
|
||||
Shown below are the important functions sqlmlutils provides:
|
||||
```python
|
||||
execute_function_in_sql # Execute a python function inside the SQL database
|
||||
execute_script_in_sql # Execute a python script inside the SQL database
|
||||
execute_sql_query # Execute a sql query in the database and return the resultant table
|
||||
|
||||
create_sproc_from_function # Create a stored procedure based on a Python function inside the SQL database
|
||||
create_sproc_from_script # Create a stored procedure based on a Python script inside the SQL database
|
||||
check_sproc # Check whether a stored procedure exists in the SQL database
|
||||
drop_sproc # Drop a stored procedure from the SQL database
|
||||
execute_sproc # Execute a stored procedure in the SQL database
|
||||
|
||||
install_package # Install a Python package on the SQL database
|
||||
remove_package # Remove a Python package from the SQL database
|
||||
list # Enumerate packages that are installed on the SQL database
|
||||
```
|
||||
|
||||
# Examples
|
||||
|
||||
### Execute in SQL
|
||||
##### Execute a python function in database
|
||||
|
||||
```python
|
||||
import sqlmlutils
|
||||
|
||||
def foo():
|
||||
return "bar"
|
||||
|
||||
sqlpy = sqlmlutils.SQLPythonExecutor(sqlmlutils.ConnectionInfo(server="localhost", database="master"))
|
||||
result = sqlpy.execute_function_in_sql(foo)
|
||||
assert result == "bar"
|
||||
```
|
||||
|
||||
##### Generate a scatter plot without the data leaving the machine
|
||||
|
||||
```python
|
||||
import sqlmlutils
|
||||
from PIL import Image
|
||||
|
||||
|
||||
def scatter_plot(input_df, x_col, y_col):
|
||||
import matplotlib.pyplot as plt
|
||||
import io
|
||||
|
||||
title = x_col + " vs. " + y_col
|
||||
|
||||
plt.scatter(input_df[x_col], input_df[y_col])
|
||||
plt.xlabel(x_col)
|
||||
plt.ylabel(y_col)
|
||||
plt.title(title)
|
||||
|
||||
# Save scatter plot image as a png
|
||||
buf = io.BytesIO()
|
||||
plt.savefig(buf, format="png")
|
||||
buf.seek(0)
|
||||
|
||||
# Returns the bytes of the png to the client
|
||||
return buf
|
||||
|
||||
|
||||
sqlpy = sqlmlutils.SQLPythonExecutor(sqlmlutils.ConnectionInfo(server="localhost", database="AirlineTestDB"))
|
||||
|
||||
sql_query = "select top 100 * from airline5000"
|
||||
plot_data = sqlpy.execute_function_in_sql(func=scatter_plot, input_data_query=sql_query,
|
||||
x_col="ArrDelay", y_col="CRSDepTime")
|
||||
im = Image.open(plot_data)
|
||||
im.show()
|
||||
```
|
||||
|
||||
##### Perform linear regression on data stored in SQL Server without the data leaving the machine
|
||||
|
||||
You can use the AirlineTestDB (supplied as a .bak file above) to run these examples.
|
||||
|
||||
```python
|
||||
import sqlmlutils
|
||||
|
||||
|
||||
def linear_regression(input_df, x_col, y_col):
|
||||
from sklearn import linear_model
|
||||
|
||||
X = input_df[[x_col]]
|
||||
y = input_df[y_col]
|
||||
|
||||
lr = linear_model.LinearRegression()
|
||||
lr.fit(X, y)
|
||||
|
||||
return lr
|
||||
|
||||
|
||||
sqlpy = sqlmlutils.SQLPythonExecutor(sqlmlutils.ConnectionInfo(server="localhost", database="AirlineTestDB"))
|
||||
sql_query = "select top 1000 CRSDepTime, CRSArrTime from airline5000"
|
||||
regression_model = sqlpy.execute_function_in_sql(linear_regression, input_data_query=sql_query,
|
||||
x_col="CRSDepTime", y_col="CRSArrTime")
|
||||
print(regression_model)
|
||||
print(regression_model.coef_)
|
||||
```
|
||||
|
||||
##### Execute a SQL Query from Python
|
||||
|
||||
```python
|
||||
import sqlmlutils
|
||||
import pytest
|
||||
|
||||
sqlpy = sqlmlutils.SQLPythonExecutor(sqlmlutils.ConnectionInfo(server="localhost", database="AirlineTestDB"))
|
||||
sql_query = "select top 10 * from airline5000"
|
||||
data_table = sqlpy.execute_sql_query(sql_query)
|
||||
assert len(data_table.columns) == 30
|
||||
assert len(data_table) == 10
|
||||
```
|
||||
|
||||
### Stored Procedure
|
||||
##### Create and call a T-SQL stored procedure based on a Python function
|
||||
|
||||
```python
|
||||
import sqlmlutils
|
||||
import pytest
|
||||
|
||||
def principal_components(input_table: str, output_table: str):
|
||||
import sqlalchemy
|
||||
from urllib import parse
|
||||
import pandas as pd
|
||||
from sklearn.decomposition import PCA
|
||||
|
||||
# Internal ODBC connection string used by process executing inside SQL Server
|
||||
connection_string = "Driver=SQL Server;Server=localhost;Database=AirlineTestDB;Trusted_Connection=Yes;"
|
||||
engine = sqlalchemy.create_engine("mssql+pyodbc:///?odbc_connect={}".format(parse.quote_plus(connection_string)))
|
||||
|
||||
input_df = pd.read_sql("select top 200 ArrDelay, CRSDepTime from {}".format(input_table), engine).dropna()
|
||||
|
||||
|
||||
pca = PCA(n_components=2)
|
||||
components = pca.fit_transform(input_df)
|
||||
|
||||
output_df = pd.DataFrame(components)
|
||||
output_df.to_sql(output_table, engine, if_exists="replace")
|
||||
|
||||
|
||||
connection = sqlmlutils.ConnectionInfo(server="localhost", database="AirlineTestDB")
|
||||
|
||||
input_table = "airline5000"
|
||||
output_table = "AirlineDemoPrincipalComponents"
|
||||
|
||||
sp_name = "SavePrincipalComponents"
|
||||
|
||||
sqlpy = sqlmlutils.SQLPythonExecutor(connection)
|
||||
|
||||
if sqlpy.check_sproc(sp_name):
|
||||
sqlpy.drop_sproc(sp_name)
|
||||
|
||||
sqlpy.create_sproc_from_function(sp_name, principal_components)
|
||||
|
||||
# You can check the stored procedure exists in the db with this:
|
||||
assert sqlpy.check_sproc(sp_name)
|
||||
|
||||
sqlpy.execute_sproc(sp_name, input_table=input_table, output_table=output_table)
|
||||
|
||||
sqlpy.drop_sproc(sp_name)
|
||||
assert not sqlpy.check_sproc(sp_name)
|
||||
```
|
||||
|
||||
### Package Management
|
||||
##### Install and remove packages from SQL Server
|
||||
|
||||
```python
|
||||
import sqlmlutils
|
||||
|
||||
connection = sqlmlutils.ConnectionInfo(server="localhost", database="AirlineTestDB")
|
||||
sqlpy = sqlmlutils.SQLPythonExecutor(connection)
|
||||
pkgmanager = sqlmlutils.SQLPackageManager(connection)
|
||||
|
||||
def use_tensorflow():
|
||||
import tensorflow as tf
|
||||
node1 = tf.constant(3.0, tf.float32)
|
||||
return str(node1.dtype)
|
||||
|
||||
pkgmanager.install("tensorflow")
|
||||
val = sqlpy.execute_function_in_sql(use_tensorflow)
|
||||
|
||||
pkgmanager.uninstall("tensorflow")
|
||||
```
|
||||
|
||||
|
||||
# Notes for Developers
|
||||
|
||||
### Running the tests
|
||||
|
||||
1. Make sure a SQL Server with an updated ML Services Python is running on localhost.
|
||||
2. Restore the AirlineTestDB from the .bak file in this repo
|
||||
3. Make sure Trusted (Windows) authentication works for connecting to the database
|
||||
4. Setup a user with db_owner role with uid: "Tester" and password "FakeTesterPwd"
|
||||
|
||||
### Notable TODOs and open issues
|
||||
|
||||
1. The pymssql library is hard to install. Users need to install the .whl files from the link above, not
|
||||
the .whl files currently hosted in PyPI. Because of this, we should consider moving to use pyodbc.
|
||||
2. Testing from a Linux client has not been performed.
|
||||
3. The way we get dependencies of a package to install is sort of hacky (parsing pip output)
|
||||
4. Output Parameter execution currently does not work - can potentially use MSSQLStoredProcedure binding
|
|
@ -0,0 +1,2 @@
|
|||
python.exe setup.py sdist
|
||||
python.exe -m pip install dist\sqlmlutils-0.5.0.zip --upgrade
|
Двоичный файл не отображается.
|
@ -0,0 +1,25 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
from distutils.core import setup
|
||||
|
||||
setup(
|
||||
name='sqlmlutils',
|
||||
packages=['sqlmlutils', 'sqlmlutils/packagemanagement'],
|
||||
version='0.5.0',
|
||||
url='https://github.com/Microsoft/sqlmlutils',
|
||||
license='MIT License',
|
||||
description='A client side package for working with SQL Machine Learning Python Services. '
|
||||
'sqlmlutils enables easy package installation and remote code execution on your SQL Server machine.',
|
||||
author='Microsoft',
|
||||
author_email='joz@microsoft.com',
|
||||
install_requires=[
|
||||
'pip',
|
||||
'pymssql',
|
||||
'dill',
|
||||
'pkginfo',
|
||||
'requirements-parser',
|
||||
'pandas'
|
||||
],
|
||||
python_requires='>=3.5'
|
||||
)
|
|
@ -0,0 +1,6 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
from .connectioninfo import ConnectionInfo
|
||||
from .sqlpythonexecutor import SQLPythonExecutor
|
||||
from .packagemanagement.sqlpackagemanager import SQLPackageManager
|
|
@ -0,0 +1,55 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
class ConnectionInfo:
|
||||
"""Information needed to connect to SQL Server.
|
||||
|
||||
"""
|
||||
|
||||
def __init__(self, driver: str = "SQL Server", server: str = "localhost", database: str = "master",
|
||||
uid: str = "", pwd: str = ""):
|
||||
"""
|
||||
:param driver: Driver to use to connect to SQL Server.
|
||||
:param server: SQL Server hostname or a specific instance to connect to.
|
||||
:param database: Database to connect to.
|
||||
:param uid: uid to connect with. If not specified, utilizes trusted authentication.
|
||||
:param pwd: pwd to connect with. If uid is not specified, pwd is ignored; uses trusted auth instead
|
||||
|
||||
>>> from sqlmlutils import ConnectionInfo
|
||||
>>> connection = ConnectionInfo(server="ServerName", database="DatabaseName", uid="Uid", pwd="Pwd")
|
||||
"""
|
||||
self._driver = driver
|
||||
self._server = server
|
||||
self._database = database
|
||||
self._uid = uid
|
||||
self._pwd = pwd
|
||||
|
||||
@property
|
||||
def driver(self):
|
||||
return self._driver
|
||||
|
||||
@property
|
||||
def server(self):
|
||||
return self._server
|
||||
|
||||
@property
|
||||
def database(self):
|
||||
return self._database
|
||||
|
||||
@property
|
||||
def uid(self):
|
||||
return self._uid
|
||||
|
||||
@property
|
||||
def pwd(self):
|
||||
return self._pwd
|
||||
|
||||
@property
|
||||
def connection_string(self):
|
||||
return "Driver={{driver}};Server={server};Database={database};{auth};".format(
|
||||
driver=self._driver,
|
||||
server=self._server,
|
||||
database=self._database,
|
||||
auth="Trusted_Connection=Yes" if self._uid == "" else
|
||||
"uid={uid};pwd={pwd}".format(uid=self._uid, pwd=self._pwd)
|
||||
)
|
|
@ -0,0 +1 @@
|
|||
from .sqlpackagemanager import SQLPackageManager
|
|
@ -0,0 +1,60 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
import operator
|
||||
|
||||
from distutils.version import LooseVersion
|
||||
|
||||
|
||||
class DependencyResolver:
|
||||
|
||||
def __init__(self, server_packages, target_package):
|
||||
self._server_packages = server_packages
|
||||
self._target_package = target_package
|
||||
|
||||
def requirement_met(self, upgrade: bool, version: str = None) -> bool:
|
||||
exists = self._package_exists_on_server()
|
||||
return exists and (not upgrade or
|
||||
(version is not None and self.get_target_server_version() != "" and
|
||||
LooseVersion(self.get_target_server_version()) >= LooseVersion(version)))
|
||||
|
||||
def get_target_server_version(self):
|
||||
for package in self._server_packages:
|
||||
if package[0].lower() == self._target_package.lower():
|
||||
return package[1]
|
||||
return ""
|
||||
|
||||
def get_required_installs(self, target_requirements):
|
||||
required_packages = []
|
||||
for requirement in target_requirements:
|
||||
reqmet = any(package[0] == requirement.name for package in self._server_packages)
|
||||
|
||||
for spec in requirement.specs:
|
||||
reqmet = reqmet & self._check_if_installed_package_meets_spec(
|
||||
self._server_packages, requirement.name, spec)
|
||||
|
||||
if not reqmet or requirement.name == self._target_package:
|
||||
required_packages.append(self.clean_requirement_name(requirement.name))
|
||||
return required_packages
|
||||
|
||||
def _package_exists_on_server(self):
|
||||
return any([serverpkg[0].lower() == self._target_package.lower() for serverpkg in self._server_packages])
|
||||
|
||||
@staticmethod
|
||||
def clean_requirement_name(reqname: str):
|
||||
return reqname.replace("-", "_")
|
||||
|
||||
@staticmethod
|
||||
def _check_if_installed_package_meets_spec(package_tuples, name, spec):
|
||||
op_str = spec[0]
|
||||
req_version = spec[1]
|
||||
|
||||
installed_package_name_and_version = [package for package in package_tuples if package[0] == name]
|
||||
if not installed_package_name_and_version:
|
||||
return False
|
||||
|
||||
installed_package_name_and_version = installed_package_name_and_version[0]
|
||||
installed_version = installed_package_name_and_version[1]
|
||||
|
||||
operator_map = {'>': 'gt', '>=': 'ge', '<': 'lt', '==': 'eq', '<=': 'le', '!=': 'ne'}
|
||||
return getattr(operator, operator_map[op_str])(LooseVersion(installed_version), LooseVersion(req_version))
|
|
@ -0,0 +1,26 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
from distutils.version import LooseVersion
|
||||
import pip
|
||||
import warnings
|
||||
import sys
|
||||
|
||||
pipversion = LooseVersion(pip.__version__ )
|
||||
if pipversion > LooseVersion("10"):
|
||||
from pip._internal import pep425tags
|
||||
from pip._internal import main as pipmain
|
||||
else:
|
||||
if pipversion < LooseVersion("8.1.2"):
|
||||
warnings.warn("Pip version less than 8.1.2 not supported.", Warning)
|
||||
from pip import pep425tags
|
||||
from pip import main as pipmain
|
||||
|
||||
# Monkey patch the pip version information with server information
|
||||
pep425tags.get_impl_version_info = lambda: eval(sys.argv[1])
|
||||
pep425tags.get_abbr_impl = lambda: sys.argv[2]
|
||||
pep425tags.get_abi_tag = lambda: sys.argv[3]
|
||||
pep425tags.get_platform = lambda: sys.argv[4]
|
||||
|
||||
# Call pipmain with the download request
|
||||
pipmain(list(map(str.strip, sys.argv[5].split(","))))
|
|
@ -0,0 +1,11 @@
|
|||
def no_upgrade(pkgname: str, serverversion: str, pkgversion: str = ""):
|
||||
return """
|
||||
Package {pkgname} exists on server. Set upgrade to True to force upgrade.".format(pkgname))
|
||||
The version of {pkgname} you are trying to install is {pkgversion}.
|
||||
The version installed on the server is {serverversion}
|
||||
""".format(pkgname=pkgname, pkgversion=pkgversion, serverversion=serverversion)
|
||||
|
||||
|
||||
def install(pkgname: str, version: str, targetpackage: bool):
|
||||
target = "target package" if targetpackage else "required dependency"
|
||||
return "Installing {} {} version {}".format(target, pkgname, version)
|
|
@ -0,0 +1,9 @@
|
|||
import sys
|
||||
import io
|
||||
|
||||
|
||||
class OutputCapture(io.StringIO):
|
||||
|
||||
def write(self, txt):
|
||||
sys.__stdout__.write(txt)
|
||||
super().write(txt)
|
|
@ -0,0 +1,130 @@
|
|||
from sqlmlutils.sqlbuilder import SQLBuilder
|
||||
from sqlmlutils.packagemanagement.scope import Scope
|
||||
|
||||
|
||||
class CreateLibraryBuilder(SQLBuilder):
|
||||
|
||||
def __init__(self, pkg_name: str, pkg_filename: str, scope: Scope):
|
||||
self._name = clean_library_name(pkg_name)
|
||||
self._filename = pkg_filename
|
||||
self._has_params = True
|
||||
self._scope = scope
|
||||
|
||||
@property
|
||||
def params(self):
|
||||
with open(self._filename, "rb") as f:
|
||||
pkgdatastr = "0x" + f.read().hex()
|
||||
|
||||
installcheckscript = """
|
||||
import os
|
||||
import re
|
||||
_ENV_NAME_USER_PATH = "MRS_EXTLIB_USER_PATH"
|
||||
_ENV_NAME_SHARED_PATH = "MRS_EXTLIB_SHARED_PATH"
|
||||
|
||||
|
||||
def _is_dist_info_file(name, file):
|
||||
return re.match(name + r'-.*egg', file) or re.match(name + r'-.*dist-info', file)
|
||||
|
||||
|
||||
def _is_package_match(package_name, file):
|
||||
package_name = package_name.lower()
|
||||
file = file.lower()
|
||||
return file == package_name or file == package_name + ".py" or \
|
||||
_is_dist_info_file(package_name, file) or \
|
||||
("-" in package_name and
|
||||
(package_name.split("-")[0] == file or _is_dist_info_file(package_name.replace("-", "_"), file)))
|
||||
|
||||
def package_files_in_scope(scope='private'):
|
||||
envdir = _ENV_NAME_SHARED_PATH if scope == 'public' or os.environ.get(_ENV_NAME_USER_PATH, "") == "" \
|
||||
else _ENV_NAME_USER_PATH
|
||||
path = os.environ.get(envdir, "")
|
||||
if os.path.isdir(path):
|
||||
return os.listdir(path)
|
||||
return []
|
||||
|
||||
def package_exists_in_scope(sql_package_name: str, scope=None) -> bool:
|
||||
if scope is None:
|
||||
# default to user path for every user but DBOs
|
||||
scope = 'public' if (os.environ.get(_ENV_NAME_USER_PATH, "") == "") else 'private'
|
||||
package_files = package_files_in_scope(scope)
|
||||
return any([_is_package_match(sql_package_name, package_file) for package_file in package_files])
|
||||
|
||||
|
||||
assert package_exists_in_scope("{sqlpkgname}", "{scopestr}")
|
||||
""".format(sqlpkgname=self._name, scopestr=self._scope._name)
|
||||
|
||||
return pkgdatastr, installcheckscript
|
||||
|
||||
@property
|
||||
def base_script(self) -> str:
|
||||
return """
|
||||
-- Wrap this in a transaction
|
||||
DECLARE @TransactionName varchar(30) = 'SqlPackageTransaction';
|
||||
BEGIN TRAN @TransactionName
|
||||
|
||||
-- Drop the library if it exists
|
||||
BEGIN TRY
|
||||
DROP EXTERNAL LIBRARY [{sqlpkgname}] {authorization}
|
||||
END TRY
|
||||
BEGIN CATCH
|
||||
END CATCH
|
||||
|
||||
-- Parameter bind the package data
|
||||
DECLARE @content varbinary(MAX) = convert(varbinary(MAX), %s, 1);
|
||||
|
||||
-- Create the library
|
||||
CREATE EXTERNAL LIBRARY [{sqlpkgname}] {authorization}
|
||||
FROM (CONTENT = @content) WITH (LANGUAGE = 'Python');
|
||||
|
||||
-- Dummy SPEES
|
||||
{dummy_spees}
|
||||
|
||||
-- Check to make sure the package was installed
|
||||
BEGIN TRY
|
||||
exec sp_execute_external_script
|
||||
@language = N'Python',
|
||||
@script = %s
|
||||
-- Installation succeeded, commit the transaction
|
||||
COMMIT TRAN @TransactionName
|
||||
print('Package successfully installed.')
|
||||
END TRY
|
||||
BEGIN CATCH
|
||||
-- Installation failed, rollback the transaction
|
||||
ROLLBACK TRAN @TransactionName
|
||||
print('Package installation failed.');
|
||||
THROW;
|
||||
END CATCH
|
||||
""".format(sqlpkgname=self._name,
|
||||
authorization=_get_authorization(self._scope),
|
||||
dummy_spees=_get_dummy_spees())
|
||||
|
||||
|
||||
class DropLibraryBuilder(SQLBuilder):
|
||||
|
||||
def __init__(self, sql_package_name: str, scope: Scope):
|
||||
self._name = clean_library_name(sql_package_name)
|
||||
self._scope = scope
|
||||
|
||||
@property
|
||||
def base_script(self) -> str:
|
||||
return """
|
||||
DROP EXTERNAL LIBRARY [{}] {authorization}
|
||||
|
||||
{dummy_spees}
|
||||
""".format(self._name, authorization=_get_authorization(self._scope), dummy_spees=_get_dummy_spees())
|
||||
|
||||
|
||||
def clean_library_name(pkgname: str):
|
||||
return pkgname.replace("-", "_").lower()
|
||||
|
||||
|
||||
def _get_authorization(scope: Scope) -> str:
|
||||
return "AUTHORIZATION dbo" if scope == Scope.public_scope() else ""
|
||||
|
||||
|
||||
def _get_dummy_spees() -> str:
|
||||
return """
|
||||
exec sp_execute_external_script
|
||||
@language = N'Python',
|
||||
@script = N''
|
||||
"""
|
|
@ -0,0 +1,82 @@
|
|||
import re
|
||||
import requirements
|
||||
import subprocess
|
||||
import os
|
||||
|
||||
from sqlmlutils import ConnectionInfo, SQLPythonExecutor
|
||||
from sqlmlutils.packagemanagement import servermethods
|
||||
|
||||
class PipDownloader:
|
||||
|
||||
def __init__(self, connection: ConnectionInfo, downloaddir: str, targetpackage: str):
|
||||
self._connection = connection
|
||||
self._downloaddir = downloaddir
|
||||
self._targetpackage = targetpackage
|
||||
server_info = SQLPythonExecutor(connection).execute_function_in_sql(servermethods.get_server_info)
|
||||
globals().update(server_info)
|
||||
|
||||
def download(self):
|
||||
return self._download(True)
|
||||
|
||||
def download_single(self) -> str:
|
||||
_, pkgsdownloaded = self._download(False)
|
||||
return pkgsdownloaded[0]
|
||||
|
||||
def _download(self, withdependencies):
|
||||
# This command directs pip to download the target package, as well as all of its dependencies into
|
||||
# temporary_directory.
|
||||
commands = ["download", self._targetpackage, "--destination-dir", self._downloaddir, "--no-cache-dir"]
|
||||
if not withdependencies:
|
||||
commands.append("--no-dependencies")
|
||||
|
||||
output, error = self._run_in_new_process(commands)
|
||||
|
||||
pkgreqs = self._get_reqs_from_output(output)
|
||||
|
||||
packagesdownloaded = [os.path.join(self._downloaddir, f) for f in os.listdir(self._downloaddir)
|
||||
if os.path.isfile(os.path.join(self._downloaddir, f))]
|
||||
|
||||
return pkgreqs, packagesdownloaded
|
||||
|
||||
def _run_in_new_process(self, commands):
|
||||
# We get the package requirements based on the print output of pip, which is stable across version 8-10.
|
||||
# TODO: get requirements in a more robust way (either through using pip internal code or rolling our own)
|
||||
download_script = os.path.join((os.path.dirname(os.path.realpath(__file__))), "download_script.py")
|
||||
args = ["python", download_script,
|
||||
str(_patch_get_impl_version_info()), str(_patch_get_abbr_impl()),
|
||||
str(_patch_get_abi_tag()), str(_patch_get_platform()),
|
||||
",".join(str(x) for x in commands)]
|
||||
|
||||
with subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc:
|
||||
output = proc.stdout.read()
|
||||
error = proc.stderr.read()
|
||||
|
||||
return output.decode(), error.decode()
|
||||
|
||||
@staticmethod
|
||||
def _get_reqs_from_output(pipoutput: str):
|
||||
# TODO: get requirements in a more robust way (either through using pip internal code or rolling our own)
|
||||
collectinglines = [line for line in pipoutput.splitlines() if "Collecting" in line]
|
||||
|
||||
f = lambda unclean: \
|
||||
re.sub(r'\(.*\)', "", unclean.replace("Collecting ", "").strip())
|
||||
|
||||
reqstr = "\n".join([f(line) for line in collectinglines])
|
||||
return list(requirements.parse(reqstr))
|
||||
|
||||
|
||||
def _patch_get_impl_version_info():
|
||||
return globals()["impl_version_info"]
|
||||
|
||||
|
||||
def _patch_get_abbr_impl():
|
||||
return globals()["abbr_impl"]
|
||||
|
||||
|
||||
def _patch_get_abi_tag():
|
||||
return globals()["abi_tag"]
|
||||
|
||||
|
||||
def _patch_get_platform():
|
||||
return globals()["platform"]
|
||||
|
|
@ -0,0 +1,29 @@
|
|||
import pkginfo
|
||||
import os
|
||||
import re
|
||||
|
||||
|
||||
def _get_pkginfo(filename: str):
|
||||
try:
|
||||
if ".whl" in filename:
|
||||
return pkginfo.Wheel(filename)
|
||||
else:
|
||||
return pkginfo.SDist(filename)
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
|
||||
def get_package_name_from_file(filename: str) -> str:
|
||||
pkg = _get_pkginfo(filename)
|
||||
if pkg is not None and pkg.name is not None:
|
||||
return pkg.name
|
||||
name = os.path.splitext(os.path.basename(filename))[0]
|
||||
return re.sub(r"\-[0-9].*", "", name)
|
||||
|
||||
|
||||
def get_package_version_from_file(filename: str):
|
||||
pkg = _get_pkginfo(filename)
|
||||
if pkg is not None and pkg.version is not None:
|
||||
return pkg.version
|
||||
return None
|
||||
|
|
@ -0,0 +1,20 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
class Scope:
|
||||
|
||||
def __init__(self, name: str):
|
||||
self._name = name
|
||||
|
||||
def __eq__(self, other):
|
||||
return self._name == other._name
|
||||
|
||||
@staticmethod
|
||||
def public_scope():
|
||||
return Scope("public")
|
||||
|
||||
@staticmethod
|
||||
def private_scope():
|
||||
return Scope("private")
|
||||
|
||||
|
|
@ -0,0 +1,78 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
from sqlmlutils.packagemanagement.scope import Scope
|
||||
import os
|
||||
import re
|
||||
|
||||
_ENV_NAME_USER_PATH = "MRS_EXTLIB_USER_PATH"
|
||||
_ENV_NAME_SHARED_PATH = "MRS_EXTLIB_SHARED_PATH"
|
||||
|
||||
|
||||
def show_installed_packages():
|
||||
from distutils.version import LooseVersion
|
||||
import pip
|
||||
if LooseVersion(pip.__version__) > LooseVersion("10"):
|
||||
from pip._internal.operations import freeze
|
||||
else:
|
||||
from pip.operations import freeze
|
||||
|
||||
packages = []
|
||||
for package in list(freeze.freeze()):
|
||||
val = package.split("==")
|
||||
name = val[0]
|
||||
version = val[1]
|
||||
packages.append((name, version))
|
||||
return packages
|
||||
|
||||
|
||||
def get_server_info():
|
||||
from distutils.version import LooseVersion
|
||||
import pip
|
||||
if LooseVersion(pip.__version__) > LooseVersion("10"):
|
||||
from pip._internal import pep425tags
|
||||
else:
|
||||
from pip import pep425tags
|
||||
return {
|
||||
"impl_version_info": pep425tags.get_impl_version_info(),
|
||||
"abbr_impl": pep425tags.get_abbr_impl(),
|
||||
"abi_tag": pep425tags.get_abi_tag(),
|
||||
"platform": pep425tags.get_platform()
|
||||
}
|
||||
|
||||
|
||||
def check_package_install_success(sql_package_name: str) -> bool:
|
||||
return package_exists_in_scope(sql_package_name)
|
||||
|
||||
|
||||
def package_files_in_scope(scope=Scope.private_scope()):
|
||||
envdir = _ENV_NAME_SHARED_PATH if scope == Scope.public_scope() or os.environ.get(_ENV_NAME_USER_PATH, "") == "" \
|
||||
else _ENV_NAME_USER_PATH
|
||||
path = os.environ.get(envdir, "")
|
||||
if os.path.isdir(path):
|
||||
return os.listdir(path)
|
||||
return []
|
||||
|
||||
|
||||
def package_exists_in_scope(sql_package_name: str, scope=None) -> bool:
|
||||
if scope is None:
|
||||
# default to user path for every user but DBOs
|
||||
scope = Scope.public_scope() if (os.environ.get(_ENV_NAME_USER_PATH, "") == "") else Scope.private_scope()
|
||||
package_files = package_files_in_scope(scope)
|
||||
return any([_is_package_match(sql_package_name, package_file) for package_file in package_files])
|
||||
|
||||
|
||||
def _is_dist_info_file(name, file):
|
||||
return re.match(name + r'-.*egg', file) or re.match(name + r'-.*dist-info', file)
|
||||
|
||||
|
||||
def _is_package_match(package_name, file):
|
||||
package_name = package_name.lower()
|
||||
file = file.lower()
|
||||
return file == package_name or file == package_name + ".py" or \
|
||||
_is_dist_info_file(package_name, file) or \
|
||||
("-" in package_name and
|
||||
(package_name.split("-")[0] == file or _is_dist_info_file(package_name.replace("-", "_"), file)))
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,176 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
import os
|
||||
import tempfile
|
||||
import zipfile
|
||||
import warnings
|
||||
|
||||
from sqlmlutils import ConnectionInfo, SQLPythonExecutor
|
||||
from sqlmlutils.sqlqueryexecutor import execute_query, SQLTransaction
|
||||
from sqlmlutils.packagemanagement.packagesqlbuilder import clean_library_name
|
||||
from sqlmlutils.packagemanagement import servermethods
|
||||
from sqlmlutils.sqlqueryexecutor import SQLQueryExecutor
|
||||
from sqlmlutils.packagemanagement.dependencyresolver import DependencyResolver
|
||||
from sqlmlutils.packagemanagement.pipdownloader import PipDownloader
|
||||
from sqlmlutils.packagemanagement.scope import Scope
|
||||
from sqlmlutils.packagemanagement import messages
|
||||
from sqlmlutils.packagemanagement.pkgutils import get_package_name_from_file, get_package_version_from_file
|
||||
from sqlmlutils.packagemanagement.packagesqlbuilder import CreateLibraryBuilder, DropLibraryBuilder
|
||||
|
||||
|
||||
class SQLPackageManager:
|
||||
|
||||
def __init__(self, connection_info: ConnectionInfo):
|
||||
self._connection_info = connection_info
|
||||
self._pyexecutor = SQLPythonExecutor(connection_info)
|
||||
|
||||
def install(self,
|
||||
package: str,
|
||||
upgrade: bool = False,
|
||||
version: str = None,
|
||||
install_dependencies: bool = True,
|
||||
scope: Scope = Scope.private_scope()):
|
||||
"""Install Python package into a SQL Server Python Services environment using pip.
|
||||
|
||||
:param package: Package name to install on the SQL Server. Can also be a filename.
|
||||
:param upgrade: If True, will update the package if it exists on the specified SQL Server.
|
||||
If False, will not try to update an existing package.
|
||||
:param version: Not yet supported. Package version to install. If not specified,
|
||||
current stable version for server environment as determined by PyPi/Anaconda repos.
|
||||
:param install_dependencies: If True, installs required dependencies of package (similar to how default
|
||||
pip install or conda install works). False not yet supported.
|
||||
:param scope: Specifies whether to install packages into private or public scope. Default is private scope.
|
||||
This installs packages into a private path for the SQL principal you connect as. If your principal has the
|
||||
db_owner role, you can also specify scope as public. This will install packages into a public path for all
|
||||
users. Note: if you connect as dbo, you can only install packages into the public path.
|
||||
|
||||
>>> from sqlmlutils import ConnectionInfo, SQLPythonExecutor, SQLPackageManager
|
||||
>>> connection = ConnectionInfo(server="localhost", database="AirlineTestsDB")
|
||||
>>> pyexecutor = SQLPythonExecutor(connection)
|
||||
>>> pkgmanager = SQLPackageManager(connection)
|
||||
>>>
|
||||
>>> def use_tensorflow():
|
||||
>>> import tensorflow as tf
|
||||
>>> node1 = tf.constant(3.0, tf.float32)
|
||||
>>> return str(node1.dtype)
|
||||
>>>
|
||||
>>> pkgmanager.install("tensorflow")
|
||||
>>> ret = pyexecutor.execute_function_in_sql(connection=connection, use_tensorflow)
|
||||
>>> pkgmanager.uninstall("tensorflow")
|
||||
|
||||
"""
|
||||
if not install_dependencies:
|
||||
raise ValueError("Dependencies will always be installed - "
|
||||
"single package install without dependencies not yet supported.")
|
||||
|
||||
if os.path.isfile(package):
|
||||
self._install_from_file(package, scope, upgrade)
|
||||
else:
|
||||
self._install_from_pypi(package, upgrade, version, install_dependencies, scope)
|
||||
|
||||
def uninstall(self, package_name: str, scope: Scope = Scope.private_scope()):
|
||||
"""Remove Python package from a SQL Server Python environment.
|
||||
|
||||
:param package_name: Package name to remove on the SQL Server.
|
||||
:param scope: Specifies whether to uninstall packages from private or public scope. Default is private scope.
|
||||
This uninstalls packages from a private path for the SQL principal you connect as. If your principal has the
|
||||
db_owner role, you can also specify scope as public. This will uninstall packages from a public path for all
|
||||
users. Note: if you connect as dbo, you can only uninstall packages from the public path.
|
||||
"""
|
||||
print("Uninstalling " + package_name + "only, not dependencies")
|
||||
self._drop_sql_package(package_name, scope)
|
||||
|
||||
def list(self):
|
||||
"""List packages installed on server, similar to output of pip freeze.
|
||||
|
||||
:return: List of tuples, each tuple[0] is package name and tuple[1] is package version.
|
||||
"""
|
||||
return self._pyexecutor.execute_function_in_sql(servermethods.show_installed_packages)
|
||||
|
||||
def _drop_sql_package(self, sql_package_name: str, scope: Scope):
|
||||
builder = DropLibraryBuilder(sql_package_name=sql_package_name, scope=scope)
|
||||
execute_query(builder, self._connection_info)
|
||||
|
||||
# TODO: Support not dependencies
|
||||
def _install_from_pypi(self,
|
||||
target_package: str,
|
||||
upgrade: bool = False,
|
||||
version: str = None,
|
||||
install_dependencies: bool = True,
|
||||
scope: Scope = Scope.private_scope()):
|
||||
|
||||
if not install_dependencies:
|
||||
raise ValueError("Dependencies will always be installed - "
|
||||
"single package install without dependencies not yet supported.")
|
||||
|
||||
if version is not None:
|
||||
target_package = target_package + "==" + version
|
||||
|
||||
with tempfile.TemporaryDirectory() as temporary_directory:
|
||||
pipdownloader = PipDownloader(self._connection_info, temporary_directory, target_package)
|
||||
target_package_file = pipdownloader.download_single()
|
||||
self._install_from_file(target_package_file, scope, upgrade)
|
||||
|
||||
def _install_from_file(self, target_package_file: str, scope: Scope, upgrade: bool = False):
|
||||
name = get_package_name_from_file(target_package_file)
|
||||
version = get_package_version_from_file(target_package_file)
|
||||
|
||||
resolver = DependencyResolver(self.list(), name)
|
||||
if resolver.requirement_met(upgrade, version):
|
||||
serverversion = resolver.get_target_server_version()
|
||||
print(messages.no_upgrade(name, serverversion, version))
|
||||
return
|
||||
|
||||
# Download requirements from PyPI
|
||||
with tempfile.TemporaryDirectory() as temporary_directory:
|
||||
pipdownloader = PipDownloader(self._connection_info, temporary_directory, target_package_file)
|
||||
|
||||
# For now, we download all target package dependencies from PyPI.
|
||||
target_package_requirements, requirements_downloaded = pipdownloader.download()
|
||||
|
||||
# Resolve which package dependencies need to be installed or upgraded on server.
|
||||
required_installs = resolver.get_required_installs(target_package_requirements)
|
||||
dependencies_to_install = self._get_required_files_to_install(requirements_downloaded, required_installs)
|
||||
|
||||
self._install_many(target_package_file, dependencies_to_install, scope)
|
||||
|
||||
def _install_many(self, target_package_file: str, dependency_files, scope: Scope):
|
||||
target_name = get_package_name_from_file(target_package_file)
|
||||
|
||||
with SQLQueryExecutor(connection=self._connection_info) as sqlexecutor:
|
||||
transaction = SQLTransaction(sqlexecutor, clean_library_name(target_name) + "InstallTransaction")
|
||||
transaction.begin()
|
||||
try:
|
||||
for pkgfile in dependency_files:
|
||||
self._install_single(sqlexecutor, pkgfile, scope)
|
||||
self._install_single(sqlexecutor, target_package_file, scope, True)
|
||||
transaction.commit()
|
||||
except Exception:
|
||||
transaction.rollback()
|
||||
raise RuntimeError("Package installation failed, installed dependencies were rolled back.")
|
||||
|
||||
@staticmethod
|
||||
def _install_single(sqlexecutor: SQLQueryExecutor, package_file: str, scope: Scope, is_target=False):
|
||||
name = get_package_name_from_file(package_file)
|
||||
version = get_package_version_from_file(package_file)
|
||||
|
||||
with tempfile.TemporaryDirectory() as temporary_directory:
|
||||
prezip = os.path.join(temporary_directory, name + "PREZIP.zip")
|
||||
with zipfile.ZipFile(prezip, 'w') as zipf:
|
||||
zipf.write(package_file, os.path.basename(package_file))
|
||||
|
||||
builder = CreateLibraryBuilder(pkg_name=name, pkg_filename=prezip, scope=scope)
|
||||
sqlexecutor.execute(builder)
|
||||
|
||||
@staticmethod
|
||||
def _get_required_files_to_install(pkgfiles, requirements):
|
||||
return [file for file in pkgfiles
|
||||
if SQLPackageManager._pkgfile_in_requirements(file, requirements)]
|
||||
|
||||
@staticmethod
|
||||
def _pkgfile_in_requirements(pkgfile: str, requirements):
|
||||
pkgname = get_package_name_from_file(pkgfile)
|
||||
return any([DependencyResolver.clean_requirement_name(pkgname.lower()) ==
|
||||
DependencyResolver.clean_requirement_name(req.lower())
|
||||
for req in requirements])
|
|
@ -0,0 +1,467 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
from typing import Callable, List
|
||||
import abc
|
||||
import dill
|
||||
import inspect
|
||||
import textwrap
|
||||
from pandas import DataFrame
|
||||
import warnings
|
||||
|
||||
RETURN_COLUMN_NAME = "return_val"
|
||||
|
||||
|
||||
"""
|
||||
_SQLBuilder implementations are used to generate SQL scripts to execute_function_in_sql Python functions and
|
||||
create/drop/execute_function_in_sql stored procedures.
|
||||
|
||||
Builder classes use query parametrization whenever possible, falling back to Python string formatting when neccesary.
|
||||
|
||||
The main internal function to execute_function_in_sql SQL statements (_execute_query in the _sqlqueryexecutor module)
|
||||
takes an implementation _SQLBuilder as an argument.
|
||||
|
||||
All _SQLBuilder classes implement a base_script property. This is the text of the SQL query. Some builder classes
|
||||
return values in their params property.
|
||||
"""
|
||||
|
||||
|
||||
class SQLBuilder:
|
||||
|
||||
@abc.abstractmethod
|
||||
def base_script(self) -> str:
|
||||
pass
|
||||
|
||||
@property
|
||||
def params(self):
|
||||
return None
|
||||
|
||||
|
||||
class SpeesBuilder(SQLBuilder):
|
||||
|
||||
"""_SpeesBuilder objects are used to generate exec sp_execute_external_script SQL queries.
|
||||
|
||||
"""
|
||||
|
||||
def __init__(self,
|
||||
script: str,
|
||||
with_results_text: str = "",
|
||||
input_data_query: str = "",
|
||||
script_parameters_text: str = ""):
|
||||
"""Instantiate a _SpeesBuilder object.
|
||||
|
||||
:param script: maps to @script parameter in the SQL query parameter
|
||||
:param with_results_text: with results text used to defined the expected data schema of the SQL query
|
||||
:param input_data_query: maps to @input_data_1 SQL query parameter
|
||||
:param script_parameters_text: maps to @params SQL query parameter
|
||||
"""
|
||||
self._script = script
|
||||
self._input_data_query = input_data_query
|
||||
self._script_parameters_text = script_parameters_text
|
||||
self._with_results_text = with_results_text
|
||||
|
||||
@property
|
||||
def base_script(self):
|
||||
return """
|
||||
exec sp_execute_external_script
|
||||
@language = N'Python',
|
||||
@script = %s,
|
||||
@input_data_1 = %s
|
||||
{script_parameters}
|
||||
{with_results_text}
|
||||
""".format(script_parameters=self._script_parameters_text,
|
||||
with_results_text=self._with_results_text)
|
||||
|
||||
@property
|
||||
def params(self):
|
||||
return self._script, self._input_data_query
|
||||
|
||||
|
||||
class SpeesBuilderFromFunction(SpeesBuilder):
|
||||
|
||||
"""
|
||||
_SpeesBuilderFromFunction objects are used to generate SPEES queries based on a function and given arguments.
|
||||
"""
|
||||
|
||||
_WITH_RESULTS_TEXT = "with result sets((return_val varchar(MAX)))"
|
||||
|
||||
def __init__(self, func: Callable, input_data_query: str = "", *args, **kwargs):
|
||||
"""Instantiate a _SpeesBuilderFromFunction object.
|
||||
|
||||
:param func: function to execute_function_in_sql on the SQL Server.
|
||||
The spees query is built based on this function.
|
||||
:param input_data_query: query text for @input_data_1 parameter
|
||||
:param args: positional arguments to function call in SPEES
|
||||
:param kwargs: keyword arguments to function call in SPEES
|
||||
"""
|
||||
with_inputdf = input_data_query != ""
|
||||
self._function_text = self._build_wrapper_python_script(func, with_inputdf, *args, **kwargs)
|
||||
super().__init__(script=self._function_text,
|
||||
with_results_text=self._WITH_RESULTS_TEXT,
|
||||
input_data_query=input_data_query)
|
||||
|
||||
# Generates a Python script that encapsulates a user defined function and the arguments to that function.
|
||||
# This script is "shipped" over the SQL Server machine.
|
||||
# The function is sent as text.
|
||||
# The arguments to pass to the function are serialized into their dill hex strings.
|
||||
# When with_inputdf is True, it specifies that func will take the magic "InputDataSet" as its first arguments.
|
||||
@staticmethod
|
||||
def _build_wrapper_python_script(func: Callable, with_inputdf, *args, **kwargs):
|
||||
dill.settings['recurse'] = True
|
||||
function_text = SpeesBuilderFromFunction._clean_function_text(inspect.getsource(func))
|
||||
args_dill = dill.dumps(kwargs).hex()
|
||||
pos_args_dill = dill.dumps(args).hex()
|
||||
function_name = func.__name__
|
||||
return """
|
||||
{user_function_text}
|
||||
|
||||
import dill
|
||||
import pandas as pd
|
||||
|
||||
# serialized keyword arguments
|
||||
args_dill = bytes.fromhex("{args_dill}")
|
||||
# serialized positional arguments
|
||||
pos_args_dill = bytes.fromhex("{pos_args_dill}")
|
||||
|
||||
args = dill.loads(args_dill)
|
||||
pos_args = dill.loads(pos_args_dill)
|
||||
|
||||
# user function name
|
||||
func = {user_function_name}
|
||||
|
||||
# call user function with serialized arguments
|
||||
return_val = func{func_arguments}
|
||||
|
||||
return_frame = pd.DataFrame()
|
||||
# serialize results of user function and put in DataFrame for return through SQL Satellite channel
|
||||
return_frame["return_val"] = [dill.dumps(return_val).hex()]
|
||||
OutputDataSet = return_frame
|
||||
""".format(user_function_text=function_text,
|
||||
args_dill=args_dill,
|
||||
pos_args_dill=pos_args_dill,
|
||||
user_function_name=function_name,
|
||||
func_arguments=SpeesBuilderFromFunction._func_arguments(with_inputdf))
|
||||
|
||||
# Call syntax of the user function
|
||||
# When with_inputdf is true, the user function will always take the "InputDataSet" magic variable as its first
|
||||
# arguments.
|
||||
@staticmethod
|
||||
def _func_arguments(with_inputdf: bool):
|
||||
return "(InputDataSet, *pos_args, **args)" if with_inputdf else "(*pos_args, **args)"
|
||||
|
||||
@staticmethod
|
||||
def _clean_function_text(function_text):
|
||||
return textwrap.dedent(function_text)
|
||||
|
||||
|
||||
class StoredProcedureBuilder(SQLBuilder):
|
||||
|
||||
def __init__(self, name: str, script: str, input_params: dict = None, output_params: dict = None):
|
||||
|
||||
"""StoredProcedureBuilder SQL stored procedures based on Python functions.
|
||||
|
||||
:param name: name of the stored procedure
|
||||
:param script: function to base the stored procedure on
|
||||
:param input_params: input parameters type annotation dictionary for the stored procedure
|
||||
:param output_params: output parameters type annotation dictionary from the stored procedure
|
||||
"""
|
||||
if input_params is None:
|
||||
input_params = {}
|
||||
if output_params is None:
|
||||
output_params = {}
|
||||
self._script = script
|
||||
self._name = name
|
||||
self._input_params = input_params
|
||||
self._output_params = output_params
|
||||
self._param_declarations = ""
|
||||
|
||||
names_of_input_args = list(self._input_params)
|
||||
names_of_output_args = list(self._output_params)
|
||||
|
||||
self._in_parameter_declarations = self.get_declarations(names_of_input_args, self._input_params)
|
||||
self._out_parameter_declarations = self.get_declarations(names_of_output_args, self._output_params,
|
||||
outputs=True)
|
||||
self._script_parameter_text = self.script_parameter_text(names_of_input_args, self._input_params,
|
||||
names_of_output_args, self._output_params)
|
||||
|
||||
@property
|
||||
def base_script(self) -> str:
|
||||
self._param_declarations = self.combine_in_out(
|
||||
self._in_parameter_declarations, self._out_parameter_declarations)
|
||||
|
||||
return """
|
||||
CREATE PROCEDURE {name}
|
||||
{parameter_declarations}
|
||||
AS
|
||||
EXEC sp_execute_external_script
|
||||
@language = N'Python',
|
||||
@script = %s
|
||||
{script_parameters}
|
||||
""".format(name=self._name,
|
||||
parameter_declarations=self._param_declarations,
|
||||
script_parameters=self._script_parameter_text)
|
||||
|
||||
@property
|
||||
def params(self):
|
||||
return self._script
|
||||
|
||||
def script_parameter_text(self, in_names: List[str], in_types: dict, out_names: List[str], out_types: dict) -> str:
|
||||
if not in_names and not out_names:
|
||||
return ""
|
||||
|
||||
script_params = ""
|
||||
self._script = "\nfrom pandas import DataFrame\n" + self._script
|
||||
|
||||
in_data_name = ""
|
||||
out_data_name = ""
|
||||
|
||||
for name in in_names:
|
||||
if in_types[name] == DataFrame:
|
||||
in_data_name = name
|
||||
in_names.remove(name)
|
||||
break
|
||||
|
||||
for name in out_names:
|
||||
if out_types[name] == DataFrame:
|
||||
out_data_name = name
|
||||
out_names.remove(name)
|
||||
break
|
||||
|
||||
if in_data_name != "":
|
||||
script_params += ",\n" + StoredProcedureBuilderFromFunction.get_input_data_set(in_data_name)
|
||||
|
||||
if out_data_name != "":
|
||||
script_params += ",\n" + StoredProcedureBuilderFromFunction.get_output_data_set(out_data_name)
|
||||
|
||||
if len(in_names) > 0:
|
||||
script_params += ","
|
||||
|
||||
in_params_declaration = out_params_declaration = ""
|
||||
in_params_passing = out_params_passing = ""
|
||||
if len(in_names) > 0:
|
||||
in_params_declaration = self.get_declarations(in_names, in_types)
|
||||
in_params_passing = self.get_params_passing(in_names)
|
||||
|
||||
if len(out_names) > 0:
|
||||
out_params_declaration = self.get_declarations(out_names, out_types, True)
|
||||
out_params_passing = self.get_params_passing(out_names, True)
|
||||
|
||||
params_declaration = self.combine_in_out(in_params_declaration, out_params_declaration)
|
||||
params_passing = self.combine_in_out(in_params_passing, out_params_passing)
|
||||
|
||||
if params_declaration != "":
|
||||
script_params += "\n@params = N'{params_declarations}',\n {params_passing}".format(
|
||||
params_declarations=params_declaration,
|
||||
params_passing=params_passing)
|
||||
|
||||
return script_params
|
||||
|
||||
@staticmethod
|
||||
def combine_in_out(in_str: str = "", out_str: str = ""):
|
||||
result = in_str
|
||||
if result != "" and out_str != "":
|
||||
result += ",\n "
|
||||
result += out_str
|
||||
return result
|
||||
|
||||
@staticmethod
|
||||
def get_input_data_set(name):
|
||||
return "@input_data_1 = @{name},\n@input_data_1_name = N'{name}'".format(name=name)
|
||||
|
||||
@staticmethod
|
||||
def get_output_data_set(name):
|
||||
return "@output_data_1_name = N'{name}'".format(name=name)
|
||||
|
||||
@staticmethod
|
||||
def get_declarations(names_of_args: List[str], type_annotations: dict, outputs: bool = False):
|
||||
return ",\n ".join(["@" + name + " {sqltype}{output}".format(
|
||||
sqltype=StoredProcedureBuilder.to_sql_type(type_annotations.get(name, None)),
|
||||
output=" OUTPUT" if outputs else ""
|
||||
) for name in names_of_args])
|
||||
|
||||
@staticmethod
|
||||
def to_sql_type(pytype):
|
||||
if pytype is None or pytype == str or pytype == DataFrame:
|
||||
return "nvarchar(MAX)"
|
||||
elif pytype == int:
|
||||
return "int"
|
||||
elif pytype == float:
|
||||
return "float"
|
||||
elif pytype == bool:
|
||||
return "bit"
|
||||
else:
|
||||
raise ValueError("Python type: " + str(pytype) + " not supported.")
|
||||
|
||||
@staticmethod
|
||||
def get_params_passing(names_of_args, outputs: bool = False):
|
||||
return ",\n ".join(["@" + name + " = " + "@" + name + "{output}".format(output=" OUTPUT" if outputs else "")
|
||||
for name in names_of_args])
|
||||
|
||||
|
||||
class StoredProcedureBuilderFromFunction(StoredProcedureBuilder):
|
||||
|
||||
"""Build query text for stored procedures creation based on Python functions.
|
||||
|
||||
ex:
|
||||
|
||||
name: "MyStoredProcedure"
|
||||
func:
|
||||
def foobar(arg1: str, arg2: str, arg3: str):
|
||||
print(arg1, arg2, arg3)
|
||||
|
||||
===========becomes===================
|
||||
|
||||
create procedure MyStoredProcedure @arg1 varchar(MAX), @arg2 varchar(MAX), @arg3 varchar(MAX) as
|
||||
|
||||
exec sp_execute_external_script
|
||||
@language = N'Python',
|
||||
@script=N'
|
||||
def foobar(arg1, arg2, arg3):
|
||||
print(arg1, arg2, arg3)
|
||||
foobar(arg1=arg1, arg2=arg2, arg3=arg3)
|
||||
',
|
||||
@params = N'@arg1 varchar(MAX), @arg2 varchar(MAX), @arg3 varchar(MAX)',
|
||||
@arg1 = @arg1,
|
||||
@arg2 = @arg2,
|
||||
@arg3 = @arg3
|
||||
"""
|
||||
|
||||
def __init__(self, name: str, func: Callable,
|
||||
input_params: dict = None, output_params: dict = None):
|
||||
"""StoredProcedureBuilderFromFunction SQL stored procedures based on Python functions.
|
||||
|
||||
:param name: name of the stored procedure
|
||||
:param func: function to base the stored procedure on
|
||||
:param input_params: input parameters type annotation dictionary for the stored procedure
|
||||
Can you function type annotations instead; if both, they must match
|
||||
:param output_params: output parameters type annotation dictionary from the stored procedure
|
||||
"""
|
||||
if input_params is None:
|
||||
input_params = {}
|
||||
if output_params is None:
|
||||
output_params = {}
|
||||
self._func = func
|
||||
self._name = name
|
||||
self._output_params = output_params
|
||||
|
||||
# Get function information
|
||||
function_text = textwrap.dedent(inspect.getsource(self._func))
|
||||
|
||||
argspec = inspect.getfullargspec(self._func)
|
||||
names_of_input_args = argspec.args
|
||||
annotations = argspec.annotations
|
||||
|
||||
if argspec.defaults is not None:
|
||||
warnings.warn("Default values are not supported")
|
||||
|
||||
# Figure out input and output parameter dictionaries
|
||||
if input_params != {}:
|
||||
if annotations != {} and annotations != input_params:
|
||||
raise ValueError("Annotations and input_params do not match!")
|
||||
self._input_params = input_params
|
||||
elif annotations != {}:
|
||||
self._input_params = annotations
|
||||
elif len(names_of_input_args) == 0:
|
||||
self._input_params = {}
|
||||
|
||||
names_of_output_args = list(self._output_params)
|
||||
|
||||
if len(names_of_input_args) != len(self._input_params):
|
||||
raise ValueError("Number of argument annotations doesn't match the number of arguments!")
|
||||
if set(names_of_input_args) != set(self._input_params.keys()):
|
||||
raise ValueError("Names of arguments do not match the annotation keys!")
|
||||
|
||||
calling_text = self.get_function_calling_text(self._func, names_of_input_args)
|
||||
|
||||
output_data_set = None
|
||||
for name in names_of_output_args:
|
||||
if self._output_params[name] == DataFrame:
|
||||
names_of_output_args.remove(name)
|
||||
output_data_set = name
|
||||
break
|
||||
|
||||
# Creates the base python script to put in the SPEES query.
|
||||
# Arguments to function are passed by name into script using SPEES @params argument.
|
||||
self._script = """
|
||||
{function_text}
|
||||
{function_call_text}
|
||||
{ending}
|
||||
""".format(function_text=function_text, function_call_text=calling_text,
|
||||
ending=self.get_ending(self._output_params, output_data_set))
|
||||
|
||||
self._in_parameter_declarations = self.get_declarations(names_of_input_args, self._input_params)
|
||||
self._out_parameter_declarations = self.get_declarations(names_of_output_args, self._output_params,
|
||||
outputs=True)
|
||||
self._script_parameter_text = self.script_parameter_text(names_of_input_args, self._input_params,
|
||||
list(self._output_params), self._output_params)
|
||||
|
||||
def script_parameter_text(self, in_names: List[str], in_types: dict, out_names: List[str], out_types: dict) -> str:
|
||||
if not in_names and not out_names:
|
||||
self._script = "\nfrom pandas import DataFrame\n" + self._script
|
||||
return super().script_parameter_text(in_names, in_types, out_names, out_types)
|
||||
|
||||
@staticmethod
|
||||
def get_function_calling_text(func: Callable, names_of_args: List[str]):
|
||||
# For a function named foo with signature def foo(arg1, arg2, arg3)...
|
||||
# kwargs_text is 'arg1=arg1, arg2=arg2, arg3=arg3'
|
||||
kwargs_text = ", ".join("{}={}".format(name, name) for name in names_of_args)
|
||||
# returns 'foo(arg1=arg2, arg2=arg2, arg3=arg3)'
|
||||
return "result = " + func.__name__ + "({})".format(kwargs_text)
|
||||
|
||||
# Convert results to Output data frame and Output parameters
|
||||
def get_ending(self, output_params: dict, output_data_set: str):
|
||||
res = """
|
||||
if type(result) == DataFrame:
|
||||
{result_val}""".format(result_val="{out_df} = result".format(out_df=output_data_set
|
||||
if output_data_set is not None else "OutputDataSet"))
|
||||
|
||||
if len(output_params) > 0 or output_data_set is not None:
|
||||
res += """
|
||||
elif type(result) == dict:
|
||||
{output_params}
|
||||
elif result is not None:
|
||||
raise TypeError("Must return a DataFrame or dictionary with output parameters or None")
|
||||
""".format(output_params=self.get_output_params(output_params) if len(output_params) > 0 else "pass")
|
||||
return res
|
||||
|
||||
@staticmethod
|
||||
def get_output_params(output_params: dict):
|
||||
return "\n ".join(['{name} = result["{name}"]'.format(name=name) for name in list(output_params)])
|
||||
|
||||
|
||||
class ExecuteStoredProcedureBuilder(SQLBuilder):
|
||||
|
||||
def __init__(self, name: str, **kwargs):
|
||||
self._name = name
|
||||
self._kwargs = kwargs
|
||||
|
||||
# Execute the query: exec sproc @var1 = val1, @var2 = val2...
|
||||
# Does not work with output parameters
|
||||
@property
|
||||
def base_script(self) -> str:
|
||||
parameters = ", ".join(["@{name} = {value}".format(name=name, value=self.format_value(self._kwargs[name]))
|
||||
for name in self._kwargs])
|
||||
return """exec {} {}""".format(self._name, parameters)
|
||||
|
||||
@staticmethod
|
||||
def format_value(value) -> str:
|
||||
if isinstance(value, str):
|
||||
return "'{}'".format(value)
|
||||
elif isinstance(value, int) or isinstance(value, float):
|
||||
return str(value)
|
||||
elif isinstance(value, bool):
|
||||
return str(int(value))
|
||||
else:
|
||||
raise ValueError("Parameter type {} not supported.".format(str(type(value))))
|
||||
|
||||
|
||||
class DropStoredProcedureBuilder(SQLBuilder):
|
||||
|
||||
def __init__(self, name: str):
|
||||
self._name = name
|
||||
|
||||
@property
|
||||
def base_script(self) -> str:
|
||||
return """
|
||||
drop procedure {}
|
||||
""".format(self._name)
|
|
@ -0,0 +1,209 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
from typing import Callable
|
||||
import dill
|
||||
from pandas import DataFrame
|
||||
|
||||
from .connectioninfo import ConnectionInfo
|
||||
from .sqlqueryexecutor import execute_query, execute_raw_query
|
||||
from .sqlbuilder import SpeesBuilder, SpeesBuilderFromFunction, StoredProcedureBuilder, \
|
||||
ExecuteStoredProcedureBuilder, DropStoredProcedureBuilder
|
||||
from .sqlbuilder import StoredProcedureBuilderFromFunction, RETURN_COLUMN_NAME
|
||||
|
||||
|
||||
class SQLPythonExecutor:
|
||||
|
||||
def __init__(self, connection_info: ConnectionInfo):
|
||||
self._connection_info = connection_info
|
||||
|
||||
def execute_function_in_sql(self,
|
||||
func: Callable, *args,
|
||||
input_data_query: str = "",
|
||||
**kwargs):
|
||||
"""Execute a function in SQL Server.
|
||||
|
||||
:param func: function to execute_function_in_sql. NOTE: This function is shipped to SQL as text.
|
||||
Functions should be self contained and import statements should be inline.
|
||||
:param args: positional args to pass to function to execute_function_in_sql.
|
||||
:param input_data_query: sql query to fill the first argument of the function. The argument gets the result of
|
||||
the query as a pandas DataFrame (uses the @input_data_1 parameter in sp_execute_external_script)
|
||||
:param kwargs: keyword arguments to pass to function to execute_function_in_sql.
|
||||
:return: value returned by func
|
||||
|
||||
>>> from sqlmlutils import ConnectionInfo, SQLPythonExecutor
|
||||
>>>
|
||||
>>> def foo(val1, val2):
|
||||
>>> import math
|
||||
>>> print(val1)
|
||||
>>> return [math.cos(val2), math.cos(val2)]
|
||||
>>>
|
||||
>>> sqlpy = SQLPythonExecutor(ConnectionInfo("localhost", database="AirlineTestDB"))
|
||||
>>> ret = sqlpy.execute_function_in_sql(foo, val1="blah", val2=5)
|
||||
blah
|
||||
>>> print(ret)
|
||||
[0.28366218546322625, 0.28366218546322625]
|
||||
"""
|
||||
rows = execute_query(SpeesBuilderFromFunction(func, input_data_query, *args, **kwargs), self._connection_info)
|
||||
return self._get_results(rows)
|
||||
|
||||
def execute_script_in_sql(self,
|
||||
path_to_script: str,
|
||||
input_data_query: str = ""):
|
||||
"""Execute a script in SQL Server.
|
||||
|
||||
:param path_to_script: file path to Python script to execute.
|
||||
:param input_data_query: sql query to fill InputDataSet global variable with.
|
||||
(@input_data_1 parameter in sp_execute_external_script)
|
||||
:return: None
|
||||
|
||||
"""
|
||||
try:
|
||||
with open(path_to_script, 'r') as script_file:
|
||||
content = script_file.read()
|
||||
print("File does exist, using " + path_to_script)
|
||||
except FileNotFoundError:
|
||||
raise FileNotFoundError("File does not exist!")
|
||||
execute_query(SpeesBuilder(content, input_data_query=input_data_query), connection=self._connection_info)
|
||||
|
||||
def execute_sql_query(self,
|
||||
sql_query: str):
|
||||
"""Execute a sql query in SQL Server.
|
||||
|
||||
:param sql_query: the sql query to execute in the server
|
||||
:return: table returned by the sql_query
|
||||
"""
|
||||
rows = execute_raw_query(conn=self._connection_info, query=sql_query)
|
||||
df = DataFrame(rows)
|
||||
|
||||
# _mssql's execute_query() returns duplicate keys for indexing, we remove them because they are extraneous
|
||||
for i in range(len(df.columns)):
|
||||
try:
|
||||
del df[i]
|
||||
except KeyError:
|
||||
pass
|
||||
|
||||
return df
|
||||
|
||||
def create_sproc_from_function(self, name: str, func: Callable,
|
||||
input_params: dict = None, output_params: dict = None):
|
||||
"""Create a SQL Server stored procedure based on a Python function.
|
||||
NOTE: Type annotations are needed either in the function definition or in the input_params dictionary
|
||||
WARNING: Output parameters can be used when creating the stored procedure, but Stored Procedures with
|
||||
output parameters other than a single DataFrame cannot be executed with sqlmlutils
|
||||
|
||||
:param name: name of stored procedure.
|
||||
:param func: function used to define stored procedure. parameters to the function are used to define parameters
|
||||
to the stored procedure. type annotations of the parameters are used to infer SQL types of parameters to the
|
||||
stored procedure. currently supported type annotations are "str", "int", "float", and "DataFrame".
|
||||
:param input_params: optional dictionary of type annotations for each argument to func;
|
||||
if func has type annotations this is not necessary. If both are provided, they must match
|
||||
:param output_params optional dictionary of type annotations for each output parameter
|
||||
:return: True if creation succeeded
|
||||
|
||||
>>> from sqlmlutils import ConnectionInfo, SQLPythonExecutor
|
||||
>>>
|
||||
>>> def foo(val1: int, val2: str):
|
||||
>>> from pandas import DataFrame
|
||||
>>> print(val2)
|
||||
>>> df = DataFrame()
|
||||
>>> df["col1"] = [val1, val1, val1]
|
||||
>>> return df
|
||||
>>>
|
||||
>>> sqlpy = SQLPythonExecutor(ConnectionInfo("localhost", database="AutoRegressTestDB"))
|
||||
>>> sqlpy.create_sproc_from_function("MyStoredProcedure", foo, with_results_set=True)
|
||||
>>>
|
||||
>>> # You can execute_function_in_sql the procedure in the usual way from sql: exec MyStoredProcedure 5, 'bar'
|
||||
>>> # You can also call the stored procedure from Python
|
||||
>>> ret = sqlpy.execute_sproc(name="MyStoredProcedure", val1=5, val2="bar")
|
||||
>>> sqlpy.drop_sproc(name="MyStoredProcedure")
|
||||
|
||||
"""
|
||||
if input_params is None:
|
||||
input_params = {}
|
||||
if output_params is None:
|
||||
output_params = {}
|
||||
# Save the stored procedure in database
|
||||
execute_query(StoredProcedureBuilderFromFunction(name, func,
|
||||
input_params, output_params), self._connection_info)
|
||||
return True
|
||||
|
||||
def create_sproc_from_script(self, name: str, path_to_script: str,
|
||||
input_params: dict = None, output_params: dict = None):
|
||||
"""Create a SQL Server stored procedure based on a Python script
|
||||
|
||||
:param name: name of stored procedure.
|
||||
:param path_to_script: file path to Python script to create a sproc from.
|
||||
:param input_params: optional dictionary of type annotations for inputs in the script
|
||||
:param output_params optional dictionary of type annotations for each output variable
|
||||
:return: True if creation succeeded
|
||||
|
||||
>>> from sqlmlutils import ConnectionInfo, SQLPythonExecutor
|
||||
>>>
|
||||
>>>
|
||||
>>> sqlpy = SQLPythonExecutor(ConnectionInfo("localhost", database="AutoRegressTestDB"))
|
||||
>>> sqlpy.create_sproc_from_script(name="script_sproc", path_to_script="path/to/script")
|
||||
>>>
|
||||
>>> # This will execute the script in sql; with no inputs or outputs it will just run and return nothing
|
||||
>>> sqlpy.execute_sproc(name="script_sproc")
|
||||
>>> sqlpy.drop_sproc(name="script_sproc")
|
||||
|
||||
"""
|
||||
if input_params is None:
|
||||
input_params = {}
|
||||
if output_params is None:
|
||||
output_params = {}
|
||||
# Save the stored procedure in database
|
||||
try:
|
||||
with open(path_to_script, 'r') as script_file:
|
||||
content = script_file.read()
|
||||
print("File does exist, using " + path_to_script)
|
||||
except FileNotFoundError:
|
||||
raise FileNotFoundError("File does not exist!")
|
||||
|
||||
execute_query(StoredProcedureBuilder(name, content,
|
||||
input_params, output_params), self._connection_info)
|
||||
return True
|
||||
|
||||
def check_sproc(self, name: str) -> bool:
|
||||
"""Check to see if a SQL Server stored procedure exists in the database.
|
||||
|
||||
>>> from sqlmlutils import ConnectionInfo, SQLPythonExecutor
|
||||
>>>
|
||||
>>> sqlpy = SQLPythonExecutor(ConnectionInfo("localhost", database="AutoRegressTestDB"))
|
||||
>>> if sqlpy.check_sproc("MyStoredProcedure"):
|
||||
>>> print("MyStoredProcedure exists")
|
||||
>>> else:
|
||||
>>> print("MyStoredProcedure does not exist")
|
||||
|
||||
:param name: name of stored procedure.
|
||||
:return: boolean whether the Stored Procedure exists in the database
|
||||
"""
|
||||
check_query = "SELECT OBJECT_ID (%s, N'P')"
|
||||
rows = execute_raw_query(conn=self._connection_info, query=check_query, params=name)
|
||||
return rows[0][0] is not None
|
||||
|
||||
def execute_sproc(self, name: str, **kwargs) -> DataFrame:
|
||||
"""Call a stored procedure on a SQL Server database.
|
||||
WARNING: Output parameters can be used when creating the stored procedure, but Stored Procedures with
|
||||
output parameters other than a single DataFrame cannot be executed with sqlmlutils
|
||||
|
||||
:param name: name of stored procedure.
|
||||
:param kwargs: keyword arguments to pass to stored procedure
|
||||
:return: DataFrame representing the output data set of the stored procedure (or empty)
|
||||
"""
|
||||
return DataFrame(execute_query(ExecuteStoredProcedureBuilder(name, **kwargs), self._connection_info))
|
||||
|
||||
def drop_sproc(self, name: str):
|
||||
"""Drop a SQL Server stored procedure if it exists.
|
||||
|
||||
:param name: name of stored procedure.
|
||||
:return: None
|
||||
"""
|
||||
if self.check_sproc(name):
|
||||
execute_query(DropStoredProcedureBuilder(name), self._connection_info)
|
||||
|
||||
@staticmethod
|
||||
def _get_results(rows):
|
||||
hexstring = rows[0][RETURN_COLUMN_NAME]
|
||||
return dill.loads(bytes.fromhex(hexstring))
|
|
@ -0,0 +1,90 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
import _mssql
|
||||
from .connectioninfo import ConnectionInfo
|
||||
from .sqlbuilder import SQLBuilder
|
||||
|
||||
"""This module is used to actually execute sql queries. It uses the pymssql module under the hood.
|
||||
|
||||
It is mostly setup to work with SQLBuilder objects as defined in sqlbuilder.
|
||||
"""
|
||||
|
||||
|
||||
# This function is best used to execute_function_in_sql a one off query
|
||||
# (the SQL connection is closed after the query completes).
|
||||
# If you need to keep the SQL connection open in between queries, you can use the _SQLQueryExecutor class below.
|
||||
def execute_query(builder, connection: ConnectionInfo):
|
||||
with SQLQueryExecutor(connection=connection) as executor:
|
||||
return executor.execute(builder)
|
||||
|
||||
|
||||
def execute_raw_query(conn: ConnectionInfo, query, params=()):
|
||||
with SQLQueryExecutor(connection=conn) as executor:
|
||||
return executor.execute_query(query, params)
|
||||
|
||||
|
||||
def _sql_msg_handler(msgstate, severity, srvname, procname, line, msgtext):
|
||||
print(msgtext.decode())
|
||||
|
||||
|
||||
class SQLQueryExecutor:
|
||||
|
||||
"""_SQLQueryExecutor objects keep a SQL connection open in order to execute_function_in_sql one or more queries.
|
||||
|
||||
This class implements the basic context manager paradigm.
|
||||
"""
|
||||
|
||||
def __init__(self, connection: ConnectionInfo):
|
||||
self._connection = connection
|
||||
|
||||
def execute(self, builder: SQLBuilder):
|
||||
try:
|
||||
self._mssqlconn.set_msghandler(_sql_msg_handler)
|
||||
self._mssqlconn.execute_query(builder.base_script, builder.params)
|
||||
return [row for row in self._mssqlconn]
|
||||
except Exception as e:
|
||||
raise RuntimeError(str.format("Error in SQL Execution: {error}", error=str(e)))
|
||||
|
||||
def execute_query(self, query, params):
|
||||
self._mssqlconn.execute_query(query, params)
|
||||
return [row for row in self._mssqlconn]
|
||||
|
||||
def __enter__(self):
|
||||
self._mssqlconn = _mssql.connect(server=self._connection.server,
|
||||
user=self._connection.uid,
|
||||
password=self._connection.pwd,
|
||||
database=self._connection.database)
|
||||
self._mssqlconn.set_msghandler(_sql_msg_handler)
|
||||
return self
|
||||
|
||||
def __exit__(self, exception_type, exception_value, traceback):
|
||||
self._mssqlconn.close()
|
||||
|
||||
|
||||
class SQLTransaction:
|
||||
|
||||
def __init__(self, executor: SQLQueryExecutor, name):
|
||||
self._executor = executor
|
||||
self._name = name
|
||||
|
||||
def begin(self):
|
||||
query = """
|
||||
declare @transactionname varchar(MAX) = %s;
|
||||
begin tran @transactionname;
|
||||
"""
|
||||
self._executor.execute_query(query, self._name)
|
||||
|
||||
def rollback(self):
|
||||
query = """
|
||||
declare @transactionname varchar(MAX) = %s;
|
||||
rollback tran @transactionname;
|
||||
"""
|
||||
self._executor.execute_query(query, self._name)
|
||||
|
||||
def commit(self):
|
||||
query = """
|
||||
declare @transactionname varchar(MAX) = %s;
|
||||
commit tran @transactionname;
|
||||
"""
|
||||
self._executor.execute_query(query, self._name)
|
|
@ -0,0 +1,36 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
from pandas import DataFrame
|
||||
|
||||
from .connectioninfo import ConnectionInfo
|
||||
from .sqlqueryexecutor import execute_query
|
||||
from .sqlbuilder import ExecuteStoredProcedureBuilder, DropStoredProcedureBuilder
|
||||
|
||||
|
||||
class StoredProcedure:
|
||||
"""Represents a SQL Server stored procedure."""
|
||||
|
||||
def __init__(self, name: str, connection: ConnectionInfo):
|
||||
"""Instantiates a StoredProcedure. Not meant to be called directly, get handles to stored
|
||||
procedures using get_sproc.
|
||||
|
||||
:param name: name of stored procedure.
|
||||
"""
|
||||
self._name = name
|
||||
self._connection = connection
|
||||
|
||||
def call(self, **kwargs) -> DataFrame:
|
||||
"""Call a stored procedure on a SQL Server database.
|
||||
|
||||
:param kwargs: keyword arguments to pass to stored procedure
|
||||
:return: DataFrame representing the output data set of the stored procedure (or empty)
|
||||
"""
|
||||
return DataFrame(execute_query(ExecuteStoredProcedureBuilder(self._name, **kwargs), self._connection))
|
||||
|
||||
def drop(self):
|
||||
"""Drop a SQL Server stored procedure.
|
||||
|
||||
:return: None
|
||||
"""
|
||||
execute_query(DropStoredProcedureBuilder(self._name), self._connection)
|
|
@ -0,0 +1,168 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
import pytest
|
||||
from contextlib import redirect_stdout, redirect_stderr
|
||||
import io
|
||||
import os
|
||||
|
||||
from sqlmlutils import SQLPythonExecutor
|
||||
from sqlmlutils import ConnectionInfo
|
||||
from pandas import DataFrame
|
||||
|
||||
current_dir = os.path.dirname(__file__)
|
||||
script_dir = os.path.join(current_dir, "scripts")
|
||||
connection = ConnectionInfo(server="localhost", database="AirlineTestDB")
|
||||
sqlpy = SQLPythonExecutor(connection)
|
||||
|
||||
|
||||
def test_with_named_args():
|
||||
def func_with_args(arg1, arg2):
|
||||
print(arg1)
|
||||
return arg2
|
||||
|
||||
output = io.StringIO()
|
||||
with redirect_stderr(output), redirect_stdout(output):
|
||||
res = sqlpy.execute_function_in_sql(func_with_args, arg1="str1", arg2="str2")
|
||||
|
||||
assert "str1" in output.getvalue()
|
||||
assert res == "str2"
|
||||
|
||||
|
||||
def test_with_order_args():
|
||||
def func_with_order_args(arg1: int, arg2: float):
|
||||
return arg1 / arg2
|
||||
|
||||
res = sqlpy.execute_function_in_sql(func_with_order_args, 2, 3.0)
|
||||
assert res == 2 / 3.0
|
||||
res = sqlpy.execute_function_in_sql(func_with_order_args, 3.0, 2)
|
||||
assert res == 3 / 2.0
|
||||
|
||||
|
||||
def test_return():
|
||||
def func_with_return():
|
||||
return "returned!"
|
||||
|
||||
res = sqlpy.execute_function_in_sql(func_with_return)
|
||||
assert res == func_with_return()
|
||||
|
||||
|
||||
@pytest.mark.skip(reason="Do we capture warnings?")
|
||||
def test_warning():
|
||||
def func_with_warning():
|
||||
import warnings
|
||||
warnings.warn("WARNING!")
|
||||
|
||||
res = sqlpy.execute_function_in_sql(func_with_warning)
|
||||
assert res is None
|
||||
|
||||
|
||||
def test_with_internal_func():
|
||||
def func_with_internal_func():
|
||||
def func2(arg1, arg2):
|
||||
return arg1 + arg2
|
||||
|
||||
return func2("Suc", "cess")
|
||||
|
||||
res = sqlpy.execute_function_in_sql(func_with_internal_func)
|
||||
assert res == "Success"
|
||||
|
||||
|
||||
@pytest.mark.skip(reason="Cannot currently return a function")
|
||||
def test_return_func():
|
||||
def func2(arg1, arg2):
|
||||
return arg1 + arg2
|
||||
|
||||
def func_returns_func():
|
||||
def func2(arg1, arg2):
|
||||
return arg1 + arg2
|
||||
|
||||
return func2
|
||||
|
||||
res = sqlpy.execute_function_in_sql(func_returns_func)
|
||||
assert res == func2
|
||||
|
||||
|
||||
@pytest.mark.skip(reason="Cannot currently return a function outside of environment")
|
||||
def test_return_func():
|
||||
def func2(arg1, arg2):
|
||||
return arg1 + arg2
|
||||
|
||||
def func_returns_func():
|
||||
return func2
|
||||
|
||||
res = sqlpy.execute_function_in_sql(func_returns_func)
|
||||
assert res == func2
|
||||
|
||||
|
||||
def test_with_no_args():
|
||||
def func_with_no_args():
|
||||
return
|
||||
|
||||
res = sqlpy.execute_function_in_sql(func_with_no_args)
|
||||
|
||||
assert res is None
|
||||
|
||||
|
||||
def test_with_data_frame():
|
||||
def func_return_df(in_df):
|
||||
return in_df
|
||||
|
||||
res = sqlpy.execute_function_in_sql(func_return_df,
|
||||
input_data_query="SELECT TOP 10 * FROM airline5000")
|
||||
|
||||
assert type(res) == DataFrame
|
||||
assert res.shape == (10, 30)
|
||||
|
||||
|
||||
def test_with_variables():
|
||||
def func_with_variables(s):
|
||||
print(s)
|
||||
|
||||
output = io.StringIO()
|
||||
with redirect_stderr(output), redirect_stdout(output):
|
||||
sqlpy.execute_function_in_sql(func_with_variables, s="Hello")
|
||||
|
||||
assert "Hello" in output.getvalue()
|
||||
|
||||
output = io.StringIO()
|
||||
with redirect_stderr(output), redirect_stdout(output):
|
||||
var_s = "World"
|
||||
sqlpy.execute_function_in_sql(func_with_variables, s=var_s)
|
||||
|
||||
assert "World" in output.getvalue()
|
||||
|
||||
|
||||
def test_execute_query():
|
||||
res = sqlpy.execute_sql_query("SELECT TOP 10 * FROM airline5000")
|
||||
|
||||
assert type(res) == DataFrame
|
||||
assert res.shape == (10, 30)
|
||||
|
||||
|
||||
def test_execute_script():
|
||||
path = os.path.join(script_dir, "test_script.py")
|
||||
|
||||
output = io.StringIO()
|
||||
with redirect_stderr(output), redirect_stdout(output):
|
||||
res = sqlpy.execute_script_in_sql(path_to_script=path,
|
||||
input_data_query="SELECT TOP 10 * FROM airline5000")
|
||||
|
||||
assert "HelloWorld" in output.getvalue()
|
||||
assert res is None
|
||||
|
||||
with pytest.raises(FileNotFoundError):
|
||||
sqlpy.execute_script_in_sql(path_to_script="NonexistentScriptPath",
|
||||
input_data_query="SELECT TOP 10 * FROM airline5000")
|
||||
|
||||
|
||||
def test_stderr():
|
||||
def print_to_stderr():
|
||||
import sys
|
||||
sys.stderr.write("Error!")
|
||||
|
||||
output = io.StringIO()
|
||||
with redirect_stderr(output), redirect_stdout(output):
|
||||
sqlpy.execute_function_in_sql(print_to_stderr)
|
||||
|
||||
assert "Error!" in output.getvalue()
|
|
@ -0,0 +1,13 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
from sqlmlutils.sqlqueryexecutor import execute_raw_query
|
||||
|
||||
|
||||
def _get_sql_package_table(connection):
|
||||
query = "select * from sys.external_libraries"
|
||||
return execute_raw_query(connection, query)
|
||||
|
||||
|
||||
def _get_package_names_list(connection):
|
||||
return {dic['name']: dic['scope'] for dic in _get_sql_package_table(connection)}
|
|
@ -0,0 +1,266 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
import io
|
||||
import os
|
||||
import subprocess
|
||||
import tempfile
|
||||
from contextlib import redirect_stdout
|
||||
|
||||
import pytest
|
||||
|
||||
import sqlmlutils
|
||||
from sqlmlutils import SQLPackageManager, SQLPythonExecutor
|
||||
from package_helper_functions import _get_sql_package_table, _get_package_names_list
|
||||
from sqlmlutils.packagemanagement.scope import Scope
|
||||
from sqlmlutils.packagemanagement.pipdownloader import PipDownloader
|
||||
|
||||
connection = sqlmlutils.ConnectionInfo(server="localhost", database="AirlineTestDB")
|
||||
path_to_packages = os.path.join((os.path.dirname(os.path.realpath(__file__))), "scripts", "test_packages")
|
||||
_SUCCESS_TOKEN = "SUCCESS"
|
||||
|
||||
pyexecutor = SQLPythonExecutor(connection)
|
||||
pkgmanager = SQLPackageManager(connection)
|
||||
|
||||
originals = _get_sql_package_table(connection)
|
||||
|
||||
def check_package(package_name: str, exists: bool, class_to_check: str = ""):
|
||||
if exists:
|
||||
themodule = __import__(package_name)
|
||||
assert themodule is not None
|
||||
assert getattr(themodule, class_to_check) is not None
|
||||
else:
|
||||
import pytest
|
||||
with pytest.raises(Exception):
|
||||
__import__(package_name)
|
||||
|
||||
|
||||
def _execute_sql(script: str) -> bool:
|
||||
tmpfile = tempfile.NamedTemporaryFile(delete=False)
|
||||
tmpfile.write(script.encode())
|
||||
tmpfile.close()
|
||||
command = ["sqlcmd", "-d", "AirlineTestDB", "-i", tmpfile.name]
|
||||
try:
|
||||
output = subprocess.check_output(command, stderr=subprocess.STDOUT, shell=True).decode()
|
||||
return _SUCCESS_TOKEN in output
|
||||
finally:
|
||||
os.remove(tmpfile.name)
|
||||
|
||||
|
||||
def _drop(package_name: str, ddl_name: str):
|
||||
pkgmanager.uninstall(package_name)
|
||||
pyexecutor.execute_function_in_sql(check_package, package_name=package_name, exists=False)
|
||||
|
||||
|
||||
def _create(module_name: str, package_file: str, class_to_check: str, drop: bool = True):
|
||||
pyexecutor.execute_function_in_sql(check_package, package_name=module_name, exists=False)
|
||||
pkgmanager.install(package_file)
|
||||
pyexecutor.execute_function_in_sql(check_package, package_name=module_name, exists=True, class_to_check=class_to_check)
|
||||
if drop:
|
||||
_drop(package_name=module_name, ddl_name=module_name)
|
||||
|
||||
|
||||
def _remove_all_new_packages(manager):
|
||||
libs = {dic['external_library_id']: (dic['name'], dic['scope']) for dic in _get_sql_package_table(connection)}
|
||||
original_libs = {dic['external_library_id']: (dic['name'], dic['scope']) for dic in originals}
|
||||
|
||||
for lib in libs:
|
||||
pkg, sc = libs[lib]
|
||||
if lib not in original_libs:
|
||||
print("uninstalling" + str(lib))
|
||||
if sc:
|
||||
manager.uninstall(pkg, scope=Scope.private_scope())
|
||||
else:
|
||||
manager.uninstall(pkg, scope=Scope.public_scope())
|
||||
else:
|
||||
if sc != original_libs[lib][1]:
|
||||
if sc:
|
||||
manager.uninstall(pkg, scope=Scope.private_scope())
|
||||
else:
|
||||
manager.uninstall(pkg, scope=Scope.public_scope())
|
||||
|
||||
|
||||
packages = ["absl-py==0.1.13", "astor==0.6.2", "bleach==1.5.0", "cryptography==2.2.2",
|
||||
"html5lib==1.0.1", "Markdown==2.6.11", "numpy==1.14.3", "termcolor==1.1.0", "webencodings==0.5.1"]
|
||||
|
||||
for package in packages:
|
||||
pipdownloader = PipDownloader(connection, path_to_packages, package)
|
||||
pipdownloader.download_single()
|
||||
|
||||
def test_install_basic_zip_package():
|
||||
package = os.path.join(path_to_packages, "testpackageA-0.0.1.zip")
|
||||
module_name = "testpackageA"
|
||||
|
||||
_remove_all_new_packages(pkgmanager)
|
||||
|
||||
_create(module_name=module_name, package_file=package, class_to_check="ClassA")
|
||||
|
||||
|
||||
def test_install_basic_zip_package_different_name():
|
||||
package = os.path.join(path_to_packages, "testpackageA-0.0.1.zip")
|
||||
module_name = "testpackageA"
|
||||
|
||||
_remove_all_new_packages(pkgmanager)
|
||||
_create(module_name=module_name, package_file=package, class_to_check="ClassA")
|
||||
|
||||
|
||||
def test_install_whl_files():
|
||||
packages = ["webencodings-0.5.1-py2.py3-none-any.whl", "html5lib-1.0.1-py2.py3-none-any.whl",
|
||||
"astor-0.6.2-py2.py3-none-any.whl"]
|
||||
module_names = ["webencodings", "html5lib", "astor"]
|
||||
classes_to_check = ["LABELS", "parse", "code_gen"]
|
||||
|
||||
_remove_all_new_packages(pkgmanager)
|
||||
|
||||
for package, module, class_to_check in zip(packages, module_names, classes_to_check):
|
||||
full_package = os.path.join(path_to_packages, package)
|
||||
_create(module_name=module, package_file=full_package, class_to_check=class_to_check, drop=False)
|
||||
|
||||
for name in module_names:
|
||||
_drop(package_name=name, ddl_name=name)
|
||||
|
||||
|
||||
def test_install_targz_files():
|
||||
packages = ["termcolor-1.1.0.tar.gz"]
|
||||
module_names = ["termcolor"]
|
||||
ddl_names = ["termcolor"]
|
||||
classes_to_check = ["colored"]
|
||||
|
||||
_remove_all_new_packages(pkgmanager)
|
||||
|
||||
for package, module, ddl_name, class_to_check in zip(packages, module_names, ddl_names, classes_to_check):
|
||||
full_package = os.path.join(path_to_packages, package)
|
||||
_create(module_name=module, package_file=full_package, class_to_check=class_to_check)
|
||||
|
||||
|
||||
def test_install_bad_package_badzipfile():
|
||||
|
||||
_remove_all_new_packages(pkgmanager)
|
||||
|
||||
with tempfile.TemporaryDirectory() as temporary_directory:
|
||||
badpackagefile = os.path.join(temporary_directory, "badpackageA-0.0.1.zip")
|
||||
with open(badpackagefile, "w") as f:
|
||||
f.write("asdasdasdascsacsadsadas")
|
||||
with pytest.raises(Exception):
|
||||
pkgmanager.install(badpackagefile)
|
||||
|
||||
assert "badpackageA" not in _get_package_names_list(connection)
|
||||
|
||||
query = """
|
||||
declare @val int;
|
||||
set @val = (select count(*) from sys.external_libraries where name='badpackageA')
|
||||
if @val = 0
|
||||
print('{}')
|
||||
""".format(_SUCCESS_TOKEN)
|
||||
|
||||
assert _execute_sql(query)
|
||||
|
||||
|
||||
def test_package_already_exists_on_sql_table():
|
||||
|
||||
_remove_all_new_packages(pkgmanager)
|
||||
|
||||
package = os.path.join(path_to_packages, "testpackageA-0.0.1.zip")
|
||||
pkgmanager.install(package)
|
||||
|
||||
# Without upgrade
|
||||
output = io.StringIO()
|
||||
with redirect_stdout(output):
|
||||
pkgmanager.install(package, upgrade=False)
|
||||
assert "exists on server. Set upgrade to True to force upgrade." in output.getvalue()
|
||||
|
||||
# With upgrade
|
||||
package = os.path.join(path_to_packages, "testpackageA-0.0.2.zip")
|
||||
pkgmanager.install(package, upgrade=True)
|
||||
|
||||
def check_version():
|
||||
import testpackageA
|
||||
return testpackageA.__version__
|
||||
|
||||
version = pyexecutor.execute_function_in_sql(check_version)
|
||||
assert version == "0.0.2"
|
||||
|
||||
pkgmanager.uninstall("testpackageA")
|
||||
|
||||
|
||||
def test_upgrade_parameter():
|
||||
|
||||
_remove_all_new_packages(pkgmanager)
|
||||
|
||||
# Get sql packages
|
||||
originalsqlpkgs = _get_sql_package_table(connection)
|
||||
|
||||
pkg = os.path.join(path_to_packages, "cryptography-2.2.2-cp35-cp35m-win_amd64.whl")
|
||||
|
||||
output = io.StringIO()
|
||||
with redirect_stdout(output):
|
||||
pkgmanager.install(pkg, upgrade=False)
|
||||
assert "exists on server. Set upgrade to True to force upgrade." in output.getvalue()
|
||||
|
||||
# Assert no additional packages were installed
|
||||
|
||||
sqlpkgs = _get_sql_package_table(connection)
|
||||
assert len(sqlpkgs) == len(originalsqlpkgs)
|
||||
|
||||
#################
|
||||
|
||||
def check_version():
|
||||
import cryptography as cp
|
||||
return cp.__version__
|
||||
|
||||
oldversion = pyexecutor.execute_function_in_sql(check_version)
|
||||
|
||||
pkgmanager.install(pkg, upgrade=True)
|
||||
|
||||
sqlpkgs = _get_sql_package_table(connection)
|
||||
assert len(sqlpkgs) == len(originalsqlpkgs) + 2
|
||||
|
||||
version = pyexecutor.execute_function_in_sql(check_version)
|
||||
assert version == "2.2.2"
|
||||
assert version > oldversion
|
||||
|
||||
pkgmanager.uninstall("cryptography")
|
||||
pkgmanager.uninstall("asn1crypto")
|
||||
|
||||
sqlpkgs = _get_sql_package_table(connection)
|
||||
assert len(sqlpkgs) == len(originalsqlpkgs)
|
||||
|
||||
|
||||
# TODO: more tests for drop external library
|
||||
|
||||
def test_scope():
|
||||
|
||||
_remove_all_new_packages(pkgmanager)
|
||||
|
||||
package = os.path.join(path_to_packages, "testpackageA-0.0.1.zip")
|
||||
|
||||
def get_location():
|
||||
import testpackageA
|
||||
return testpackageA.__file__
|
||||
|
||||
_revotesterconnection = sqlmlutils.ConnectionInfo(server="localhost",
|
||||
database="AirlineTestDB",
|
||||
uid="Tester",
|
||||
pwd="FakeT3sterPwd!")
|
||||
revopkgmanager = SQLPackageManager(_revotesterconnection)
|
||||
revoexecutor = SQLPythonExecutor(_revotesterconnection)
|
||||
|
||||
revopkgmanager.install(package, scope=Scope.private_scope())
|
||||
private_location = revoexecutor.execute_function_in_sql(get_location)
|
||||
|
||||
pkg_name = "testpackageA"
|
||||
|
||||
pyexecutor.execute_function_in_sql(check_package, package_name=pkg_name, exists=False)
|
||||
|
||||
revopkgmanager.uninstall(pkg_name, scope=Scope.private_scope())
|
||||
|
||||
revopkgmanager.install(package, scope=Scope.public_scope())
|
||||
public_location = revoexecutor.execute_function_in_sql(get_location)
|
||||
|
||||
assert private_location != public_location
|
||||
pyexecutor.execute_function_in_sql(check_package, package_name=pkg_name, exists=True, class_to_check='ClassA')
|
||||
|
||||
revopkgmanager.uninstall(pkg_name, scope=Scope.public_scope())
|
||||
|
||||
revoexecutor.execute_function_in_sql(check_package, package_name=pkg_name, exists=False)
|
||||
pyexecutor.execute_function_in_sql(check_package, package_name=pkg_name, exists=False)
|
|
@ -0,0 +1,232 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
import sqlmlutils
|
||||
import os
|
||||
import pytest
|
||||
from sqlmlutils import SQLPythonExecutor, SQLPackageManager
|
||||
from sqlmlutils.packagemanagement.scope import Scope
|
||||
from package_helper_functions import _get_sql_package_table, _get_package_names_list
|
||||
import io
|
||||
from contextlib import redirect_stdout
|
||||
|
||||
|
||||
def _drop_all_ddl_packages(conn):
|
||||
pkgs = _get_sql_package_table(conn)
|
||||
for pkg in pkgs:
|
||||
try:
|
||||
SQLPackageManager(conn)._drop_sql_package(pkg['name'], scope=Scope.private_scope())
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
|
||||
server = os.environ.get("SQLPY_TEST_SERVER", "localhost")
|
||||
database = os.environ.get("SQLPY_TEST_DB", "AirlineTestDB")
|
||||
uid = os.environ.get("SQLPY_TEST_UID", "")
|
||||
pwd = os.environ.get("SQLPY_TEST_PWD", "")
|
||||
connection = sqlmlutils.ConnectionInfo(server=server, database=database, uid=uid, pwd=pwd)
|
||||
pyexecutor = SQLPythonExecutor(connection)
|
||||
pkgmanager = SQLPackageManager(connection)
|
||||
_drop_all_ddl_packages(connection)
|
||||
|
||||
|
||||
def _package_exists(module_name: str):
|
||||
mod = __import__(module_name)
|
||||
return mod is not None
|
||||
|
||||
|
||||
def _package_no_exist(module_name: str):
|
||||
import pytest
|
||||
with pytest.raises(Exception):
|
||||
__import__(module_name)
|
||||
return True
|
||||
|
||||
|
||||
def test_install_tensorflow_and_keras():
|
||||
def use_tensorflow():
|
||||
import tensorflow as tf
|
||||
node1 = tf.constant(3.0, tf.float32)
|
||||
return str(node1.dtype)
|
||||
|
||||
def use_keras():
|
||||
import keras
|
||||
|
||||
pkgmanager.install("tensorflow")
|
||||
val = pyexecutor.execute_function_in_sql(use_tensorflow)
|
||||
assert 'float32' in val
|
||||
|
||||
pkgmanager.install("keras")
|
||||
pyexecutor.execute_function_in_sql(use_keras)
|
||||
pkgmanager.uninstall("keras")
|
||||
val = pyexecutor.execute_function_in_sql(_package_no_exist, "keras")
|
||||
assert val
|
||||
|
||||
pkgmanager.uninstall("tensorflow")
|
||||
val = pyexecutor.execute_function_in_sql(_package_no_exist, "tensorflow")
|
||||
assert val
|
||||
|
||||
_drop_all_ddl_packages(connection)
|
||||
|
||||
|
||||
def test_install_many_packages():
|
||||
packages = ["multiprocessing_on_dill", "simplejson"]
|
||||
|
||||
for package in packages:
|
||||
pkgmanager.install(package, upgrade=True)
|
||||
val = pyexecutor.execute_function_in_sql(_package_exists, module_name=package)
|
||||
assert val
|
||||
|
||||
pkgmanager.uninstall(package)
|
||||
val = pyexecutor.execute_function_in_sql(_package_no_exist, module_name=package)
|
||||
assert val
|
||||
|
||||
_drop_all_ddl_packages(connection)
|
||||
|
||||
|
||||
def test_install_version():
|
||||
package = "simplejson"
|
||||
v = "3.0.3"
|
||||
|
||||
def _package_version_exists(module_name: str, version: str):
|
||||
mod = __import__(module_name)
|
||||
return mod.__version__ == version
|
||||
|
||||
pkgmanager.install(package, version=v)
|
||||
val = pyexecutor.execute_function_in_sql(_package_version_exists, module_name=package, version=v)
|
||||
assert val
|
||||
|
||||
pkgmanager.uninstall(package)
|
||||
val = pyexecutor.execute_function_in_sql(_package_no_exist, module_name=package)
|
||||
assert val
|
||||
|
||||
_drop_all_ddl_packages(connection)
|
||||
|
||||
|
||||
def test_dependency_resolution():
|
||||
package = "multiprocessing_on_dill"
|
||||
|
||||
pkgmanager.install(package, upgrade=True)
|
||||
val = pyexecutor.execute_function_in_sql(_package_exists, module_name=package)
|
||||
assert val
|
||||
|
||||
pkgs = _get_package_names_list(connection)
|
||||
|
||||
assert package in pkgs
|
||||
assert "pyreadline" in pkgs
|
||||
|
||||
pkgmanager.uninstall(package)
|
||||
val = pyexecutor.execute_function_in_sql(_package_no_exist, module_name=package)
|
||||
assert val
|
||||
|
||||
_drop_all_ddl_packages(connection)
|
||||
|
||||
|
||||
def test_upgrade_parameter():
|
||||
|
||||
pkg = "cryptography"
|
||||
|
||||
# Get sql packages
|
||||
originalsqlpkgs = _get_sql_package_table(connection)
|
||||
|
||||
output = io.StringIO()
|
||||
with redirect_stdout(output):
|
||||
pkgmanager.install(pkg, upgrade=False)
|
||||
assert "exists on server. Set upgrade to True to force upgrade." in output.getvalue()
|
||||
|
||||
# Assert no additional packages were installed
|
||||
|
||||
sqlpkgs = _get_sql_package_table(connection)
|
||||
assert len(sqlpkgs) == len(originalsqlpkgs)
|
||||
|
||||
#################
|
||||
|
||||
def check_version():
|
||||
import cryptography as cp
|
||||
return cp.__version__
|
||||
|
||||
oldversion = pyexecutor.execute_function_in_sql(check_version)
|
||||
|
||||
pkgmanager.install(pkg, upgrade=True)
|
||||
|
||||
afterinstall = _get_sql_package_table(connection)
|
||||
assert len(afterinstall) > len(originalsqlpkgs)
|
||||
|
||||
version = pyexecutor.execute_function_in_sql(check_version)
|
||||
assert version > oldversion
|
||||
|
||||
pkgmanager.uninstall("cryptography")
|
||||
|
||||
sqlpkgs = _get_sql_package_table(connection)
|
||||
assert len(sqlpkgs) == len(afterinstall) - 1
|
||||
|
||||
_drop_all_ddl_packages(connection)
|
||||
|
||||
|
||||
def test_install_abslpy():
|
||||
pkgmanager.install("absl-py")
|
||||
|
||||
def useit():
|
||||
import absl
|
||||
return absl.__file__
|
||||
|
||||
pyexecutor.execute_function_in_sql(useit)
|
||||
|
||||
pkgmanager.uninstall("absl-py")
|
||||
|
||||
def dontuseit():
|
||||
import pytest
|
||||
with pytest.raises(Exception):
|
||||
import absl
|
||||
|
||||
pyexecutor.execute_function_in_sql(dontuseit)
|
||||
|
||||
_drop_all_ddl_packages(connection)
|
||||
|
||||
|
||||
@pytest.mark.skip(reason="Theano depends on a conda package libpython? lazylinker issue")
|
||||
def test_install_theano():
|
||||
pkgmanager.install("Theano")
|
||||
|
||||
def useit():
|
||||
import theano.tensor as T
|
||||
return str(T)
|
||||
|
||||
pyexecutor.execute_function_in_sql(useit)
|
||||
|
||||
pkgmanager.uninstall("Theano")
|
||||
|
||||
pkgmanager.install("theano")
|
||||
pyexecutor.execute_function_in_sql(useit)
|
||||
pkgmanager.uninstall("theano")
|
||||
|
||||
_drop_all_ddl_packages(connection)
|
||||
|
||||
|
||||
def test_already_installed_popular_ml_packages():
|
||||
installedpackages = ["numpy", "scipy", "pandas", "matplotlib", "seaborn", "bokeh", "nltk", "statsmodels"]
|
||||
|
||||
sqlpkgs = _get_sql_package_table(connection)
|
||||
for package in installedpackages:
|
||||
pkgmanager.install(package)
|
||||
newsqlpkgs = _get_sql_package_table(connection)
|
||||
assert len(sqlpkgs) == len(newsqlpkgs)
|
||||
|
||||
|
||||
def test_installing_popular_ml_packages():
|
||||
newpackages = ["plotly", "cntk", "gensim"]
|
||||
|
||||
def checkit(pkgname):
|
||||
val = __import__(pkgname)
|
||||
return str(val)
|
||||
|
||||
for package in newpackages:
|
||||
pkgmanager.install(package)
|
||||
pyexecutor.execute_function_in_sql(checkit, pkgname=package)
|
||||
|
||||
_drop_all_ddl_packages(connection)
|
||||
|
||||
|
||||
# TODO: find a bad pypi package to test this scenario
|
||||
def test_install_bad_pypi_package():
|
||||
pass
|
||||
|
|
@ -0,0 +1,24 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
import sqlmlutils
|
||||
|
||||
|
||||
def linear_regression(input_df, x_col, y_col):
|
||||
from sklearn import linear_model
|
||||
|
||||
X = input_df[[x_col]]
|
||||
y = input_df[y_col]
|
||||
|
||||
lr = linear_model.LinearRegression()
|
||||
lr.fit(X, y)
|
||||
|
||||
return lr
|
||||
|
||||
|
||||
sqlpy = sqlmlutils.SQLPythonExecutor(sqlmlutils.ConnectionInfo(server="localhost", database="AirlineTestDB"))
|
||||
sql_query = "select top 1000 CRSDepTime, CRSArrTime from airline5000"
|
||||
regression_model = sqlpy.execute_function_in_sql(linear_regression, input_data_query=sql_query,
|
||||
x_col="CRSDepTime", y_col="CRSArrTime")
|
||||
print(regression_model)
|
||||
print(regression_model.coef_)
|
|
@ -0,0 +1,35 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
import sqlmlutils
|
||||
from PIL import Image
|
||||
|
||||
|
||||
def scatter_plot(input_df, x_col, y_col):
|
||||
import matplotlib.pyplot as plt
|
||||
import io
|
||||
|
||||
title = x_col + " vs. " + y_col
|
||||
|
||||
plt.scatter(input_df[x_col], input_df[y_col])
|
||||
plt.xlabel(x_col)
|
||||
plt.ylabel(y_col)
|
||||
plt.title(title)
|
||||
|
||||
# Save scatter plot image as a png
|
||||
buf = io.BytesIO()
|
||||
plt.savefig(buf, format="png")
|
||||
buf.seek(0)
|
||||
|
||||
# Returns the bytes of the png to the client
|
||||
return buf
|
||||
|
||||
|
||||
sqlpy = sqlmlutils.SQLPythonExecutor(sqlmlutils.ConnectionInfo(server="localhost", database="AirlineTestDB"))
|
||||
|
||||
sql_query = "select top 100 * from airline5000"
|
||||
plot_data = sqlpy.execute_function_in_sql(func=scatter_plot, input_data_query=sql_query,
|
||||
x_col="ArrDelay", y_col="CRSDepTime")
|
||||
im = Image.open(plot_data)
|
||||
im.show()
|
||||
#im.save("scatter_test.png")
|
|
@ -0,0 +1,14 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
import sqlmlutils
|
||||
|
||||
|
||||
def foo():
|
||||
return "bar"
|
||||
|
||||
|
||||
sqlpython = sqlmlutils.SQLPythonExecutor(sqlmlutils.ConnectionInfo(server="localhost", database="master"))
|
||||
result = sqlpython.execute_function_in_sql(foo)
|
||||
assert result == "bar"
|
||||
|
|
@ -0,0 +1,47 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
import sqlmlutils
|
||||
import pytest
|
||||
|
||||
def principal_components(input_table: str, output_table: str):
|
||||
import sqlalchemy
|
||||
from urllib import parse
|
||||
import pandas as pd
|
||||
from sklearn.decomposition import PCA
|
||||
|
||||
# Internal ODBC connection string used by process executing inside SQL Server
|
||||
connection_string = "Driver=SQL Server;Server=localhost;Database=AirlineTestDB;Trusted_Connection=Yes;"
|
||||
engine = sqlalchemy.create_engine("mssql+pyodbc:///?odbc_connect={}".format(parse.quote_plus(connection_string)))
|
||||
|
||||
input_df = pd.read_sql("select top 200 ArrDelay,CRSDepTime,DayOfWeek from {}".format(input_table), engine).dropna()
|
||||
|
||||
|
||||
pca = PCA(n_components=2)
|
||||
components = pca.fit_transform(input_df)
|
||||
|
||||
output_df = pd.DataFrame(components)
|
||||
output_df.to_sql(output_table, engine, if_exists="replace")
|
||||
|
||||
|
||||
connection = sqlmlutils.ConnectionInfo(server="localhost", database="AirlineTestDB")
|
||||
|
||||
input_table = "airline5000"
|
||||
output_table = "AirlineDemoPrincipalComponents"
|
||||
|
||||
sp_name = "SavePrincipalComponents"
|
||||
|
||||
sqlpy = sqlmlutils.SQLPythonExecutor(connection)
|
||||
|
||||
if sqlpy.check_sproc(sp_name):
|
||||
sqlpy.drop_sproc(sp_name)
|
||||
|
||||
sqlpy.create_sproc_from_function(sp_name, principal_components)
|
||||
|
||||
# You can check the stored procedure exists in the db with this:
|
||||
assert sqlpy.check_sproc(sp_name)
|
||||
|
||||
sqlpy.execute_sproc(sp_name, input_table=input_table, output_table=output_table)
|
||||
|
||||
sqlpy.drop_sproc(sp_name)
|
||||
assert not sqlpy.check_sproc(sp_name)
|
Двоичный файл не отображается.
Двоичный файл не отображается.
|
@ -0,0 +1,4 @@
|
|||
# file GENERATED by distutils, do NOT edit
|
||||
setup.py
|
||||
testpackageA\ClassA.py
|
||||
testpackageA\__init__.py
|
Двоичные данные
Python/tests/scripts/test_packages/testpackageA/dist/testpackageA-0.0.1.zip
поставляемый
Normal file
Двоичные данные
Python/tests/scripts/test_packages/testpackageA/dist/testpackageA-0.0.1.zip
поставляемый
Normal file
Двоичный файл не отображается.
|
@ -0,0 +1,9 @@
|
|||
from distutils.core import setup
|
||||
|
||||
setup(
|
||||
name='testpackageA' ,
|
||||
packages=['testpackageA'],
|
||||
version='0.0.1',
|
||||
description='Test package for python package management.',
|
||||
author='Microsoft'
|
||||
)
|
|
@ -0,0 +1,8 @@
|
|||
class ClassA:
|
||||
|
||||
def __init__(self, val):
|
||||
self._val = val
|
||||
|
||||
@property
|
||||
def val(self):
|
||||
return self._val
|
|
@ -0,0 +1 @@
|
|||
from .ClassA import ClassA
|
|
@ -0,0 +1,9 @@
|
|||
def foo(t1, t2, t3):
|
||||
print(t1 + t2)
|
||||
print(t3)
|
||||
return t3
|
||||
|
||||
|
||||
res = foo("Hello","World",InputDataSet)
|
||||
|
||||
print("Testing output!")
|
|
@ -0,0 +1,9 @@
|
|||
def foo(t1, t2, t3):
|
||||
print(t1 + t2)
|
||||
print(t3)
|
||||
return t3
|
||||
|
||||
|
||||
res = foo(t1,t2,t3)
|
||||
|
||||
print("Testing output!")
|
|
@ -0,0 +1,9 @@
|
|||
def foo(t1, t2, t3):
|
||||
print(t1 + t2)
|
||||
print(t3)
|
||||
return t3
|
||||
|
||||
|
||||
res = foo("No ", "Inputs", "Required")
|
||||
|
||||
print("Testing output!")
|
|
@ -0,0 +1,7 @@
|
|||
def foo(t1, t2, t3):
|
||||
return str(t1)+str(t2)
|
||||
|
||||
|
||||
res = foo(t1,t2,t3)
|
||||
|
||||
print("Testing output!")
|
|
@ -0,0 +1,10 @@
|
|||
def foo(t1, t2, t3):
|
||||
print(t1)
|
||||
print(t2)
|
||||
print(t3)
|
||||
return t3
|
||||
|
||||
|
||||
OutputDataSet = foo(t1,t2,t3)
|
||||
|
||||
print("Testing output!")
|
|
@ -0,0 +1,461 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
import pytest
|
||||
import sqlmlutils
|
||||
from contextlib import redirect_stdout
|
||||
from subprocess import Popen, PIPE, STDOUT
|
||||
from pandas import DataFrame
|
||||
import io
|
||||
import os
|
||||
|
||||
current_dir = os.path.dirname(__file__)
|
||||
script_dir = os.path.join(current_dir, "scripts")
|
||||
conn = sqlmlutils.ConnectionInfo(database="AirlineTestDB")
|
||||
sqlpy = sqlmlutils.SQLPythonExecutor(conn)
|
||||
|
||||
|
||||
###################
|
||||
# No output tests #
|
||||
###################
|
||||
|
||||
def test_no_output():
|
||||
def my_func():
|
||||
print("blah blah blah")
|
||||
|
||||
name = "test_no_output"
|
||||
sqlpy.drop_sproc(name)
|
||||
|
||||
sqlpy.create_sproc_from_function(name, my_func)
|
||||
assert sqlpy.check_sproc(name)
|
||||
|
||||
x = sqlpy.execute_sproc(name)
|
||||
assert type(x) == DataFrame
|
||||
assert x.empty
|
||||
|
||||
sqlpy.drop_sproc(name)
|
||||
assert not sqlpy.check_sproc(name)
|
||||
|
||||
|
||||
def test_no_output_mixed_args():
|
||||
def mixed(val1: int, val2: str, val3: float, val4: bool):
|
||||
print(val1, val2, val3, val4)
|
||||
|
||||
name = "test_no_output_mixed_args"
|
||||
sqlpy.drop_sproc(name)
|
||||
|
||||
sqlpy.create_sproc_from_function(name, mixed)
|
||||
buf = io.StringIO()
|
||||
with redirect_stdout(buf):
|
||||
sqlpy.execute_sproc(name, val1=5, val2="blah", val3=15.5, val4=True)
|
||||
assert "5 blah 15.5 True" in buf.getvalue()
|
||||
|
||||
sqlpy.drop_sproc(name)
|
||||
assert not sqlpy.check_sproc(name)
|
||||
|
||||
|
||||
def test_no_output_mixed_args_in_df():
|
||||
def mixed(val1: int, val2: str, val3: float, val4: bool, val5: DataFrame):
|
||||
print(val1, val2, val3, val4)
|
||||
print(val5)
|
||||
|
||||
name = "test_no_output_mixed_args_in_df"
|
||||
sqlpy.drop_sproc(name)
|
||||
|
||||
sqlpy.create_sproc_from_function(name, mixed)
|
||||
buf = io.StringIO()
|
||||
with redirect_stdout(buf):
|
||||
sqlpy.execute_sproc(name, val1=5, val2="blah", val3=15.5, val4=False, val5="SELECT TOP 2 * FROM airline5000")
|
||||
assert "5 blah 15.5 False" in buf.getvalue()
|
||||
assert "ArrTime" in buf.getvalue()
|
||||
assert "CRSDepTime" in buf.getvalue()
|
||||
assert "DepTime" in buf.getvalue()
|
||||
assert "CancellationCode" in buf.getvalue()
|
||||
assert "DayOfWeek" in buf.getvalue()
|
||||
|
||||
sqlpy.drop_sproc(name)
|
||||
assert not sqlpy.check_sproc(name)
|
||||
|
||||
|
||||
def test_no_output_mixed_args_in_df_in_params():
|
||||
def mixed(val1, val2, val3, val4, val5):
|
||||
print(val1, val2, val3, val5)
|
||||
print(val4)
|
||||
|
||||
in_params = {"val1": int, "val2": str, "val3": float, "val4": DataFrame, "val5": bool}
|
||||
name = "test_no_output_mixed_args_in_df_in_params"
|
||||
sqlpy.drop_sproc(name)
|
||||
|
||||
sqlpy.create_sproc_from_function(name=name, func=mixed, input_params=in_params)
|
||||
buf = io.StringIO()
|
||||
with redirect_stdout(buf):
|
||||
sqlpy.execute_sproc(name, val1=5, val2="blah", val3=15.5, val4="SELECT TOP 2 * FROM airline5000", val5=False)
|
||||
assert "5 blah 15.5 False" in buf.getvalue()
|
||||
assert "ArrTime" in buf.getvalue()
|
||||
assert "CRSDepTime" in buf.getvalue()
|
||||
assert "DepTime" in buf.getvalue()
|
||||
assert "CancellationCode" in buf.getvalue()
|
||||
assert "DayOfWeek" in buf.getvalue()
|
||||
|
||||
sqlpy.drop_sproc(name)
|
||||
assert not sqlpy.check_sproc(name)
|
||||
|
||||
|
||||
################
|
||||
# Test outputs #
|
||||
################
|
||||
|
||||
def test_out_df_no_params():
|
||||
def no_params():
|
||||
df = DataFrame()
|
||||
df["col1"] = [1, 2, 3, 4, 5]
|
||||
return df
|
||||
|
||||
name = "test_out_df_no_params"
|
||||
sqlpy.drop_sproc(name)
|
||||
|
||||
sqlpy.create_sproc_from_function(name, no_params)
|
||||
assert sqlpy.check_sproc(name)
|
||||
|
||||
df = sqlpy.execute_sproc(name)
|
||||
assert list(df.iloc[:,0] == [1, 2, 3, 4, 5])
|
||||
|
||||
sqlpy.drop_sproc(name)
|
||||
assert not sqlpy.check_sproc(name)
|
||||
|
||||
|
||||
def test_out_df_with_args():
|
||||
def my_func_with_args(arg1: str, arg2: str):
|
||||
return DataFrame({"arg1": [arg1], "arg2": [arg2]})
|
||||
|
||||
name = "test_out_df_with_args"
|
||||
sqlpy.drop_sproc(name)
|
||||
|
||||
sqlpy.create_sproc_from_function(name, my_func_with_args)
|
||||
assert sqlpy.check_sproc(name)
|
||||
|
||||
vals = [("arg1val", "arg2val"), ("asd", "Asd"), ("Qwe", "Qwe"), ("zxc", "Asd")]
|
||||
|
||||
for values in vals:
|
||||
arg1 = values[0]
|
||||
arg2 = values[1]
|
||||
res = sqlpy.execute_sproc(name, arg1=arg1, arg2=arg2)
|
||||
assert res[0][0] == arg1
|
||||
assert res[1][0] == arg2
|
||||
|
||||
sqlpy.drop_sproc(name)
|
||||
assert not sqlpy.check_sproc(name)
|
||||
|
||||
|
||||
def test_out_df_in_df():
|
||||
def in_data(in_df: DataFrame):
|
||||
return in_df
|
||||
|
||||
name = "test_out_df_in_df"
|
||||
sqlpy.drop_sproc(name)
|
||||
|
||||
sqlpy.create_sproc_from_function(name, in_data)
|
||||
assert sqlpy.check_sproc(name)
|
||||
|
||||
res = sqlpy.execute_sproc(name, in_df="SELECT TOP 10 * FROM airline5000")
|
||||
|
||||
assert type(res) == DataFrame
|
||||
assert res.shape == (10, 30)
|
||||
|
||||
sqlpy.drop_sproc(name)
|
||||
assert not sqlpy.check_sproc(name)
|
||||
|
||||
|
||||
def test_out_df_mixed_args_in_df():
|
||||
def mixed(val1: int, val2: str, val3: float, val4: DataFrame, val5: bool):
|
||||
print(val1, val2, val3, val5)
|
||||
if val5 and val1 == 5 and val2 == "blah" and val3 == 15.5:
|
||||
return val4
|
||||
else:
|
||||
return None
|
||||
|
||||
name = "test_out_df_mixed_args_in_df"
|
||||
sqlpy.drop_sproc(name)
|
||||
|
||||
sqlpy.create_sproc_from_function(name, mixed)
|
||||
|
||||
res = sqlpy.execute_sproc(name, val1=5, val2="blah", val3=15.5,
|
||||
val4="SELECT TOP 10 * FROM airline5000", val5=True)
|
||||
|
||||
assert type(res) == DataFrame
|
||||
assert res.shape == (10, 30)
|
||||
|
||||
sqlpy.drop_sproc(name)
|
||||
assert not sqlpy.check_sproc(name)
|
||||
|
||||
|
||||
def test_out_df_mixed_in_params_in_df():
|
||||
def mixed(val1, val2, val3, val4, val5):
|
||||
print(val1, val2, val3, val5)
|
||||
if val5 and val1 == 5 and val2 == "blah" and val3 == 15.5:
|
||||
return val4
|
||||
else:
|
||||
return None
|
||||
|
||||
name = "test_out_df_mixed_in_params_in_df"
|
||||
sqlpy.drop_sproc(name)
|
||||
|
||||
input_params = {"val1": int, "val2": str, "val3": float, "val4": DataFrame, "val5": bool}
|
||||
|
||||
sqlpy.create_sproc_from_function(name, mixed, input_params=input_params)
|
||||
assert sqlpy.check_sproc(name)
|
||||
|
||||
res = sqlpy.execute_sproc(name, val1=5, val2="blah", val3=15.5,
|
||||
val4="SELECT TOP 10 * FROM airline5000", val5=True)
|
||||
|
||||
assert type(res) == DataFrame
|
||||
assert res.shape == (10, 30)
|
||||
|
||||
sqlpy.drop_sproc(name)
|
||||
assert not sqlpy.check_sproc(name)
|
||||
|
||||
|
||||
def test_out_of_order_args():
|
||||
def mixed(val1, val2, val3, val4, val5):
|
||||
return DataFrame({"val1": [val1], "val2": [val2], "val3": [val3], "val5": [val5]})
|
||||
|
||||
in_params = {"val2": str, "val3": float, "val5": bool, "val4": DataFrame, "val1": int}
|
||||
|
||||
name = "test_out_of_order_args"
|
||||
sqlpy.drop_sproc(name)
|
||||
|
||||
sqlpy.create_sproc_from_function(name=name, func=mixed, input_params=in_params)
|
||||
assert sqlpy.check_sproc(name)
|
||||
|
||||
v1 = 5
|
||||
v2 = "blah"
|
||||
v3 = 15.5
|
||||
v4 = "SELECT TOP 10 * FROM airline5000"
|
||||
res = sqlpy.execute_sproc(name, val5=False, val3=v3, val4=v4, val1=v1, val2=v2)
|
||||
|
||||
assert res[0][0] == v1
|
||||
assert res[1][0] == v2
|
||||
assert res[2][0] == v3
|
||||
assert not res[3][0]
|
||||
|
||||
sqlpy.drop_sproc(name)
|
||||
assert not sqlpy.check_sproc(name)
|
||||
|
||||
|
||||
# TODO: Output Params execution not currently supported
|
||||
def test_in_param_out_param():
|
||||
def in_out(t1, t2, t3):
|
||||
print(t2)
|
||||
print(t3)
|
||||
res = "Hello " + t1
|
||||
return {'out_df': t3, 'res': res}
|
||||
|
||||
name = "test_in_param_out_param"
|
||||
sqlpy.drop_sproc(name)
|
||||
|
||||
input_params = {"t1": str, "t2": int, "t3": DataFrame}
|
||||
output_params = {"res": str, "out_df": DataFrame}
|
||||
|
||||
sqlpy.create_sproc_from_function(name, in_out, input_params=input_params, output_params=output_params)
|
||||
assert sqlpy.check_sproc(name)
|
||||
|
||||
# Out params don't currently work so we use sqlcmd to test the output param sproc
|
||||
sql_str = "DECLARE @res nvarchar(max) EXEC test_in_param_out_param @t2 = 213, @t1 = N'Hello', " \
|
||||
"@t3 = N'select top 10 * from airline5000', @res = @res OUTPUT SELECT @res as N'@res'"
|
||||
p = Popen(["sqlcmd", "-S", conn.server, "-E", "-d", conn.database, "-Q", sql_str],
|
||||
shell=True, stdin=PIPE, stdout=PIPE, stderr=STDOUT)
|
||||
output = p.stdout.read()
|
||||
assert "Hello Hello" in output.decode()
|
||||
|
||||
sqlpy.drop_sproc(name)
|
||||
assert not sqlpy.check_sproc(name)
|
||||
|
||||
|
||||
def test_in_df_out_df_dict():
|
||||
def func(in_df: DataFrame):
|
||||
return {"out_df": in_df}
|
||||
|
||||
name = "test_in_df_out_df_dict"
|
||||
sqlpy.drop_sproc(name)
|
||||
|
||||
output_params = {"out_df": DataFrame}
|
||||
|
||||
sqlpy.create_sproc_from_function(name, func, output_params=output_params)
|
||||
assert sqlpy.check_sproc(name)
|
||||
|
||||
res = sqlpy.execute_sproc(name, in_df="SELECT TOP 10 * FROM airline5000")
|
||||
|
||||
assert type(res) == DataFrame
|
||||
assert res.shape == (10, 30)
|
||||
|
||||
sqlpy.drop_sproc(name)
|
||||
assert not sqlpy.check_sproc(name)
|
||||
|
||||
|
||||
################
|
||||
# Script Tests #
|
||||
################
|
||||
|
||||
def test_script_no_params():
|
||||
script = os.path.join(script_dir, "test_script_no_params.py")
|
||||
|
||||
name = "test_script_no_params"
|
||||
sqlpy.drop_sproc(name)
|
||||
|
||||
sqlpy.create_sproc_from_script(name, script)
|
||||
assert sqlpy.check_sproc(name)
|
||||
|
||||
buf = io.StringIO()
|
||||
with redirect_stdout(buf):
|
||||
sqlpy.execute_sproc(name)
|
||||
assert "No Inputs" in buf.getvalue()
|
||||
assert "Required" in buf.getvalue()
|
||||
assert "Testing output!" in buf.getvalue()
|
||||
assert "HelloWorld" not in buf.getvalue()
|
||||
|
||||
sqlpy.drop_sproc(name)
|
||||
assert not sqlpy.check_sproc(name)
|
||||
|
||||
|
||||
def test_script_no_out_params():
|
||||
script = os.path.join(script_dir, "test_script_no_out_params.py")
|
||||
|
||||
name = "test_script_no_out_params"
|
||||
sqlpy.drop_sproc(name)
|
||||
|
||||
input_params = {"t1": str, "t2": str, "t3": int}
|
||||
|
||||
sqlpy.create_sproc_from_script(name, script, input_params)
|
||||
assert sqlpy.check_sproc(name)
|
||||
|
||||
buf = io.StringIO()
|
||||
with redirect_stdout(buf):
|
||||
sqlpy.execute_sproc(name, t1="Hello", t2="World", t3=312)
|
||||
assert "HelloWorld" in buf.getvalue()
|
||||
assert "312" in buf.getvalue()
|
||||
assert "Testing output!" in buf.getvalue()
|
||||
|
||||
sqlpy.drop_sproc(name)
|
||||
assert not sqlpy.check_sproc(name)
|
||||
|
||||
|
||||
def test_script_out_df():
|
||||
script = os.path.join(script_dir, "test_script_sproc_out_df.py")
|
||||
|
||||
name = "test_script_out_df"
|
||||
sqlpy.drop_sproc(name)
|
||||
|
||||
input_params = {"t1": str, "t2": int, "t3": DataFrame}
|
||||
|
||||
sqlpy.create_sproc_from_script(name, script, input_params)
|
||||
assert sqlpy.check_sproc(name)
|
||||
|
||||
res = sqlpy.execute_sproc(name, t1="Hello", t2=2313, t3="SELECT TOP 10 * FROM airline5000")
|
||||
|
||||
assert type(res) == DataFrame
|
||||
assert res.shape == (10, 30)
|
||||
|
||||
sqlpy.drop_sproc(name)
|
||||
assert not sqlpy.check_sproc(name)
|
||||
|
||||
|
||||
#TODO: Output Params execution not currently supported
|
||||
def test_script_out_param():
|
||||
script = os.path.join(script_dir, "test_script_out_param.py")
|
||||
|
||||
name = "test_script_out_param"
|
||||
sqlpy.drop_sproc(name)
|
||||
|
||||
input_params = {"t1": str, "t2": int, "t3": DataFrame}
|
||||
output_params = {"res": str}
|
||||
|
||||
sqlpy.create_sproc_from_script(name, script, input_params, output_params)
|
||||
assert sqlpy.check_sproc(name)
|
||||
|
||||
# Out params don't currently work so we use sqlcmd to test the output param sproc
|
||||
sql_str = "DECLARE @res nvarchar(max) EXEC test_script_out_param @t2 = 123, @t1 = N'Hello', " \
|
||||
"@t3 = N'select top 10 * from airline5000', @res = @res OUTPUT SELECT @res as N'@res'"
|
||||
p = Popen(["sqlcmd", "-S", conn.server, "-E", "-d", conn.database, "-Q", sql_str],
|
||||
shell=True, stdin=PIPE, stdout=PIPE, stderr=STDOUT)
|
||||
output = p.stdout.read()
|
||||
assert "Hello123" in output.decode()
|
||||
|
||||
sqlpy.drop_sproc(name)
|
||||
assert not sqlpy.check_sproc(name)
|
||||
|
||||
|
||||
##################
|
||||
# Negative Tests #
|
||||
##################
|
||||
|
||||
def test_execute_bad_param_types():
|
||||
def bad_func(input1: bin):
|
||||
pass
|
||||
|
||||
with pytest.raises(ValueError):
|
||||
sqlpy.create_sproc_from_function("BadParam", bad_func)
|
||||
|
||||
def func(input1: bool):
|
||||
pass
|
||||
name = "BadInput"
|
||||
sqlpy.drop_sproc(name)
|
||||
sqlpy.create_sproc_from_function(name, func)
|
||||
assert sqlpy.check_sproc(name)
|
||||
|
||||
with pytest.raises(RuntimeError):
|
||||
sqlpy.execute_sproc(name, input1="Hello!")
|
||||
|
||||
|
||||
def test_create_bad_name():
|
||||
def foo():
|
||||
return 1
|
||||
with pytest.raises(RuntimeError):
|
||||
sqlpy.create_sproc_from_function("'''asd''asd''asd", foo)
|
||||
|
||||
|
||||
def test_no_output_bad_num_args():
|
||||
def mixed(val1: str, val2, val3, val4):
|
||||
print(val1, val2, val3)
|
||||
print(val4)
|
||||
|
||||
name = "test_no_output_bad_num_args"
|
||||
sqlpy.drop_sproc(name)
|
||||
|
||||
with pytest.raises(ValueError):
|
||||
sqlpy.create_sproc_from_function(name=name, func=mixed)
|
||||
|
||||
def func(val1, val2, val3, val4):
|
||||
print(val1, val2, val3)
|
||||
print(val4)
|
||||
|
||||
input_params = {"val1": int, "val4": str, "val5": int, "BADVAL": str}
|
||||
sqlpy.drop_sproc(name)
|
||||
|
||||
with pytest.raises(ValueError):
|
||||
sqlpy.create_sproc_from_function(name=name, func=func, input_params=input_params)
|
||||
|
||||
input_params = {"val1": int, "val2": int, "val3": str}
|
||||
sqlpy.drop_sproc(name)
|
||||
|
||||
with pytest.raises(ValueError):
|
||||
sqlpy.create_sproc_from_function(name=name, func=func, input_params=input_params)
|
||||
|
||||
|
||||
def test_annotation_vs_input_param():
|
||||
def foo(val1: str, val2: int, val3: int):
|
||||
print(val1)
|
||||
print(val2)
|
||||
return val3
|
||||
|
||||
name = "test_input_param_override_error"
|
||||
input_params = {"val1": str, "val2": int, "val3": DataFrame}
|
||||
|
||||
sqlpy.drop_sproc(name)
|
||||
with pytest.raises(ValueError):
|
||||
sqlpy.create_sproc_from_function(name=name, func=foo, input_params=input_params)
|
||||
|
||||
|
||||
def test_bad_script_path():
|
||||
with pytest.raises(FileNotFoundError):
|
||||
sqlpy.create_sproc_from_script(name="badScript", path_to_script="NonexistentScriptPath")
|
||||
|
|
@ -0,0 +1,19 @@
|
|||
Package: sqlmlutils
|
||||
Type: Package
|
||||
Title: Wraps R code into executable SQL Server stored procedures
|
||||
Version: 0.5.0
|
||||
Author: Microsoft Corporation
|
||||
Maintainer: Microsoft Corporation <msrpack@microsoft.com>
|
||||
Depends:
|
||||
R (>= 3.2.2)
|
||||
Imports:
|
||||
RODBC, RODBCext, tools, methods, utils
|
||||
Description: Provides a set of functions allowing the user
|
||||
to wrap their R script into a TSQL stored procedure, register
|
||||
that stored procedure with a database, and test it from an R
|
||||
development environment.
|
||||
License: MIT + file LICENSE
|
||||
Copyright: Copyright 2016 Microsoft Corporation
|
||||
RoxygenNote: 6.0.1
|
||||
Suggests: testthat,
|
||||
roxygen2
|
|
@ -0,0 +1,25 @@
|
|||
------------------------------------------- START OF LICENSE -----------------------------------------
|
||||
sqlmlutils
|
||||
|
||||
MIT License
|
||||
|
||||
Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE
|
||||
----------------------------------------------- END OF LICENSE ------------------------------------------
|
|
@ -0,0 +1,17 @@
|
|||
# Generated by roxygen2: do not edit by hand
|
||||
|
||||
export(checkSproc)
|
||||
export(connectionInfo)
|
||||
export(createSprocFromFunction)
|
||||
export(createSprocFromScript)
|
||||
export(dropSproc)
|
||||
export(executeFunctionInSQL)
|
||||
export(executeSQLQuery)
|
||||
export(executeScriptInSQL)
|
||||
export(executeSproc)
|
||||
export(sql_install.packages)
|
||||
export(sql_installed.packages)
|
||||
export(sql_remove.packages)
|
||||
import(RODBC)
|
||||
importFrom(RODBCext,sqlExecute)
|
||||
importFrom(utils,tail)
|
|
@ -0,0 +1,293 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
|
||||
|
||||
#'
|
||||
#'Execute a function in SQL
|
||||
#'
|
||||
#'@param driver The driver to use for the connection - defaults to SQL Server
|
||||
#'@param server The server to connect to - defaults to localhost
|
||||
#'@param database The database to connect to - defaults to master
|
||||
#'@param uid The user id for the connection. If uid is NULL, default to Trusted Connection
|
||||
#'@param pwd The password for the connection. If uid is not NULL, pwd is required
|
||||
#'
|
||||
#'@return A fully formed connection string
|
||||
#'
|
||||
#'
|
||||
#'@examples
|
||||
#'\dontrun{
|
||||
#'
|
||||
#' connectionInfo()
|
||||
#' [1] "Driver={SQL Server};Server=localhost;Database=master;Trusted_Connection=Yes;"
|
||||
#'
|
||||
#' connectionInfo(server="ServerName", database="AirlineTestDB", uid="username", pwd="pass")
|
||||
#' [1] "Driver={SQL Server};Server=ServerName;Database=AirlineTestDB;uid=username;pwd=pass;"
|
||||
#' }
|
||||
#'
|
||||
#'
|
||||
#'@export
|
||||
connectionInfo <- function(driver = "SQL Server", server = "localhost", database = "master",
|
||||
uid = NULL, pwd = NULL) {
|
||||
authorization <- "Trusted_Connection=Yes"
|
||||
|
||||
if (!is.null(uid)) {
|
||||
if (is.null(pwd)) {
|
||||
stop("Need a password if using uid")
|
||||
} else {
|
||||
authorization = sprintf("uid=%s;pwd=%s",uid,pwd)
|
||||
}
|
||||
}
|
||||
|
||||
connection <- sprintf("Driver={%s};Server=%s;Database=%s;%s;", driver, server, database, authorization)
|
||||
connection
|
||||
}
|
||||
|
||||
|
||||
|
||||
#'
|
||||
#'Execute a function in SQL
|
||||
#'
|
||||
#'@param connectionString character string. The connectionString to the database
|
||||
#'@param func closure. The function to execute
|
||||
#'@param ... A named list of arguments to pass into the function
|
||||
#'@param inputDataQuery character string. A string to query the database.
|
||||
#' The result of the query will be put into a data frame into the first argument in the function
|
||||
#'
|
||||
#'@return The returned value from the function
|
||||
#'
|
||||
#'@seealso
|
||||
#'\code{\link{executeScriptInSQL}} to execute a script file instead of a function in SQL
|
||||
#'
|
||||
#'
|
||||
#'@examples
|
||||
#'\dontrun{
|
||||
#' connection <- connectionInfo(database = "AirlineTestDB")
|
||||
#'
|
||||
#' foo <- function(in_df, arg) {
|
||||
#' list(data = in_df, value = arg)
|
||||
#' }
|
||||
#' executeFunctionInSQL(connection, foo, arg = 12345,
|
||||
#' inputDataQuery = "SELECT top 1 * from airline5000")
|
||||
#'}
|
||||
#'
|
||||
#'
|
||||
#'@export
|
||||
executeFunctionInSQL <- function(connectionString, func, ..., inputDataQuery = "")
|
||||
{
|
||||
inputDataName <- "InputDataSet"
|
||||
listArgs <- list(...)
|
||||
|
||||
if (inputDataQuery != "") {
|
||||
funcArgs <- methods::formalArgs(func)
|
||||
if (length(funcArgs) < 1) {
|
||||
stop("To use the inputDataQuery variable, the function must have at least one input argument")
|
||||
} else {
|
||||
inputDataName <- funcArgs[[1]]
|
||||
}
|
||||
}
|
||||
binArgs <- serialize(listArgs, NULL)
|
||||
|
||||
spees <- speesBuilderFromFunction(func = func, inputDataQuery = inputDataQuery, inputDataName = inputDataName, binArgs)
|
||||
resVal <- execute(connectionString = connectionString, script = spees)
|
||||
return(resVal[[1]])
|
||||
}
|
||||
|
||||
#'
|
||||
#'Execute a script in SQL
|
||||
#'
|
||||
#'@param connectionString character string. The connectionString to the database
|
||||
#'@param script character string. The path to the script to execute in SQL
|
||||
#'@param inputDataQuery character string. A string to query the database.
|
||||
#' The result of the query will be put into a data frame into the variable "InputDataSet" in the environment
|
||||
#'
|
||||
#'@return The returned value from the last line of the script
|
||||
#'
|
||||
#'@seealso
|
||||
#'\code{\link{executeFunctionInSQL}} to execute a user function instead of a script in SQL
|
||||
#'
|
||||
#'@export
|
||||
executeScriptInSQL <- function(connectionString, script, inputDataQuery = "")
|
||||
{
|
||||
|
||||
if (file.exists(script)){
|
||||
print(paste0("Script path exists, using file ", script))
|
||||
} else {
|
||||
stop("Script path doesn't exist")
|
||||
}
|
||||
|
||||
text <- paste(readLines(script), collapse="\n")
|
||||
|
||||
func <- function(InputDataSet, script) {
|
||||
eval(parse(text = script))
|
||||
}
|
||||
|
||||
executeFunctionInSQL(connectionString = connectionString, func = func, script = text, inputDataQuery = inputDataQuery)
|
||||
}
|
||||
|
||||
|
||||
#'
|
||||
#'Execute a script in SQL
|
||||
#'
|
||||
#'@param connectionString character string. The connectionString to the database
|
||||
#'@param sqlQuery character string. The query to execute
|
||||
#'
|
||||
#'@return The data frame returned by the query to the database
|
||||
#'
|
||||
#'
|
||||
#'@examples
|
||||
#'\dontrun{
|
||||
#' connection <- connectionInfo(database="AirlineTestDB")
|
||||
#' executeSQLQuery(connection, sqlQuery="SELECT top 1 * from airline5000")
|
||||
#'}
|
||||
#'
|
||||
#'
|
||||
#'@export
|
||||
executeSQLQuery <- function(connectionString, sqlQuery)
|
||||
{
|
||||
#We use the serialize method here instead of OutputDataSet <- InputDataSet to preserve column names
|
||||
|
||||
script <- "
|
||||
serializedResult <- as.character(serialize(list(result = InputDataSet), NULL))
|
||||
OutputDataSet <- data.frame(returnVal=serializedResult)"
|
||||
spees <- speesBuilder(script = script, inputDataQuery = sqlQuery, TRUE)
|
||||
execute(connectionString, spees)$result
|
||||
}
|
||||
|
||||
#
|
||||
#Execute and process a script
|
||||
#
|
||||
#@param connectionString character string. The connectionString to the database
|
||||
#@param script character string. The script to execute
|
||||
#
|
||||
#
|
||||
execute <- function(connectionString, script)
|
||||
{
|
||||
tryCatch({
|
||||
dbhandle <- odbcDriverConnect(connectionString)
|
||||
res <- sqlQuery(dbhandle, script)
|
||||
if (typeof(res) == "character") {
|
||||
stop(res[1])
|
||||
}
|
||||
binVal <- res$returnVal
|
||||
}, error = function(e) {
|
||||
stop(paste0("Error in SQL Execution: ", e, "\n"))
|
||||
}, finally ={
|
||||
odbcCloseAll()
|
||||
})
|
||||
binVal <- res$returnVal
|
||||
if (!is.null(binVal)) {
|
||||
resVal <- unserialize(unlist(lapply(lapply(as.character(binVal),as.hexmode), as.raw)))
|
||||
len <- length(resVal)
|
||||
|
||||
# Each piece of the returned value is a different part of the output
|
||||
# 1. The result of the function
|
||||
# 2. The output of the function (e.g. from print())
|
||||
# 3. The warnings of the function
|
||||
# 4. The errors of the function
|
||||
# We raise warnings and errors, print any output, and return the actual function results to the user
|
||||
|
||||
if (len > 1) {
|
||||
output <- resVal[[2]]
|
||||
for(o in output) {
|
||||
cat(paste0(o,"\n"))
|
||||
}
|
||||
}
|
||||
if (len > 2) {
|
||||
warnings <- resVal[[3]]
|
||||
for(w in warnings) {
|
||||
warning(w)
|
||||
}
|
||||
}
|
||||
if (len > 3) {
|
||||
errors <- resVal[[4]]
|
||||
for(e in errors) {
|
||||
stop(paste0("Error in script: ", e))
|
||||
}
|
||||
}
|
||||
return(resVal)
|
||||
} else {
|
||||
return(res)
|
||||
}
|
||||
}
|
||||
|
||||
#
|
||||
#Build an R sp_execute_external_script
|
||||
#
|
||||
#@param script The script to execute
|
||||
#@param inputDataQuery The query on the database
|
||||
#@param withResults Whether to have a result set, outside of the OutputDataSet
|
||||
#
|
||||
speesBuilder <- function(script, inputDataQuery, withResults = FALSE) {
|
||||
|
||||
resultSet <- if (withResults) "with result sets((returnVal varchar(MAX)))" else ""
|
||||
|
||||
sprintf("exec sp_execute_external_script
|
||||
@language = N'R',
|
||||
@script = N'
|
||||
%s
|
||||
',
|
||||
@input_data_1 = N'%s'
|
||||
%s
|
||||
", script, inputDataQuery, resultSet)
|
||||
}
|
||||
|
||||
#
|
||||
#Build a spees call from a function
|
||||
#
|
||||
#@param func The function to make into a spees
|
||||
#@param inputDataQuery The input data query to the database
|
||||
#@param inputDataName The name of the variable to put the data frame from the query into in the script
|
||||
#@param binArgs The (binary) version of all arguments passed into the function
|
||||
#
|
||||
#@return The spees script to execute
|
||||
#The spees script will return a data frame with the results, serialized
|
||||
#
|
||||
speesBuilderFromFunction <- function(func, inputDataQuery, inputDataName, binArgs) {
|
||||
|
||||
funcName <- deparse(substitute(func))
|
||||
funcBody <- gsub('"', '\"', paste0(deparse(func), collapse = "\n"))
|
||||
|
||||
speesBody <- sprintf("
|
||||
%s <- %s
|
||||
|
||||
|
||||
oldWarn <- options(\"warn\")$warn
|
||||
options(warn=1)
|
||||
|
||||
output <- NULL
|
||||
result <- NULL
|
||||
funerror <- NULL
|
||||
funwarnings <- NULL
|
||||
try(withCallingHandlers({
|
||||
|
||||
binArgList <- unlist(lapply(lapply(strsplit(\"%s\",\";\")[[1]], as.hexmode), as.raw))
|
||||
argList <- as.list(unserialize(binArgList))
|
||||
|
||||
if (nrow(InputDataSet)!=0) {
|
||||
argList <- c(list(%s = InputDataSet), argList)
|
||||
}
|
||||
|
||||
funwarnings <- capture.output(
|
||||
output <- capture.output(
|
||||
result <- do.call(%s, argList)
|
||||
),
|
||||
type=\"message\")
|
||||
|
||||
}, error = function(err) {
|
||||
funerror <<- err
|
||||
}
|
||||
), silent = TRUE
|
||||
)
|
||||
|
||||
options(warn=oldWarn)
|
||||
|
||||
serializedResult <- as.character(serialize(list(result, output, funwarnings, funerror), NULL))
|
||||
OutputDataSet <- data.frame(returnVal=serializedResult)
|
||||
|
||||
", funcName, funcBody, paste0(binArgs,collapse=";"), inputDataName, funcName)
|
||||
|
||||
#Call the spees builder to wrap the function; needs the returnVal resultset
|
||||
speesBuilder(speesBody, inputDataQuery, withResults = TRUE)
|
||||
}
|
||||
|
Разница между файлами не показана из-за своего большого размера
Загрузить разницу
|
@ -0,0 +1,378 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
|
||||
#'
|
||||
#'Create a Stored Procedure
|
||||
#'
|
||||
#'This function creates a stored procedure from a function
|
||||
#'on the database and return the object.
|
||||
#'
|
||||
#'
|
||||
#'@param connectionString character string. The connectionString to the database
|
||||
#'@param name character string. The name of the stored procedure
|
||||
#'@param func closure. The function to wrap in the stored procedure
|
||||
#'@param inputParams named list. The types of the inputs,
|
||||
#'where the names are the arguments and the values are the types
|
||||
#'@param outputParams named list. The types of the outputs,
|
||||
#'where the names are the arguments and the values are the types
|
||||
#'
|
||||
#'@section Warning:
|
||||
#'You can add output parameters to the stored procedure
|
||||
#'but you will not be able to execute the procedure from R afterwards.
|
||||
#'Any stored procedure with output params must be executed directly in SQL.
|
||||
#'
|
||||
#'@examples
|
||||
#'\dontrun{
|
||||
#' connectionString <- connectionInfo()
|
||||
#'
|
||||
#' ### Using a function
|
||||
#' dropSproc(connectionString, "fun")
|
||||
#'
|
||||
#' func <- function(arg1) {return(data.frame(hello = arg1))}
|
||||
#' createSprocFromFunction(connectionString, name = "fun",
|
||||
#' func = func, inputParams = list(arg1="character"))
|
||||
#'
|
||||
#' if (checkSproc(connectionString, "fun")) {
|
||||
#' print("Function 'fun' exists!")
|
||||
#' executeSproc(connectionString, "fun", arg1="WORLD")
|
||||
#' }
|
||||
#'
|
||||
#' ### Using a script
|
||||
#' createSprocFromScript(connectionString, name = "funScript",
|
||||
#' script = "path/to/script", inputParams = list(arg1="character"))
|
||||
#'
|
||||
#'}
|
||||
#'
|
||||
#'
|
||||
#'
|
||||
#'@seealso{
|
||||
#'\code{\link{dropSproc}}
|
||||
#'
|
||||
#'\code{\link{executeSproc}}
|
||||
#'
|
||||
#'\code{\link{checkSproc}}
|
||||
#'}
|
||||
#'
|
||||
#'@return Invisibly returns the script used to create the stored procedure
|
||||
#'
|
||||
#'@describeIn createSprocFromFunction Create stored procedure from function
|
||||
#'@export
|
||||
createSprocFromFunction <- function (connectionString, name, func, inputParams = NULL, outputParams = NULL) {
|
||||
|
||||
possibleTypes <- c("posixct", "numeric", "character", "integer", "logical", "raw", "dataframe")
|
||||
|
||||
lapply(inputParams, function(x) {
|
||||
if (!tolower(x) %in% possibleTypes) stop("Possible types are POSIXct, numeric, character, integer, logical, raw, and DataFrame.")
|
||||
})
|
||||
lapply(outputParams, function(x) {
|
||||
if (!tolower(x) %in% possibleTypes) stop("Possible types are POSIXct, numeric, character, integer, logical, raw, and DataFrame.")
|
||||
})
|
||||
|
||||
inputParameters <- methods::formalArgs(func)
|
||||
|
||||
if (!setequal(names(inputParams), inputParameters)){
|
||||
stop("inputParams and function arguments do not match!")
|
||||
}
|
||||
|
||||
procScript <- generateTSQL(func = func, spName = name, inputParams = inputParams, outputParams = outputParams)
|
||||
|
||||
tryCatch({
|
||||
register(procScript, connectionString = connectionString)
|
||||
}, error = function(e) {
|
||||
stop(paste0("Failed during registering procedure ", name, ": ", e))
|
||||
})
|
||||
|
||||
invisible(procScript)
|
||||
}
|
||||
|
||||
#'@describeIn createSprocFromFunction Create stored procedure from script file, returns output of final line
|
||||
#'
|
||||
#'@param script character string. The path to the script to wrap in the stored procedure
|
||||
#'@export
|
||||
createSprocFromScript <- function (connectionString, name, script, inputParams = NULL, outputParams = NULL) {
|
||||
|
||||
if (file.exists(script)){
|
||||
print(paste0("Script path exists, using file ", script))
|
||||
} else {
|
||||
stop("Script path doesn't exist")
|
||||
}
|
||||
|
||||
text <- paste(readLines(script), collapse="\n")
|
||||
|
||||
possibleTypes = c("posixct", "numeric", "character", "integer", "logical", "raw", "dataframe")
|
||||
|
||||
lapply(inputParams, function(x) {
|
||||
if (!tolower(x) %in% possibleTypes) stop("Possible input types are POSIXct, numeric, character, integer, logical, raw, and DataFrame.")
|
||||
})
|
||||
lapply(outputParams, function(x) {
|
||||
if (!tolower(x) %in% possibleTypes) stop("Possible output types are POSIXct, numeric, character, integer, logical, raw, and DataFrame.")
|
||||
})
|
||||
|
||||
procScript <- generateTSQLFromScript(script = text, spName = name, inputParams = inputParams, outputParams = outputParams)
|
||||
|
||||
tryCatch({
|
||||
register(procScript, connectionString = connectionString)
|
||||
}, error = function(e) {
|
||||
stop(paste0("Failed during registering procedure ", name, ": ", e))
|
||||
})
|
||||
|
||||
invisible(procScript)
|
||||
}
|
||||
|
||||
|
||||
#'Drop Stored Procedure
|
||||
#'
|
||||
#'@param connectionString character string. The connectionString to the database
|
||||
#'@param name character string. The name of the stored procedure
|
||||
#'
|
||||
#'@examples
|
||||
#'\dontrun{
|
||||
#' connectionString <- connectionInfo()
|
||||
#'
|
||||
#' dropSproc(connectionString, "fun")
|
||||
#'
|
||||
#' func <- function(arg1) {return(data.frame(hello = arg1))}
|
||||
#' createSprocFromFunction(connectionString, name = "fun",
|
||||
#' func = func, inputParams = list(arg1 = "character"))
|
||||
#'
|
||||
#' if (checkSproc(connectionString, "fun")) {
|
||||
#' print("Function 'fun' exists!")
|
||||
#' executeSproc(connectionString, "fun", arg1="WORLD")
|
||||
#' }
|
||||
#'}
|
||||
#'
|
||||
#'
|
||||
#'@seealso{
|
||||
#'
|
||||
#'\code{\link{createSprocFromFunction}}
|
||||
#'
|
||||
#'\code{\link{executeSproc}}
|
||||
#'
|
||||
#'\code{\link{checkSproc}}
|
||||
#'}
|
||||
#'
|
||||
#'@importFrom RODBCext sqlExecute
|
||||
#'@import RODBC
|
||||
#'
|
||||
#'@export
|
||||
dropSproc <- function(connectionString, name) {
|
||||
tryCatch({
|
||||
dbhandle <- odbcDriverConnect(connectionString)
|
||||
output <- sqlExecute(dbhandle, "SELECT OBJECT_ID (?)", name, fetch=TRUE)
|
||||
if (!is.na(output)) {
|
||||
output <- sqlQuery(dbhandle, sprintf("DROP PROCEDURE %s", name))
|
||||
} else {
|
||||
output <- "Named procedure doesn't exist"
|
||||
}
|
||||
}, error = function(e) {
|
||||
stop(paste0("Error dropping the stored procedure\n"))
|
||||
}, finally = {
|
||||
odbcCloseAll()
|
||||
})
|
||||
|
||||
if (length(output) > 0) {
|
||||
print(output)
|
||||
return(FALSE)
|
||||
} else {
|
||||
print(paste0("Successfully dropped procedure ", name))
|
||||
return(TRUE)
|
||||
}
|
||||
}
|
||||
|
||||
#'Check if Stored Procedure is in Database
|
||||
#'
|
||||
#'@param connectionString character string. The connectionString to the database
|
||||
#'@param name character string. The name of the stored procedure
|
||||
#'
|
||||
#'@return Whether the stored procedure exists in the database
|
||||
#'
|
||||
#'@examples
|
||||
#'\dontrun{
|
||||
#' connectionString <- connectionInfo()
|
||||
#'
|
||||
#' dropSproc(connectionString, "fun")
|
||||
#'
|
||||
#' func <- function(arg1) {return(data.frame(hello = arg1))}
|
||||
#' createSprocFromFunction(connectionString, name = "fun",
|
||||
#' func = func, inputParams = list(arg1="character"))
|
||||
#' if (checkSproc(connectionString, "fun")) {
|
||||
#' print("Function 'fun' exists!")
|
||||
#' executeSproc(connectionString, "fun", arg1="WORLD")
|
||||
#' }
|
||||
#'}
|
||||
#'
|
||||
#'
|
||||
#'@seealso{
|
||||
#'\code{\link{createSprocFromFunction}}
|
||||
#'
|
||||
#'\code{\link{dropSproc}}
|
||||
#'
|
||||
#'\code{\link{executeSproc}}
|
||||
#'
|
||||
#'}
|
||||
#'
|
||||
#'@importFrom RODBCext sqlExecute
|
||||
#'@import RODBC
|
||||
#'@export
|
||||
checkSproc <- function(connectionString, name) {
|
||||
tryCatch({
|
||||
dbhandle <- odbcDriverConnect(connectionString)
|
||||
output <- sqlExecute(dbhandle, "SELECT OBJECT_ID (?, N'P')", name, fetch = TRUE)
|
||||
}, error = function(e) {
|
||||
cat(paste0("Error executing the sqlExecute\n"))
|
||||
}, finally = {
|
||||
odbcCloseAll()
|
||||
})
|
||||
if (is.na(output)) {
|
||||
return(FALSE)
|
||||
} else {
|
||||
return(TRUE)
|
||||
}
|
||||
}
|
||||
|
||||
#'Execute a Stored Procedure
|
||||
#'
|
||||
#'@param connectionString character string. The connectionString for the database with the stored procedure
|
||||
#'@param name character string. The name of the stored procedure in the database to execute
|
||||
#'@param ... named list. Parameters to pass into the procedure. These MUST be named the same as the arguments to the function.
|
||||
#'
|
||||
#'@section Warning:
|
||||
#'Even though you can create stored procedures with output parameters, you CANNOT execute them
|
||||
#'using this utility due to limitations of RODBC.
|
||||
#'
|
||||
#'@examples
|
||||
#'\dontrun{
|
||||
#' connectionString <- connectionInfo()
|
||||
#'
|
||||
#' dropSproc(connectionString, "fun")
|
||||
#'
|
||||
#' func <- function(arg1) {return(data.frame(hello = arg1))}
|
||||
#' createSprocFromFunction(connectionString, name = "fun",
|
||||
#' func = func, inputParams = list(arg1="character"))
|
||||
#'
|
||||
#' if (checkSproc(connectionString, "fun")) {
|
||||
#' print("Function 'fun' exists!")
|
||||
#' executeSproc(connectionString, "fun", arg1="WORLD")
|
||||
#' }
|
||||
#'}
|
||||
#'@seealso{
|
||||
#'\code{\link{createSprocFromFunction}}
|
||||
#'
|
||||
#'\code{\link{dropSproc}}
|
||||
#'
|
||||
#'\code{\link{checkSproc}}
|
||||
#'}
|
||||
#'@importFrom RODBCext sqlExecute
|
||||
#'@import RODBC
|
||||
#'@export
|
||||
executeSproc <- function(connectionString, name, ...) {
|
||||
if (class(name) != "character")
|
||||
stop("the argument must be the name of a Sproc")
|
||||
|
||||
res <- createQuery(connectionString = connectionString, name = name, ...)
|
||||
query <- res$query
|
||||
paramOrder <- res$inputParams
|
||||
df = data.frame(...)
|
||||
|
||||
if (nrow(df) != 0 && ncol(df) != 0) {
|
||||
df <- df[paramOrder]
|
||||
}
|
||||
|
||||
tryCatch({
|
||||
dbhandle <- odbcDriverConnect(connectionString)
|
||||
result <- sqlExecute(dbhandle, query, df, fetch = TRUE)
|
||||
}, error = function(e) {
|
||||
stop(paste0("Error in SQL Execution: ", e, "\n"))
|
||||
}, finally ={
|
||||
odbcCloseAll()
|
||||
})
|
||||
|
||||
if (is.list(result)) {
|
||||
return(result)
|
||||
} else if (!is.character(result)) {
|
||||
stop(paste("Error executing the stored procedure:", name))
|
||||
} else {
|
||||
return(NULL)
|
||||
}
|
||||
}
|
||||
|
||||
#
|
||||
# Get the parameters of the stored procedure to create the query
|
||||
#
|
||||
#@param connectionString character string. The connectionString to the database
|
||||
#@param name character string. The name of the stored procedure
|
||||
#
|
||||
#@return the parameters
|
||||
#
|
||||
getSprocParams <- function(connectionString, name) {
|
||||
query <- "SELECT 'Parameter_name' = name, 'Type' = type_name(user_type_id),
|
||||
'Output' = is_output FROM sys.parameters WHERE OBJECT_ID = ?"
|
||||
|
||||
inputDataName <- NULL
|
||||
tryCatch({
|
||||
dbhandle <- odbcDriverConnect(connectionString)
|
||||
|
||||
number <- sqlExecute(dbhandle, "SELECT OBJECT_ID (?)", name, fetch=TRUE)[[1]]
|
||||
|
||||
params <- sqlExecute(dbhandle, query, number, fetch=TRUE)
|
||||
outputParams <- split(params,params$Output)[['1']]
|
||||
inputParams <- split(params,params$Output)[['0']]
|
||||
|
||||
text <- paste0(collapse="", lapply(sqlExecute(dbhandle, "EXEC sp_helptext ?", name, fetch = TRUE), as.character))
|
||||
matched <- regmatches(text, gregexpr("input_data_1_name = [^,]+",text))[[1]]
|
||||
if (length(matched) == 1) {
|
||||
inputDataName <- regmatches(matched, gregexpr("N'.*'",matched))[[1]]
|
||||
inputDataName <- gsub("(N'|')","", inputDataName)
|
||||
}
|
||||
}, error = function(e) {
|
||||
cat(paste0("Error executing the sqlExecute\n"))
|
||||
odbcCloseAll()
|
||||
stop(e)
|
||||
}, finally ={
|
||||
odbcCloseAll()
|
||||
})
|
||||
list(inputParams = inputParams, inputDataName = inputDataName, outputParams = outputParams)
|
||||
}
|
||||
|
||||
#Create the necessary query to execute the stored procedure
|
||||
#
|
||||
#@param connectionString character string. The connectionString to the database
|
||||
#@param name character string. The name of the stored procedure
|
||||
#@param ... The arguments for the stored procedure
|
||||
#
|
||||
#@return the query
|
||||
#
|
||||
createQuery <- function(connectionString, name, ...) {
|
||||
#Get and process params from the stored procedure in the database
|
||||
storedProcParams <- getSprocParams(connectionString = connectionString, name = name)
|
||||
params <- storedProcParams$inputParams
|
||||
inList <- c()
|
||||
|
||||
if (!is.null(params)) {
|
||||
for(i in seq_len(nrow(params))) {
|
||||
parameter_outer <- params[i,]$Parameter_name
|
||||
parameter <- gsub('.{6}$', '', parameter_outer)
|
||||
parameter <- gsub('@','', parameter)
|
||||
type <- params[i,]$Type
|
||||
|
||||
inList <- c(inList,parameter)
|
||||
}
|
||||
}
|
||||
inLabels <- NULL
|
||||
if (!(length(list(...)) == 1 && is.null(list(...)[[1]]))) {
|
||||
inLabels <- labels(list(...))
|
||||
if (!all(inLabels %in% inList)) {
|
||||
stop("You must provide named arguments that match the parameters in the stored procedure.")
|
||||
}
|
||||
}
|
||||
#add necessary variable declarations and value assignments
|
||||
|
||||
query <- paste0("exec ", name)
|
||||
for(p in inList) {
|
||||
paramName <- p
|
||||
query <- paste0(query, " @", paramName, "_outer = ?,")
|
||||
}
|
||||
query <- gsub(",$", "", query)
|
||||
list(query=query, inputParams=inList)
|
||||
}
|
|
@ -0,0 +1,220 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
|
||||
# the list with type conversion info
|
||||
sqlTypes <- list(posixct = "datetime", numeric = "float",
|
||||
character = "nvarchar(max)", integer = "int",
|
||||
logical = "bit", raw = "varbinary(max)", dataframe = "nvarchar(max)")
|
||||
|
||||
getSqlType <- function(rType) {
|
||||
sqlTypes[[tolower(rType)]]
|
||||
}
|
||||
|
||||
# creates the top part of the sql script (up to R code)
|
||||
getHeader <- function(spName, inputParams, outputParams) {
|
||||
header <- c(paste0 ("CREATE PROCEDURE ", spName),
|
||||
handleHeadParams(inputParams, outputParams),
|
||||
"AS",
|
||||
"BEGIN TRY",
|
||||
"exec sp_execute_external_script",
|
||||
"@language = N'R',","@script = N'")
|
||||
return(paste0(header, collapse = "\n"))
|
||||
}
|
||||
|
||||
handleHeadParams <- function(inputParams, outputParams)
|
||||
{
|
||||
paramString <- c()
|
||||
|
||||
makeString <- function(name, d, output = "") {
|
||||
rType <- d[[name]]
|
||||
sqlType <- getSqlType(rType)
|
||||
paste0(" @", name, "_outer ", sqlType, output)
|
||||
}
|
||||
|
||||
for(name in names(inputParams)) {
|
||||
paramString <- c(paramString, makeString(name, inputParams))
|
||||
}
|
||||
for(name in names(outputParams)) {
|
||||
rType <- outputParams[[name]]
|
||||
if (tolower(rType) != "dataframe") {
|
||||
paramString <- c(paramString, makeString(name, outputParams, " output"))
|
||||
}
|
||||
}
|
||||
return(paste0(paramString, collapse = ",\n"))
|
||||
}
|
||||
|
||||
generateTSQL <- function(func, spName, inputParams = NULL, outputParams = NULL ) {
|
||||
# header to drop and create a stored procedure
|
||||
header <- getHeader(spName, inputParams, outputParams)
|
||||
|
||||
# vector containing R code
|
||||
rCode <- getRCode(func, outputParams)
|
||||
|
||||
# tail of the sp
|
||||
tail <- getTail(inputParams, outputParams)
|
||||
|
||||
register = paste0(header, rCode, tail, sep = "\n")
|
||||
}
|
||||
|
||||
generateTSQLFromScript <- function(script, spName, inputParams, outputParams) {
|
||||
# header to drop and create a stored procedure
|
||||
header <- getHeader(spName, inputParams = inputParams, outputParams = outputParams)
|
||||
|
||||
# vector containing R code
|
||||
rCode <- getRCodeFromScript(script = script, outputParams = outputParams)
|
||||
|
||||
# tail of the sp
|
||||
tail <- getTail(inputParams = inputParams, outputParams = outputParams)
|
||||
|
||||
paste0(header, rCode, tail, sep = "\n")
|
||||
}
|
||||
|
||||
|
||||
|
||||
# creates the bottom part of the sql script (after R code)
|
||||
getTail <- function(inputParams, outputParams) {
|
||||
tail <- c("'")
|
||||
tailParams <- handleTailParams(inputParams, outputParams)
|
||||
if (tailParams != "")
|
||||
tail <- c("',")
|
||||
tail <- c(tail,
|
||||
tailParams,
|
||||
"END TRY",
|
||||
"BEGIN CATCH",
|
||||
"THROW;",
|
||||
"END CATCH;")
|
||||
return(paste0(tail, collapse = "\n"))
|
||||
}
|
||||
|
||||
handleTailParams <- function(inputParams, outputParams) {
|
||||
inDataString <- c()
|
||||
outDataString <- c()
|
||||
paramString <- c()
|
||||
overallParams <- c()
|
||||
|
||||
makeString <- function(name, d, output = "") {
|
||||
rType <- d[[name]]
|
||||
if (tolower(rType) == "dataframe") {
|
||||
if (output=="") {
|
||||
c(paste0("@input_data_1 = @", name, "_outer"),
|
||||
paste0("@input_data_1_name = N'", name, "'"))
|
||||
} else {
|
||||
c(paste0("@output_data_1_name = N'", name, "'"))
|
||||
}
|
||||
} else {
|
||||
sqlType <- getSqlType(rType)
|
||||
overallParams <<- c(overallParams, paste0("@", name, " ", sqlType, output))
|
||||
paste0("@", name, " = ", "@", name, "_outer", output)
|
||||
}
|
||||
}
|
||||
|
||||
for(name in names(inputParams)) {
|
||||
rType <- inputParams[[name]]
|
||||
if (tolower(rType) == "dataframe") {
|
||||
inDataString <- c(makeString(name, inputParams))
|
||||
} else {
|
||||
paramString <- c(paramString, makeString(name, inputParams))
|
||||
}
|
||||
}
|
||||
for(name in names(outputParams)) {
|
||||
rType <- outputParams[[name]]
|
||||
if (tolower(rType) == "dataframe") {
|
||||
outDataString <- c(makeString(name, outputParams, " output"))
|
||||
} else {
|
||||
paramString <- c(paramString, makeString(name, outputParams, " output"))
|
||||
}
|
||||
}
|
||||
if (length(overallParams) > 0) {
|
||||
overallParams <- paste0(overallParams, collapse = ", ")
|
||||
overallParams <- paste0("@params = N'" , overallParams,"'")
|
||||
}
|
||||
return(paste0(c(inDataString, outDataString, overallParams, paramString), collapse = ",\n"))
|
||||
}
|
||||
|
||||
getRCodeFromScript <- function(script, inputParams, outputParams) {
|
||||
# escape single quotes and get rid of tabs
|
||||
script <- sapply(script, gsub, pattern = "\t", replacement = " ")
|
||||
script <- sapply(script, gsub, pattern = "'", replacement = "''")
|
||||
|
||||
return(paste0(script, collapse = "\n"))
|
||||
}
|
||||
|
||||
getRCode <- function(func, outputParams) {
|
||||
name <- as.character(substitute(func))
|
||||
|
||||
funcBody <- deparse(func)
|
||||
|
||||
# add on the function definititon
|
||||
funcBody[1] <- paste(name, "<-", funcBody[1], sep = " ")
|
||||
|
||||
# escape single quotes and get rid of tabs
|
||||
funcBody <- sapply(funcBody, gsub, pattern = "\t", replacement = " ")
|
||||
funcBody <- sapply(funcBody, gsub, pattern = "'", replacement = "''")
|
||||
|
||||
inputParameters <- methods::formalArgs(func)
|
||||
|
||||
funcInputNames <- paste(inputParameters, inputParameters,
|
||||
sep = " = ")
|
||||
funcInputNames <- paste(funcInputNames, collapse = ", ")
|
||||
|
||||
# add function call
|
||||
funcBody <- c(funcBody, paste0("result <- ", name,
|
||||
paste0("(", funcInputNames, ")")))
|
||||
|
||||
# add appropriate ending
|
||||
ending <- getEnding(outputParams)
|
||||
funcBody <- c(funcBody, ending)
|
||||
return(paste0(funcBody, collapse = "\n"))
|
||||
}
|
||||
|
||||
#
|
||||
# Get ending string
|
||||
# We change the result into an OutputDataSet - we only expect a single OutputDataSet result
|
||||
getEnding <- function(outputParams) {
|
||||
outputDataSetName <- "OutputDataSet"
|
||||
for(name in names(outputParams)) {
|
||||
if (tolower(outputParams[[name]]) == "dataframe") {
|
||||
outputDataSetName <- name
|
||||
}
|
||||
}
|
||||
ending <- c( "if (is.data.frame(result)) {",
|
||||
paste0(" ", outputDataSetName," <- result")
|
||||
)
|
||||
|
||||
if (length(outputParams) > 0) {
|
||||
ending <- c(ending, "} else if (is.list(result)) {")
|
||||
|
||||
for(name in names(outputParams)) {
|
||||
if (tolower(outputParams[[name]]) == "dataframe") {
|
||||
ending <- c(ending,paste0(" ", name," <- result$", name))
|
||||
} else {
|
||||
ending <- c(ending,paste0(" ", name, " <- result$", name))
|
||||
}
|
||||
}
|
||||
ending <- c(ending,
|
||||
"} else if (!is.null(result)) {",
|
||||
" stop(\"the R function must return a list\")"
|
||||
)
|
||||
}
|
||||
ending <- c(ending, "}")
|
||||
}
|
||||
|
||||
# @import RODBC
|
||||
# Execute the registration script
|
||||
register <- function(registrationScript, connectionString) {
|
||||
output <- character(0)
|
||||
|
||||
tryCatch({
|
||||
dbhandle <- odbcDriverConnect(connectionString)
|
||||
output <- sqlQuery(dbhandle, registrationScript)
|
||||
}, error = function(e) {
|
||||
stop(paste0("Error in SQL Execution:\n", e))
|
||||
}, finally ={
|
||||
odbcCloseAll()
|
||||
})
|
||||
if (length(output) > 0 ) {
|
||||
stop(output)
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,151 @@
|
|||
# sqlmlutils
|
||||
|
||||
sqlmlutils is an R package to help execute R code on a SQL Server machine.
|
||||
|
||||
# Installation
|
||||
|
||||
Run
|
||||
```
|
||||
R CMD INSTALL dist/sqlmlutils_0.5.0.zip
|
||||
```
|
||||
OR
|
||||
To build a new package file and install, run
|
||||
```
|
||||
.\buildandinstall.cmd
|
||||
```
|
||||
|
||||
# Getting started
|
||||
|
||||
Shown below are the important functions sqlmlutils provides:
|
||||
```R
|
||||
connectionInfo # Create a connection string for connecting to the SQL Server
|
||||
|
||||
executeFunctionInSQL # Execute an R function inside the SQL database
|
||||
executeScriptInSQL # Execute an R script inside the SQL database
|
||||
executeSQLQuery # Execute a SQL query on the database and return the resultant table
|
||||
|
||||
createSprocFromFunction # Create a stored procedure based on a R function inside the SQL database
|
||||
createSprocFromScript # Create a stored procedure based on a R script inside the SQL database
|
||||
checkSproc # Check whether a stored procedure exists in the SQL database
|
||||
dropSproc # Drop a stored procedure in the SQL database
|
||||
executeSproc # Execute a stored procedure in the SQL database
|
||||
|
||||
sql_install.packages # Install packages in the SQL database
|
||||
sql_remove.packages # Remove packages from the SQL database
|
||||
sql_installed.packages # Enumerate packages that are installed on the SQL database
|
||||
```
|
||||
|
||||
# Examples
|
||||
|
||||
### Execute In SQL
|
||||
##### Execute an R function in database using sp_execute_external_script
|
||||
|
||||
```R
|
||||
library(sqlmlutils)
|
||||
connection <- connectionInfo()
|
||||
|
||||
funcWithArgs <- function(arg1, arg2){
|
||||
return(c(arg1, arg2))
|
||||
}
|
||||
result <- executeFunctionInSQL(connection, funcWithArgs, arg1="result1", arg2="result2")
|
||||
```
|
||||
|
||||
##### Generate a linear model without the data leaving the machine
|
||||
|
||||
```R
|
||||
library(sqlmlutils)
|
||||
connection <- connectionInfo(database="AirlineTestDB")
|
||||
|
||||
linearModel <- function(in_df, xCol, yCol) {
|
||||
lm(paste0(yCol, " ~ ", xCol), in_df)
|
||||
}
|
||||
|
||||
model <- executeFunctionInSQL(connectionString = connection, func = linearModel, xCol = "CRSDepTime", yCol = "ArrDelay",
|
||||
inputDataQuery = "SELECT TOP 100 * FROM airline5000")
|
||||
model
|
||||
```
|
||||
|
||||
##### Execute a SQL Query from R
|
||||
|
||||
```R
|
||||
library(sqlmlutils)
|
||||
connection <- connectionInfo(database="AirlineTestDB")
|
||||
|
||||
dataTable <- executeSQLQuery(connectionString = connection, sqlQuery="SELECT TOP 100 * FROM airline5000")
|
||||
stopifnot(nrow(dataTable) == 100)
|
||||
stopifnot(ncol(dataTable) == 30)
|
||||
```
|
||||
|
||||
### Stored Procedures (Sproc)
|
||||
##### Create and call a T-SQL stored procedure based on a R function
|
||||
|
||||
```R
|
||||
library(sqlmlutils)
|
||||
|
||||
spPredict <- function(inputDataFrame) {
|
||||
library(RevoScaleR)
|
||||
model <- rxLinMod(ArrDelay ~ CRSDepTime, inputDataFrame)
|
||||
rxPredict(model, inputDataFrame)
|
||||
}
|
||||
|
||||
connection <- connectionInfo(database="AirlineTestDB")
|
||||
inputParams <- list(inputDataFrame = "Dataframe")
|
||||
|
||||
name = "prediction"
|
||||
|
||||
createSprocFromFunction(connectionString = connection, name = name, func = spPredict, inputParams = inputParams)
|
||||
stopifnot(checkSproc(connectionString = connection, name = name))
|
||||
|
||||
predictions <- executeSproc(connectionString = connection, name = name, inputDataFrame = "select ArrDelay, CRSDepTime, DayOfWeek from airline5000")
|
||||
stopifnot(nrow(predictions) == 5000)
|
||||
|
||||
dropSproc(connectionString = connection, name = name)
|
||||
```
|
||||
|
||||
### Package Management
|
||||
##### Install and remove packages from SQL Server
|
||||
|
||||
```R
|
||||
library(sqlmlutils)
|
||||
connection <- connectionInfo(database="AirlineTestDB")
|
||||
|
||||
# install glue on sql server
|
||||
pkgs <- c("glue")
|
||||
sql_install.packages(connectionString = connection, pkgs, verbose = TRUE, scope="PUBLIC")
|
||||
|
||||
# confirm glue is installed on sql server
|
||||
r<-sql_installed.packages(connectionString = connection, fields=c("Package", "LibPath", "Attributes", "Scope"))
|
||||
View(r)
|
||||
|
||||
# use glue on sql server
|
||||
useLibraryGlueInSql <- function()
|
||||
{
|
||||
library(glue)
|
||||
|
||||
name <- "Fred"
|
||||
age <- 50
|
||||
anniversary <- as.Date("1991-10-12")
|
||||
glue('My name is {name},',
|
||||
'my age next year is {age + 1},',
|
||||
'my anniversary is {format(anniversary, "%A, %B %d, %Y")}.')
|
||||
}
|
||||
|
||||
result <- executeFunctionInSQL(connectionString = connection, func = useLibraryGlueInSql)
|
||||
print(result)
|
||||
|
||||
# remove glue from sql server
|
||||
sql_remove.packages(connectionString = connection, pkgs, scope="PUBLIC")
|
||||
```
|
||||
|
||||
# Notes for Developers
|
||||
|
||||
### Running the tests
|
||||
|
||||
1. Make sure a SQL Server with an updated ML Services R is running on localhost.
|
||||
2. Restore the AirlineTestDB from the .bak file in this repo
|
||||
3. Make sure Trusted (Windows) authentication works for connecting to the database
|
||||
|
||||
### Notable TODOs and open issues
|
||||
|
||||
1. Output Parameter execution does not work - RODBCext limitations?
|
||||
2. Testing from a Linux client has not been performed.
|
|
@ -0,0 +1,5 @@
|
|||
pushd .
|
||||
cd ..
|
||||
R CMD INSTALL --build R
|
||||
mv sqlmlutils_0.5.0.zip R/dist
|
||||
popd
|
Двоичный файл не отображается.
|
@ -0,0 +1,46 @@
|
|||
% Generated by roxygen2: do not edit by hand
|
||||
% Please edit documentation in R/storedProcedure.R
|
||||
\name{checkSproc}
|
||||
\alias{checkSproc}
|
||||
\title{Check if Stored Procedure is in Database}
|
||||
\usage{
|
||||
checkSproc(connectionString, name)
|
||||
}
|
||||
\arguments{
|
||||
\item{connectionString}{character string. The connectionString to the database}
|
||||
|
||||
\item{name}{character string. The name of the stored procedure}
|
||||
}
|
||||
\value{
|
||||
Whether the stored procedure exists in the database
|
||||
}
|
||||
\description{
|
||||
Check if Stored Procedure is in Database
|
||||
}
|
||||
\examples{
|
||||
\dontrun{
|
||||
connectionString <- connectionInfo()
|
||||
|
||||
dropSproc(connectionString, "fun")
|
||||
|
||||
func <- function(arg1) {return(data.frame(hello = arg1))}
|
||||
createSprocFromFunction(connectionString, name = "fun",
|
||||
func = func, inputParams = list(arg1="character"))
|
||||
if (checkSproc(connectionString, "fun")) {
|
||||
print("Function 'fun' exists!")
|
||||
executeSproc(connectionString, "fun", arg1="WORLD")
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
}
|
||||
\seealso{
|
||||
{
|
||||
\code{\link{createSprocFromFunction}}
|
||||
|
||||
\code{\link{dropSproc}}
|
||||
|
||||
\code{\link{executeSproc}}
|
||||
|
||||
}
|
||||
}
|
|
@ -0,0 +1,38 @@
|
|||
% Generated by roxygen2: do not edit by hand
|
||||
% Please edit documentation in R/executeInSQL.R
|
||||
\name{connectionInfo}
|
||||
\alias{connectionInfo}
|
||||
\title{Execute a function in SQL}
|
||||
\usage{
|
||||
connectionInfo(driver = "SQL Server", server = "localhost",
|
||||
database = "master", uid = NULL, pwd = NULL)
|
||||
}
|
||||
\arguments{
|
||||
\item{driver}{The driver to use for the connection - defaults to SQL Server}
|
||||
|
||||
\item{server}{The server to connect to - defaults to localhost}
|
||||
|
||||
\item{database}{The database to connect to - defaults to master}
|
||||
|
||||
\item{uid}{The user id for the connection. If uid is NULL, default to Trusted Connection}
|
||||
|
||||
\item{pwd}{The password for the connection. If uid is not NULL, pwd is required}
|
||||
}
|
||||
\value{
|
||||
A fully formed connection string
|
||||
}
|
||||
\description{
|
||||
Execute a function in SQL
|
||||
}
|
||||
\examples{
|
||||
\dontrun{
|
||||
|
||||
connectionInfo()
|
||||
[1] "Driver={SQL Server};Server=localhost;Database=master;Trusted_Connection=Yes;"
|
||||
|
||||
connectionInfo(server="ServerName", database="AirlineTestDB", uid="username", pwd="pass")
|
||||
[1] "Driver={SQL Server};Server=ServerName;Database=AirlineTestDB;uid=username;pwd=pass;"
|
||||
}
|
||||
|
||||
|
||||
}
|
|
@ -0,0 +1,83 @@
|
|||
% Generated by roxygen2: do not edit by hand
|
||||
% Please edit documentation in R/storedProcedure.R
|
||||
\name{createSprocFromFunction}
|
||||
\alias{createSprocFromFunction}
|
||||
\alias{createSprocFromScript}
|
||||
\title{Create a Stored Procedure}
|
||||
\usage{
|
||||
createSprocFromFunction(connectionString, name, func, inputParams = NULL,
|
||||
outputParams = NULL)
|
||||
|
||||
createSprocFromScript(connectionString, name, script, inputParams = NULL,
|
||||
outputParams = NULL)
|
||||
}
|
||||
\arguments{
|
||||
\item{connectionString}{character string. The connectionString to the database}
|
||||
|
||||
\item{name}{character string. The name of the stored procedure}
|
||||
|
||||
\item{func}{closure. The function to wrap in the stored procedure}
|
||||
|
||||
\item{inputParams}{named list. The types of the inputs,
|
||||
where the names are the arguments and the values are the types}
|
||||
|
||||
\item{outputParams}{named list. The types of the outputs,
|
||||
where the names are the arguments and the values are the types}
|
||||
|
||||
\item{script}{character string. The path to the script to wrap in the stored procedure}
|
||||
}
|
||||
\value{
|
||||
Invisibly returns the script used to create the stored procedure
|
||||
}
|
||||
\description{
|
||||
This function creates a stored procedure from a function
|
||||
on the database and return the object.
|
||||
}
|
||||
\section{Functions}{
|
||||
\itemize{
|
||||
\item \code{createSprocFromFunction}: Create stored procedure from function
|
||||
|
||||
\item \code{createSprocFromScript}: Create stored procedure from script file, returns output of final line
|
||||
}}
|
||||
|
||||
\section{Warning}{
|
||||
|
||||
You can add output parameters to the stored procedure
|
||||
but you will not be able to execute the procedure from R afterwards.
|
||||
Any stored procedure with output params must be executed directly in SQL.
|
||||
}
|
||||
|
||||
\examples{
|
||||
\dontrun{
|
||||
connectionString <- connectionInfo()
|
||||
|
||||
### Using a function
|
||||
dropSproc(connectionString, "fun")
|
||||
|
||||
func <- function(arg1) {return(data.frame(hello = arg1))}
|
||||
createSprocFromFunction(connectionString, name = "fun",
|
||||
func = func, inputParams = list(arg1="character"))
|
||||
|
||||
if (checkSproc(connectionString, "fun")) {
|
||||
print("Function 'fun' exists!")
|
||||
executeSproc(connectionString, "fun", arg1="WORLD")
|
||||
}
|
||||
|
||||
### Using a script
|
||||
createSprocFromScript(connectionString, name = "funScript",
|
||||
script = "path/to/script", inputParams = list(arg1="character"))
|
||||
|
||||
}
|
||||
|
||||
|
||||
|
||||
}
|
||||
\seealso{
|
||||
{
|
||||
\code{\link{dropSproc}}
|
||||
|
||||
\code{\link{executeSproc}}
|
||||
|
||||
\code{\link{checkSproc}}
|
||||
}
|
||||
}
|
|
@ -0,0 +1,44 @@
|
|||
% Generated by roxygen2: do not edit by hand
|
||||
% Please edit documentation in R/storedProcedure.R
|
||||
\name{dropSproc}
|
||||
\alias{dropSproc}
|
||||
\title{Drop Stored Procedure}
|
||||
\usage{
|
||||
dropSproc(connectionString, name)
|
||||
}
|
||||
\arguments{
|
||||
\item{connectionString}{character string. The connectionString to the database}
|
||||
|
||||
\item{name}{character string. The name of the stored procedure}
|
||||
}
|
||||
\description{
|
||||
Drop Stored Procedure
|
||||
}
|
||||
\examples{
|
||||
\dontrun{
|
||||
connectionString <- connectionInfo()
|
||||
|
||||
dropSproc(connectionString, "fun")
|
||||
|
||||
func <- function(arg1) {return(data.frame(hello = arg1))}
|
||||
createSprocFromFunction(connectionString, name = "fun",
|
||||
func = func, inputParams = list(arg1 = "character"))
|
||||
|
||||
if (checkSproc(connectionString, "fun")) {
|
||||
print("Function 'fun' exists!")
|
||||
executeSproc(connectionString, "fun", arg1="WORLD")
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
}
|
||||
\seealso{
|
||||
{
|
||||
|
||||
\code{\link{createSprocFromFunction}}
|
||||
|
||||
\code{\link{executeSproc}}
|
||||
|
||||
\code{\link{checkSproc}}
|
||||
}
|
||||
}
|
|
@ -0,0 +1,40 @@
|
|||
% Generated by roxygen2: do not edit by hand
|
||||
% Please edit documentation in R/executeInSQL.R
|
||||
\name{executeFunctionInSQL}
|
||||
\alias{executeFunctionInSQL}
|
||||
\title{Execute a function in SQL}
|
||||
\usage{
|
||||
executeFunctionInSQL(connectionString, func, ..., inputDataQuery = "")
|
||||
}
|
||||
\arguments{
|
||||
\item{connectionString}{character string. The connectionString to the database}
|
||||
|
||||
\item{func}{closure. The function to execute}
|
||||
|
||||
\item{...}{A named list of arguments to pass into the function}
|
||||
|
||||
\item{inputDataQuery}{character string. A string to query the database.
|
||||
The result of the query will be put into a data frame into the first argument in the function}
|
||||
}
|
||||
\value{
|
||||
The returned value from the function
|
||||
}
|
||||
\description{
|
||||
Execute a function in SQL
|
||||
}
|
||||
\examples{
|
||||
\dontrun{
|
||||
connection <- connectionInfo(database = "AirlineTestDB")
|
||||
|
||||
foo <- function(in_df, arg) {
|
||||
list(data = in_df, value = arg)
|
||||
}
|
||||
executeFunctionInSQL(connection, foo, arg = 12345,
|
||||
inputDataQuery = "SELECT top 1 * from airline5000")
|
||||
}
|
||||
|
||||
|
||||
}
|
||||
\seealso{
|
||||
\code{\link{executeScriptInSQL}} to execute a script file instead of a function in SQL
|
||||
}
|
|
@ -0,0 +1,27 @@
|
|||
% Generated by roxygen2: do not edit by hand
|
||||
% Please edit documentation in R/executeInSQL.R
|
||||
\name{executeSQLQuery}
|
||||
\alias{executeSQLQuery}
|
||||
\title{Execute a script in SQL}
|
||||
\usage{
|
||||
executeSQLQuery(connectionString, sqlQuery)
|
||||
}
|
||||
\arguments{
|
||||
\item{connectionString}{character string. The connectionString to the database}
|
||||
|
||||
\item{sqlQuery}{character string. The query to execute}
|
||||
}
|
||||
\value{
|
||||
The data frame returned by the query to the database
|
||||
}
|
||||
\description{
|
||||
Execute a script in SQL
|
||||
}
|
||||
\examples{
|
||||
\dontrun{
|
||||
connection <- connectionInfo(database="AirlineTestDB")
|
||||
executeSQLQuery(connection, sqlQuery="SELECT top 1 * from airline5000")
|
||||
}
|
||||
|
||||
|
||||
}
|
|
@ -0,0 +1,25 @@
|
|||
% Generated by roxygen2: do not edit by hand
|
||||
% Please edit documentation in R/executeInSQL.R
|
||||
\name{executeScriptInSQL}
|
||||
\alias{executeScriptInSQL}
|
||||
\title{Execute a script in SQL}
|
||||
\usage{
|
||||
executeScriptInSQL(connectionString, script, inputDataQuery = "")
|
||||
}
|
||||
\arguments{
|
||||
\item{connectionString}{character string. The connectionString to the database}
|
||||
|
||||
\item{script}{character string. The path to the script to execute in SQL}
|
||||
|
||||
\item{inputDataQuery}{character string. A string to query the database.
|
||||
The result of the query will be put into a data frame into the variable "InputDataSet" in the environment}
|
||||
}
|
||||
\value{
|
||||
The returned value from the last line of the script
|
||||
}
|
||||
\description{
|
||||
Execute a script in SQL
|
||||
}
|
||||
\seealso{
|
||||
\code{\link{executeFunctionInSQL}} to execute a user function instead of a script in SQL
|
||||
}
|
|
@ -0,0 +1,49 @@
|
|||
% Generated by roxygen2: do not edit by hand
|
||||
% Please edit documentation in R/storedProcedure.R
|
||||
\name{executeSproc}
|
||||
\alias{executeSproc}
|
||||
\title{Execute a Stored Procedure}
|
||||
\usage{
|
||||
executeSproc(connectionString, name, ...)
|
||||
}
|
||||
\arguments{
|
||||
\item{connectionString}{character string. The connectionString for the database with the stored procedure}
|
||||
|
||||
\item{name}{character string. The name of the stored procedure in the database to execute}
|
||||
|
||||
\item{...}{named list. Parameters to pass into the procedure. These MUST be named the same as the arguments to the function.}
|
||||
}
|
||||
\description{
|
||||
Execute a Stored Procedure
|
||||
}
|
||||
\section{Warning}{
|
||||
|
||||
Even though you can create stored procedures with output parameters, you CANNOT execute them
|
||||
using this utility due to limitations of RODBC.
|
||||
}
|
||||
|
||||
\examples{
|
||||
\dontrun{
|
||||
connectionString <- connectionInfo()
|
||||
|
||||
dropSproc(connectionString, "fun")
|
||||
|
||||
func <- function(arg1) {return(data.frame(hello = arg1))}
|
||||
createSprocFromFunction(connectionString, name = "fun",
|
||||
func = func, inputParams = list(arg1="character"))
|
||||
|
||||
if (checkSproc(connectionString, "fun")) {
|
||||
print("Function 'fun' exists!")
|
||||
executeSproc(connectionString, "fun", arg1="WORLD")
|
||||
}
|
||||
}
|
||||
}
|
||||
\seealso{
|
||||
{
|
||||
\code{\link{createSprocFromFunction}}
|
||||
|
||||
\code{\link{dropSproc}}
|
||||
|
||||
\code{\link{checkSproc}}
|
||||
}
|
||||
}
|
|
@ -0,0 +1,40 @@
|
|||
% Generated by roxygen2: do not edit by hand
|
||||
% Please edit documentation in R/sqlPackage.R
|
||||
\name{sql_install.packages}
|
||||
\alias{sql_install.packages}
|
||||
\title{sql_install.packages}
|
||||
\usage{
|
||||
sql_install.packages(connectionString, pkgs, skipMissing = FALSE,
|
||||
repos = getOption("repos"), verbose = getOption("verbose"),
|
||||
scope = "private", owner = "")
|
||||
}
|
||||
\arguments{
|
||||
\item{connectionString}{ODBC connection string to Microsoft SQL Server database.}
|
||||
|
||||
\item{pkgs}{character vector of the names of packages whose current versions should be downloaded from the repositories. If repos = NULL, a character vector of file paths of .zip files containing binary builds of packages. (http:// and file:// URLs are also accepted and the files will be downloaded and installed from local copies).}
|
||||
|
||||
\item{skipMissing}{logical. If TRUE, skips missing dependent packages for which otherwise an error is generated.}
|
||||
|
||||
\item{repos}{character vector, the base URL(s) of the repositories to use.Can be NULL to install from local files, directories.}
|
||||
|
||||
\item{verbose}{logical. If TRUE, more detailed information is given during installation of packages.}
|
||||
|
||||
\item{scope}{character string. Should be either "public" or "private". "public" installs the packages on per database public location on SQL server which in turn can be used (referred) by multiple different users. "private" installs the packages on per database, per user private location on SQL server which is only accessible to the single user.}
|
||||
|
||||
\item{owner}{character string. Should be either empty '' or a valid SQL database user account name. Only 'dbo' or users in 'db_owner' role for a database can specify this value to install packages on behalf of other users. A user who is member of the 'db_owner' group can set owner='dbo' to install on the "public" folder.}
|
||||
}
|
||||
\value{
|
||||
invisible(NULL)
|
||||
}
|
||||
\description{
|
||||
Installs R packages on a SQL Server database. Packages are downloaded on the client and then copied and installed to SQL Server into "public" and "private" folders. Packages in the "public" folders can be loaded by all database users running R script in SQL. Packages in the "private" folder can be loaded only by a single user. 'dbo' users always install into the "public" folder. Users who are members of the 'db_owner' role can install to both "public" and "private" folders. All other users can only install packages to their "private" folder.
|
||||
}
|
||||
\seealso{
|
||||
{
|
||||
\code{\link{sql_remove.packages}} to remove packages
|
||||
|
||||
\code{\link{sql_installed.packages}} to enumerate the installed packages
|
||||
|
||||
\code{\link{install.packages}} for the base version of this function
|
||||
}
|
||||
}
|
|
@ -0,0 +1,42 @@
|
|||
% Generated by roxygen2: do not edit by hand
|
||||
% Please edit documentation in R/sqlPackage.R
|
||||
\name{sql_installed.packages}
|
||||
\alias{sql_installed.packages}
|
||||
\title{sql_installed.packages}
|
||||
\usage{
|
||||
sql_installed.packages(connectionString, priority = NULL, noCache = FALSE,
|
||||
fields = "Package", subarch = NULL, scope = "private", owner = "")
|
||||
}
|
||||
\arguments{
|
||||
\item{connectionString}{ODBC connection string to Microsoft SQL Server database.}
|
||||
|
||||
\item{priority}{character vector or NULL (default). If non-null, used to select packages; "high" is equivalent to c("base", "recommended"). To select all packages without an assigned priority use priority = "NA".}
|
||||
|
||||
\item{noCache}{logical. If TRUE, do not use cached information, nor cache it.}
|
||||
|
||||
\item{fields}{a character vector giving the fields to extract from each package's DESCRIPTION file, or NULL. If NULL, the following fields are used:
|
||||
"Package", "LibPath", "Version", "Priority", "Depends", "Imports", "LinkingTo", "Suggests", "Enhances", "License", "License_is_FOSS", "License_restricts_use", "OS_type", "MD5sum", "NeedsCompilation", and "Built".
|
||||
Unavailable fields result in NA values.}
|
||||
|
||||
\item{subarch}{character string or NULL. If non-null and non-empty, used to select packages which are installed for that sub-architecture}
|
||||
|
||||
\item{scope}{character string which can be "private" or "public".}
|
||||
|
||||
\item{owner}{character string of a user whose private packages shall be listed (availableto dbo or db_owner users only)}
|
||||
}
|
||||
\value{
|
||||
matrix with enumerated packages
|
||||
}
|
||||
\description{
|
||||
Enumerates the currently installed R packages on a SQL Server for the current database
|
||||
}
|
||||
\seealso{
|
||||
{
|
||||
\code{\link{sql_install.packages}} to install packages
|
||||
|
||||
\code{\link{sql_remove.packages}} to remove packages
|
||||
|
||||
\code{\link{installed.packages}} for the base version of this function
|
||||
|
||||
}
|
||||
}
|
|
@ -0,0 +1,41 @@
|
|||
% Generated by roxygen2: do not edit by hand
|
||||
% Please edit documentation in R/sqlPackage.R
|
||||
\name{sql_remove.packages}
|
||||
\alias{sql_remove.packages}
|
||||
\title{sql_remove.packages}
|
||||
\usage{
|
||||
sql_remove.packages(connectionString, pkgs, dependencies = TRUE,
|
||||
checkReferences = TRUE, verbose = getOption("verbose"),
|
||||
scope = "private", owner = "")
|
||||
}
|
||||
\arguments{
|
||||
\item{connectionString}{ODBC connection string to SQL Server database.}
|
||||
|
||||
\item{pkgs}{character vector of names of the packages to be removed.}
|
||||
|
||||
\item{dependencies}{logical. If TRUE, does dependency resolution of the packages being removed and removes the dependent packages also if the dependent packages aren't referenced by other packages outside the dependency closure.}
|
||||
|
||||
\item{checkReferences}{logical. If TRUE, verifies there are no references to the dependent packages by other packages outside the dependency closure. Use FALSE to force removal of packages even when other packages depend on it.}
|
||||
|
||||
\item{verbose}{logical. If TRUE, more detailed information is given during removal of packages.}
|
||||
|
||||
\item{scope}{character string. Should be either "public" or "private". "public" removes the packages from a per-database public location on SQL Server which in turn could have been used (referred) by multiple different users. "private" removes the packages from a per-database, per-user private location on SQL Server which is only accessible to the single user.}
|
||||
|
||||
\item{owner}{character string. Should be either empty '' or a valid SQL database user account name. Only 'dbo' or users in 'db_owner' role for a database can specify this value to remove packages on behalf of other users. A user who is member of the 'db_owner' group can set owner='dbo' to remove packages from the "public" folder.}
|
||||
}
|
||||
\value{
|
||||
invisible(NULL)
|
||||
}
|
||||
\description{
|
||||
sql_remove.packages
|
||||
}
|
||||
\seealso{
|
||||
{
|
||||
\code{\link{sql_install.packages}} to install packages
|
||||
|
||||
\code{\link{sql_installed.packages}} to enumerate the installed packages
|
||||
|
||||
\code{\link{remove.packages}} for the base version of this function
|
||||
|
||||
}
|
||||
}
|
|
@ -0,0 +1,21 @@
|
|||
Version: 1.0
|
||||
|
||||
RestoreWorkspace: No
|
||||
SaveWorkspace: No
|
||||
AlwaysSaveHistory: Default
|
||||
|
||||
EnableCodeIndexing: Yes
|
||||
UseSpacesForTab: Yes
|
||||
NumSpacesForTab: 4
|
||||
Encoding: UTF-8
|
||||
|
||||
RnwWeave: Sweave
|
||||
LaTeX: pdfLaTeX
|
||||
|
||||
AutoAppendNewline: Yes
|
||||
StripTrailingWhitespace: Yes
|
||||
|
||||
BuildType: Package
|
||||
PackageUseDevtools: Yes
|
||||
PackageInstallArgs: --no-multiarch --with-keep.source
|
||||
PackageRoxygenize: rd,collate,namespace
|
|
@ -0,0 +1,7 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
library(testthat)
|
||||
library(sqlmlutils)
|
||||
|
||||
test_check("sqlmlutils")
|
|
@ -0,0 +1,27 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
library(sqlmlutils)
|
||||
library(methods)
|
||||
library(testthat)
|
||||
|
||||
options(keep.source = TRUE)
|
||||
Sys.setenv(TZ='GMT')
|
||||
Server <- ''
|
||||
if (Server == '') Server <- "."
|
||||
cnnstr <- connectionInfo(server=Server, database="AirlineTestDB")
|
||||
|
||||
testthatDir <- getwd()
|
||||
R_Root <- file.path(testthatDir, "../..")
|
||||
scriptDirectory <- file.path(testthatDir, "scripts")
|
||||
|
||||
TestArgs <- list(
|
||||
# Compute context specifications
|
||||
gitRoot = R_Root,
|
||||
testDirectory = testthatDir,
|
||||
scriptDirectory = scriptDirectory,
|
||||
connectionString = cnnstr
|
||||
)
|
||||
|
||||
options(TestArgs = TestArgs)
|
||||
rm(TestArgs)
|
|
@ -0,0 +1,7 @@
|
|||
foo <- function(t1, t2, t3) {
|
||||
print(t1)
|
||||
warning(t2)
|
||||
return(t3)
|
||||
}
|
||||
|
||||
foo("Hello","WARNING", InputDataSet)
|
|
@ -0,0 +1,5 @@
|
|||
|
||||
sum1 <- 1+2
|
||||
sum2 <- 5+6
|
||||
product <- sum1 * sum2
|
||||
product
|
|
@ -0,0 +1,2 @@
|
|||
product <- num1 * num2
|
||||
out_df <- rbind(in_df, product)
|
|
@ -0,0 +1,203 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
library(testthat)
|
||||
context("executeInSQL tests")
|
||||
|
||||
TestArgs <- options("TestArgs")$TestArgs
|
||||
connection <- TestArgs$connectionString
|
||||
scriptDir <- TestArgs$scriptDirectory
|
||||
|
||||
|
||||
|
||||
test_that("Test with named args", {
|
||||
funcWithArgs <- function(arg1, arg2){
|
||||
print(arg1)
|
||||
return(arg2)
|
||||
}
|
||||
expect_output(
|
||||
expect_equal(
|
||||
executeFunctionInSQL(connection, funcWithArgs, arg1="blah1", arg2="blah2"),
|
||||
"blah2"),
|
||||
"blah1"
|
||||
)
|
||||
})
|
||||
|
||||
test_that("Test ordered arguments", {
|
||||
funcNum <- function(arg1, arg2){
|
||||
stopifnot(typeof(arg1) == "integer")
|
||||
stopifnot(typeof(arg2) == "double")
|
||||
return(arg1 / arg2)
|
||||
}
|
||||
expect_error(executeFunctionInSQL(connection, funcNum, 2))
|
||||
expect_equal(executeFunctionInSQL(connection, funcNum, as.integer(2), 3), 2/3)
|
||||
expect_equal(executeFunctionInSQL(connection, funcNum, as.integer(3), 2), 3/2)
|
||||
})
|
||||
|
||||
test_that("Test Return", {
|
||||
myReturnVal <- function(){
|
||||
return("returned!")
|
||||
}
|
||||
|
||||
val = executeFunctionInSQL(connection, myReturnVal)
|
||||
expect_equal(val, myReturnVal())
|
||||
})
|
||||
|
||||
test_that("Test Warning", {
|
||||
printWarning <- function(){
|
||||
warning("testWarning")
|
||||
print("Hello, this returned")
|
||||
}
|
||||
expect_warning(
|
||||
expect_output(executeFunctionInSQL(connection, printWarning),
|
||||
"Hello, this returned"),
|
||||
"testWarning")
|
||||
|
||||
})
|
||||
|
||||
test_that("Passing in a user defined function", {
|
||||
func1 <- function(){
|
||||
func2 <- function() {
|
||||
return("Success")
|
||||
}
|
||||
return(func2())
|
||||
}
|
||||
|
||||
expect_equal(executeFunctionInSQL(connection, func=func1), "Success")
|
||||
})
|
||||
|
||||
test_that("Returning a function object", {
|
||||
func2 <- function() {
|
||||
return("Success")
|
||||
}
|
||||
func1 <- function(){
|
||||
func2 <- function() {
|
||||
return("Success")
|
||||
}
|
||||
return(func2)
|
||||
}
|
||||
|
||||
expect_equal(executeFunctionInSQL(connection, func=func1), func2)
|
||||
})
|
||||
|
||||
test_that("Calling an object in the environment", {
|
||||
skip("This doesn't work right now because we don't pass the whole environment")
|
||||
|
||||
func2 <- function() {
|
||||
return("Success")
|
||||
}
|
||||
func1 <- function(){
|
||||
return(func2)
|
||||
}
|
||||
|
||||
expect_equal(executeFunctionInSQL(connection, func=func1), func2)
|
||||
})
|
||||
|
||||
test_that("No Parameters test", {
|
||||
noReturn <- function() {
|
||||
}
|
||||
result = executeFunctionInSQL(connection, noReturn)
|
||||
expect_null(result)
|
||||
})
|
||||
|
||||
test_that("Print, Warning, Return test", {
|
||||
|
||||
returnString <- function() {
|
||||
print("hello")
|
||||
warning("uh oh")
|
||||
return("bar")
|
||||
}
|
||||
expect_warning(expect_output(result <- executeFunctionInSQL(connection, returnString), "hello"), "uh oh")
|
||||
|
||||
expect_equal(result , "bar")
|
||||
})
|
||||
|
||||
test_that("Print, Warning, Return test, with args", {
|
||||
|
||||
returnVector <- function(a,b) {
|
||||
print("print")
|
||||
warning("uh oh")
|
||||
return(c(a,b))
|
||||
}
|
||||
expect_warning(expect_output(result <- executeFunctionInSQL(connection, returnVector, "foo", "bar"), "print"), "uh oh")
|
||||
|
||||
expect_equal(result , c("foo","bar"))
|
||||
})
|
||||
|
||||
test_that("Print, Warning, Error test", {
|
||||
testError <- function() {
|
||||
print("print")
|
||||
warning("warning")
|
||||
stop("ERROR")
|
||||
}
|
||||
expect_error(
|
||||
expect_warning(
|
||||
expect_output(
|
||||
result <- executeFunctionInSQL(connection, testError),
|
||||
"print"),
|
||||
"warning"),
|
||||
"ERROR")
|
||||
})
|
||||
|
||||
test_that("Return a DataFrame", {
|
||||
|
||||
returnDF <- function(a, b) {
|
||||
return(data.frame(x = c(foo=a,bar=b)))
|
||||
}
|
||||
result <- executeFunctionInSQL(connection, returnDF, "foo", 2)
|
||||
expect_equal(result, data.frame(x = c(foo="foo",bar=2)))
|
||||
})
|
||||
|
||||
test_that("Return an input DataFrame", {
|
||||
useInputDataSet <- function(in_df) {
|
||||
return(in_df)
|
||||
}
|
||||
result = executeFunctionInSQL(connection, useInputDataSet, inputDataQuery = "SELECT TOP 5 * FROM airline5000")
|
||||
expect_equal(nrow(result), 5)
|
||||
expect_equal(ncol(result), 30)
|
||||
|
||||
useInputDataSet2 <- function(in_df, t1) {
|
||||
return(list(in_df, t1=t1))
|
||||
}
|
||||
result = executeFunctionInSQL(connection, useInputDataSet2, t1=5, inputDataQuery = "SELECT TOP 5 * FROM airline5000")
|
||||
expect_equal(result$t1, 5)
|
||||
expect_equal(ncol(result[[1]]), 30)
|
||||
|
||||
})
|
||||
|
||||
test_that("Variable test", {
|
||||
|
||||
printString <- function(str) {
|
||||
print(str)
|
||||
}
|
||||
expect_output(executeFunctionInSQL(connection, printString, str="Hello"), "Hello")
|
||||
test <- "World"
|
||||
expect_output(executeFunctionInSQL(connection, printString, str=test), test)
|
||||
})
|
||||
|
||||
test_that("Query test", {
|
||||
res <- executeSQLQuery(connectionString = connection, sqlQuery = "SELECT TOP 5 * FROM airline5000")
|
||||
expect_equal(nrow(res), 5)
|
||||
expect_equal(ncol(res), 30)
|
||||
})
|
||||
|
||||
test_that("Script test", {
|
||||
script <- file.path(scriptDir, 'script.txt')
|
||||
|
||||
expect_warning(
|
||||
expect_output(
|
||||
res <- executeScriptInSQL(connectionString=connection, script=script, inputDataQuery = "SELECT TOP 5 * FROM airline5000"),
|
||||
"Hello"),
|
||||
"WARNING")
|
||||
expect_equal(nrow(res), 5)
|
||||
expect_equal(ncol(res), 30)
|
||||
|
||||
script2 <- file.path(scriptDir, 'script2.txt')
|
||||
|
||||
|
||||
expect_output(res <- executeScriptInSQL(connection, script2), "Script path exists")
|
||||
expect_equal(res, 33)
|
||||
|
||||
expect_error(res <- executeScriptInSQL(connection, "non-existent-script.txt"), regexp = "Script path doesn't exist")
|
||||
|
||||
})
|
|
@ -0,0 +1,319 @@
|
|||
# Copyright(c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
library(testthat)
|
||||
context("Stored Procedure tests")
|
||||
|
||||
TestArgs <- options('TestArgs')$TestArgs
|
||||
connection <- TestArgs$connectionString
|
||||
scriptDir <- TestArgs$scriptDirectory
|
||||
|
||||
dropIfExists <- function(connectionString, name) {
|
||||
if(checkSproc(connectionString, name))
|
||||
invisible(capture.output(dropSproc(connectionString = connectionString, name = name)))
|
||||
}
|
||||
|
||||
#
|
||||
#Test an empty function (no inputs)
|
||||
test_that("No Parameters test", {
|
||||
noParams <- function() {
|
||||
data.frame(hello = "world")
|
||||
}
|
||||
name = "noParams"
|
||||
|
||||
dropIfExists(name, connectionString = connection)
|
||||
expect_false(checkSproc(name, connectionString = connection))
|
||||
|
||||
capture.output(createSprocFromFunction(name, noParams, connectionString = connection))
|
||||
expect_true(checkSproc(name, connectionString = connection))
|
||||
|
||||
expect_equal(as.character(executeSproc(connectionString = connection, name)[[1]]) , "world")
|
||||
|
||||
dropIfExists(name, connectionString = connection)
|
||||
expect_false(checkSproc(name, connectionString = connection))
|
||||
})
|
||||
|
||||
#
|
||||
#Test multiple input parameters
|
||||
#("posixct", "numeric", "character", "integer", "logical", "raw", "dataframe")
|
||||
test_that("Numeric, POSIXct, Character, Logical test", {
|
||||
inNumCharParams <- function(in1, in2, in3, in4) {
|
||||
data.frame(in1, in2,in3,in4)
|
||||
}
|
||||
|
||||
#TODO: Time zone might not work
|
||||
x <- as.POSIXct(12345678, origin = "1960-01-01")#, tz= "GMT")
|
||||
|
||||
inputParams <- list(in1="numeric", in2="posixct", in3="character", in4="logical")
|
||||
|
||||
name = "inNumCharParams"
|
||||
|
||||
dropIfExists(name, connectionString = connection)
|
||||
expect_false(checkSproc(name, connectionString = connection))
|
||||
|
||||
capture.output(createSprocFromFunction(name, inNumCharParams, connectionString = connection, inputParams = inputParams))
|
||||
expect_true(checkSproc(name, connectionString = connection))
|
||||
|
||||
res <- executeSproc(name, in1 = 1, in2 = x, in3 = "Hello", in4 = 1, connectionString = connection)
|
||||
|
||||
expect_equal(res[[1]], 1)
|
||||
expect_equal(res[[2]], x)
|
||||
expect_equal(as.character(res[[3]]), "Hello")
|
||||
expect_equal(as.logical(res[[4]]), TRUE)
|
||||
|
||||
dropIfExists(name, connectionString = connection)
|
||||
expect_false(checkSproc(name, connectionString = connection))
|
||||
})
|
||||
|
||||
#
|
||||
#Test only an InputDataSet StoredProcedure
|
||||
test_that("Simple InputDataSet test", {
|
||||
inData <- function(in_df) {
|
||||
in_df
|
||||
}
|
||||
|
||||
inputParams <- list(in_df="dataframe")
|
||||
|
||||
name = "inData"
|
||||
|
||||
dropIfExists(name, connectionString = connection)
|
||||
expect_false(checkSproc(name, connectionString = connection))
|
||||
|
||||
capture.output(createSprocFromFunction(name, inData, connectionString = connection, inputParams = inputParams))
|
||||
expect_true(checkSproc(name, connectionString = connection))
|
||||
|
||||
res <- executeSproc(name, in_df = "SELECT TOP 10 * FROM airline5000", connectionString = connection)
|
||||
expect_equal(nrow(res), 10)
|
||||
expect_equal(ncol(res), 30)
|
||||
|
||||
dropIfExists(name, connectionString = connection)
|
||||
expect_false(checkSproc(name, connectionString = connection))
|
||||
})
|
||||
|
||||
|
||||
#
|
||||
#Test InputDataSet with returned OutputDataSet
|
||||
test_that("InputDataSet to OutputDataSet test", {
|
||||
inOutData <- function(in_df) {
|
||||
list(out_df = in_df)
|
||||
}
|
||||
|
||||
inputParams <- list(in_df="dataframe")
|
||||
outputParams <- list(out_df="dataframe")
|
||||
|
||||
name = "inOutData"
|
||||
|
||||
dropIfExists(name, connectionString = connection)
|
||||
expect_false(checkSproc(name, connectionString = connection))
|
||||
|
||||
capture.output(createSprocFromFunction(name, inOutData, connectionString = connection, inputParams = inputParams, outputParams = outputParams))
|
||||
expect_true(checkSproc(name, connectionString = connection))
|
||||
|
||||
res <- executeSproc(name, in_df = "SELECT TOP 10 * FROM airline5000", connectionString = connection)
|
||||
expect_equal(nrow(res), 10)
|
||||
expect_equal(ncol(res), 30)
|
||||
|
||||
dropIfExists(name, connectionString = connection)
|
||||
expect_false(checkSproc(name, connectionString = connection))
|
||||
})
|
||||
|
||||
#
|
||||
#Test InputDataSet query with InputParameters
|
||||
test_that("InputDataSet with InputParameter test", {
|
||||
inDataParams <- function(id, ip) {
|
||||
rbind(id,ip)
|
||||
}
|
||||
|
||||
name = "inDataParams"
|
||||
|
||||
inputParams = list(id = "DataFrame", ip = "numeric")
|
||||
|
||||
dropIfExists(name, connectionString = connection)
|
||||
expect_false(checkSproc(name, connectionString = connection))
|
||||
|
||||
capture.output(createSprocFromFunction(name, inDataParams, connectionString = connection, inputParams = inputParams))
|
||||
expect_true(checkSproc(name, connectionString = connection))
|
||||
|
||||
res <- executeSproc(name, id = "SELECT TOP 10 * FROM airline5000", ip = 4, connectionString = connection)
|
||||
|
||||
expect_equal(nrow(res), 11)
|
||||
expect_equal(ncol(res), 30)
|
||||
|
||||
expect_error(executeSproc(name, "SELECT TOP 10 * FROM airline5000", ip = 4, connectionString = connection))
|
||||
|
||||
dropIfExists(name, connectionString = connection)
|
||||
expect_false(checkSproc(name, connectionString = connection))
|
||||
})
|
||||
|
||||
|
||||
#
|
||||
#Test InputDataSet query with InputParameters with inputs out of order
|
||||
test_that("InputDataSet with InputParameter test, out of order", {
|
||||
inDataParams <- function(id, ip, ip2) {
|
||||
rbind(id,ip)
|
||||
}
|
||||
|
||||
name = "inDataParamsOoO"
|
||||
|
||||
inputParams = list(ip = "numeric", id = "DATAFRAME", ip2 = "character")
|
||||
|
||||
dropIfExists(name, connectionString = connection)
|
||||
expect_false(checkSproc(name, connectionString = connection))
|
||||
|
||||
capture.output(createSprocFromFunction(name, inDataParams, connectionString = connection, inputParams = inputParams))
|
||||
expect_true(checkSproc(name, connectionString = connection))
|
||||
|
||||
res <- executeSproc(name, ip2 = "Hello", ip = 4, id = "SELECT TOP 10 * FROM airline5000", connectionString = connection)
|
||||
|
||||
expect_equal(nrow(res), 11)
|
||||
expect_equal(ncol(res), 30)
|
||||
|
||||
expect_error(executeSproc(name,ip = 4, "SELECT TOP 10 * FROM airline5000", connectionString = connection))
|
||||
|
||||
dropIfExists(name, connectionString = connection)
|
||||
expect_false(checkSproc(name, connectionString = connection))
|
||||
})
|
||||
|
||||
|
||||
test_that("Stored Procedure with Scripts", {
|
||||
inputParams <- list(num1 = "numeric", num2 = "numeric", in_df = "dAtaFrame")
|
||||
outputParams <- list(out_df = "dataframe")
|
||||
|
||||
name="script"
|
||||
|
||||
dropIfExists(name, connectionString = connection)
|
||||
expect_false(checkSproc(name, connectionString = connection))
|
||||
|
||||
capture.output(createSprocFromScript(
|
||||
connectionString = connection, name=name, file.path(scriptDir, "script3.R"), inputParams = inputParams, outputParams = outputParams))
|
||||
expect_true(checkSproc(connectionString = connection, name = name))
|
||||
|
||||
retVal <- executeSproc(connectionString = connection, name, num1 = 3, num2 = 4, in_df = "select top 10 * from airline5000")
|
||||
|
||||
expect_equal(nrow(retVal), 11)
|
||||
expect_equal(ncol(retVal), 30)
|
||||
|
||||
dropIfExists(connectionString = connection, name = name)
|
||||
expect_false(checkSproc(connectionString = connection, name = name))
|
||||
})
|
||||
|
||||
context("Sprocs with output params (TODO)")
|
||||
|
||||
# TODO: Output params test - execution doesn't work right now
|
||||
test_that("Only OuputParams test", {
|
||||
outsFunc <- function(arg1) {
|
||||
list(res = paste0("Hello ", arg1, "!"))
|
||||
}
|
||||
|
||||
name <- "outsFunc"
|
||||
inputParams <- list(arg1 = "character")
|
||||
outputParams <- list(res = "character")
|
||||
|
||||
dropIfExists(name, connectionString = connection)
|
||||
expect_false(checkSproc(name, connectionString = connection))
|
||||
|
||||
capture.output(createSprocFromFunction(name, outsFunc, connectionString = connection, inputParams = inputParams, outputParams = outputParams))
|
||||
expect_true(checkSproc(name, connectionString = connection))
|
||||
|
||||
|
||||
#Use T-SQL to verify
|
||||
sql_str = "DECLARE @res nvarchar(max) EXEC outsFunc @arg1_outer = N'T-SQL', @res_outer = @res OUTPUT SELECT @res as N'@res'"
|
||||
out <- system2("sqlcmd.exe", c("-S", "localhost", "-E", "-d","AirlineTestDB", "-Q", paste0('"', sql_str, '"')), stdout=TRUE)
|
||||
expect_true(any(grepl("Hello T-SQL!", out)))
|
||||
#executeSproc(name, connectionString = connection, out1 = "Asd", out2 = "wqe")
|
||||
|
||||
dropIfExists(name, connectionString = connection)
|
||||
expect_false(checkSproc(name, connectionString = connection))
|
||||
})
|
||||
|
||||
|
||||
test_that("OutputDataSet and OuputParams test", {
|
||||
outDataParam <- function() {
|
||||
df = data.frame(hello = "world")
|
||||
list(df = df, out1 = "Hello", out2 = "World")
|
||||
}
|
||||
name = "outDataParam"
|
||||
|
||||
outputParams <- list(df = "dataframe", out2 = "character", out1 = "character")
|
||||
|
||||
createSprocFromFunction(name, outDataParam, connectionString = connection, outputParams = outputParams)
|
||||
stopifnot(checkSproc(name, connectionString = connection))
|
||||
|
||||
#Use T-SQL to verify
|
||||
sql_str = "DECLARE @out1 nvarchar(max),@out2 nvarchar(max) EXEC outDataParam @out1_outer = @out1 OUTPUT, @out2_outer = @out2 OUTPUT SELECT @out1 as N'@out1'"
|
||||
out <- system2("sqlcmd.exe", c("-S", "localhost", "-E", "-d","AirlineTestDB", "-Q", paste0('"', sql_str, '"')), stdout=TRUE)
|
||||
expect_true(any(grepl("Hello", out)))
|
||||
#res <- executeSproc(connectionString = connection, name)
|
||||
|
||||
dropSproc(name, connectionString = connection)
|
||||
stopifnot(!checkSproc(name, connectionString = connection))
|
||||
})
|
||||
|
||||
context("Sproc Negative Tests")
|
||||
|
||||
test_that("Bad input param types or usage", {
|
||||
badParam <- function(arg1) {
|
||||
return(arg1)
|
||||
}
|
||||
inputParams <- list(arg1 = "NotAType")
|
||||
|
||||
expect_error(createSprocFromFunction(connection, "badParam", badParam, inputParams = inputParams))
|
||||
|
||||
inputParams <- list(arg1 = "dataframe")
|
||||
|
||||
name = "badInput"
|
||||
dropIfExists(connection, name)
|
||||
capture.output(createSprocFromFunction(connection, name, badParam, inputParams = inputParams))
|
||||
expect_true(checkSproc(connection, name))
|
||||
|
||||
expect_error(expect_warning(executeSproc(connection, name, arg1=12314532)))
|
||||
res <- executeSproc(connection, name, arg1="SELECT TOP 5 * FROM airline5000")
|
||||
|
||||
expect_equal(ncol(res), 30)
|
||||
expect_equal(nrow(res), 5)
|
||||
dropIfExists(connection, name)
|
||||
})
|
||||
|
||||
test_that("Drop nonexistent sproc",{
|
||||
expect_false(checkSproc(connection, "NonexistentSproc"))
|
||||
expect_output(dropSproc(connection, "NonexistentSproc"), "Named procedure doesn't exist")
|
||||
})
|
||||
|
||||
test_that("Create with bad name",{
|
||||
name = "'''asd''asd''asd"
|
||||
foo = function() {
|
||||
return(NULL)
|
||||
}
|
||||
expect_error(createSprocFromFunction(connection, name, foo))
|
||||
})
|
||||
|
||||
test_that("mismatch input params", {
|
||||
func <- function(arg1, arg2) {
|
||||
return(arg1)
|
||||
}
|
||||
inputParams <- list(arg1 = "dataframe", arg3 = "numeric")
|
||||
|
||||
dropIfExists(connection, "mismatch")
|
||||
expect_error(createSprocFromFunction(connection, "mismatch", func, inputParams = inputParams))
|
||||
|
||||
inputParams <- list(arg1 = "dataframe", arg2 = "qwe", arg3 = "numeric")
|
||||
|
||||
dropIfExists(connection, "mismatch")
|
||||
expect_error(createSprocFromFunction(connection, "mismatch", func, inputParams = inputParams))
|
||||
|
||||
})
|
||||
|
||||
|
||||
test_that("Sproc with Bad Script Path", {
|
||||
name="bad_script_path"
|
||||
|
||||
dropIfExists(name, connectionString = connection)
|
||||
expect_false(checkSproc(name, connectionString = connection))
|
||||
|
||||
expect_error(createSprocFromScript(
|
||||
connectionString = connection, name=name, "bad_script_path.txt"))
|
||||
|
||||
})
|
||||
|
||||
|
||||
|
44
README.md
44
README.md
|
@ -1,14 +1,38 @@
|
|||
# sqlmlutils
|
||||
|
||||
# Contributing
|
||||
sqlmlutils is a package designed to help users interact with SQL Server and execute R or Python code from an R/Python client.
|
||||
|
||||
This project welcomes contributions and suggestions. Most contributions require you to agree to a
|
||||
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
|
||||
the rights to use your contribution. For details, visit https://cla.microsoft.com.
|
||||
# Installation
|
||||
|
||||
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide
|
||||
a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions
|
||||
provided by the bot. You will only need to do this once across all repos using our CLA.
|
||||
To install sqlmlutils from this repository, run the following commands from the root folder:
|
||||
|
||||
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
|
||||
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
|
||||
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
|
||||
Python:
|
||||
1. If your client is a Linux machine, you can skip this step. If your client is a Windows machine: go to https://www.lfd.uci.edu/~gohlke/pythonlibs/#pymssql and download the correct version of pymssql for your client. Run ```pip install pymssql-2.1.4.dev5-cpXX-cpXXm-win_amd64.whl``` on that file to install pymssql.
|
||||
2. Run
|
||||
```
|
||||
python.exe -m pip install Python/dist/sqlmlutils-0.5.0.zip --upgrade
|
||||
```
|
||||
|
||||
R:
|
||||
```
|
||||
R CMD INSTALL R/dist/sqlmlutils_0.5.0.zip
|
||||
```
|
||||
|
||||
# Details
|
||||
|
||||
sqlmlutils contains 3 main parts:
|
||||
- Execution of Python/R in SQL Server using sp_execute_external_script
|
||||
- Creation and execution of stored procedures created from scripts and functions
|
||||
- Install and manage packages in SQL Server
|
||||
|
||||
## Execute in SQL
|
||||
|
||||
Execute in SQL provides a convenient way for the user to execute arbitrary Python/R code inside a sql server using an sp_execute_external_script. The user does not have to know any t-sql to use this function. Function arguments are serialized into binary and passed into the t-sql script that is generated. Warnings and printed output will be printed at the end of execution, and any results returned by the function will be passed back to the client.
|
||||
|
||||
## Stored Procedures (Sprocs)
|
||||
|
||||
The goal of this utility is to allow users to create and execute stored procedures on their database without needing to know the exact syntax of creating one. Functions and scripts are wrapped into a stored procedure and registered into a database, then can be executed from the Python/R client.
|
||||
|
||||
## Package Management
|
||||
|
||||
With package management users can install packages to a remote SQL server from a client machine. The packages are downloaded on the client and then send over to SQL server where they will be installed into library folders. The folders are per-database so packages will always be installed and made available for a specific database. The package management APIs provided a PUBLIC and PRIVATE folders. Packages in the PUBLIC folder are accessible to all database users. Packages in the PRIVATE folder are only accessible by the user who installed the package.
|
||||
|
|
Загрузка…
Ссылка в новой задаче