style: improve style using pre-commit (#1538)

* chore: pre-commit first pass

* Add developer docs

* update version for pre-commit hooks

* remove manual black hook config

Co-authored-by: Puneet Pruthi <ppruthi@microsoft.com>
This commit is contained in:
Puneet Pruthi 2022-06-21 12:31:34 -07:00 коммит произвёл GitHub
Родитель 9204586f45
Коммит 3bd5bef91d
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: 4AEE18F83AFDEB23
326 изменённых файлов: 1296 добавлений и 1363 удалений

Просмотреть файл

@ -28,6 +28,5 @@
## Acknowledgements
We would like to acknowledge the developers and contributors, both internal and external who helped create this version of SynapseML.\n
{{ end -}}
{{ end -}}

Просмотреть файл

@ -26,4 +26,4 @@ options:
- Subject
notes:
keywords:
- BREAKING CHANGE
- BREAKING CHANGE

Просмотреть файл

@ -3,4 +3,4 @@ target
.git
tools/docker/Dockerfile.dev
pipeline.yaml
.dockerignore
.dockerignore

2
.github/config.yml поставляемый
Просмотреть файл

@ -14,7 +14,7 @@ newPRWelcomeComment: >
This helps us to create release messages and credit you for your hard work!
Examples of commit messages with semantic prefixes:
- `fix: Fix LightGBM crashes with empty partitions`
- `feat: Make HTTP on Spark back-offs configurable`
- `docs: Update Spark Serving usage`

2
.github/pull_request_template.md поставляемый
Просмотреть файл

@ -8,4 +8,4 @@ _Describe what tests have you performed to validate your changes before submitti
# Dependency changes
_If you needed to make any changes to dependencies of this project, please describe them here._
_If you needed to make any changes to dependencies of this project, please describe them here._

4
.github/workflows/ci-publish-artifacts.yml поставляемый
Просмотреть файл

@ -23,11 +23,11 @@ jobs:
fetch-depth: 0
- uses: Azure/login@v1
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
creds: ${{ secrets.AZURE_CREDENTIALS }}
- uses: Azure/get-keyvault-secrets@v1
with:
keyvault: "mmlspark-keys"
secrets: 'storage-key,nexus-un,nexus-pw,pgp-private,pgp-public,pgp-pw' # comma separated list of secret keys that need to be fetched from the Key Vault
secrets: 'storage-key,nexus-un,nexus-pw,pgp-private,pgp-public,pgp-pw' # comma separated list of secret keys that need to be fetched from the Key Vault
id: GetKeyVaultSecrets
- name: Setup Python
uses: actions/setup-python@v2.3.2

Просмотреть файл

@ -27,7 +27,7 @@ jobs:
- uses: Azure/get-keyvault-secrets@v1
with:
keyvault: "mmlspark-keys"
secrets: 'gh-name,gh-email,gh-token' # comma separated list of secret keys that need to be fetched from the Key Vault
secrets: 'gh-name,gh-email,gh-token' # comma separated list of secret keys that need to be fetched from the Key Vault
id: GetKeyVaultSecrets
- name: 'Install Node.js'
uses: actions/setup-node@v2

4
.github/workflows/ci-release-sonatype.yml поставляемый
Просмотреть файл

@ -48,13 +48,13 @@ jobs:
if: startsWith(steps.getGitTag.outputs.gittag, 'v')
uses: Azure/login@v1
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
creds: ${{ secrets.AZURE_CREDENTIALS }}
- name: Get Secrets from KeyVault
if: startsWith(steps.getGitTag.outputs.gittag, 'v')
uses: Azure/get-keyvault-secrets@v1
with:
keyvault: "mmlspark-keys"
secrets: 'storage-key,nexus-un,nexus-pw,pgp-private,pgp-public,pgp-pw,pypi-api-token' # comma separated list of secret keys that need to be fetched from the Key Vault
secrets: 'storage-key,nexus-un,nexus-pw,pgp-private,pgp-public,pgp-pw,pypi-api-token' # comma separated list of secret keys that need to be fetched from the Key Vault
id: GetKeyVaultSecrets
- name: Setup Python
if: startsWith(steps.getGitTag.outputs.gittag, 'v')

2
.github/workflows/ci-scalastyle.yml поставляемый
Просмотреть файл

@ -27,5 +27,3 @@ jobs:
distribution: 'temurin'
- name: Run scalastyle
run: sbt scalastyle test:scalastyle

5
.github/workflows/ci-test-synapsee2e.yml поставляемый
Просмотреть файл

@ -22,11 +22,11 @@ jobs:
fetch-depth: 0
- uses: Azure/login@v1
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
creds: ${{ secrets.AZURE_CREDENTIALS }}
- uses: Azure/get-keyvault-secrets@v1
with:
keyvault: "mmlspark-keys"
secrets: 'storage-key,nexus-un,nexus-pw,pgp-private,pgp-public,pgp-pw' # comma separated list of secret keys that need to be fetched from the Key Vault
secrets: 'storage-key,nexus-un,nexus-pw,pgp-private,pgp-public,pgp-pw' # comma separated list of secret keys that need to be fetched from the Key Vault
id: GetKeyVaultSecrets
- name: Setup Python
uses: actions/setup-python@v2.3.2
@ -63,4 +63,3 @@ jobs:
files: '**/test-reports/TEST-*.xml'
check_name: "SynapseE2E Test Results"
comment_title: "SynapseE2E Test Results"

Просмотреть файл

@ -28,7 +28,7 @@ jobs:
uses: Azure/get-keyvault-secrets@v1
with:
keyvault: "mmlspark-keys"
secrets: 'storage-key,nexus-un,nexus-pw,pgp-private,pgp-public,pgp-pw' # comma separated list of secret keys that need to be fetched from the Key Vault
secrets: 'storage-key,nexus-un,nexus-pw,pgp-private,pgp-public,pgp-pw' # comma separated list of secret keys that need to be fetched from the Key Vault
id: GetKeyVaultSecrets
- name: Setup Python
uses: actions/setup-python@v2.3.2

2
.github/workflows/ci-tests-r.yml поставляемый
Просмотреть файл

@ -32,7 +32,7 @@ jobs:
uses: actions/setup-java@v2
with:
java-version: '11'
distribution: 'temurin'
distribution: 'temurin'
- name: Setup Miniconda
uses: conda-incubator/setup-miniconda@v2.1.1
with:

3
.github/workflows/ci-tests-unit.yml поставляемый
Просмотреть файл

@ -84,7 +84,7 @@ jobs:
(${FLAKY:-false} && timeout 30m sbt coverage "testOnly com.microsoft.azure.synapse.ml.${PACKAGE}.**") ||
(${FLAKY:-false} && timeout 30m sbt coverage "testOnly com.microsoft.azure.synapse.ml.${PACKAGE}.**")
env:
PACKAGE: ${{ matrix.package.name }}
PACKAGE: ${{ matrix.package.name }}
FFMPEG: ${{ matrix.package.ffmpeg }}
FLAKY: ${{ matrix.package.flaky }}
- name: Publish Test Results
@ -113,4 +113,3 @@ jobs:
chmod +x .codecov
echo "Starting Codecov Upload"
./.codecov -t ${{ steps.GetKeyVaultSecrets.outputs.codecov-token }}

Просмотреть файл

@ -32,7 +32,7 @@ jobs:
uses: actions/setup-java@v2
with:
java-version: '11'
distribution: 'temurin'
distribution: 'temurin'
- name: Setup Miniconda
uses: conda-incubator/setup-miniconda@v2.1.1
with:

Просмотреть файл

@ -1,6 +1,6 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v3.2.0
rev: v4.3.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer

Просмотреть файл

@ -20,7 +20,7 @@ tools/helm/ @dbanda
vw/ @eisber
isolationforest/ @eisber
recommendation/ @eisber
#Roy's Areas
cyber/ @rolevin

Просмотреть файл

@ -3,8 +3,8 @@
# Synapse Machine Learning
SynapseML (previously MMLSpark) is an open source library to simplify the creation of scalable machine learning pipelines.
SynapseML builds on [Apache Spark](https://github.com/apache/spark) and SparkML to enable new kinds of
machine learning, analytics, and model deployment workflows.
SynapseML builds on [Apache Spark](https://github.com/apache/spark) and SparkML to enable new kinds of
machine learning, analytics, and model deployment workflows.
SynapseML adds many deep learning and data science tools to the Spark ecosystem,
including seamless integration of Spark Machine Learning pipelines with the [Open Neural Network Exchange
(ONNX)](https://onnx.ai),
@ -14,8 +14,8 @@ including seamless integration of Spark Machine Learning pipelines with the [Ope
[OpenCV](http://www.opencv.org/). These tools enable powerful and highly-scalable predictive and analytical models
for a variety of datasources.
SynapseML also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users
can embed **any** web service into their SparkML models.
SynapseML also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users
can embed **any** web service into their SparkML models.
For production grade deployment, the Spark Serving project enables high throughput,
sub-millisecond latency web services, backed by your Spark cluster.
@ -58,7 +58,7 @@ PySpark](https://mmlspark.blob.core.windows.net/docs/0.9.5/pyspark/index.html).
| <img width="150" src="https://mmlspark.blob.core.windows.net/graphics/emails/isolation forest 3.svg"> | <img width="150" src="https://mmlspark.blob.core.windows.net/graphics/emails/cyberml.svg"> | <img width="150" src="https://mmlspark.blob.core.windows.net/graphics/emails/conditional_knn.svg"> |
|:---:|:---:|:---:|
| [**Isolation Forest on Spark**](https://microsoft.github.io/SynapseML/docs/documentation/estimators/estimators_core/#isolationforest) | [**CyberML**](https://github.com/microsoft/SynapseML/blob/master/notebooks/features/other/CyberML%20-%20Anomalous%20Access%20Detection.ipynb) | [**Conditional KNN**](https://microsoft.github.io/SynapseML/docs/features/other/ConditionalKNN%20-%20Exploring%20Art%20Across%20Cultures/) |
| Distributed Nonlinear Outlier Detection | Machine Learning Tools for Cyber Security | Scalable KNN Models with Conditional Queries |
| Distributed Nonlinear Outlier Detection | Machine Learning Tools for Cyber Security | Scalable KNN Models with Conditional Queries |
## Documentation and Examples
@ -80,7 +80,7 @@ First select the correct platform that you are installing SynapseML into:
<!--te-->
### Synapse Analytics
### Synapse Analytics
In Azure Synapse notebooks please place the following in the first cell of your notebook.
@ -123,7 +123,7 @@ cloud](http://community.cloud.databricks.com), create a new [library from Maven
coordinates](https://docs.databricks.com/user-guide/libraries.html#libraries-from-maven-pypi-or-spark-packages)
in your workspace.
For the coordinates use: `com.microsoft.azure:synapseml_2.12:0.9.5`
For the coordinates use: `com.microsoft.azure:synapseml_2.12:0.9.5`
with the resolver: `https://mmlspark.azureedge.net/maven`. Ensure this library is
attached to your target cluster(s).
@ -208,11 +208,11 @@ and some necessary custom wrappers may be missing.
### Building from source
SynapseML has recently transitioned to a new build infrastructure.
SynapseML has recently transitioned to a new build infrastructure.
For detailed developer docs please see the [Developer Readme](website/docs/reference/developer-readme.md)
If you are an existing synapsemldeveloper, you will need to reconfigure your
development setup. We now support platform independent development and
If you are an existing synapsemldeveloper, you will need to reconfigure your
development setup. We now support platform independent development and
better integrate with intellij and SBT.
If you encounter issues please reach out to our support email!

Просмотреть файл

@ -14,7 +14,7 @@ Instead, please report them to the Microsoft Security Response Center (MSRC) at
If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com). If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://www.microsoft.com/en-us/msrc/pgp-key-msrc).
You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://www.microsoft.com/msrc).
You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://www.microsoft.com/msrc).
Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue:
@ -38,4 +38,4 @@ We prefer all communications to be in English.
Microsoft follows the principle of [Coordinated Vulnerability Disclosure](https://www.microsoft.com/en-us/msrc/cvd).
<!-- END MICROSOFT SECURITY.MD BLOCK -->
<!-- END MICROSOFT SECURITY.MD BLOCK -->

Просмотреть файл

@ -25,4 +25,4 @@ flags:
- src/main/scala
python:
paths:
- src/main/python
- src/main/python

Просмотреть файл

@ -43,5 +43,5 @@ class BingImageSearch(_BingImageSearch):
SparkSession.builder.getOrCreate()._jvm.com.microsoft.azure.synapse.ml.cognitive.BingImageSearch
)
return Lambda._from_java(
bis.downloadFromUrls(pathCol, bytesCol, concurrency, timeout)
bis.downloadFromUrls(pathCol, bytesCol, concurrency, timeout),
)

Просмотреть файл

@ -164,4 +164,3 @@ class DocumentTranslator(override val uid: String) extends CognitiveServicesBase
override def responseDataType: DataType = TranslationStatusResponse.schema
}

Просмотреть файл

@ -340,4 +340,3 @@ class AnalyzeCustomModel(override val uid: String) extends FormRecognizerBase(ui
override protected def responseDataType: DataType = AnalyzeResponse.schema
}

Просмотреть файл

@ -88,5 +88,3 @@ case class FormFieldV3(`type`: String,
valueTime: Option[String],
valueObject: Option[String],
valueArray: Option[Seq[String]])

Просмотреть файл

@ -348,4 +348,3 @@ object SDKConverters {
.toSeq.map(tup => new TAResponseSDK[U](tup._3, tup._1, tup._2))
}
}

Просмотреть файл

@ -13,7 +13,8 @@ from pyspark.sql.types import *
class SimpleHTTPTransformerSmokeTest(unittest.TestCase):
def test_simple(self):
df = spark.createDataFrame([("foo",) for x in range(20)], ["data"]).withColumn(
"inputs", struct("data")
"inputs",
struct("data"),
)
response_schema = (

Просмотреть файл

@ -1 +1 @@
Content like data models tests and end points are organized into projects in the custom speech portal. Each project is specific to a domain and country slash language. For example, you may create a project for call centers that use English in the United States to create your first project select the speech to text slash custom speech, then click new project follow the instructions provided by The Wizard to create your project after you've created a project you should see 4 tabs data testing training. And deployment use the links provided in Next steps to learn how to use each tab.
Content like data models tests and end points are organized into projects in the custom speech portal. Each project is specific to a domain and country slash language. For example, you may create a project for call centers that use English in the United States to create your first project select the speech to text slash custom speech, then click new project follow the instructions provided by The Wizard to create your project after you've created a project you should see 4 tabs data testing training. And deployment use the links provided in Next steps to learn how to use each tab.

Просмотреть файл

@ -1 +1 @@
Custom Speech provides tools that allow you to visually inspect the recognition quality of a model by comparing audio data with the corresponding recognition result. From the Custom Speech portal, you can play back uploaded audio and determine if the provided recognition result is correct. This tool allows you to quickly inspect quality of Microsoft's baseline speech-to-text model or a trained custom model without having to transcribe any audio data.
Custom Speech provides tools that allow you to visually inspect the recognition quality of a model by comparing audio data with the corresponding recognition result. From the Custom Speech portal, you can play back uploaded audio and determine if the provided recognition result is correct. This tool allows you to quickly inspect quality of Microsoft's baseline speech-to-text model or a trained custom model without having to transcribe any audio data.

Просмотреть файл

@ -1 +1 @@
Add a gentleman that I got Sir. Yes Sir, thank you. Yeah, nobody heard me. I like the reassurance of the radio that I can hear it as well.
Add a gentleman that I got Sir. Yes Sir, thank you. Yeah, nobody heard me. I like the reassurance of the radio that I can hear it as well.

Просмотреть файл

@ -51,5 +51,3 @@ class OpenAICompletionSuite extends TransformerFuzzing[OpenAICompletion] with Op
override def reader: MLReadable[_] = OpenAICompletion
}

Просмотреть файл

@ -472,4 +472,3 @@ class ConversationTranscriptionSuite extends TransformerFuzzing[ConversationTran
override def reader: MLReadable[_] = ConversationTranscription
}

Просмотреть файл

@ -270,4 +270,3 @@ class FitMultivariateAnomalySuite extends EstimatorFuzzing[FitMultivariateAnomal
override def modelReader: MLReadable[_] = DetectMultivariateAnomaly
}

Просмотреть файл

@ -197,4 +197,4 @@ namespace SynapseML.Dotnet.Utils
}
#nullable disable
}
}

Просмотреть файл

@ -45,7 +45,7 @@ namespace Synapse.ML.Automl
typeof(DistObject),
"s_className");
for (int i = 0; i < jvmObjects.Length; i++)
{
{
Param param = new Param((JvmObjectReference)jvmObjects[i].Invoke("_1"));
JvmObjectReference distObject = (JvmObjectReference)jvmObjects[i].Invoke("_2");
if (JvmObjectUtils.TryConstructInstanceFromJvmObject(
@ -197,7 +197,7 @@ namespace Synapse.ML.Automl
public override T GetNext() =>
(T)Reference.Invoke("getNext");
public override ParamPair<T> GetParamPair(Param param) =>
new ParamPair<T>((JvmObjectReference)Reference.Invoke("getParamPair", param));

Просмотреть файл

@ -102,7 +102,7 @@ namespace Synapse.ML.LightGBM.Param
(bool)Reference.Invoke("updateOneIterationCustom", gradient, hessian);
/// <summary>
/// Sets the start index of the iteration to predict.
/// Sets the start index of the iteration to predict.
/// If <= 0, starts from the first iteration.
/// </summary>
/// <param name="startIteration">The start index of the iteration to predict.</param>

Просмотреть файл

@ -38,7 +38,7 @@ namespace Synapse.ML.Automl
/// <summary>
/// Creates a new instance of a <see cref="RandomSpace"/>
/// </summary>
public RandomSpace((Param, DistObject)[] value)
public RandomSpace((Param, DistObject)[] value)
: this(SparkEnvironment.JvmBridge.CallConstructor(s_className, value))
{
}
@ -53,5 +53,5 @@ namespace Synapse.ML.Automl
override public IEnumerable<ParamMap> ParamMaps() =>
(IEnumerable<ParamMap>)Reference.Invoke("paramMaps");
}
}

Просмотреть файл

@ -56,7 +56,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegression object </returns>
public LogisticRegression SetAggregationDepth(int value) =>
WrapAsLogisticRegression(Reference.Invoke("setAggregationDepth", (object)value));
/// <summary>
/// Sets elasticNetParam value for <see cref="elasticNetParam"/>
/// </summary>
@ -66,7 +66,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegression object </returns>
public LogisticRegression SetElasticNetParam(double value) =>
WrapAsLogisticRegression(Reference.Invoke("setElasticNetParam", (object)value));
/// <summary>
/// Sets family value for <see cref="family"/>
/// </summary>
@ -76,7 +76,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegression object </returns>
public LogisticRegression SetFamily(string value) =>
WrapAsLogisticRegression(Reference.Invoke("setFamily", (object)value));
/// <summary>
/// Sets featuresCol value for <see cref="featuresCol"/>
/// </summary>
@ -86,7 +86,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegression object </returns>
public LogisticRegression SetFeaturesCol(string value) =>
WrapAsLogisticRegression(Reference.Invoke("setFeaturesCol", (object)value));
/// <summary>
/// Sets fitIntercept value for <see cref="fitIntercept"/>
/// </summary>
@ -96,7 +96,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegression object </returns>
public LogisticRegression SetFitIntercept(bool value) =>
WrapAsLogisticRegression(Reference.Invoke("setFitIntercept", (object)value));
/// <summary>
/// Sets labelCol value for <see cref="labelCol"/>
/// </summary>
@ -106,7 +106,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegression object </returns>
public LogisticRegression SetLabelCol(string value) =>
WrapAsLogisticRegression(Reference.Invoke("setLabelCol", (object)value));
/// <summary>
/// Sets lowerBoundsOnCoefficients value for <see cref="lowerBoundsOnCoefficients"/>
/// </summary>
@ -116,7 +116,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegression object </returns>
public LogisticRegression SetLowerBoundsOnCoefficients(object value) =>
WrapAsLogisticRegression(Reference.Invoke("setLowerBoundsOnCoefficients", (object)value));
/// <summary>
/// Sets lowerBoundsOnIntercepts value for <see cref="lowerBoundsOnIntercepts"/>
/// </summary>
@ -126,7 +126,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegression object </returns>
public LogisticRegression SetLowerBoundsOnIntercepts(object value) =>
WrapAsLogisticRegression(Reference.Invoke("setLowerBoundsOnIntercepts", (object)value));
/// <summary>
/// Sets maxBlockSizeInMB value for <see cref="maxBlockSizeInMB"/>
/// </summary>
@ -136,7 +136,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegression object </returns>
public LogisticRegression SetMaxBlockSizeInMB(double value) =>
WrapAsLogisticRegression(Reference.Invoke("setMaxBlockSizeInMB", (object)value));
/// <summary>
/// Sets maxIter value for <see cref="maxIter"/>
/// </summary>
@ -146,7 +146,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegression object </returns>
public LogisticRegression SetMaxIter(int value) =>
WrapAsLogisticRegression(Reference.Invoke("setMaxIter", (object)value));
/// <summary>
/// Sets predictionCol value for <see cref="predictionCol"/>
/// </summary>
@ -156,7 +156,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegression object </returns>
public LogisticRegression SetPredictionCol(string value) =>
WrapAsLogisticRegression(Reference.Invoke("setPredictionCol", (object)value));
/// <summary>
/// Sets probabilityCol value for <see cref="probabilityCol"/>
/// </summary>
@ -166,7 +166,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegression object </returns>
public LogisticRegression SetProbabilityCol(string value) =>
WrapAsLogisticRegression(Reference.Invoke("setProbabilityCol", (object)value));
/// <summary>
/// Sets rawPredictionCol value for <see cref="rawPredictionCol"/>
/// </summary>
@ -176,7 +176,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegression object </returns>
public LogisticRegression SetRawPredictionCol(string value) =>
WrapAsLogisticRegression(Reference.Invoke("setRawPredictionCol", (object)value));
/// <summary>
/// Sets regParam value for <see cref="regParam"/>
/// </summary>
@ -186,7 +186,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegression object </returns>
public LogisticRegression SetRegParam(double value) =>
WrapAsLogisticRegression(Reference.Invoke("setRegParam", (object)value));
/// <summary>
/// Sets standardization value for <see cref="standardization"/>
/// </summary>
@ -196,7 +196,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegression object </returns>
public LogisticRegression SetStandardization(bool value) =>
WrapAsLogisticRegression(Reference.Invoke("setStandardization", (object)value));
/// <summary>
/// Sets threshold value for <see cref="threshold"/>
/// </summary>
@ -206,7 +206,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegression object </returns>
public LogisticRegression SetThreshold(double value) =>
WrapAsLogisticRegression(Reference.Invoke("setThreshold", (object)value));
/// <summary>
/// Sets thresholds value for <see cref="thresholds"/>
/// </summary>
@ -216,7 +216,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegression object </returns>
public LogisticRegression SetThresholds(double[] value) =>
WrapAsLogisticRegression(Reference.Invoke("setThresholds", (object)value));
/// <summary>
/// Sets tol value for <see cref="tol"/>
/// </summary>
@ -226,7 +226,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegression object </returns>
public LogisticRegression SetTol(double value) =>
WrapAsLogisticRegression(Reference.Invoke("setTol", (object)value));
/// <summary>
/// Sets upperBoundsOnCoefficients value for <see cref="upperBoundsOnCoefficients"/>
/// </summary>
@ -236,7 +236,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegression object </returns>
public LogisticRegression SetUpperBoundsOnCoefficients(object value) =>
WrapAsLogisticRegression(Reference.Invoke("setUpperBoundsOnCoefficients", (object)value));
/// <summary>
/// Sets upperBoundsOnIntercepts value for <see cref="upperBoundsOnIntercepts"/>
/// </summary>
@ -246,7 +246,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegression object </returns>
public LogisticRegression SetUpperBoundsOnIntercepts(object value) =>
WrapAsLogisticRegression(Reference.Invoke("setUpperBoundsOnIntercepts", (object)value));
/// <summary>
/// Sets weightCol value for <see cref="weightCol"/>
/// </summary>
@ -257,7 +257,7 @@ namespace Microsoft.Spark.ML.Classification
public LogisticRegression SetWeightCol(string value) =>
WrapAsLogisticRegression(Reference.Invoke("setWeightCol", (object)value));
/// <summary>
/// Gets aggregationDepth value for <see cref="aggregationDepth"/>
/// </summary>
@ -266,8 +266,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public int GetAggregationDepth() =>
(int)Reference.Invoke("getAggregationDepth");
/// <summary>
/// Gets elasticNetParam value for <see cref="elasticNetParam"/>
/// </summary>
@ -276,8 +276,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public double GetElasticNetParam() =>
(double)Reference.Invoke("getElasticNetParam");
/// <summary>
/// Gets family value for <see cref="family"/>
/// </summary>
@ -286,8 +286,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public string GetFamily() =>
(string)Reference.Invoke("getFamily");
/// <summary>
/// Gets featuresCol value for <see cref="featuresCol"/>
/// </summary>
@ -296,8 +296,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public string GetFeaturesCol() =>
(string)Reference.Invoke("getFeaturesCol");
/// <summary>
/// Gets fitIntercept value for <see cref="fitIntercept"/>
/// </summary>
@ -306,8 +306,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public bool GetFitIntercept() =>
(bool)Reference.Invoke("getFitIntercept");
/// <summary>
/// Gets labelCol value for <see cref="labelCol"/>
/// </summary>
@ -316,8 +316,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public string GetLabelCol() =>
(string)Reference.Invoke("getLabelCol");
/// <summary>
/// Gets lowerBoundsOnCoefficients value for <see cref="lowerBoundsOnCoefficients"/>
/// </summary>
@ -326,8 +326,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public object GetLowerBoundsOnCoefficients() =>
(object)Reference.Invoke("getLowerBoundsOnCoefficients");
/// <summary>
/// Gets lowerBoundsOnIntercepts value for <see cref="lowerBoundsOnIntercepts"/>
/// </summary>
@ -336,8 +336,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public object GetLowerBoundsOnIntercepts() =>
(object)Reference.Invoke("getLowerBoundsOnIntercepts");
/// <summary>
/// Gets maxBlockSizeInMB value for <see cref="maxBlockSizeInMB"/>
/// </summary>
@ -346,8 +346,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public double GetMaxBlockSizeInMB() =>
(double)Reference.Invoke("getMaxBlockSizeInMB");
/// <summary>
/// Gets maxIter value for <see cref="maxIter"/>
/// </summary>
@ -356,8 +356,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public int GetMaxIter() =>
(int)Reference.Invoke("getMaxIter");
/// <summary>
/// Gets predictionCol value for <see cref="predictionCol"/>
/// </summary>
@ -366,8 +366,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public string GetPredictionCol() =>
(string)Reference.Invoke("getPredictionCol");
/// <summary>
/// Gets probabilityCol value for <see cref="probabilityCol"/>
/// </summary>
@ -376,8 +376,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public string GetProbabilityCol() =>
(string)Reference.Invoke("getProbabilityCol");
/// <summary>
/// Gets rawPredictionCol value for <see cref="rawPredictionCol"/>
/// </summary>
@ -386,8 +386,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public string GetRawPredictionCol() =>
(string)Reference.Invoke("getRawPredictionCol");
/// <summary>
/// Gets regParam value for <see cref="regParam"/>
/// </summary>
@ -396,8 +396,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public double GetRegParam() =>
(double)Reference.Invoke("getRegParam");
/// <summary>
/// Gets standardization value for <see cref="standardization"/>
/// </summary>
@ -406,8 +406,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public bool GetStandardization() =>
(bool)Reference.Invoke("getStandardization");
/// <summary>
/// Gets threshold value for <see cref="threshold"/>
/// </summary>
@ -416,8 +416,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public double GetThreshold() =>
(double)Reference.Invoke("getThreshold");
/// <summary>
/// Gets thresholds value for <see cref="thresholds"/>
/// </summary>
@ -426,8 +426,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public double[] GetThresholds() =>
(double[])Reference.Invoke("getThresholds");
/// <summary>
/// Gets tol value for <see cref="tol"/>
/// </summary>
@ -436,8 +436,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public double GetTol() =>
(double)Reference.Invoke("getTol");
/// <summary>
/// Gets upperBoundsOnCoefficients value for <see cref="upperBoundsOnCoefficients"/>
/// </summary>
@ -446,8 +446,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public object GetUpperBoundsOnCoefficients() =>
(object)Reference.Invoke("getUpperBoundsOnCoefficients");
/// <summary>
/// Gets upperBoundsOnIntercepts value for <see cref="upperBoundsOnIntercepts"/>
/// </summary>
@ -456,8 +456,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public object GetUpperBoundsOnIntercepts() =>
(object)Reference.Invoke("getUpperBoundsOnIntercepts");
/// <summary>
/// Gets weightCol value for <see cref="weightCol"/>
/// </summary>
@ -481,18 +481,18 @@ namespace Microsoft.Spark.ML.Classification
/// <returns>New <see cref="LogisticRegression"/> object, loaded from path.</returns>
public static LogisticRegression Load(string path) => WrapAsLogisticRegression(
SparkEnvironment.JvmBridge.CallStaticJavaMethod(s_className, "load", path));
/// <summary>
/// Saves the object so that it can be loaded later using Load. Note that these objects
/// can be shared with Scala by Loading or Saving in Scala.
/// </summary>
/// <param name="path">The path to save the object to</param>
public void Save(string path) => Reference.Invoke("save", path);
/// <returns>a <see cref="JavaMLWriter"/> instance for this ML instance.</returns>
public JavaMLWriter Write() =>
new JavaMLWriter((JvmObjectReference)Reference.Invoke("write"));
/// <summary>
/// Get the corresponding JavaMLReader instance.
/// </summary>
@ -503,8 +503,6 @@ namespace Microsoft.Spark.ML.Classification
private static LogisticRegression WrapAsLogisticRegression(object obj) =>
new LogisticRegression((JvmObjectReference)obj);
}
}

Просмотреть файл

@ -51,7 +51,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegressionModel object </returns>
public LogisticRegressionModel SetAggregationDepth(int value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setAggregationDepth", (object)value));
/// <summary>
/// Sets elasticNetParam value for <see cref="elasticNetParam"/>
/// </summary>
@ -61,7 +61,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegressionModel object </returns>
public LogisticRegressionModel SetElasticNetParam(double value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setElasticNetParam", (object)value));
/// <summary>
/// Sets family value for <see cref="family"/>
/// </summary>
@ -71,7 +71,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegressionModel object </returns>
public LogisticRegressionModel SetFamily(string value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setFamily", (object)value));
/// <summary>
/// Sets featuresCol value for <see cref="featuresCol"/>
/// </summary>
@ -81,7 +81,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegressionModel object </returns>
public LogisticRegressionModel SetFeaturesCol(string value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setFeaturesCol", (object)value));
/// <summary>
/// Sets fitIntercept value for <see cref="fitIntercept"/>
/// </summary>
@ -91,7 +91,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegressionModel object </returns>
public LogisticRegressionModel SetFitIntercept(bool value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setFitIntercept", (object)value));
/// <summary>
/// Sets labelCol value for <see cref="labelCol"/>
/// </summary>
@ -101,7 +101,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegressionModel object </returns>
public LogisticRegressionModel SetLabelCol(string value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setLabelCol", (object)value));
/// <summary>
/// Sets lowerBoundsOnCoefficients value for <see cref="lowerBoundsOnCoefficients"/>
/// </summary>
@ -111,7 +111,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegressionModel object </returns>
public LogisticRegressionModel SetLowerBoundsOnCoefficients(object value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setLowerBoundsOnCoefficients", (object)value));
/// <summary>
/// Sets lowerBoundsOnIntercepts value for <see cref="lowerBoundsOnIntercepts"/>
/// </summary>
@ -121,7 +121,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegressionModel object </returns>
public LogisticRegressionModel SetLowerBoundsOnIntercepts(object value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setLowerBoundsOnIntercepts", (object)value));
/// <summary>
/// Sets maxBlockSizeInMB value for <see cref="maxBlockSizeInMB"/>
/// </summary>
@ -131,7 +131,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegressionModel object </returns>
public LogisticRegressionModel SetMaxBlockSizeInMB(double value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setMaxBlockSizeInMB", (object)value));
/// <summary>
/// Sets maxIter value for <see cref="maxIter"/>
/// </summary>
@ -141,7 +141,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegressionModel object </returns>
public LogisticRegressionModel SetMaxIter(int value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setMaxIter", (object)value));
/// <summary>
/// Sets predictionCol value for <see cref="predictionCol"/>
/// </summary>
@ -151,7 +151,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegressionModel object </returns>
public LogisticRegressionModel SetPredictionCol(string value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setPredictionCol", (object)value));
/// <summary>
/// Sets probabilityCol value for <see cref="probabilityCol"/>
/// </summary>
@ -161,7 +161,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegressionModel object </returns>
public LogisticRegressionModel SetProbabilityCol(string value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setProbabilityCol", (object)value));
/// <summary>
/// Sets rawPredictionCol value for <see cref="rawPredictionCol"/>
/// </summary>
@ -171,7 +171,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegressionModel object </returns>
public LogisticRegressionModel SetRawPredictionCol(string value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setRawPredictionCol", (object)value));
/// <summary>
/// Sets regParam value for <see cref="regParam"/>
/// </summary>
@ -181,7 +181,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegressionModel object </returns>
public LogisticRegressionModel SetRegParam(double value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setRegParam", (object)value));
/// <summary>
/// Sets standardization value for <see cref="standardization"/>
/// </summary>
@ -191,7 +191,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegressionModel object </returns>
public LogisticRegressionModel SetStandardization(bool value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setStandardization", (object)value));
/// <summary>
/// Sets threshold value for <see cref="threshold"/>
/// </summary>
@ -201,7 +201,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegressionModel object </returns>
public LogisticRegressionModel SetThreshold(double value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setThreshold", (object)value));
/// <summary>
/// Sets thresholds value for <see cref="thresholds"/>
/// </summary>
@ -211,7 +211,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegressionModel object </returns>
public LogisticRegressionModel SetThresholds(double[] value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setThresholds", (object)value));
/// <summary>
/// Sets tol value for <see cref="tol"/>
/// </summary>
@ -221,7 +221,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegressionModel object </returns>
public LogisticRegressionModel SetTol(double value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setTol", (object)value));
/// <summary>
/// Sets upperBoundsOnCoefficients value for <see cref="upperBoundsOnCoefficients"/>
/// </summary>
@ -231,7 +231,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegressionModel object </returns>
public LogisticRegressionModel SetUpperBoundsOnCoefficients(object value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setUpperBoundsOnCoefficients", (object)value));
/// <summary>
/// Sets upperBoundsOnIntercepts value for <see cref="upperBoundsOnIntercepts"/>
/// </summary>
@ -241,7 +241,7 @@ namespace Microsoft.Spark.ML.Classification
/// <returns> New LogisticRegressionModel object </returns>
public LogisticRegressionModel SetUpperBoundsOnIntercepts(object value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setUpperBoundsOnIntercepts", (object)value));
/// <summary>
/// Sets weightCol value for <see cref="weightCol"/>
/// </summary>
@ -252,7 +252,7 @@ namespace Microsoft.Spark.ML.Classification
public LogisticRegressionModel SetWeightCol(string value) =>
WrapAsLogisticRegressionModel(Reference.Invoke("setWeightCol", (object)value));
/// <summary>
/// Gets aggregationDepth value for <see cref="aggregationDepth"/>
/// </summary>
@ -261,8 +261,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public int GetAggregationDepth() =>
(int)Reference.Invoke("getAggregationDepth");
/// <summary>
/// Gets elasticNetParam value for <see cref="elasticNetParam"/>
/// </summary>
@ -271,8 +271,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public double GetElasticNetParam() =>
(double)Reference.Invoke("getElasticNetParam");
/// <summary>
/// Gets family value for <see cref="family"/>
/// </summary>
@ -281,8 +281,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public string GetFamily() =>
(string)Reference.Invoke("getFamily");
/// <summary>
/// Gets featuresCol value for <see cref="featuresCol"/>
/// </summary>
@ -291,8 +291,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public string GetFeaturesCol() =>
(string)Reference.Invoke("getFeaturesCol");
/// <summary>
/// Gets fitIntercept value for <see cref="fitIntercept"/>
/// </summary>
@ -301,8 +301,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public bool GetFitIntercept() =>
(bool)Reference.Invoke("getFitIntercept");
/// <summary>
/// Gets labelCol value for <see cref="labelCol"/>
/// </summary>
@ -311,8 +311,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public string GetLabelCol() =>
(string)Reference.Invoke("getLabelCol");
/// <summary>
/// Gets lowerBoundsOnCoefficients value for <see cref="lowerBoundsOnCoefficients"/>
/// </summary>
@ -321,8 +321,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public object GetLowerBoundsOnCoefficients() =>
(object)Reference.Invoke("getLowerBoundsOnCoefficients");
/// <summary>
/// Gets lowerBoundsOnIntercepts value for <see cref="lowerBoundsOnIntercepts"/>
/// </summary>
@ -331,8 +331,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public object GetLowerBoundsOnIntercepts() =>
(object)Reference.Invoke("getLowerBoundsOnIntercepts");
/// <summary>
/// Gets maxBlockSizeInMB value for <see cref="maxBlockSizeInMB"/>
/// </summary>
@ -341,8 +341,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public double GetMaxBlockSizeInMB() =>
(double)Reference.Invoke("getMaxBlockSizeInMB");
/// <summary>
/// Gets maxIter value for <see cref="maxIter"/>
/// </summary>
@ -351,8 +351,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public int GetMaxIter() =>
(int)Reference.Invoke("getMaxIter");
/// <summary>
/// Gets predictionCol value for <see cref="predictionCol"/>
/// </summary>
@ -361,8 +361,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public string GetPredictionCol() =>
(string)Reference.Invoke("getPredictionCol");
/// <summary>
/// Gets probabilityCol value for <see cref="probabilityCol"/>
/// </summary>
@ -371,8 +371,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public string GetProbabilityCol() =>
(string)Reference.Invoke("getProbabilityCol");
/// <summary>
/// Gets rawPredictionCol value for <see cref="rawPredictionCol"/>
/// </summary>
@ -381,8 +381,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public string GetRawPredictionCol() =>
(string)Reference.Invoke("getRawPredictionCol");
/// <summary>
/// Gets regParam value for <see cref="regParam"/>
/// </summary>
@ -391,8 +391,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public double GetRegParam() =>
(double)Reference.Invoke("getRegParam");
/// <summary>
/// Gets standardization value for <see cref="standardization"/>
/// </summary>
@ -401,8 +401,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public bool GetStandardization() =>
(bool)Reference.Invoke("getStandardization");
/// <summary>
/// Gets threshold value for <see cref="threshold"/>
/// </summary>
@ -411,8 +411,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public double GetThreshold() =>
(double)Reference.Invoke("getThreshold");
/// <summary>
/// Gets thresholds value for <see cref="thresholds"/>
/// </summary>
@ -421,8 +421,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public double[] GetThresholds() =>
(double[])Reference.Invoke("getThresholds");
/// <summary>
/// Gets tol value for <see cref="tol"/>
/// </summary>
@ -431,8 +431,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public double GetTol() =>
(double)Reference.Invoke("getTol");
/// <summary>
/// Gets upperBoundsOnCoefficients value for <see cref="upperBoundsOnCoefficients"/>
/// </summary>
@ -441,8 +441,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public object GetUpperBoundsOnCoefficients() =>
(object)Reference.Invoke("getUpperBoundsOnCoefficients");
/// <summary>
/// Gets upperBoundsOnIntercepts value for <see cref="upperBoundsOnIntercepts"/>
/// </summary>
@ -451,8 +451,8 @@ namespace Microsoft.Spark.ML.Classification
/// </returns>
public object GetUpperBoundsOnIntercepts() =>
(object)Reference.Invoke("getUpperBoundsOnIntercepts");
/// <summary>
/// Gets weightCol value for <see cref="weightCol"/>
/// </summary>
@ -462,7 +462,7 @@ namespace Microsoft.Spark.ML.Classification
public string GetWeightCol() =>
(string)Reference.Invoke("getWeightCol");
/// <summary>
/// Loads the <see cref="LogisticRegressionModel"/> that was previously saved using Save(string).
/// </summary>
@ -470,18 +470,18 @@ namespace Microsoft.Spark.ML.Classification
/// <returns>New <see cref="LogisticRegressionModel"/> object, loaded from path.</returns>
public static LogisticRegressionModel Load(string path) => WrapAsLogisticRegressionModel(
SparkEnvironment.JvmBridge.CallStaticJavaMethod(s_className, "load", path));
/// <summary>
/// Saves the object so that it can be loaded later using Load. Note that these objects
/// can be shared with Scala by Loading or Saving in Scala.
/// </summary>
/// <param name="path">The path to save the object to</param>
public void Save(string path) => Reference.Invoke("save", path);
/// <returns>a <see cref="JavaMLWriter"/> instance for this ML instance.</returns>
public JavaMLWriter Write() =>
new JavaMLWriter((JvmObjectReference)Reference.Invoke("write"));
/// <summary>
/// Get the corresponding JavaMLReader instance.
/// </summary>
@ -492,8 +492,6 @@ namespace Microsoft.Spark.ML.Classification
private static LogisticRegressionModel WrapAsLogisticRegressionModel(object obj) =>
new LogisticRegressionModel((JvmObjectReference)obj);
}
}

Просмотреть файл

@ -56,7 +56,7 @@ namespace Microsoft.Spark.ML.Feature
/// <returns> New StringIndexer object </returns>
public StringIndexer SetHandleInvalid(string value) =>
WrapAsStringIndexer(Reference.Invoke("setHandleInvalid", (object)value));
/// <summary>
/// Sets inputCol value for <see cref="inputCol"/>
/// </summary>
@ -66,7 +66,7 @@ namespace Microsoft.Spark.ML.Feature
/// <returns> New StringIndexer object </returns>
public StringIndexer SetInputCol(string value) =>
WrapAsStringIndexer(Reference.Invoke("setInputCol", (object)value));
/// <summary>
/// Sets inputCols value for <see cref="inputCols"/>
/// </summary>
@ -76,7 +76,7 @@ namespace Microsoft.Spark.ML.Feature
/// <returns> New StringIndexer object </returns>
public StringIndexer SetInputCols(string[] value) =>
WrapAsStringIndexer(Reference.Invoke("setInputCols", (object)value));
/// <summary>
/// Sets outputCol value for <see cref="outputCol"/>
/// </summary>
@ -86,7 +86,7 @@ namespace Microsoft.Spark.ML.Feature
/// <returns> New StringIndexer object </returns>
public StringIndexer SetOutputCol(string value) =>
WrapAsStringIndexer(Reference.Invoke("setOutputCol", (object)value));
/// <summary>
/// Sets outputCols value for <see cref="outputCols"/>
/// </summary>
@ -96,7 +96,7 @@ namespace Microsoft.Spark.ML.Feature
/// <returns> New StringIndexer object </returns>
public StringIndexer SetOutputCols(string[] value) =>
WrapAsStringIndexer(Reference.Invoke("setOutputCols", (object)value));
/// <summary>
/// Sets stringOrderType value for <see cref="stringOrderType"/>
/// </summary>
@ -107,7 +107,7 @@ namespace Microsoft.Spark.ML.Feature
public StringIndexer SetStringOrderType(string value) =>
WrapAsStringIndexer(Reference.Invoke("setStringOrderType", (object)value));
/// <summary>
/// Gets handleInvalid value for <see cref="handleInvalid"/>
/// </summary>
@ -116,8 +116,8 @@ namespace Microsoft.Spark.ML.Feature
/// </returns>
public string GetHandleInvalid() =>
(string)Reference.Invoke("getHandleInvalid");
/// <summary>
/// Gets inputCol value for <see cref="inputCol"/>
/// </summary>
@ -126,8 +126,8 @@ namespace Microsoft.Spark.ML.Feature
/// </returns>
public string GetInputCol() =>
(string)Reference.Invoke("getInputCol");
/// <summary>
/// Gets inputCols value for <see cref="inputCols"/>
/// </summary>
@ -136,8 +136,8 @@ namespace Microsoft.Spark.ML.Feature
/// </returns>
public string[] GetInputCols() =>
(string[])Reference.Invoke("getInputCols");
/// <summary>
/// Gets outputCol value for <see cref="outputCol"/>
/// </summary>
@ -146,8 +146,8 @@ namespace Microsoft.Spark.ML.Feature
/// </returns>
public string GetOutputCol() =>
(string)Reference.Invoke("getOutputCol");
/// <summary>
/// Gets outputCols value for <see cref="outputCols"/>
/// </summary>
@ -156,8 +156,8 @@ namespace Microsoft.Spark.ML.Feature
/// </returns>
public string[] GetOutputCols() =>
(string[])Reference.Invoke("getOutputCols");
/// <summary>
/// Gets stringOrderType value for <see cref="stringOrderType"/>
/// </summary>
@ -181,18 +181,18 @@ namespace Microsoft.Spark.ML.Feature
/// <returns>New <see cref="StringIndexer"/> object, loaded from path.</returns>
public static StringIndexer Load(string path) => WrapAsStringIndexer(
SparkEnvironment.JvmBridge.CallStaticJavaMethod(s_className, "load", path));
/// <summary>
/// Saves the object so that it can be loaded later using Load. Note that these objects
/// can be shared with Scala by Loading or Saving in Scala.
/// </summary>
/// <param name="path">The path to save the object to</param>
public void Save(string path) => Reference.Invoke("save", path);
/// <returns>a <see cref="JavaMLWriter"/> instance for this ML instance.</returns>
public JavaMLWriter Write() =>
new JavaMLWriter((JvmObjectReference)Reference.Invoke("write"));
/// <summary>
/// Get the corresponding JavaMLReader instance.
/// </summary>
@ -203,8 +203,6 @@ namespace Microsoft.Spark.ML.Feature
private static StringIndexer WrapAsStringIndexer(object obj) =>
new StringIndexer((JvmObjectReference)obj);
}
}

Просмотреть файл

@ -79,7 +79,7 @@ namespace Microsoft.Spark.ML.Feature
/// <returns> New StringIndexerModel object </returns>
public StringIndexerModel SetHandleInvalid(string value) =>
WrapAsStringIndexerModel(Reference.Invoke("setHandleInvalid", (object)value));
/// <summary>
/// Sets inputCol value for <see cref="inputCol"/>
/// </summary>
@ -89,7 +89,7 @@ namespace Microsoft.Spark.ML.Feature
/// <returns> New StringIndexerModel object </returns>
public StringIndexerModel SetInputCol(string value) =>
WrapAsStringIndexerModel(Reference.Invoke("setInputCol", (object)value));
/// <summary>
/// Sets inputCols value for <see cref="inputCols"/>
/// </summary>
@ -99,7 +99,7 @@ namespace Microsoft.Spark.ML.Feature
/// <returns> New StringIndexerModel object </returns>
public StringIndexerModel SetInputCols(string[] value) =>
WrapAsStringIndexerModel(Reference.Invoke("setInputCols", (object)value));
/// <summary>
/// Sets outputCol value for <see cref="outputCol"/>
/// </summary>
@ -109,7 +109,7 @@ namespace Microsoft.Spark.ML.Feature
/// <returns> New StringIndexerModel object </returns>
public StringIndexerModel SetOutputCol(string value) =>
WrapAsStringIndexerModel(Reference.Invoke("setOutputCol", (object)value));
/// <summary>
/// Sets outputCols value for <see cref="outputCols"/>
/// </summary>
@ -119,7 +119,7 @@ namespace Microsoft.Spark.ML.Feature
/// <returns> New StringIndexerModel object </returns>
public StringIndexerModel SetOutputCols(string[] value) =>
WrapAsStringIndexerModel(Reference.Invoke("setOutputCols", (object)value));
/// <summary>
/// Sets stringOrderType value for <see cref="stringOrderType"/>
/// </summary>
@ -130,7 +130,7 @@ namespace Microsoft.Spark.ML.Feature
public StringIndexerModel SetStringOrderType(string value) =>
WrapAsStringIndexerModel(Reference.Invoke("setStringOrderType", (object)value));
/// <summary>
/// Gets handleInvalid value for <see cref="handleInvalid"/>
/// </summary>
@ -139,8 +139,8 @@ namespace Microsoft.Spark.ML.Feature
/// </returns>
public string GetHandleInvalid() =>
(string)Reference.Invoke("getHandleInvalid");
/// <summary>
/// Gets inputCol value for <see cref="inputCol"/>
/// </summary>
@ -149,8 +149,8 @@ namespace Microsoft.Spark.ML.Feature
/// </returns>
public string GetInputCol() =>
(string)Reference.Invoke("getInputCol");
/// <summary>
/// Gets inputCols value for <see cref="inputCols"/>
/// </summary>
@ -159,8 +159,8 @@ namespace Microsoft.Spark.ML.Feature
/// </returns>
public string[] GetInputCols() =>
(string[])Reference.Invoke("getInputCols");
/// <summary>
/// Gets outputCol value for <see cref="outputCol"/>
/// </summary>
@ -169,8 +169,8 @@ namespace Microsoft.Spark.ML.Feature
/// </returns>
public string GetOutputCol() =>
(string)Reference.Invoke("getOutputCol");
/// <summary>
/// Gets outputCols value for <see cref="outputCols"/>
/// </summary>
@ -179,8 +179,8 @@ namespace Microsoft.Spark.ML.Feature
/// </returns>
public string[] GetOutputCols() =>
(string[])Reference.Invoke("getOutputCols");
/// <summary>
/// Gets stringOrderType value for <see cref="stringOrderType"/>
/// </summary>
@ -190,7 +190,7 @@ namespace Microsoft.Spark.ML.Feature
public string GetStringOrderType() =>
(string)Reference.Invoke("getStringOrderType");
/// <summary>
/// Loads the <see cref="StringIndexerModel"/> that was previously saved using Save(string).
/// </summary>
@ -198,18 +198,18 @@ namespace Microsoft.Spark.ML.Feature
/// <returns>New <see cref="StringIndexerModel"/> object, loaded from path.</returns>
public static StringIndexerModel Load(string path) => WrapAsStringIndexerModel(
SparkEnvironment.JvmBridge.CallStaticJavaMethod(s_className, "load", path));
/// <summary>
/// Saves the object so that it can be loaded later using Load. Note that these objects
/// can be shared with Scala by Loading or Saving in Scala.
/// </summary>
/// <param name="path">The path to save the object to</param>
public void Save(string path) => Reference.Invoke("save", path);
/// <returns>a <see cref="JavaMLWriter"/> instance for this ML instance.</returns>
public JavaMLWriter Write() =>
new JavaMLWriter((JvmObjectReference)Reference.Invoke("write"));
/// <summary>
/// Get the corresponding JavaMLReader instance.
/// </summary>
@ -220,8 +220,6 @@ namespace Microsoft.Spark.ML.Feature
private static StringIndexerModel WrapAsStringIndexerModel(object obj) =>
new StringIndexerModel((JvmObjectReference)obj);
}
}

Просмотреть файл

@ -56,7 +56,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALS object </returns>
public ALS SetAlpha(double value) =>
WrapAsALS(Reference.Invoke("setAlpha", (object)value));
/// <summary>
/// Sets blockSize value for <see cref="blockSize"/>
/// </summary>
@ -66,7 +66,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALS object </returns>
public ALS SetBlockSize(int value) =>
WrapAsALS(Reference.Invoke("setBlockSize", (object)value));
/// <summary>
/// Sets checkpointInterval value for <see cref="checkpointInterval"/>
/// </summary>
@ -76,7 +76,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALS object </returns>
public ALS SetCheckpointInterval(int value) =>
WrapAsALS(Reference.Invoke("setCheckpointInterval", (object)value));
/// <summary>
/// Sets coldStartStrategy value for <see cref="coldStartStrategy"/>
/// </summary>
@ -86,7 +86,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALS object </returns>
public ALS SetColdStartStrategy(string value) =>
WrapAsALS(Reference.Invoke("setColdStartStrategy", (object)value));
/// <summary>
/// Sets finalStorageLevel value for <see cref="finalStorageLevel"/>
/// </summary>
@ -96,7 +96,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALS object </returns>
public ALS SetFinalStorageLevel(string value) =>
WrapAsALS(Reference.Invoke("setFinalStorageLevel", (object)value));
/// <summary>
/// Sets implicitPrefs value for <see cref="implicitPrefs"/>
/// </summary>
@ -106,7 +106,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALS object </returns>
public ALS SetImplicitPrefs(bool value) =>
WrapAsALS(Reference.Invoke("setImplicitPrefs", (object)value));
/// <summary>
/// Sets intermediateStorageLevel value for <see cref="intermediateStorageLevel"/>
/// </summary>
@ -116,7 +116,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALS object </returns>
public ALS SetIntermediateStorageLevel(string value) =>
WrapAsALS(Reference.Invoke("setIntermediateStorageLevel", (object)value));
/// <summary>
/// Sets itemCol value for <see cref="itemCol"/>
/// </summary>
@ -126,7 +126,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALS object </returns>
public ALS SetItemCol(string value) =>
WrapAsALS(Reference.Invoke("setItemCol", (object)value));
/// <summary>
/// Sets maxIter value for <see cref="maxIter"/>
/// </summary>
@ -136,7 +136,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALS object </returns>
public ALS SetMaxIter(int value) =>
WrapAsALS(Reference.Invoke("setMaxIter", (object)value));
/// <summary>
/// Sets nonnegative value for <see cref="nonnegative"/>
/// </summary>
@ -146,7 +146,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALS object </returns>
public ALS SetNonnegative(bool value) =>
WrapAsALS(Reference.Invoke("setNonnegative", (object)value));
/// <summary>
/// Sets numItemBlocks value for <see cref="numItemBlocks"/>
/// </summary>
@ -156,7 +156,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALS object </returns>
public ALS SetNumItemBlocks(int value) =>
WrapAsALS(Reference.Invoke("setNumItemBlocks", (object)value));
/// <summary>
/// Sets numUserBlocks value for <see cref="numUserBlocks"/>
/// </summary>
@ -166,7 +166,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALS object </returns>
public ALS SetNumUserBlocks(int value) =>
WrapAsALS(Reference.Invoke("setNumUserBlocks", (object)value));
/// <summary>
/// Sets predictionCol value for <see cref="predictionCol"/>
/// </summary>
@ -176,7 +176,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALS object </returns>
public ALS SetPredictionCol(string value) =>
WrapAsALS(Reference.Invoke("setPredictionCol", (object)value));
/// <summary>
/// Sets rank value for <see cref="rank"/>
/// </summary>
@ -186,7 +186,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALS object </returns>
public ALS SetRank(int value) =>
WrapAsALS(Reference.Invoke("setRank", (object)value));
/// <summary>
/// Sets ratingCol value for <see cref="ratingCol"/>
/// </summary>
@ -196,7 +196,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALS object </returns>
public ALS SetRatingCol(string value) =>
WrapAsALS(Reference.Invoke("setRatingCol", (object)value));
/// <summary>
/// Sets regParam value for <see cref="regParam"/>
/// </summary>
@ -206,7 +206,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALS object </returns>
public ALS SetRegParam(double value) =>
WrapAsALS(Reference.Invoke("setRegParam", (object)value));
/// <summary>
/// Sets seed value for <see cref="seed"/>
/// </summary>
@ -216,7 +216,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALS object </returns>
public ALS SetSeed(long value) =>
WrapAsALS(Reference.Invoke("setSeed", (object)value));
/// <summary>
/// Sets userCol value for <see cref="userCol"/>
/// </summary>
@ -227,7 +227,7 @@ namespace Microsoft.Spark.ML.Recommendation
public ALS SetUserCol(string value) =>
WrapAsALS(Reference.Invoke("setUserCol", (object)value));
/// <summary>
/// Gets alpha value for <see cref="alpha"/>
/// </summary>
@ -236,8 +236,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public double GetAlpha() =>
(double)Reference.Invoke("getAlpha");
/// <summary>
/// Gets blockSize value for <see cref="blockSize"/>
/// </summary>
@ -246,8 +246,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public int GetBlockSize() =>
(int)Reference.Invoke("getBlockSize");
/// <summary>
/// Gets checkpointInterval value for <see cref="checkpointInterval"/>
/// </summary>
@ -256,8 +256,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public int GetCheckpointInterval() =>
(int)Reference.Invoke("getCheckpointInterval");
/// <summary>
/// Gets coldStartStrategy value for <see cref="coldStartStrategy"/>
/// </summary>
@ -266,8 +266,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public string GetColdStartStrategy() =>
(string)Reference.Invoke("getColdStartStrategy");
/// <summary>
/// Gets finalStorageLevel value for <see cref="finalStorageLevel"/>
/// </summary>
@ -276,8 +276,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public string GetFinalStorageLevel() =>
(string)Reference.Invoke("getFinalStorageLevel");
/// <summary>
/// Gets implicitPrefs value for <see cref="implicitPrefs"/>
/// </summary>
@ -286,8 +286,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public bool GetImplicitPrefs() =>
(bool)Reference.Invoke("getImplicitPrefs");
/// <summary>
/// Gets intermediateStorageLevel value for <see cref="intermediateStorageLevel"/>
/// </summary>
@ -296,8 +296,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public string GetIntermediateStorageLevel() =>
(string)Reference.Invoke("getIntermediateStorageLevel");
/// <summary>
/// Gets itemCol value for <see cref="itemCol"/>
/// </summary>
@ -306,8 +306,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public string GetItemCol() =>
(string)Reference.Invoke("getItemCol");
/// <summary>
/// Gets maxIter value for <see cref="maxIter"/>
/// </summary>
@ -316,8 +316,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public int GetMaxIter() =>
(int)Reference.Invoke("getMaxIter");
/// <summary>
/// Gets nonnegative value for <see cref="nonnegative"/>
/// </summary>
@ -326,8 +326,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public bool GetNonnegative() =>
(bool)Reference.Invoke("getNonnegative");
/// <summary>
/// Gets numItemBlocks value for <see cref="numItemBlocks"/>
/// </summary>
@ -336,8 +336,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public int GetNumItemBlocks() =>
(int)Reference.Invoke("getNumItemBlocks");
/// <summary>
/// Gets numUserBlocks value for <see cref="numUserBlocks"/>
/// </summary>
@ -346,8 +346,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public int GetNumUserBlocks() =>
(int)Reference.Invoke("getNumUserBlocks");
/// <summary>
/// Gets predictionCol value for <see cref="predictionCol"/>
/// </summary>
@ -356,8 +356,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public string GetPredictionCol() =>
(string)Reference.Invoke("getPredictionCol");
/// <summary>
/// Gets rank value for <see cref="rank"/>
/// </summary>
@ -366,8 +366,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public int GetRank() =>
(int)Reference.Invoke("getRank");
/// <summary>
/// Gets ratingCol value for <see cref="ratingCol"/>
/// </summary>
@ -376,8 +376,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public string GetRatingCol() =>
(string)Reference.Invoke("getRatingCol");
/// <summary>
/// Gets regParam value for <see cref="regParam"/>
/// </summary>
@ -386,8 +386,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public double GetRegParam() =>
(double)Reference.Invoke("getRegParam");
/// <summary>
/// Gets seed value for <see cref="seed"/>
/// </summary>
@ -396,8 +396,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public long GetSeed() =>
(long)Reference.Invoke("getSeed");
/// <summary>
/// Gets userCol value for <see cref="userCol"/>
/// </summary>
@ -421,18 +421,18 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns>New <see cref="ALS"/> object, loaded from path.</returns>
public static ALS Load(string path) => WrapAsALS(
SparkEnvironment.JvmBridge.CallStaticJavaMethod(s_className, "load", path));
/// <summary>
/// Saves the object so that it can be loaded later using Load. Note that these objects
/// can be shared with Scala by Loading or Saving in Scala.
/// </summary>
/// <param name="path">The path to save the object to</param>
public void Save(string path) => Reference.Invoke("save", path);
/// <returns>a <see cref="JavaMLWriter"/> instance for this ML instance.</returns>
public JavaMLWriter Write() =>
new JavaMLWriter((JvmObjectReference)Reference.Invoke("write"));
/// <summary>
/// Get the corresponding JavaMLReader instance.
/// </summary>
@ -443,8 +443,6 @@ namespace Microsoft.Spark.ML.Recommendation
private static ALS WrapAsALS(object obj) =>
new ALS((JvmObjectReference)obj);
}
}

Просмотреть файл

@ -53,7 +53,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALSModel object </returns>
public ALSModel SetBlockSize(int value) =>
WrapAsALSModel(Reference.Invoke("setBlockSize", (object)value));
/// <summary>
/// Sets coldStartStrategy value for <see cref="coldStartStrategy"/>
/// </summary>
@ -63,7 +63,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALSModel object </returns>
public ALSModel SetColdStartStrategy(string value) =>
WrapAsALSModel(Reference.Invoke("setColdStartStrategy", (object)value));
/// <summary>
/// Sets itemCol value for <see cref="itemCol"/>
/// </summary>
@ -73,7 +73,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALSModel object </returns>
public ALSModel SetItemCol(string value) =>
WrapAsALSModel(Reference.Invoke("setItemCol", (object)value));
/// <summary>
/// Sets predictionCol value for <see cref="predictionCol"/>
/// </summary>
@ -83,7 +83,7 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns> New ALSModel object </returns>
public ALSModel SetPredictionCol(string value) =>
WrapAsALSModel(Reference.Invoke("setPredictionCol", (object)value));
/// <summary>
/// Sets userCol value for <see cref="userCol"/>
/// </summary>
@ -94,7 +94,7 @@ namespace Microsoft.Spark.ML.Recommendation
public ALSModel SetUserCol(string value) =>
WrapAsALSModel(Reference.Invoke("setUserCol", (object)value));
/// <summary>
/// Gets blockSize value for <see cref="blockSize"/>
/// </summary>
@ -103,8 +103,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public int GetBlockSize() =>
(int)Reference.Invoke("getBlockSize");
/// <summary>
/// Gets coldStartStrategy value for <see cref="coldStartStrategy"/>
/// </summary>
@ -113,8 +113,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public string GetColdStartStrategy() =>
(string)Reference.Invoke("getColdStartStrategy");
/// <summary>
/// Gets itemCol value for <see cref="itemCol"/>
/// </summary>
@ -123,8 +123,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public string GetItemCol() =>
(string)Reference.Invoke("getItemCol");
/// <summary>
/// Gets predictionCol value for <see cref="predictionCol"/>
/// </summary>
@ -133,8 +133,8 @@ namespace Microsoft.Spark.ML.Recommendation
/// </returns>
public string GetPredictionCol() =>
(string)Reference.Invoke("getPredictionCol");
/// <summary>
/// Gets userCol value for <see cref="userCol"/>
/// </summary>
@ -144,7 +144,7 @@ namespace Microsoft.Spark.ML.Recommendation
public string GetUserCol() =>
(string)Reference.Invoke("getUserCol");
/// <summary>
/// Loads the <see cref="ALSModel"/> that was previously saved using Save(string).
/// </summary>
@ -152,18 +152,18 @@ namespace Microsoft.Spark.ML.Recommendation
/// <returns>New <see cref="ALSModel"/> object, loaded from path.</returns>
public static ALSModel Load(string path) => WrapAsALSModel(
SparkEnvironment.JvmBridge.CallStaticJavaMethod(s_className, "load", path));
/// <summary>
/// Saves the object so that it can be loaded later using Load. Note that these objects
/// can be shared with Scala by Loading or Saving in Scala.
/// </summary>
/// <param name="path">The path to save the object to</param>
public void Save(string path) => Reference.Invoke("save", path);
/// <returns>a <see cref="JavaMLWriter"/> instance for this ML instance.</returns>
public JavaMLWriter Write() =>
new JavaMLWriter((JvmObjectReference)Reference.Invoke("write"));
/// <summary>
/// Get the corresponding JavaMLReader instance.
/// </summary>
@ -174,8 +174,6 @@ namespace Microsoft.Spark.ML.Recommendation
private static ALSModel WrapAsALSModel(object obj) =>
new ALSModel((JvmObjectReference)obj);
}
}

Просмотреть файл

@ -56,7 +56,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegression object </returns>
public LinearRegression SetAggregationDepth(int value) =>
WrapAsLinearRegression(Reference.Invoke("setAggregationDepth", (object)value));
/// <summary>
/// Sets elasticNetParam value for <see cref="elasticNetParam"/>
/// </summary>
@ -66,7 +66,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegression object </returns>
public LinearRegression SetElasticNetParam(double value) =>
WrapAsLinearRegression(Reference.Invoke("setElasticNetParam", (object)value));
/// <summary>
/// Sets epsilon value for <see cref="epsilon"/>
/// </summary>
@ -76,7 +76,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegression object </returns>
public LinearRegression SetEpsilon(double value) =>
WrapAsLinearRegression(Reference.Invoke("setEpsilon", (object)value));
/// <summary>
/// Sets featuresCol value for <see cref="featuresCol"/>
/// </summary>
@ -86,7 +86,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegression object </returns>
public LinearRegression SetFeaturesCol(string value) =>
WrapAsLinearRegression(Reference.Invoke("setFeaturesCol", (object)value));
/// <summary>
/// Sets fitIntercept value for <see cref="fitIntercept"/>
/// </summary>
@ -96,7 +96,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegression object </returns>
public LinearRegression SetFitIntercept(bool value) =>
WrapAsLinearRegression(Reference.Invoke("setFitIntercept", (object)value));
/// <summary>
/// Sets labelCol value for <see cref="labelCol"/>
/// </summary>
@ -106,7 +106,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegression object </returns>
public LinearRegression SetLabelCol(string value) =>
WrapAsLinearRegression(Reference.Invoke("setLabelCol", (object)value));
/// <summary>
/// Sets loss value for <see cref="loss"/>
/// </summary>
@ -116,7 +116,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegression object </returns>
public LinearRegression SetLoss(string value) =>
WrapAsLinearRegression(Reference.Invoke("setLoss", (object)value));
/// <summary>
/// Sets maxBlockSizeInMB value for <see cref="maxBlockSizeInMB"/>
/// </summary>
@ -126,7 +126,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegression object </returns>
public LinearRegression SetMaxBlockSizeInMB(double value) =>
WrapAsLinearRegression(Reference.Invoke("setMaxBlockSizeInMB", (object)value));
/// <summary>
/// Sets maxIter value for <see cref="maxIter"/>
/// </summary>
@ -136,7 +136,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegression object </returns>
public LinearRegression SetMaxIter(int value) =>
WrapAsLinearRegression(Reference.Invoke("setMaxIter", (object)value));
/// <summary>
/// Sets predictionCol value for <see cref="predictionCol"/>
/// </summary>
@ -146,7 +146,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegression object </returns>
public LinearRegression SetPredictionCol(string value) =>
WrapAsLinearRegression(Reference.Invoke("setPredictionCol", (object)value));
/// <summary>
/// Sets regParam value for <see cref="regParam"/>
/// </summary>
@ -156,7 +156,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegression object </returns>
public LinearRegression SetRegParam(double value) =>
WrapAsLinearRegression(Reference.Invoke("setRegParam", (object)value));
/// <summary>
/// Sets solver value for <see cref="solver"/>
/// </summary>
@ -166,7 +166,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegression object </returns>
public LinearRegression SetSolver(string value) =>
WrapAsLinearRegression(Reference.Invoke("setSolver", (object)value));
/// <summary>
/// Sets standardization value for <see cref="standardization"/>
/// </summary>
@ -176,7 +176,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegression object </returns>
public LinearRegression SetStandardization(bool value) =>
WrapAsLinearRegression(Reference.Invoke("setStandardization", (object)value));
/// <summary>
/// Sets tol value for <see cref="tol"/>
/// </summary>
@ -186,7 +186,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegression object </returns>
public LinearRegression SetTol(double value) =>
WrapAsLinearRegression(Reference.Invoke("setTol", (object)value));
/// <summary>
/// Sets weightCol value for <see cref="weightCol"/>
/// </summary>
@ -197,7 +197,7 @@ namespace Microsoft.Spark.ML.Regression
public LinearRegression SetWeightCol(string value) =>
WrapAsLinearRegression(Reference.Invoke("setWeightCol", (object)value));
/// <summary>
/// Gets aggregationDepth value for <see cref="aggregationDepth"/>
/// </summary>
@ -206,8 +206,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public int GetAggregationDepth() =>
(int)Reference.Invoke("getAggregationDepth");
/// <summary>
/// Gets elasticNetParam value for <see cref="elasticNetParam"/>
/// </summary>
@ -216,8 +216,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public double GetElasticNetParam() =>
(double)Reference.Invoke("getElasticNetParam");
/// <summary>
/// Gets epsilon value for <see cref="epsilon"/>
/// </summary>
@ -226,8 +226,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public double GetEpsilon() =>
(double)Reference.Invoke("getEpsilon");
/// <summary>
/// Gets featuresCol value for <see cref="featuresCol"/>
/// </summary>
@ -236,8 +236,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public string GetFeaturesCol() =>
(string)Reference.Invoke("getFeaturesCol");
/// <summary>
/// Gets fitIntercept value for <see cref="fitIntercept"/>
/// </summary>
@ -246,8 +246,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public bool GetFitIntercept() =>
(bool)Reference.Invoke("getFitIntercept");
/// <summary>
/// Gets labelCol value for <see cref="labelCol"/>
/// </summary>
@ -256,8 +256,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public string GetLabelCol() =>
(string)Reference.Invoke("getLabelCol");
/// <summary>
/// Gets loss value for <see cref="loss"/>
/// </summary>
@ -266,8 +266,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public string GetLoss() =>
(string)Reference.Invoke("getLoss");
/// <summary>
/// Gets maxBlockSizeInMB value for <see cref="maxBlockSizeInMB"/>
/// </summary>
@ -276,8 +276,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public double GetMaxBlockSizeInMB() =>
(double)Reference.Invoke("getMaxBlockSizeInMB");
/// <summary>
/// Gets maxIter value for <see cref="maxIter"/>
/// </summary>
@ -286,8 +286,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public int GetMaxIter() =>
(int)Reference.Invoke("getMaxIter");
/// <summary>
/// Gets predictionCol value for <see cref="predictionCol"/>
/// </summary>
@ -296,8 +296,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public string GetPredictionCol() =>
(string)Reference.Invoke("getPredictionCol");
/// <summary>
/// Gets regParam value for <see cref="regParam"/>
/// </summary>
@ -306,8 +306,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public double GetRegParam() =>
(double)Reference.Invoke("getRegParam");
/// <summary>
/// Gets solver value for <see cref="solver"/>
/// </summary>
@ -316,8 +316,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public string GetSolver() =>
(string)Reference.Invoke("getSolver");
/// <summary>
/// Gets standardization value for <see cref="standardization"/>
/// </summary>
@ -326,8 +326,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public bool GetStandardization() =>
(bool)Reference.Invoke("getStandardization");
/// <summary>
/// Gets tol value for <see cref="tol"/>
/// </summary>
@ -336,8 +336,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public double GetTol() =>
(double)Reference.Invoke("getTol");
/// <summary>
/// Gets weightCol value for <see cref="weightCol"/>
/// </summary>
@ -361,18 +361,18 @@ namespace Microsoft.Spark.ML.Regression
/// <returns>New <see cref="LinearRegression"/> object, loaded from path.</returns>
public static LinearRegression Load(string path) => WrapAsLinearRegression(
SparkEnvironment.JvmBridge.CallStaticJavaMethod(s_className, "load", path));
/// <summary>
/// Saves the object so that it can be loaded later using Load. Note that these objects
/// can be shared with Scala by Loading or Saving in Scala.
/// </summary>
/// <param name="path">The path to save the object to</param>
public void Save(string path) => Reference.Invoke("save", path);
/// <returns>a <see cref="JavaMLWriter"/> instance for this ML instance.</returns>
public JavaMLWriter Write() =>
new JavaMLWriter((JvmObjectReference)Reference.Invoke("write"));
/// <summary>
/// Get the corresponding JavaMLReader instance.
/// </summary>
@ -383,8 +383,6 @@ namespace Microsoft.Spark.ML.Regression
private static LinearRegression WrapAsLinearRegression(object obj) =>
new LinearRegression((JvmObjectReference)obj);
}
}

Просмотреть файл

@ -51,7 +51,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegressionModel object </returns>
public LinearRegressionModel SetAggregationDepth(int value) =>
WrapAsLinearRegressionModel(Reference.Invoke("setAggregationDepth", (object)value));
/// <summary>
/// Sets elasticNetParam value for <see cref="elasticNetParam"/>
/// </summary>
@ -61,7 +61,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegressionModel object </returns>
public LinearRegressionModel SetElasticNetParam(double value) =>
WrapAsLinearRegressionModel(Reference.Invoke("setElasticNetParam", (object)value));
/// <summary>
/// Sets epsilon value for <see cref="epsilon"/>
/// </summary>
@ -71,7 +71,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegressionModel object </returns>
public LinearRegressionModel SetEpsilon(double value) =>
WrapAsLinearRegressionModel(Reference.Invoke("setEpsilon", (object)value));
/// <summary>
/// Sets featuresCol value for <see cref="featuresCol"/>
/// </summary>
@ -81,7 +81,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegressionModel object </returns>
public LinearRegressionModel SetFeaturesCol(string value) =>
WrapAsLinearRegressionModel(Reference.Invoke("setFeaturesCol", (object)value));
/// <summary>
/// Sets fitIntercept value for <see cref="fitIntercept"/>
/// </summary>
@ -91,7 +91,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegressionModel object </returns>
public LinearRegressionModel SetFitIntercept(bool value) =>
WrapAsLinearRegressionModel(Reference.Invoke("setFitIntercept", (object)value));
/// <summary>
/// Sets labelCol value for <see cref="labelCol"/>
/// </summary>
@ -101,7 +101,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegressionModel object </returns>
public LinearRegressionModel SetLabelCol(string value) =>
WrapAsLinearRegressionModel(Reference.Invoke("setLabelCol", (object)value));
/// <summary>
/// Sets loss value for <see cref="loss"/>
/// </summary>
@ -111,7 +111,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegressionModel object </returns>
public LinearRegressionModel SetLoss(string value) =>
WrapAsLinearRegressionModel(Reference.Invoke("setLoss", (object)value));
/// <summary>
/// Sets maxBlockSizeInMB value for <see cref="maxBlockSizeInMB"/>
/// </summary>
@ -121,7 +121,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegressionModel object </returns>
public LinearRegressionModel SetMaxBlockSizeInMB(double value) =>
WrapAsLinearRegressionModel(Reference.Invoke("setMaxBlockSizeInMB", (object)value));
/// <summary>
/// Sets maxIter value for <see cref="maxIter"/>
/// </summary>
@ -131,7 +131,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegressionModel object </returns>
public LinearRegressionModel SetMaxIter(int value) =>
WrapAsLinearRegressionModel(Reference.Invoke("setMaxIter", (object)value));
/// <summary>
/// Sets predictionCol value for <see cref="predictionCol"/>
/// </summary>
@ -141,7 +141,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegressionModel object </returns>
public LinearRegressionModel SetPredictionCol(string value) =>
WrapAsLinearRegressionModel(Reference.Invoke("setPredictionCol", (object)value));
/// <summary>
/// Sets regParam value for <see cref="regParam"/>
/// </summary>
@ -151,7 +151,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegressionModel object </returns>
public LinearRegressionModel SetRegParam(double value) =>
WrapAsLinearRegressionModel(Reference.Invoke("setRegParam", (object)value));
/// <summary>
/// Sets solver value for <see cref="solver"/>
/// </summary>
@ -161,7 +161,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegressionModel object </returns>
public LinearRegressionModel SetSolver(string value) =>
WrapAsLinearRegressionModel(Reference.Invoke("setSolver", (object)value));
/// <summary>
/// Sets standardization value for <see cref="standardization"/>
/// </summary>
@ -171,7 +171,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegressionModel object </returns>
public LinearRegressionModel SetStandardization(bool value) =>
WrapAsLinearRegressionModel(Reference.Invoke("setStandardization", (object)value));
/// <summary>
/// Sets tol value for <see cref="tol"/>
/// </summary>
@ -181,7 +181,7 @@ namespace Microsoft.Spark.ML.Regression
/// <returns> New LinearRegressionModel object </returns>
public LinearRegressionModel SetTol(double value) =>
WrapAsLinearRegressionModel(Reference.Invoke("setTol", (object)value));
/// <summary>
/// Sets weightCol value for <see cref="weightCol"/>
/// </summary>
@ -192,7 +192,7 @@ namespace Microsoft.Spark.ML.Regression
public LinearRegressionModel SetWeightCol(string value) =>
WrapAsLinearRegressionModel(Reference.Invoke("setWeightCol", (object)value));
/// <summary>
/// Gets aggregationDepth value for <see cref="aggregationDepth"/>
/// </summary>
@ -201,8 +201,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public int GetAggregationDepth() =>
(int)Reference.Invoke("getAggregationDepth");
/// <summary>
/// Gets elasticNetParam value for <see cref="elasticNetParam"/>
/// </summary>
@ -211,8 +211,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public double GetElasticNetParam() =>
(double)Reference.Invoke("getElasticNetParam");
/// <summary>
/// Gets epsilon value for <see cref="epsilon"/>
/// </summary>
@ -221,8 +221,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public double GetEpsilon() =>
(double)Reference.Invoke("getEpsilon");
/// <summary>
/// Gets featuresCol value for <see cref="featuresCol"/>
/// </summary>
@ -231,8 +231,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public string GetFeaturesCol() =>
(string)Reference.Invoke("getFeaturesCol");
/// <summary>
/// Gets fitIntercept value for <see cref="fitIntercept"/>
/// </summary>
@ -241,8 +241,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public bool GetFitIntercept() =>
(bool)Reference.Invoke("getFitIntercept");
/// <summary>
/// Gets labelCol value for <see cref="labelCol"/>
/// </summary>
@ -251,8 +251,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public string GetLabelCol() =>
(string)Reference.Invoke("getLabelCol");
/// <summary>
/// Gets loss value for <see cref="loss"/>
/// </summary>
@ -261,8 +261,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public string GetLoss() =>
(string)Reference.Invoke("getLoss");
/// <summary>
/// Gets maxBlockSizeInMB value for <see cref="maxBlockSizeInMB"/>
/// </summary>
@ -271,8 +271,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public double GetMaxBlockSizeInMB() =>
(double)Reference.Invoke("getMaxBlockSizeInMB");
/// <summary>
/// Gets maxIter value for <see cref="maxIter"/>
/// </summary>
@ -281,8 +281,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public int GetMaxIter() =>
(int)Reference.Invoke("getMaxIter");
/// <summary>
/// Gets predictionCol value for <see cref="predictionCol"/>
/// </summary>
@ -291,8 +291,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public string GetPredictionCol() =>
(string)Reference.Invoke("getPredictionCol");
/// <summary>
/// Gets regParam value for <see cref="regParam"/>
/// </summary>
@ -301,8 +301,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public double GetRegParam() =>
(double)Reference.Invoke("getRegParam");
/// <summary>
/// Gets solver value for <see cref="solver"/>
/// </summary>
@ -311,8 +311,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public string GetSolver() =>
(string)Reference.Invoke("getSolver");
/// <summary>
/// Gets standardization value for <see cref="standardization"/>
/// </summary>
@ -321,8 +321,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public bool GetStandardization() =>
(bool)Reference.Invoke("getStandardization");
/// <summary>
/// Gets tol value for <see cref="tol"/>
/// </summary>
@ -331,8 +331,8 @@ namespace Microsoft.Spark.ML.Regression
/// </returns>
public double GetTol() =>
(double)Reference.Invoke("getTol");
/// <summary>
/// Gets weightCol value for <see cref="weightCol"/>
/// </summary>
@ -342,7 +342,7 @@ namespace Microsoft.Spark.ML.Regression
public string GetWeightCol() =>
(string)Reference.Invoke("getWeightCol");
/// <summary>
/// Loads the <see cref="LinearRegressionModel"/> that was previously saved using Save(string).
/// </summary>
@ -350,18 +350,18 @@ namespace Microsoft.Spark.ML.Regression
/// <returns>New <see cref="LinearRegressionModel"/> object, loaded from path.</returns>
public static LinearRegressionModel Load(string path) => WrapAsLinearRegressionModel(
SparkEnvironment.JvmBridge.CallStaticJavaMethod(s_className, "load", path));
/// <summary>
/// Saves the object so that it can be loaded later using Load. Note that these objects
/// can be shared with Scala by Loading or Saving in Scala.
/// </summary>
/// <param name="path">The path to save the object to</param>
public void Save(string path) => Reference.Invoke("save", path);
/// <returns>a <see cref="JavaMLWriter"/> instance for this ML instance.</returns>
public JavaMLWriter Write() =>
new JavaMLWriter((JvmObjectReference)Reference.Invoke("write"));
/// <summary>
/// Get the corresponding JavaMLReader instance.
/// </summary>
@ -372,8 +372,6 @@ namespace Microsoft.Spark.ML.Regression
private static LinearRegressionModel WrapAsLinearRegressionModel(object obj) =>
new LinearRegressionModel((JvmObjectReference)obj);
}
}

Просмотреть файл

@ -2,7 +2,7 @@ import sys
import warnings
warnings.warn(
"The mmlspark namespace has been deprecated. Please change import statements to import from synapse.ml"
"The mmlspark namespace has been deprecated. Please change import statements to import from synapse.ml",
)
import synapse.ml

Просмотреть файл

@ -51,7 +51,8 @@ class DiscreteHyperParam(object):
ctx = SparkContext.getOrCreate()
self.jvm = ctx.getOrCreate()._jvm
self.hyperParam = self.jvm.com.microsoft.azure.synapse.ml.automl.HyperParamUtils.getDiscreteHyperParam(
values, seed
values,
seed,
)
def get(self):
@ -67,7 +68,9 @@ class RangeHyperParam(object):
ctx = SparkContext.getOrCreate()
self.jvm = ctx.getOrCreate()._jvm
self.rangeParam = self.jvm.com.microsoft.azure.synapse.ml.automl.HyperParamUtils.getRangeHyperParam(
min, max, seed
min,
max,
seed,
)
def get(self):
@ -89,15 +92,16 @@ class GridSpace(object):
if not isinstance(hyperparam, DiscreteHyperParam):
raise ValueError(
"GridSpace only supports DiscreteHyperParam, but hyperparam {} is of type {}".format(
k, type(hyperparam)
)
k,
type(hyperparam),
),
)
values = hyperparam.get().getValues()
hyperparamBuilder.addGrid(javaParam, self.jvm.PythonUtils.toList(values))
self.gridSpace = self.jvm.com.microsoft.azure.synapse.ml.automl.GridSpace(
hyperparamBuilder.build()
hyperparamBuilder.build(),
)
def space(self):
@ -119,7 +123,7 @@ class RandomSpace(object):
javaParam = est._java_obj.getParam(k.name)
hyperparamBuilder.addHyperparam(javaParam, hyperparam.get())
self.paramSpace = self.jvm.com.microsoft.azure.synapse.ml.automl.RandomSpace(
hyperparamBuilder.build()
hyperparamBuilder.build(),
)
def space(self):

Просмотреть файл

@ -55,7 +55,7 @@ def from_java(java_stage, stage_name):
py_stage = py_type._from_java(java_stage)
else:
raise NotImplementedError(
"This Java stage cannot be loaded into Python currently: %r" % stage_name
"This Java stage cannot be loaded into Python currently: %r" % stage_name,
)
return py_stage
@ -87,13 +87,13 @@ class ComplexParamsMixin(MLReadable):
sc._gateway.jvm.com.microsoft.azure.synapse.ml.core.serialize.ComplexParam._java_lang_class
)
is_complex_param = complex_param_class.isAssignableFrom(
java_param.getClass()
java_param.getClass(),
)
service_param_class = (
sc._gateway.jvm.com.microsoft.azure.synapse.ml.param.ServiceParam._java_lang_class
)
is_service_param = service_param_class.isAssignableFrom(
java_param.getClass()
java_param.getClass(),
)
if self._java_obj.isSet(java_param):
if is_complex_param:
@ -120,7 +120,7 @@ class ComplexParamsMixin(MLReadable):
sc._gateway.jvm.com.microsoft.azure.synapse.ml.param.ServiceParam._java_lang_class
)
is_service_param = service_param_class.isAssignableFrom(
self._java_obj.getParam(param.name).getClass()
self._java_obj.getParam(param.name).getClass(),
)
if is_service_param:
getattr(

Просмотреть файл

@ -43,7 +43,7 @@ def _mml_from_java(java_stage):
py_stage = py_type._from_java(java_stage)
else:
raise NotImplementedError(
"This Java stage cannot be loaded into Python currently: %r" % stage_name
"This Java stage cannot be loaded into Python currently: %r" % stage_name,
)
return py_stage

Просмотреть файл

@ -32,14 +32,20 @@ def _make_dot():
if (v is not None) and (u is not None):
vv = (
np.pad(
np.array(v), (0, len(u) - len(v)), "constant", constant_values=1.0
np.array(v),
(0, len(u) - len(v)),
"constant",
constant_values=1.0,
)
if len(v) < len(u)
else np.array(v)
)
uu = (
np.pad(
np.array(u), (0, len(v) - len(u)), "constant", constant_values=1.0
np.array(u),
(0, len(v) - len(u)),
"constant",
constant_values=1.0,
)
if len(u) < len(v)
else np.array(u)
@ -113,7 +119,7 @@ class _UserResourceFeatureVectorMapping:
self.res_feature_vector_mapping_df = res_feature_vector_mapping_df
assert self.history_access_df is None or set(
self.history_access_df.schema.fieldNames()
self.history_access_df.schema.fieldNames(),
) == {tenant_col, user_col, res_col}, self.history_access_df.schema.fieldNames()
def replace_mappings(
@ -269,12 +275,16 @@ class AccessAnomalyModel(Transformer):
t.StructField("has_user2component_mappings_df", t.BooleanType(), False),
t.StructField("has_res2component_mappings_df", t.BooleanType(), False),
t.StructField(
"has_user_feature_vector_mapping_df", t.BooleanType(), False
"has_user_feature_vector_mapping_df",
t.BooleanType(),
False,
),
t.StructField(
"has_res_feature_vector_mapping_df", t.BooleanType(), False
"has_res_feature_vector_mapping_df",
t.BooleanType(),
False,
),
]
],
)
def save(self, path: str, path_suffix: str = "", output_format: str = "parquet"):
@ -309,30 +319,30 @@ class AccessAnomalyModel(Transformer):
is not None,
self.user_res_feature_vector_mapping.res_feature_vector_mapping_df
is not None,
)
),
],
AccessAnomalyModel._metadata_schema(),
)
metadata_df.write.format(output_format).save(
os.path.join(path, "metadata_df", path_suffix)
os.path.join(path, "metadata_df", path_suffix),
)
if self.user_res_feature_vector_mapping.history_access_df is not None:
self.user_res_feature_vector_mapping.history_access_df.write.format(
output_format
output_format,
).save(os.path.join(path, "history_access_df", path_suffix))
if self.user_res_feature_vector_mapping.user2component_mappings_df is not None:
self.user_res_feature_vector_mapping.user2component_mappings_df.write.format(
output_format
output_format,
).save(
os.path.join(path, "user2component_mappings_df", path_suffix)
os.path.join(path, "user2component_mappings_df", path_suffix),
)
if self.user_res_feature_vector_mapping.res2component_mappings_df is not None:
self.user_res_feature_vector_mapping.res2component_mappings_df.write.format(
output_format
output_format,
).save(os.path.join(path, "res2component_mappings_df", path_suffix))
if (
@ -340,9 +350,9 @@ class AccessAnomalyModel(Transformer):
is not None
):
self.user_res_feature_vector_mapping.user_feature_vector_mapping_df.write.format(
output_format
output_format,
).save(
os.path.join(path, "user_feature_vector_mapping_df", path_suffix)
os.path.join(path, "user_feature_vector_mapping_df", path_suffix),
)
if (
@ -350,17 +360,19 @@ class AccessAnomalyModel(Transformer):
is not None
):
self.user_res_feature_vector_mapping.res_feature_vector_mapping_df.write.format(
output_format
output_format,
).save(
os.path.join(path, "res_feature_vector_mapping_df", path_suffix)
os.path.join(path, "res_feature_vector_mapping_df", path_suffix),
)
@staticmethod
def load(
spark: SQLContext, path: str, output_format: str = "parquet"
spark: SQLContext,
path: str,
output_format: str = "parquet",
) -> "AccessAnomalyModel":
metadata_df = spark.read.format(output_format).load(
os.path.join(path, "metadata_df")
os.path.join(path, "metadata_df"),
)
assert metadata_df.count() == 1
@ -385,7 +397,7 @@ class AccessAnomalyModel(Transformer):
history_access_df = (
spark.read.format(output_format).load(
os.path.join(path, "history_access_df")
os.path.join(path, "history_access_df"),
)
if has_history_access_df
else None
@ -393,7 +405,7 @@ class AccessAnomalyModel(Transformer):
user2component_mappings_df = (
spark.read.format(output_format).load(
os.path.join(path, "user2component_mappings_df")
os.path.join(path, "user2component_mappings_df"),
)
if has_user2component_mappings_df
else None
@ -401,7 +413,7 @@ class AccessAnomalyModel(Transformer):
res2component_mappings_df = (
spark.read.format(output_format).load(
os.path.join(path, "res2component_mappings_df")
os.path.join(path, "res2component_mappings_df"),
)
if has_res2component_mappings_df
else None
@ -409,7 +421,7 @@ class AccessAnomalyModel(Transformer):
user_feature_vector_mapping_df = (
spark.read.format(output_format).load(
os.path.join(path, "user_feature_vector_mapping_df")
os.path.join(path, "user_feature_vector_mapping_df"),
)
if has_user_feature_vector_mapping_df
else None
@ -417,7 +429,7 @@ class AccessAnomalyModel(Transformer):
res_feature_vector_mapping_df = (
spark.read.format(output_format).load(
os.path.join(path, "res_feature_vector_mapping_df")
os.path.join(path, "res_feature_vector_mapping_df"),
)
if has_res_feature_vector_mapping_df
else None
@ -516,7 +528,11 @@ class AccessAnomalyModel(Transformer):
.join(res_mapping_df, [tenant_col, res_col], how="left")
.withColumn(output_col, value_calc())
.drop(
user_vec_col, res_vec_col, "user_component", "res_component", seen_token
user_vec_col,
res_vec_col,
"user_component",
"res_component",
seen_token,
)
)
@ -550,7 +566,8 @@ class ConnectedComponents:
.cache()
)
user2index = spark_utils.DataFrameUtils.zip_with_index(
users, col_name="user_component"
users,
col_name="user_component",
)
user2components = user2index
res2components = None
@ -862,7 +879,10 @@ class AccessAnomaly(Estimator):
comp_df = None
scaled_df = self._get_scaled_df(indexed_df).select(
tenant_col, indexed_user_col, indexed_res_col, scaled_likelihood_col
tenant_col,
indexed_user_col,
indexed_res_col,
scaled_likelihood_col,
)
return scaled_df.union(comp_df) if comp_df is not None else scaled_df
@ -887,7 +907,8 @@ class AccessAnomaly(Estimator):
res_mapping_df = (
spark_model.itemFactors.select(
f.col("id").alias(indexed_res_col), f.col("features").alias(res_vec_col)
f.col("id").alias(indexed_res_col),
f.col("features").alias(res_vec_col),
)
.join(df.select(indexed_res_col, tenant_col).distinct(), indexed_res_col)
.select(tenant_col, indexed_res_col, res_vec_col)
@ -896,7 +917,8 @@ class AccessAnomaly(Estimator):
return user_mapping_df, res_mapping_df
def create_spark_model_vectors_df(
self, df: DataFrame
self,
df: DataFrame,
) -> _UserResourceFeatureVectorMapping:
tenant_col = self.tenant_col
indexed_user_col = self.indexed_user_col
@ -990,7 +1012,7 @@ class AccessAnomaly(Estimator):
output_col=self.indexed_res_col,
reset_per_partition=self.separate_tenants,
),
]
],
)
the_indexer_model = the_indexer.fit(df)
@ -1000,10 +1022,11 @@ class AccessAnomaly(Estimator):
enriched_df = self._enrich_and_normalize(indexed_df).cache()
user_res_feature_vector_mapping_df = self.create_spark_model_vectors_df(
enriched_df
enriched_df,
)
user_res_norm_cf_df_model = ModelNormalizeTransformer(
enriched_df, self.rank_param
enriched_df,
self.rank_param,
).transform(user_res_feature_vector_mapping_df)
# convert user and resource indices back to names
@ -1019,10 +1042,10 @@ class AccessAnomaly(Estimator):
# do the actual index to name mapping (using undo_transform)
final_user_mapping_df = user_index_model.undo_transform(
norm_user_mapping_df
norm_user_mapping_df,
).drop(indexed_user_col)
final_res_mapping_df = res_index_model.undo_transform(norm_res_mapping_df).drop(
indexed_res_col
indexed_res_col,
)
tenant_col, user_col, res_col = self.tenant_col, self.user_col, self.res_col
@ -1035,7 +1058,9 @@ class AccessAnomaly(Estimator):
)
user2component_mappings_df, res2component_mappings_df = ConnectedComponents(
tenant_col, user_col, res_col
tenant_col,
user_col,
res_col,
).transform(access_df)
return AccessAnomalyModel(
@ -1112,7 +1137,8 @@ class ModelNormalizeTransformer:
return append_bias
def transform(
self, user_res_cf_df_model: _UserResourceFeatureVectorMapping
self,
user_res_cf_df_model: _UserResourceFeatureVectorMapping,
) -> _UserResourceFeatureVectorMapping:
likelihood_col_token = "__likelihood__"
@ -1140,28 +1166,39 @@ class ModelNormalizeTransformer:
res_col,
res_vec_col,
dot(f.col(user_vec_col), f.col(res_vec_col)).alias(
likelihood_col_token
likelihood_col_token,
),
)
)
scaler_model = scalers.StandardScalarScaler(
likelihood_col_token, tenant_col, user_vec_col
likelihood_col_token,
tenant_col,
user_vec_col,
).fit(fixed_df)
per_group_stats: DataFrame = scaler_model.per_group_stats
assert isinstance(per_group_stats, DataFrame)
append2user_bias = self._make_append_bias(
user_col, res_col, user_col, user_col, self.rank
user_col,
res_col,
user_col,
user_col,
self.rank,
)
append2res_bias = self._make_append_bias(
user_col, res_col, res_col, user_col, self.rank
user_col,
res_col,
res_col,
user_col,
self.rank,
)
fixed_user_mapping_df = (
user_res_cf_df_model.user_feature_vector_mapping_df.join(
per_group_stats, tenant_col
per_group_stats,
tenant_col,
).select(
tenant_col,
user_col,
@ -1178,7 +1215,8 @@ class ModelNormalizeTransformer:
)
fixed_res_mapping_df = user_res_cf_df_model.res_feature_vector_mapping_df.join(
per_group_stats, tenant_col
per_group_stats,
tenant_col,
).select(
tenant_col,
res_col,
@ -1186,5 +1224,6 @@ class ModelNormalizeTransformer:
)
return user_res_cf_df_model.replace_mappings(
fixed_user_mapping_df, fixed_res_mapping_df
fixed_user_mapping_df,
fixed_res_mapping_df,
)

Просмотреть файл

@ -12,7 +12,9 @@ import random
class ComplementAccessTransformer(Transformer):
partitionKey = Param(
Params._dummy(), "partitionKey", "The name of the partition_key field name"
Params._dummy(),
"partitionKey",
"The name of the partition_key field name",
)
indexedColNamesArr = Param(
@ -94,10 +96,10 @@ class ComplementAccessTransformer(Transformer):
.groupBy(partition_key)
.agg(
f.min(curr_col_name).alias(
ComplementAccessTransformer._min_index_token(curr_col_name)
ComplementAccessTransformer._min_index_token(curr_col_name),
),
f.max(curr_col_name).alias(
ComplementAccessTransformer._max_index_token(curr_col_name)
ComplementAccessTransformer._max_index_token(curr_col_name),
),
)
.orderBy(partition_key)
@ -110,8 +112,8 @@ class ComplementAccessTransformer(Transformer):
[
t.StructField(curr_col_name, t.IntegerType())
for curr_col_name in indexed_col_names_arr
]
)
],
),
)
@f.udf(schema)
@ -121,9 +123,10 @@ class ComplementAccessTransformer(Transformer):
[
random.randint(min_index, max_index)
for min_index, max_index in zip(
min_index_arr, max_index_arr
min_index_arr,
max_index_arr,
)
]
],
)
for _ in range(factor)
]
@ -134,7 +137,8 @@ class ComplementAccessTransformer(Transformer):
for limits_df in limits_dfs:
pre_complement_candidates_df = pre_complement_candidates_df.join(
limits_df, partition_key
limits_df,
partition_key,
).cache()
cols = [f.col(partition_key)] + [
@ -151,23 +155,23 @@ class ComplementAccessTransformer(Transformer):
[
f.col(
ComplementAccessTransformer._min_index_token(
curr_col_name
)
curr_col_name,
),
)
for curr_col_name in indexed_col_names_arr
]
],
),
f.array(
[
f.col(
ComplementAccessTransformer._max_index_token(
curr_col_name
)
curr_col_name,
),
)
for curr_col_name in indexed_col_names_arr
]
],
),
)
),
),
)
.select(
@ -178,7 +182,7 @@ class ComplementAccessTransformer(Transformer):
"{0}.{1}".format(
ComplementAccessTransformer._tuple_token(),
curr_col_name,
)
),
).alias(curr_col_name)
for curr_col_name in indexed_col_names_arr
]
@ -192,7 +196,9 @@ class ComplementAccessTransformer(Transformer):
res_df = (
complement_candidates_df.join(
tuples_df, [partition_key] + indexed_col_names_arr, how="left_anti"
tuples_df,
[partition_key] + indexed_col_names_arr,
how="left_anti",
)
.select(*cols)
.orderBy(*cols)

Просмотреть файл

@ -37,7 +37,10 @@ class DataFactory:
self.rand = random.Random(42)
def to_pdf(
self, users: List[str], resources: List[str], likelihoods: List[float]
self,
users: List[str],
resources: List[str],
likelihoods: List[float],
) -> pd.DataFrame:
return pd.DataFrame(
data={
@ -46,7 +49,7 @@ class DataFactory:
AccessAnomalyConfig.default_likelihood_col: [
float(s) for s in likelihoods
],
}
},
)
def tups2pdf(self, tup_arr: List[Tuple[str, str, float]]) -> pd.DataFrame:
@ -129,11 +132,12 @@ class DataFactory:
+ self.edges_between(self.eng_users, self.join_resources, 1.0, True)
+ self.edges_between(self.hr_users, self.hr_resources, ratio, True)
+ self.edges_between(self.fin_users, self.fin_resources, ratio, True)
+ self.edges_between(self.eng_users, self.eng_resources, ratio, True)
+ self.edges_between(self.eng_users, self.eng_resources, ratio, True),
)
def create_clustered_intra_test_data(
self, train: Optional[pd.DataFrame] = None
self,
train: Optional[pd.DataFrame] = None,
) -> pd.DataFrame:
not_set = (
set(
@ -143,7 +147,7 @@ class DataFactory:
row[AccessAnomalyConfig.default_res_col],
)
for _, row in train.iterrows()
]
],
)
if train is not None
else None
@ -154,14 +158,26 @@ class DataFactory:
+ self.edges_between(self.fin_users, self.join_resources, 1.0, True)
+ self.edges_between(self.eng_users, self.join_resources, 1.0, True)
+ self.edges_between(
self.hr_users, self.hr_resources, 0.025, False, not_set
self.hr_users,
self.hr_resources,
0.025,
False,
not_set,
)
+ self.edges_between(
self.fin_users, self.fin_resources, 0.05, False, not_set
self.fin_users,
self.fin_resources,
0.05,
False,
not_set,
)
+ self.edges_between(
self.eng_users, self.eng_resources, 0.035, False, not_set
)
self.eng_users,
self.eng_resources,
0.035,
False,
not_set,
),
)
def create_clustered_inter_test_data(self) -> pd.DataFrame:
@ -174,7 +190,7 @@ class DataFactory:
+ self.edges_between(self.fin_users, self.hr_resources, 0.05, False)
+ self.edges_between(self.fin_users, self.eng_resources, 0.05, False)
+ self.edges_between(self.eng_users, self.fin_resources, 0.035, False)
+ self.edges_between(self.eng_users, self.hr_resources, 0.035, False)
+ self.edges_between(self.eng_users, self.hr_resources, 0.035, False),
)
def create_fixed_training_data(self) -> pd.DataFrame:

Просмотреть файл

@ -23,11 +23,18 @@ class IdIndexerModel(Transformer, HasSetInputCol, HasSetOutputCol):
)
def __init__(
self, input_col: str, partition_key: str, output_col: str, vocab_df: DataFrame
self,
input_col: str,
partition_key: str,
output_col: str,
vocab_df: DataFrame,
):
super().__init__()
ExplainBuilder.build(
self, inputCol=input_col, partitionKey=partition_key, outputCol=output_col
self,
inputCol=input_col,
partitionKey=partition_key,
outputCol=output_col,
)
self._vocab_df = vocab_df
@ -49,7 +56,7 @@ class IdIndexerModel(Transformer, HasSetInputCol, HasSetOutputCol):
.withColumn(
output_col,
f.when(f.col(output_col).isNotNull(), f.col(output_col)).otherwise(
f.lit(0)
f.lit(0),
),
)
.drop(input_col)
@ -102,7 +109,9 @@ class IdIndexer(Estimator, HasSetInputCol, HasSetOutputCol):
)
if self.getResetPerPartition()
else DataFrameUtils.zip_with_index(
df=the_df, start_index=1, col_name=self.getOutputCol()
df=the_df,
start_index=1,
col_name=self.getOutputCol(),
)
)

Просмотреть файл

@ -42,7 +42,10 @@ class PerPartitionScalarScalerModel(ABC, Transformer, HasSetInputCol, HasSetOutp
super().__init__()
ExplainBuilder.build(
self, inputCol=input_col, partitionKey=partition_key, outputCol=output_col
self,
inputCol=input_col,
partitionKey=partition_key,
outputCol=output_col,
)
self._per_group_stats = per_group_stats
self._use_pandas = use_pandas
@ -65,10 +68,12 @@ class PerPartitionScalarScalerModel(ABC, Transformer, HasSetInputCol, HasSetOutp
def is_partitioned(self) -> bool:
if self.partition_key is not None or isinstance(
self.per_group_stats, DataFrame
self.per_group_stats,
DataFrame,
):
assert self.partition_key is not None and isinstance(
self.per_group_stats, DataFrame
self.per_group_stats,
DataFrame,
)
res = True
elif self.partition_key is None or isinstance(self.per_group_stats, Dict):
@ -76,7 +81,7 @@ class PerPartitionScalarScalerModel(ABC, Transformer, HasSetInputCol, HasSetOutp
res = False
else:
assert False, "unsupported type for per_group_stats: {0}".format(
type(self.per_group_stats)
type(self.per_group_stats),
)
return res
@ -105,7 +110,10 @@ class PerPartitionScalarScalerModel(ABC, Transformer, HasSetInputCol, HasSetOutp
class PerPartitionScalarScalerEstimator(
ABC, Estimator, HasSetInputCol, HasSetOutputCol
ABC,
Estimator,
HasSetInputCol,
HasSetOutputCol,
):
partitionKey = Param(
Params._dummy(),
@ -123,7 +131,10 @@ class PerPartitionScalarScalerEstimator(
super().__init__()
ExplainBuilder.build(
self, inputCol=input_col, partitionKey=partition_key, outputCol=output_col
self,
inputCol=input_col,
partitionKey=partition_key,
outputCol=output_col,
)
self._use_pandas = use_pandas
@ -137,7 +148,8 @@ class PerPartitionScalarScalerEstimator(
@abstractmethod
def _create_model(
self, per_group_stats: Union[DataFrame, Dict[str, float]]
self,
per_group_stats: Union[DataFrame, Dict[str, float]],
) -> PerPartitionScalarScalerModel:
raise NotImplementedError()
@ -183,7 +195,11 @@ class StandardScalarScalerModel(PerPartitionScalarScalerModel):
):
super().__init__(
input_col, partition_key, output_col, per_group_stats, use_pandas
input_col,
partition_key,
output_col,
per_group_stats,
use_pandas,
)
self.coefficient_factor = coefficient_factor
@ -247,7 +263,8 @@ class StandardScalarScaler(PerPartitionScalarScalerEstimator):
]
def _create_model(
self, per_group_stats: Union[DataFrame, Dict[str, float]]
self,
per_group_stats: Union[DataFrame, Dict[str, float]],
) -> PerPartitionScalarScalerModel:
return StandardScalarScalerModel(
self.input_col,
@ -277,7 +294,11 @@ class LinearScalarScalerModel(PerPartitionScalarScalerModel):
):
super().__init__(
input_col, partition_key, output_col, per_group_stats, use_pandas
input_col,
partition_key,
output_col,
per_group_stats,
use_pandas,
)
self.min_required_value = min_required_value
self.max_required_value = max_required_value
@ -287,12 +308,13 @@ class LinearScalarScalerModel(PerPartitionScalarScalerModel):
def actual_delta():
return f.col(LinearScalarScalerConfig.max_actual_value_token) - f.col(
LinearScalarScalerConfig.min_actual_value_token
LinearScalarScalerConfig.min_actual_value_token,
)
def a():
return f.when(
actual_delta() != f.lit(0), f.lit(req_delta) / actual_delta()
actual_delta() != f.lit(0),
f.lit(req_delta) / actual_delta(),
).otherwise(f.lit(0.0))
def b():
@ -301,7 +323,7 @@ class LinearScalarScalerModel(PerPartitionScalarScalerModel):
self.max_required_value
- a() * f.col(LinearScalarScalerConfig.max_actual_value_token),
).otherwise(
f.lit((self.min_required_value + self.max_required_value) / 2.0)
f.lit((self.min_required_value + self.max_required_value) / 2.0),
)
def norm(x):
@ -370,15 +392,16 @@ class LinearScalarScaler(PerPartitionScalarScalerEstimator):
return [
f.min(f.col(input_col)).alias(
LinearScalarScalerConfig.min_actual_value_token
LinearScalarScalerConfig.min_actual_value_token,
),
f.max(f.col(input_col)).alias(
LinearScalarScalerConfig.max_actual_value_token
LinearScalarScalerConfig.max_actual_value_token,
),
]
def _create_model(
self, per_group_stats: Union[DataFrame, Dict[str, float]]
self,
per_group_stats: Union[DataFrame, Dict[str, float]],
) -> PerPartitionScalarScalerModel:
return LinearScalarScalerModel(
self.input_col,

Просмотреть файл

@ -88,7 +88,8 @@ class DataFrameUtils:
window = window.orderBy(*order_by_columns)
return df.withColumn(
col_name, f.row_number().over(window) - 1 + start_index
col_name,
f.row_number().over(window) - 1 + start_index,
)
else:
if len(order_by_col) > 0:
@ -96,7 +97,7 @@ class DataFrameUtils:
df = df.orderBy(*order_by_columns)
output_schema = t.StructType(
df.schema.fields + [t.StructField(col_name, t.LongType(), True)]
df.schema.fields + [t.StructField(col_name, t.LongType(), True)],
)
return (
df.rdd.zipWithIndex()
@ -195,13 +196,15 @@ class ExplainBuilder:
else:
assert (
ExplainBuilder.get_method(
explainable, to_camel_case("get", param_name)
explainable,
to_camel_case("get", param_name),
)
is not None
), "no_getter"
assert (
ExplainBuilder.get_method(
explainable, to_camel_case("set", param_name)
explainable,
to_camel_case("set", param_name),
)
is not None
), "no_setter"

Просмотреть файл

@ -55,7 +55,9 @@ class ModelSchema:
def __repr__(self):
return "ModelSchema<name: {}, dataset: {}, loc: {}>".format(
self.name, self.dataset, self.uri
self.name,
self.dataset,
self.uri,
)
def toJava(self, sparkSession):
@ -107,7 +109,9 @@ class ModelDownloader:
self._ctx = sparkSession.sparkContext
self._model_downloader = (
self._ctx._jvm.com.microsoft.azure.synapse.ml.downloader.ModelDownloader(
sparkSession._jsparkSession, localPath, serverURL
sparkSession._jsparkSession,
localPath,
serverURL,
)
)

Просмотреть файл

@ -82,14 +82,23 @@ setattr(pyspark.sql.streaming.DataStreamWriter, "continuousServer", _writeContSe
def _parseRequest(
self, apiName, schema, idCol="id", requestCol="request", parsingCheck="none"
self,
apiName,
schema,
idCol="id",
requestCol="request",
parsingCheck="none",
):
ctx = SparkContext.getOrCreate()
jvm = ctx._jvm
extended = jvm.com.microsoft.azure.synapse.ml.io.DataFrameExtensions(self._jdf)
dt = jvm.org.apache.spark.sql.types.DataType
jResult = extended.parseRequest(
apiName, dt.fromJson(schema.json()), idCol, requestCol, parsingCheck
apiName,
dt.fromJson(schema.json()),
idCol,
requestCol,
parsingCheck,
)
sql_ctx = pyspark.SQLContext.getOrCreate(ctx)
return DataFrame(jResult, sql_ctx)
@ -112,7 +121,7 @@ setattr(pyspark.sql.DataFrame, "makeReply", _makeReply)
def _readImage(self):
return self.format(image_source).schema(
StructType([StructField("image", ImageSchema, True)])
StructType([StructField("image", ImageSchema, True)]),
)

Просмотреть файл

@ -22,7 +22,7 @@ BinaryFileSchema = StructType(
[
StructField(BinaryFileFields[0], StringType(), True),
StructField(BinaryFileFields[1], BinaryType(), True),
]
],
)
"""
Schema for Binary Files.
@ -34,7 +34,12 @@ Schema records consist of BinaryFileFields name, Type, and ??
def readBinaryFiles(
self, path, recursive=False, sampleRatio=1.0, inspectZip=True, seed=0
self,
path,
recursive=False,
sampleRatio=1.0,
inspectZip=True,
seed=0,
):
"""
Reads the directory of binary files from the local or remote (WASB) source
@ -57,7 +62,12 @@ def readBinaryFiles(
sql_ctx = pyspark.SQLContext.getOrCreate(ctx)
jsession = sql_ctx.sparkSession._jsparkSession
jresult = reader.read(
path, recursive, jsession, float(sampleRatio), inspectZip, seed
path,
recursive,
jsession,
float(sampleRatio),
inspectZip,
seed,
)
return DataFrame(jresult, sql_ctx)

Просмотреть файл

@ -45,8 +45,8 @@ HTTPRequestDataType = StructType().fromJson(
'{"name":"value","type":"string","nullable":true,"metadata":{}}]},"nullable":true,"metadata":{}},'
'{"name":"isChunked","type":"boolean","nullable":false,"metadata":{}},'
'{"name":"isRepeatable","type":"boolean","nullable":false,"metadata":{}},'
'{"name":"isStreaming","type":"boolean","nullable":false,"metadata":{}}]},"nullable":true,"metadata":{}}]}'
)
'{"name":"isStreaming","type":"boolean","nullable":false,"metadata":{}}]},"nullable":true,"metadata":{}}]}',
),
)

Просмотреть файл

@ -17,7 +17,7 @@ from pyspark.sql.types import StructType
class JSONOutputParser(_JSONOutputParser):
def setDataType(self, value):
jdt = SparkContext.getOrCreate()._jvm.org.apache.spark.sql.types.DataType.fromJson(
value.json()
value.json(),
)
self._java_obj = self._java_obj.setDataType(jdt)
return self

Просмотреть файл

@ -19,7 +19,10 @@ class ConditionalBallTree(object):
"""
if java_obj is None:
self._jconditional_balltree = SparkContext._active_spark_context._jvm.com.microsoft.azure.synapse.ml.nn.ConditionalBallTree.apply(
keys, values, labels, leafSize
keys,
values,
labels,
leafSize,
)
else:
self._jconditional_balltree = java_obj
@ -35,7 +38,9 @@ class ConditionalBallTree(object):
return [
(bm.index(), bm.distance())
for bm in self._jconditional_balltree.findMaximumInnerProducts(
queryPoint, conditioner, k
queryPoint,
conditioner,
k,
)
]
@ -45,6 +50,6 @@ class ConditionalBallTree(object):
@staticmethod
def load(filename):
java_obj = SparkContext._active_spark_context._jvm.com.microsoft.azure.synapse.ml.nn.ConditionalBallTree.load(
filename
filename,
)
return ConditionalBallTree(None, None, None, None, java_obj=java_obj)

Просмотреть файл

@ -21,7 +21,10 @@ from synapse.ml.core.schema.Utils import *
@inherit_doc
class UDFTransformer(
ComplexParamsMixin, JavaMLReadable, JavaMLWritable, JavaTransformer
ComplexParamsMixin,
JavaMLReadable,
JavaMLWritable,
JavaTransformer,
):
"""
Args:
@ -36,16 +39,22 @@ class UDFTransformer(
def __init__(self, inputCol=None, inputCols=None, outputCol=None, udf=None):
super(UDFTransformer, self).__init__()
self._java_obj = self._new_java_obj(
"com.microsoft.azure.synapse.ml.stages.UDFTransformer"
"com.microsoft.azure.synapse.ml.stages.UDFTransformer",
)
self.inputCol = Param(
self, "inputCol", "inputCol: The name of the input column (default: )"
self,
"inputCol",
"inputCol: The name of the input column (default: )",
)
self.inputCols = Param(
self, "inputCols", "inputCols: The names of the input columns (default: )"
self,
"inputCols",
"inputCols: The names of the input columns (default: )",
)
self.outputCol = Param(
self, "outputCol", "outputCol: The name of the output column"
self,
"outputCol",
"outputCol: The name of the output column",
)
self.udf = Param(
self,
@ -126,7 +135,9 @@ class UDFTransformer(
name = getattr(udf, "_name", getattr(udf, "__name__", None))
name = name if name else udf.__class__.__name__
userDefinedFunction = UserDefinedFunction(
udf.func, returnType=udf.returnType, name=name
udf.func,
returnType=udf.returnType,
name=name,
)
self._java_obj = self._java_obj.setUDF(userDefinedFunction._judf)
self._udf = udf

Просмотреть файл

@ -206,4 +206,3 @@ object CodeGen {
}
}

Просмотреть файл

@ -83,4 +83,3 @@ object DotnetCodegen {
}
}

Просмотреть файл

@ -496,4 +496,3 @@ trait RWrappable extends BaseWrappable {
}
trait Wrappable extends PythonWrappable with RWrappable with DotnetWrappable

Просмотреть файл

@ -204,7 +204,3 @@ trait ParamGroup extends Serializable {
def appendParams(sb: ParamsStringBuilder): ParamsStringBuilder
}

Просмотреть файл

@ -86,4 +86,3 @@ class CountSelectorModel(val uid: String) extends Model[CountSelectorModel]
}
}

Просмотреть файл

@ -330,5 +330,3 @@ class ImageLIME(val uid: String) extends Transformer with LIMEBase
}
}

Просмотреть файл

@ -87,4 +87,3 @@ class TextLIME(val uid: String) extends Model[TextLIME]
}
}

Просмотреть файл

@ -135,4 +135,3 @@ class Consolidator[T] {
}
}

Просмотреть файл

@ -110,4 +110,3 @@ class UDFTransformer(val uid: String) extends Transformer with Wrappable with Co
def copy(extra: ParamMap): UDFTransformer = defaultCopy(extra)
}

Просмотреть файл

@ -38,4 +38,3 @@ object ParamInjections {
}
}
}

Просмотреть файл

@ -140,4 +140,3 @@ class PipelineArraySerializer extends Serializer[Array[PipelineStage]] {
Pipeline.load(path.toString).getStages
}
}

Просмотреть файл

@ -76,4 +76,3 @@ trait OptimizedKNNFitting extends KNNParams with BasicLogging {
}
}

Просмотреть файл

@ -30,7 +30,12 @@ def materialized_cache(df: DataFrame) -> DataFrame:
class BasicStats:
def __init__(
self, count: int, min_value: float, max_value: float, mean: float, std: float
self,
count: int,
min_value: float,
max_value: float,
mean: float,
std: float,
):
self.count = count
self.min = min_value
@ -69,7 +74,9 @@ class StatsMap:
def create_stats(
df, tenant_col: str, value_col: str = AccessAnomalyConfig.default_output_col
df,
tenant_col: str,
value_col: str = AccessAnomalyConfig.default_output_col,
) -> StatsMap:
stat_rows = (
df.groupBy(tenant_col)
@ -130,15 +137,18 @@ class Dataset:
inter_test_pdf = factory.create_clustered_inter_test_data()
curr_training = sc.createDataFrame(training_pdf).withColumn(
AccessAnomalyConfig.default_tenant_col, f.lit(tid)
AccessAnomalyConfig.default_tenant_col,
f.lit(tid),
)
curr_intra_test = sc.createDataFrame(intra_test_pdf).withColumn(
AccessAnomalyConfig.default_tenant_col, f.lit(tid)
AccessAnomalyConfig.default_tenant_col,
f.lit(tid),
)
curr_inter_test = sc.createDataFrame(inter_test_pdf).withColumn(
AccessAnomalyConfig.default_tenant_col, f.lit(tid)
AccessAnomalyConfig.default_tenant_col,
f.lit(tid),
)
self.training = (
@ -170,7 +180,7 @@ class Dataset:
AccessAnomalyConfig.default_user_col,
AccessAnomalyConfig.default_res_col,
],
)
),
).count()
== 0
), f"self.training.join is not 0"
@ -203,8 +213,9 @@ class Dataset:
return materialized_cache(
sc.createDataFrame(training_pdf).withColumn(
AccessAnomalyConfig.default_tenant_col, f.lit(0)
)
AccessAnomalyConfig.default_tenant_col,
f.lit(0),
),
)
def get_default_access_anomaly_model(self):
@ -212,7 +223,8 @@ class Dataset:
return self.default_access_anomaly_model
access_anomaly = AccessAnomaly(
tenantCol=AccessAnomalyConfig.default_tenant_col, maxIter=10
tenantCol=AccessAnomalyConfig.default_tenant_col,
maxIter=10,
)
self.default_access_anomaly_model = access_anomaly.fit(self.training)
self.default_access_anomaly_model.preserve_history = False
@ -242,7 +254,7 @@ class TestModelNormalizeTransformer(unittest.TestCase):
t.StructField(user_col, t.StringType(), False),
t.StructField(res_col, t.StringType(), False),
t.StructField(likelihood_col, t.DoubleType(), False),
]
],
)
user_model_schema = t.StructType(
@ -250,7 +262,7 @@ class TestModelNormalizeTransformer(unittest.TestCase):
t.StructField(tenant_col, t.StringType(), False),
t.StructField(user_col, t.StringType(), False),
t.StructField(user_vec_col, t.ArrayType(t.DoubleType()), False),
]
],
)
res_model_schema = t.StructType(
@ -258,17 +270,18 @@ class TestModelNormalizeTransformer(unittest.TestCase):
t.StructField(tenant_col, t.StringType(), False),
t.StructField(res_col, t.StringType(), False),
t.StructField(res_vec_col, t.ArrayType(t.DoubleType()), False),
]
],
)
df = materialized_cache(
sc.createDataFrame(
[["0", "roy", "res1", 4.0], ["0", "roy", "res2", 8.0]], df_schema
)
[["0", "roy", "res1", 4.0], ["0", "roy", "res2", 8.0]],
df_schema,
),
)
user_mapping_df = materialized_cache(
sc.createDataFrame([["0", "roy", [1.0, 1.0, 0.0, 1.0]]], user_model_schema)
sc.createDataFrame([["0", "roy", [1.0, 1.0, 0.0, 1.0]]], user_model_schema),
)
res_mapping_df = materialized_cache(
@ -278,7 +291,7 @@ class TestModelNormalizeTransformer(unittest.TestCase):
["0", "res2", [4.0, 4.0, 1.0, 0.0]],
],
res_model_schema,
)
),
)
user_res_feature_vector_mapping_df = UserResourceFeatureVectorMapping(
@ -298,7 +311,7 @@ class TestModelNormalizeTransformer(unittest.TestCase):
model_normalizer = ModelNormalizeTransformer(df, 2)
fixed_user_res_mapping_df = model_normalizer.transform(
user_res_feature_vector_mapping_df
user_res_feature_vector_mapping_df,
)
assert fixed_user_res_mapping_df.check()
@ -316,14 +329,14 @@ class TestModelNormalizeTransformer(unittest.TestCase):
assert (
fixed_user_res_mapping_df.user_feature_vector_mapping_df.filter(
f.size(f.col(user_vec_col)) == 4
f.size(f.col(user_vec_col)) == 4,
).count()
== user_mapping_df.count()
), f"{fixed_user_mapping_df.filter(f.size(f.col(user_vec_col)) == 4).count()} != {user_mapping_df.count()}"
assert (
fixed_user_res_mapping_df.res_feature_vector_mapping_df.filter(
f.size(f.col(res_vec_col)) == 4
f.size(f.col(res_vec_col)) == 4,
).count()
== res_mapping_df.count()
), f"{fixed_res_mapping_df.filter(f.size(f.col(res_vec_col)) == 4).count()} != {res_mapping_df.count()}"
@ -352,7 +365,7 @@ class TestModelNormalizeTransformer(unittest.TestCase):
t.StructField(tenant_col, t.StringType(), False),
t.StructField(user_col, t.StringType(), False),
t.StructField(user_vec_col, t.ArrayType(t.DoubleType()), False),
]
],
)
res_model_schema = t.StructType(
@ -360,7 +373,7 @@ class TestModelNormalizeTransformer(unittest.TestCase):
t.StructField(tenant_col, t.StringType(), False),
t.StructField(res_col, t.StringType(), False),
t.StructField(res_vec_col, t.ArrayType(t.DoubleType()), False),
]
],
)
user_mapping_df = materialized_cache(
@ -374,7 +387,7 @@ class TestModelNormalizeTransformer(unittest.TestCase):
for i in range(num_users)
],
user_model_schema,
)
),
)
res_mapping_df = materialized_cache(
@ -388,7 +401,7 @@ class TestModelNormalizeTransformer(unittest.TestCase):
for i in range(num_resources)
],
res_model_schema,
)
),
)
df = (
@ -411,7 +424,7 @@ class TestModelNormalizeTransformer(unittest.TestCase):
for i in range(num_users)
],
user_model_schema,
)
),
)
res_mapping_df = materialized_cache(
@ -425,7 +438,7 @@ class TestModelNormalizeTransformer(unittest.TestCase):
for i in range(num_resources)
],
res_model_schema,
)
),
)
df = (
@ -468,14 +481,14 @@ class TestModelNormalizeTransformer(unittest.TestCase):
assert (
user_res_norm_cf_mapping.user_feature_vector_mapping_df.filter(
f.size(f.col(user_vec_col)) == 4
f.size(f.col(user_vec_col)) == 4,
).count()
== user_mapping_df.count()
)
assert (
user_res_norm_cf_mapping.res_feature_vector_mapping_df.filter(
f.size(f.col(res_vec_col)) == 4
f.size(f.col(res_vec_col)) == 4,
).count()
== res_mapping_df.count()
)
@ -556,25 +569,35 @@ class TestAccessAnomaly(unittest.TestCase):
assert loaded_model.res_vec_col == res_vec_col
intra_test_scored = model.transform(data_set.intra_test).orderBy(
tenant_col, user_col, res_col
tenant_col,
user_col,
res_col,
)
intra_test_scored_tag = loaded_model.transform(data_set.intra_test).orderBy(
tenant_col, user_col, res_col
tenant_col,
user_col,
res_col,
)
assert_frame_equal(
intra_test_scored.toPandas(), intra_test_scored_tag.toPandas()
intra_test_scored.toPandas(),
intra_test_scored_tag.toPandas(),
)
inter_test_scored = model.transform(data_set.inter_test).orderBy(
tenant_col, user_col, res_col
tenant_col,
user_col,
res_col,
)
inter_test_scored_tag = loaded_model.transform(data_set.inter_test).orderBy(
tenant_col, user_col, res_col
tenant_col,
user_col,
res_col,
)
assert_frame_equal(
inter_test_scored.toPandas(), inter_test_scored_tag.toPandas()
inter_test_scored.toPandas(),
inter_test_scored_tag.toPandas(),
)
def test_enrich_and_normalize(self):
@ -610,7 +633,7 @@ class TestAccessAnomaly(unittest.TestCase):
output_col=indexed_res_col,
reset_per_partition=False,
),
]
],
)
the_indexer_model = the_indexer.fit(training)
@ -626,10 +649,10 @@ class TestAccessAnomaly(unittest.TestCase):
assert unindexed_df.filter(f.col(res_col).isNull()).count() == 0
enriched_indexed_df = materialized_cache(
access_anomaly._enrich_and_normalize(indexed_df)
access_anomaly._enrich_and_normalize(indexed_df),
)
enriched_df = materialized_cache(
without_ffa(the_indexer_model.undo_transform(enriched_indexed_df))
without_ffa(the_indexer_model.undo_transform(enriched_indexed_df)),
)
assert enriched_df.filter(f.col(user_col).isNull()).count() == 0
@ -638,7 +661,7 @@ class TestAccessAnomaly(unittest.TestCase):
assert (
enriched_df.filter(
(get_department(user_col) == get_department(res_col))
& (f.col(scaled_likelihood_col) == 1.0)
& (f.col(scaled_likelihood_col) == 1.0),
).count()
== 0
)
@ -646,21 +669,21 @@ class TestAccessAnomaly(unittest.TestCase):
assert (
enriched_df.filter(
(get_department(user_col) != get_department(res_col))
& (f.col(scaled_likelihood_col) != 1.0)
& (f.col(scaled_likelihood_col) != 1.0),
).count()
== 0
)
assert (
enriched_df.filter(
(get_department(user_col) != get_department(res_col))
(get_department(user_col) != get_department(res_col)),
).count()
== enriched_df.filter(f.col(scaled_likelihood_col) == 1.0).count()
)
assert (
enriched_df.filter(
(get_department(user_col) == get_department(res_col))
(get_department(user_col) == get_department(res_col)),
).count()
== enriched_df.filter(f.col(scaled_likelihood_col) != 1.0).count()
)
@ -675,7 +698,7 @@ class TestAccessAnomaly(unittest.TestCase):
(f.col(scaled_likelihood_col) >= low_value)
& (f.col(scaled_likelihood_col) <= high_value)
)
| (f.col(scaled_likelihood_col) == 1.0)
| (f.col(scaled_likelihood_col) == 1.0),
).count()
== enriched_df.count()
)
@ -708,7 +731,7 @@ class TestAccessAnomaly(unittest.TestCase):
assert data_set.training.count() == res_df.count()
assert (
res_df.filter(
f.col(AccessAnomalyConfig.default_output_col).isNull()
f.col(AccessAnomalyConfig.default_output_col).isNull(),
).count()
== 0
)
@ -731,7 +754,7 @@ class TestAccessAnomaly(unittest.TestCase):
f.col(tenant_col).alias("df1_tenant"),
f.col(user_col).alias("df1_user"),
f.col(res_col).alias("df1_res"),
)
),
)
df2 = materialized_cache(
@ -739,7 +762,7 @@ class TestAccessAnomaly(unittest.TestCase):
f.col(tenant_col).alias("df2_tenant"),
f.col(user_col).alias("df2_user"),
f.col(res_col).alias("df2_res"),
)
),
)
df_joined_same_department = materialized_cache(
@ -751,7 +774,7 @@ class TestAccessAnomaly(unittest.TestCase):
& (df1.df1_res == df2.df2_res),
)
.groupBy("df1_tenant", df1.df1_user, df2.df2_user)
.agg(f.count("*").alias("count"))
.agg(f.count("*").alias("count")),
)
stats_same_map = create_stats(df_joined_same_department, "df1_tenant", "count")
@ -771,7 +794,7 @@ class TestAccessAnomaly(unittest.TestCase):
& (df1.df1_res == df2.df2_res),
)
.groupBy("df1_tenant", df1.df1_user, df2.df2_user)
.agg(f.count("*").alias("count"))
.agg(f.count("*").alias("count")),
)
assert df_joined_diff_department.count() == 0
@ -793,7 +816,8 @@ class TestAccessAnomaly(unittest.TestCase):
def report_cross_access(self, model: AccessAnomalyModel):
training_scores = materialized_cache(model.transform(data_set.training))
training_stats: StatsMap = create_stats(
training_scores, AccessAnomalyConfig.default_tenant_col
training_scores,
AccessAnomalyConfig.default_tenant_col,
)
print("training_stats")
@ -812,7 +836,7 @@ class TestAccessAnomaly(unittest.TestCase):
assert (
model.user_res_feature_vector_mapping.user2component_mappings_df.select(
"component"
"component",
)
.distinct()
.count()
@ -821,7 +845,7 @@ class TestAccessAnomaly(unittest.TestCase):
assert (
model.user_res_feature_vector_mapping.res2component_mappings_df.select(
"component"
"component",
)
.distinct()
.count()
@ -830,12 +854,14 @@ class TestAccessAnomaly(unittest.TestCase):
intra_test_scores = model.transform(data_set.intra_test)
intra_test_stats = create_stats(
intra_test_scores, AccessAnomalyConfig.default_tenant_col
intra_test_scores,
AccessAnomalyConfig.default_tenant_col,
)
inter_test_scores = model.transform(data_set.inter_test)
inter_test_stats = create_stats(
inter_test_scores, AccessAnomalyConfig.default_tenant_col
inter_test_scores,
AccessAnomalyConfig.default_tenant_col,
)
print("test_stats")
@ -868,7 +894,7 @@ class TestConnectedComponents:
t.StructField(user_col, t.StringType(), False),
t.StructField(res_col, t.StringType(), False),
t.StructField(likelihood_col, t.DoubleType(), False),
]
],
)
df = sc.createDataFrame(
@ -919,7 +945,7 @@ class TestConnectedComponents:
res_col = "res"
df = sc.createDataFrame(
DataFactory(single_component=False).create_clustered_training_data()
DataFactory(single_component=False).create_clustered_training_data(),
).withColumn(tenant_col, f.lit(0))
cc = ConnectedComponents(tenant_col, user_col, res_col)

Просмотреть файл

@ -16,7 +16,7 @@ class TestComplementAccessTransformer(unittest.TestCase):
t.StructField("tenant", t.StringType(), nullable=True),
t.StructField("user", t.IntegerType(), nullable=True),
t.StructField("res", t.IntegerType(), nullable=True),
]
],
)
return sc.createDataFrame(

Просмотреть файл

@ -8,7 +8,10 @@ from synapsemltest.spark import *
class ExplainTester:
def check_explain(
self, explainable: Any, params: List[str], type_count_checker: Callable
self,
explainable: Any,
params: List[str],
type_count_checker: Callable,
):
explained = explainable.explainParams()
@ -22,7 +25,7 @@ class ExplainTester:
else:
parts = name.split("_")
return prefix + "".join(
[parts[i][0:1].upper() + parts[i][1:] for i in range(len(parts))]
[parts[i][0:1].upper() + parts[i][1:] for i in range(len(parts))],
)
values = []
@ -48,7 +51,7 @@ class ExplainTester:
for vv in arr
if (tt is not None and isinstance(vv, tt))
or (tt is None and vv is None)
]
],
)
assert type_count_checker(count_instance(values, str), str)

Просмотреть файл

@ -18,7 +18,7 @@ class TestIndexers(unittest.TestCase):
t.StructField("res", t.StringType(), nullable=True),
t.StructField("expected_uid", t.IntegerType(), nullable=True),
t.StructField("expected_rid", t.IntegerType(), nullable=True),
]
],
)
return sc.createDataFrame(
@ -49,7 +49,7 @@ class TestIndexers(unittest.TestCase):
[
indexers.IdIndexer("user", "tenant", "actual_uid", True),
indexers.IdIndexer("res", "tenant", "actual_rid", True),
]
],
)
df = self.create_sample_dataframe()
@ -69,7 +69,7 @@ class TestIndexers(unittest.TestCase):
[
indexers.IdIndexer("user", "tenant", "actual_uid", True),
indexers.IdIndexer("res", "tenant", "actual_rid", True),
]
],
)
df = self.create_sample_dataframe()
@ -80,7 +80,7 @@ class TestIndexers(unittest.TestCase):
assert new_df.filter(f.col("actual_rid") <= 0).count() == 0
orig_df = model.undo_transform(
new_df.select("tenant", "actual_uid", "actual_rid")
new_df.select("tenant", "actual_uid", "actual_rid"),
)
assert (
@ -107,7 +107,7 @@ class TestIndexers(unittest.TestCase):
[
indexers.IdIndexer("user", "tenant", "actual_uid", False),
indexers.IdIndexer("res", "tenant", "actual_rid", False),
]
],
)
df = self.create_sample_dataframe()
@ -157,7 +157,9 @@ class TestIdIndexerExplain(ExplainTester):
params = ["inputCol", "partitionKey", "outputCol", "resetPerPartition"]
self.check_explain(
indexers.IdIndexer("input", "tenant", "output", True), params, counts
indexers.IdIndexer("input", "tenant", "output", True),
params,
counts,
)

Просмотреть файл

@ -16,7 +16,7 @@ class TestScalers(unittest.TestCase):
t.StructField("tenant", t.StringType(), nullable=True),
t.StructField("name", t.StringType(), nullable=True),
t.StructField("score", t.FloatType(), nullable=True),
]
],
)
return sc.createDataFrame(
@ -43,7 +43,7 @@ class TestScalers(unittest.TestCase):
0
== new_df.filter(
f.col("name").cast(t.IntegerType())
!= f.col("new_score").cast(t.IntegerType())
!= f.col("new_score").cast(t.IntegerType()),
).count()
)
@ -127,7 +127,9 @@ class TestStandardScalarScalerExplain(ExplainTester):
params = ["inputCol", "partitionKey", "outputCol", "coefficientFactor"]
self.check_explain(
StandardScalarScaler("input", "tenant", "output"), params, counts
StandardScalarScaler("input", "tenant", "output"),
params,
counts,
)
@ -146,7 +148,9 @@ class TestLinearScalarScalerExplain(ExplainTester):
"maxRequiredValue",
]
self.check_explain(
LinearScalarScaler("input", "tenant", "output"), params, counts
LinearScalarScaler("input", "tenant", "output"),
params,
counts,
)

Просмотреть файл

@ -26,12 +26,17 @@ class TestDataFrameUtils(unittest.TestCase):
return dataframe
def create_string_type_dataframe(
self, field_names: List[str], data: List[Tuple[str]]
self,
field_names: List[str],
data: List[Tuple[str]],
) -> DataFrame:
return sc.createDataFrame(
data,
StructType(
[StructField(name, StringType(), nullable=True) for name in field_names]
[
StructField(name, StringType(), nullable=True)
for name in field_names
],
),
)
@ -47,7 +52,9 @@ class TestDataFrameUtils(unittest.TestCase):
def test_zip_with_index_sort_by_column_within_partitions(self):
dataframe = self.create_sample_dataframe()
result = DataFrameUtils.zip_with_index(
df=dataframe, partition_col="tenant", order_by_col="user"
df=dataframe,
partition_col="tenant",
order_by_col="user",
)
expected = [
("OrgA", "Alice", 0),
@ -72,7 +79,9 @@ class TestDataFrameUtils(unittest.TestCase):
class TestExplainBuilder(unittest.TestCase):
class ExplainableObj(Transformer, HasSetInputCol, HasSetOutputCol):
partitionKey = Param(
Params._dummy(), "partitionKey", "The name of the column to partition by."
Params._dummy(),
"partitionKey",
"The name of the column to partition by.",
)
secondPartitionKey = Param(

Просмотреть файл

@ -84,7 +84,7 @@ class RankingSpec(unittest.TestCase):
print(
metric
+ ": "
+ str(RankingEvaluator(k=3, metricName=metric).evaluate(output))
+ str(RankingEvaluator(k=3, metricName=metric).evaluate(output)),
)
# def test_adapter_evaluator_als(self):

Просмотреть файл

@ -1 +1 @@
Content like data models tests and end points are organized into projects in the custom speech portal. Each project is specific to a domain and country slash language. For example, you may create a project for call centers that use English in the United States to create your first project select the speech to text slash custom speech, then click new project follow the instructions provided by The Wizard to create your project after you've created a project you should see 4 tabs data testing training. And deployment use the links provided in Next steps to learn how to use each tab.
Content like data models tests and end points are organized into projects in the custom speech portal. Each project is specific to a domain and country slash language. For example, you may create a project for call centers that use English in the United States to create your first project select the speech to text slash custom speech, then click new project follow the instructions provided by The Wizard to create your project after you've created a project you should see 4 tabs data testing training. And deployment use the links provided in Next steps to learn how to use each tab.

Просмотреть файл

@ -110,4 +110,4 @@ RandomForestClassification_TelescopeData.csv_AUPR,0.8553584233805505,0.01,true
DecisionTreeClassification_TelescopeData.csv_AUROC,0.6679763401598343,0.01,true
DecisionTreeClassification_TelescopeData.csv_AUPR,0.6646757705708718,0.01,true
LogisticRegression_TelescopeData.csv_AUROC,0.5,0.01,true
LogisticRegression_TelescopeData.csv_AUPR,0.35347472322262236,0.01,true
LogisticRegression_TelescopeData.csv_AUPR,0.35347472322262236,0.01,true

1 name value precision higherIsBetter
110 DecisionTreeClassification_TelescopeData.csv_AUROC 0.6679763401598343 0.01 true
111 DecisionTreeClassification_TelescopeData.csv_AUPR 0.6646757705708718 0.01 true
112 LogisticRegression_TelescopeData.csv_AUROC 0.5 0.01 true
113 LogisticRegression_TelescopeData.csv_AUPR 0.35347472322262236 0.01 true

Просмотреть файл

@ -10,4 +10,4 @@ binary_transfusion.csv,0.7619987556051406,0.01,true
binary_breast-cancer-wisconsin.csv,0.9106799862802264,0.01,true
binary_fertility_Diagnosis.train.csv,0.8787878787878788,0.01,true
binary_bank.train.csv,0.8847234595193773,0.01,true
binary_TelescopeData.csv,0.6483818077309704,0.01,true
binary_TelescopeData.csv,0.6483818077309704,0.01,true

1 name value precision higherIsBetter
10 binary_breast-cancer-wisconsin.csv 0.9106799862802264 0.01 true
11 binary_fertility_Diagnosis.train.csv 0.8787878787878788 0.01 true
12 binary_bank.train.csv 0.8847234595193773 0.01 true
13 binary_TelescopeData.csv 0.6483818077309704 0.01 true

Просмотреть файл

@ -24,4 +24,3 @@ class RankingAdapterModelSpec extends RankingTestBase with TransformerFuzzing[Ra
override def reader: MLReadable[_] = RankingAdapterModel
}

Просмотреть файл

@ -47,4 +47,3 @@ class RecommendationIndexerModelSpec extends RankingTestBase with TransformerFuz
override def reader: MLReadable[_] = RankingAdapterModel
}

Просмотреть файл

@ -81,7 +81,8 @@ class TensorInfo(ValueInfo):
def __str__(self):
return "TensorInfo(shape={}, type={})".format(
"[" + ",".join(map(str, self.shape)) + "]", self.type
"[" + ",".join(map(str, self.shape)) + "]",
self.type,
)
@classmethod
@ -128,7 +129,11 @@ class MapInfo(ValueInfo):
class SequenceInfo(ValueInfo):
def __init__(
self, length: int, sequence_of_maps: bool, map_info: MapInfo, sequence_type: str
self,
length: int,
sequence_of_maps: bool,
map_info: MapInfo,
sequence_type: str,
):
self.length = length
self.sequence_of_maps = sequence_of_maps
@ -153,9 +158,10 @@ class SequenceInfo(ValueInfo):
length = java_gateway.get_field(java_sequence_info, "length")
sequence_of_maps = java_gateway.get_field(java_sequence_info, "sequenceOfMaps")
map_info = MapInfo.from_java(
java_gateway.get_field(java_sequence_info, "mapInfo")
java_gateway.get_field(java_sequence_info, "mapInfo"),
)
sequence_type = java_gateway.get_field(
java_sequence_info, "sequenceType"
java_sequence_info,
"sequenceType",
).toString()
return cls(length, sequence_of_maps, map_info, sequence_type)

Просмотреть файл

@ -3,36 +3,5 @@
# If any command fails, exit immediately with that command's exit status
set -eo pipefail
get_installed_version() {
echo $(pip freeze | grep "$1==" | sed -e "s/$1==//g")
}
get_version_from_lint_requirements() {
echo $(cat requirements.txt | grep "$1==" | sed -e "s/$1==//g")
}
validate_version() {
package_name=$1
installed_version=$(get_installed_version $package_name)
expected_version=$(get_version_from_lint_requirements $package_name)
if [ "$installed_version" != "$expected_version" ]; then
echo "Found version mismatch for $package_name:"
echo "Installed: $installed_version"
echo " Expected: $expected_version"
echo
echo "Please run:"
echo "pip install $package_name==$expected_version"
return 1
fi
}
diff=$(git diff --cached --diff-filter ACMRTUXB --name-only | grep '\.py$' || true)
if [ ! -z "$diff" ]; then
validate_version black
echo "Running black..."
black --check $diff
fi
echo "Running scalastyle.."
sbt scalastyle test:scalastyle

Просмотреть файл

@ -38,12 +38,14 @@ class LightGBMModelMixin:
sparse_values = [float(v) for v in vector.values]
return list(
self._java_obj.getSparseFeatureShaps(
vector.size, sparse_indices, sparse_values
)
vector.size,
sparse_indices,
sparse_values,
),
)
else:
raise TypeError(
"Vector argument to getFeatureShaps must be a pyspark.linalg sparse or dense vector type"
"Vector argument to getFeatureShaps must be a pyspark.linalg sparse or dense vector type",
)
def getBoosterBestIteration(self):

Просмотреть файл

@ -28,4 +28,3 @@ case class RankerTrainParams(passThroughArgs: Option[String],
.appendParamValueIfNotThere("max_position", Option(maxPosition))
}
}

Просмотреть файл

@ -30,4 +30,4 @@ LightGBMClassifier_random.forest.train.csv_goss,0.9945531305363942,0.1,true
LightGBMClassifier_transfusion.csv_gbdt,0.7824315000985611,0.1,true
LightGBMClassifier_transfusion.csv_rf,0.7596441947565543,0.1,true
LightGBMClassifier_transfusion.csv_dart,0.7791494184900454,0.1,true
LightGBMClassifier_transfusion.csv_goss,0.7858170707668046,0.1,true
LightGBMClassifier_transfusion.csv_goss,0.7858170707668046,0.1,true

1 name value precision higherIsBetter
30 LightGBMClassifier_transfusion.csv_gbdt 0.7824315000985611 0.1 true
31 LightGBMClassifier_transfusion.csv_rf 0.7596441947565543 0.1 true
32 LightGBMClassifier_transfusion.csv_dart 0.7791494184900454 0.1 true
33 LightGBMClassifier_transfusion.csv_goss 0.7858170707668046 0.1 true

Просмотреть файл

@ -18,4 +18,4 @@ LightGBMRegressor_machine.train.csv_goss,101.33455012081083,100.0,false
LightGBMRegressor_Concrete_Data.train.csv_gbdt,10.74708157454394,1.0,false
LightGBMRegressor_Concrete_Data.train.csv_rf,11.073837461910365,1.0,false
LightGBMRegressor_Concrete_Data.train.csv_dart,11.293857554640548,1.0,false
LightGBMRegressor_Concrete_Data.train.csv_goss,10.724926727333045,1.0,false
LightGBMRegressor_Concrete_Data.train.csv_goss,10.724926727333045,1.0,false

1 name value precision higherIsBetter
18 LightGBMRegressor_Concrete_Data.train.csv_gbdt 10.74708157454394 1.0 false
19 LightGBMRegressor_Concrete_Data.train.csv_rf 11.073837461910365 1.0 false
20 LightGBMRegressor_Concrete_Data.train.csv_dart 11.293857554640548 1.0 false
21 LightGBMRegressor_Concrete_Data.train.csv_goss 10.724926727333045 1.0 false

Просмотреть файл

@ -170,4 +170,4 @@
},
"nbformat": 4,
"nbformat_minor": 1
}
}

Просмотреть файл

@ -126,4 +126,4 @@
},
"nbformat": 4,
"nbformat_minor": 2
}
}

Просмотреть файл

@ -310,4 +310,4 @@
},
"nbformat": 4,
"nbformat_minor": 2
}
}

Просмотреть файл

@ -355,4 +355,4 @@
},
"nbformat": 4,
"nbformat_minor": 1
}
}

Просмотреть файл

@ -123,4 +123,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
}
}

Просмотреть файл

@ -259,4 +259,4 @@
},
"nbformat": 4,
"nbformat_minor": 1
}
}

Просмотреть файл

@ -261,4 +261,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
}
}

Просмотреть файл

@ -1099,4 +1099,4 @@
},
"nbformat": 4,
"nbformat_minor": 4
}
}

Просмотреть файл

@ -689,4 +689,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
}
}

Просмотреть файл

@ -244,4 +244,4 @@
"metadata": {},
"nbformat": 4,
"nbformat_minor": 0
}
}

Некоторые файлы не были показаны из-за слишком большого количества измененных файлов Показать больше