This commit is contained in:
Terry Kim 2019-06-05 07:44:14 -07:00 коммит произвёл GitHub
Родитель eb26baa462
Коммит a0f2620880
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: 4AEE18F83AFDEB23
11 изменённых файлов: 74 добавлений и 21 удалений

Просмотреть файл

@ -36,7 +36,7 @@
<tbody align="center"> <tbody align="center">
<tr> <tr>
<td >2.3.*</td> <td >2.3.*</td>
<td rowspan=3><a href="https://github.com/dotnet/spark/releases/tag/v0.2.0">v0.2.0</a></td> <td rowspan=4><a href="https://github.com/dotnet/spark/releases/tag/v0.3.0">v0.3.0</a></td>
</tr> </tr>
<tr> <tr>
<td>2.4.0</td> <td>2.4.0</td>
@ -45,12 +45,11 @@
<td>2.4.1</td> <td>2.4.1</td>
</tr> </tr>
<tr> <tr>
<td>2.4.2</td> <td>2.4.3</td>
<td><a href="https://github.com/dotnet/spark/issues/60">Not supported</a></td>
</tr> </tr>
<tr> <tr>
<td>2.4.3</td> <td>2.4.2</td>
<td>master branch</td> <td><a href="https://github.com/dotnet/spark/issues/60">Not supported</a></td>
</tr> </tr>
</tbody> </tbody>
</table> </table>

Просмотреть файл

@ -3,7 +3,7 @@
<modelVersion>4.0.0</modelVersion> <modelVersion>4.0.0</modelVersion>
<groupId>com.microsoft.spark</groupId> <groupId>com.microsoft.spark</groupId>
<artifactId>microsoft-spark-benchmark</artifactId> <artifactId>microsoft-spark-benchmark</artifactId>
<version>0.2.0</version> <version>0.3.0</version>
<inceptionYear>2019</inceptionYear> <inceptionYear>2019</inceptionYear>
<properties> <properties>
<encoding>UTF-8</encoding> <encoding>UTF-8</encoding>

Просмотреть файл

@ -7,7 +7,7 @@ These instructions will show you how to run a .NET for Apache Spark app using .N
- Download and install the following: **[.NET Core 2.1 SDK](https://dotnet.microsoft.com/download/dotnet-core/2.1)** | **[OpenJDK 8](https://openjdk.java.net/install/)** | **[Apache Spark 2.4.1](https://archive.apache.org/dist/spark/spark-2.4.1/spark-2.4.1-bin-hadoop2.7.tgz)** - Download and install the following: **[.NET Core 2.1 SDK](https://dotnet.microsoft.com/download/dotnet-core/2.1)** | **[OpenJDK 8](https://openjdk.java.net/install/)** | **[Apache Spark 2.4.1](https://archive.apache.org/dist/spark/spark-2.4.1/spark-2.4.1-bin-hadoop2.7.tgz)**
- Download and install **[Microsoft.Spark.Worker](https://github.com/dotnet/spark/releases)** release: - Download and install **[Microsoft.Spark.Worker](https://github.com/dotnet/spark/releases)** release:
- Select a **[Microsoft.Spark.Worker](https://github.com/dotnet/spark/releases)** release from .NET for Apache Spark GitHub Releases page and download into your local machine (e.g., `~/bin/Microsoft.Spark.Worker`). - Select a **[Microsoft.Spark.Worker](https://github.com/dotnet/spark/releases)** release from .NET for Apache Spark GitHub Releases page and download into your local machine (e.g., `~/bin/Microsoft.Spark.Worker`).
- **IMPORTANT** Create a [new environment variable](https://help.ubuntu.com/community/EnvironmentVariables) `DotnetWorkerPath` and set it to the directory where you downloaded and extracted the Microsoft.Spark.Worker (e.g., `~/bin/Microsoft.Spark.Worker`). - **IMPORTANT** Create a [new environment variable](https://help.ubuntu.com/community/EnvironmentVariables) `DOTNET_WORKER_DIR` and set it to the directory where you downloaded and extracted the Microsoft.Spark.Worker (e.g., `~/bin/Microsoft.Spark.Worker`).
For detailed instructions, you can see [Building .NET for Apache Spark from Source on Ubuntu](../building/ubuntu-instructions.md). For detailed instructions, you can see [Building .NET for Apache Spark from Source on Ubuntu](../building/ubuntu-instructions.md).

Просмотреть файл

@ -7,7 +7,7 @@ These instructions will show you how to run a .NET for Apache Spark app using .N
- Download and install the following: **[.NET Core 2.1 SDK](https://dotnet.microsoft.com/download/dotnet-core/2.1)** | **[Visual Studio 2019](https://www.visualstudio.com/downloads/)** | **[Java 1.8](https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html)** | **[Apache Spark 2.4.1](https://archive.apache.org/dist/spark/spark-2.4.1/spark-2.4.1-bin-hadoop2.7.tgz)** - Download and install the following: **[.NET Core 2.1 SDK](https://dotnet.microsoft.com/download/dotnet-core/2.1)** | **[Visual Studio 2019](https://www.visualstudio.com/downloads/)** | **[Java 1.8](https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html)** | **[Apache Spark 2.4.1](https://archive.apache.org/dist/spark/spark-2.4.1/spark-2.4.1-bin-hadoop2.7.tgz)**
- Download and install **[Microsoft.Spark.Worker](https://github.com/dotnet/spark/releases)** release: - Download and install **[Microsoft.Spark.Worker](https://github.com/dotnet/spark/releases)** release:
- Select a **[Microsoft.Spark.Worker](https://github.com/dotnet/spark/releases)** release from .NET for Apache Spark GitHub Releases page and download into your local machine (e.g., `c:\bin\Microsoft.Spark.Worker\`). - Select a **[Microsoft.Spark.Worker](https://github.com/dotnet/spark/releases)** release from .NET for Apache Spark GitHub Releases page and download into your local machine (e.g., `c:\bin\Microsoft.Spark.Worker\`).
- **IMPORTANT** Create a [new environment variable](https://www.java.com/en/download/help/path.xml) `DotnetWorkerPath` and set it to the directory where you downloaded and extracted the Microsoft.Spark.Worker (e.g., `c:\bin\Microsoft.Spark.Worker`). - **IMPORTANT** Create a [new environment variable](https://www.java.com/en/download/help/path.xml) `DOTNET_WORKER_DIR` and set it to the directory where you downloaded and extracted the Microsoft.Spark.Worker (e.g., `c:\bin\Microsoft.Spark.Worker`).
For detailed instructions, you can see [Building .NET for Apache Spark from Source on Windows](../building/windows-instructions.md). For detailed instructions, you can see [Building .NET for Apache Spark from Source on Windows](../building/windows-instructions.md).

Просмотреть файл

@ -0,0 +1,46 @@
# .NET for Apache Spark 0.3 Release Notes
### Release Notes
Below are some of the highlights from this release.
* [Apache Spark 2.4.3](https://spark.apache.org/news/spark-2-4-3-released.html) support ([#118](https://github.com/dotnet/spark/pull/108))
* dotnet/spark is now using [dotnet/arcade](https://github.com/dotnet/arcade) as the build infrastructure ([#113](https://github.com/dotnet/spark/pull/113))
* [Source Link](https://github.com/dotnet/sourcelink) is now supported for the Nuget package ([#40](https://github.com/dotnet/spark/issues/40)).
* Fixed the issue where Microsoft.Spark.dll is not signed ([#119](https://github.com/dotnet/spark/issues/119)).
* Pickling performance is improved ([#111](https://github.com/dotnet/spark/pull/111)).
* Performance improvment PRs in the Pickling Library: [irmen/Pyrolite#64](https://github.com/irmen/Pyrolite/pull/64), [irmen/Pyrolite#67](https://github.com/irmen/Pyrolite/pull/67)
* ArrayType and MapType are supported as UDF return types ([#112](https://github.com/dotnet/spark/issues/112#issuecomment-493297068), [#114](https://github.com/dotnet/spark/pull/114))
### Supported Spark Versions
The following table outlines the supported Spark versions along with the microsoft-spark JAR to use with:
<table>
<thead>
<tr>
<th>Spark Version</th>
<th>microsoft-spark JAR</th>
</tr>
</thead>
<tbody align="center">
<tr>
<td>2.3.*</td>
<td>microsoft-spark-2.3.x-0.2.0.jar</td>
</tr>
<tr>
<td>2.4.0</td>
<td rowspan=3>microsoft-spark-2.4.x-0.2.0.jar</td>
</tr>
<tr>
<td>2.4.1</td>
</tr>
<tr>
<td>2.4.3</td>
</tr>
<tr>
<td>2.4.2</td>
<td><a href="https://github.com/dotnet/spark/issues/60">Not supported</a></td>
</tr>
</tbody>
</table>

Просмотреть файл

@ -1,7 +1,7 @@
<?xml version="1.0" encoding="utf-8"?> <?xml version="1.0" encoding="utf-8"?>
<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003"> <Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<PropertyGroup> <PropertyGroup>
<VersionPrefix>0.2.0</VersionPrefix> <VersionPrefix>0.3.0</VersionPrefix>
<PreReleaseVersionLabel>prerelease</PreReleaseVersionLabel> <PreReleaseVersionLabel>prerelease</PreReleaseVersionLabel>
<RestoreSources> <RestoreSources>
$(RestoreSources); $(RestoreSources);

Просмотреть файл

@ -36,7 +36,7 @@ namespace Microsoft.Spark.E2ETest
AppDomain.CurrentDomain.BaseDirectory); AppDomain.CurrentDomain.BaseDirectory);
#elif NETCOREAPP2_1 #elif NETCOREAPP2_1
// For .NET Core, the user must have published the worker as a standalone // For .NET Core, the user must have published the worker as a standalone
// executable and set DotnetWorkerPath to the published directory. // executable and set the worker path to the published directory.
if (string.IsNullOrEmpty(Environment.GetEnvironmentVariable(workerDirEnvVarName))) if (string.IsNullOrEmpty(Environment.GetEnvironmentVariable(workerDirEnvVarName)))
{ {
throw new Exception( throw new Exception(

Просмотреть файл

@ -65,6 +65,22 @@ namespace Microsoft.Spark.Sql
public SparkSession NewSession() => public SparkSession NewSession() =>
new SparkSession((JvmObjectReference)_jvmObject.Invoke("newSession")); new SparkSession((JvmObjectReference)_jvmObject.Invoke("newSession"));
/// <summary>
/// Returns the specified table/view as a DataFrame.
/// </summary>
/// <param name="tableName">Name of a table or view</param>
/// <returns>DataFrame object</returns>
public DataFrame Table(string tableName)
=> new DataFrame((JvmObjectReference)_jvmObject.Invoke("table", tableName));
/// <summary>
/// Executes a SQL query using Spark, returning the result as a DataFrame.
/// </summary>
/// <param name="sqlText">SQL query text</param>
/// <returns>DataFrame object</returns>
public DataFrame Sql(string sqlText)
=> new DataFrame((JvmObjectReference)_jvmObject.Invoke("sql", sqlText));
/// <summary> /// <summary>
/// Returns a DataFrameReader that can be used to read non-streaming data in /// Returns a DataFrameReader that can be used to read non-streaming data in
/// as a DataFrame. /// as a DataFrame.
@ -80,14 +96,6 @@ namespace Microsoft.Spark.Sql
public DataStreamReader ReadStream() => public DataStreamReader ReadStream() =>
new DataStreamReader((JvmObjectReference)_jvmObject.Invoke("readStream")); new DataStreamReader((JvmObjectReference)_jvmObject.Invoke("readStream"));
/// <summary>
/// Executes a SQL query using Spark, returning the result as a DataFrame.
/// </summary>
/// <param name="sqlText">SQL query text</param>
/// <returns>DataFrame object</returns>
public DataFrame Sql(string sqlText)
=> new DataFrame((JvmObjectReference)_jvmObject.Invoke("sql", sqlText));
/// <summary> /// <summary>
/// Returns UDFRegistraion object with which user-defined functions (UDF) can /// Returns UDFRegistraion object with which user-defined functions (UDF) can
/// be registered. /// be registered.

Просмотреть файл

@ -4,7 +4,7 @@
<parent> <parent>
<groupId>com.microsoft.scala</groupId> <groupId>com.microsoft.scala</groupId>
<artifactId>microsoft-spark</artifactId> <artifactId>microsoft-spark</artifactId>
<version>0.2.0</version> <version>0.3.0</version>
</parent> </parent>
<artifactId>microsoft-spark-2.3.x</artifactId> <artifactId>microsoft-spark-2.3.x</artifactId>
<inceptionYear>2019</inceptionYear> <inceptionYear>2019</inceptionYear>

Просмотреть файл

@ -4,7 +4,7 @@
<parent> <parent>
<groupId>com.microsoft.scala</groupId> <groupId>com.microsoft.scala</groupId>
<artifactId>microsoft-spark</artifactId> <artifactId>microsoft-spark</artifactId>
<version>0.2.0</version> <version>0.3.0</version>
</parent> </parent>
<artifactId>microsoft-spark-2.4.x</artifactId> <artifactId>microsoft-spark-2.4.x</artifactId>
<inceptionYear>2019</inceptionYear> <inceptionYear>2019</inceptionYear>

Просмотреть файл

@ -4,7 +4,7 @@
<groupId>com.microsoft.scala</groupId> <groupId>com.microsoft.scala</groupId>
<artifactId>microsoft-spark</artifactId> <artifactId>microsoft-spark</artifactId>
<packaging>pom</packaging> <packaging>pom</packaging>
<version>0.2.0</version> <version>0.3.0</version>
<properties> <properties>
<encoding>UTF-8</encoding> <encoding>UTF-8</encoding>
</properties> </properties>