зеркало из https://github.com/dotnet/spark.git
Adding MacOs getting started (#239)
This commit is contained in:
Родитель
192997cb55
Коммит
8dc789fa6c
|
@ -68,6 +68,7 @@
|
|||
These instructions will show you how to run a .NET for Apache Spark app using .NET Core.
|
||||
- [Windows Instructions](docs/getting-started/windows-instructions.md)
|
||||
- [Ubuntu Instructions](docs/getting-started/ubuntu-instructions.md)
|
||||
- [MacOs Instructions](docs/getting-started/macos-instructions.md)
|
||||
|
||||
## Build Status
|
||||
|
||||
|
|
|
@ -0,0 +1,85 @@
|
|||
# Getting Started with Spark .NET on MacOS
|
||||
|
||||
These instructions will show you how to run a .NET for Apache Spark app using .NET Core on MacOSX.
|
||||
|
||||
## Pre-requisites
|
||||
|
||||
- Download and install **[.NET Core 2.1 SDK](https://dotnet.microsoft.com/download/dotnet-core/2.1)**
|
||||
- Install **[Java 8](https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html)**
|
||||
- Select the appropriate version for your operating system e.g., `jdk-8u231-macosx-x64.dmg`.
|
||||
- Install using the installer and verify you are able to run `java` from your command-line
|
||||
- Download and install **[Apache Spark 2.4.4](https://archive.apache.org/dist/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz)**:
|
||||
- Add the necessary environment variables SPARK_HOME e.g., `~/bin/spark-2.4.4-bin-hadoop2.7/`
|
||||
```bash
|
||||
export SPARK_HOME=~/bin/spark-2.4.4-bin-hadoop2.7/
|
||||
export PATH="$SPARK_HOME/bin:$PATH"
|
||||
source ~/.bashrc
|
||||
```
|
||||
- Download and install **[Microsoft.Spark.Worker](https://github.com/dotnet/spark/releases)** release:
|
||||
- Select a **[Microsoft.Spark.Worker](https://github.com/dotnet/spark/releases)** release from .NET for Apache Spark GitHub Releases page and download into your local machine (e.g., `/bin/Microsoft.Spark.Worker/`).
|
||||
- **IMPORTANT** Create a new environment variable using ```export DOTNET_WORKER_DIR <your_path>``` and set it to the directory where you downloaded and extracted the Microsoft.Spark.Worker (e.g., `/bin/Microsoft.Spark.Worker/`).
|
||||
|
||||
|
||||
## Authoring a .NET for Apache Spark App
|
||||
- Use the `dotnet` CLI to create a console application.
|
||||
```
|
||||
dotnet new console -o HelloSpark
|
||||
```
|
||||
- Install `Microsoft.Spark` Nuget package into the project from the [spark nuget.org feed](https://www.nuget.org/profiles/spark) - see [Ways to install Nuget Package](https://docs.microsoft.com/en-us/nuget/consume-packages/ways-to-install-a-package)
|
||||
```
|
||||
cd HelloSpark
|
||||
dotnet add package Microsoft.Spark
|
||||
```
|
||||
- Replace the contents of the `Program.cs` file with the following code:
|
||||
```csharp
|
||||
using Microsoft.Spark.Sql;
|
||||
|
||||
namespace HelloSpark
|
||||
{
|
||||
class Program
|
||||
{
|
||||
static void Main(string[] args)
|
||||
{
|
||||
var spark = SparkSession.Builder().GetOrCreate();
|
||||
var df = spark.Read().Json("people.json");
|
||||
df.Show();
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
- Use the `dotnet` CLI to build the application:
|
||||
```bash
|
||||
dotnet build
|
||||
```
|
||||
|
||||
## Running your .NET for Apache Spark App
|
||||
- Open your terminal and navigate into your app folder:
|
||||
```bash
|
||||
cd <your-app-output-directory>
|
||||
```
|
||||
- Create `people.json` with the following content:
|
||||
```json
|
||||
{ "name" : "Michael" }
|
||||
{ "name" : "Andy", "age" : 30 }
|
||||
{ "name" : "Justin", "age" : 19 }
|
||||
```
|
||||
- Run your app
|
||||
```bash
|
||||
spark-submit \
|
||||
--class org.apache.spark.deploy.dotnet.DotnetRunner \
|
||||
--master local \
|
||||
microsoft-spark-2.4.x-<version>.jar \
|
||||
dotnet HelloSpark.dll
|
||||
```
|
||||
**Note**: This command assumes you have downloaded Apache Spark and added it to your PATH environment variable to be able to use `spark-submit`, otherwise, you would have to use the full path (e.g., `~/spark/bin/spark-submit`).
|
||||
|
||||
- The output of the application should look similar to the output below:
|
||||
```text
|
||||
+----+-------+
|
||||
| age| name|
|
||||
+----+-------+
|
||||
|null|Michael|
|
||||
| 30| Andy|
|
||||
| 19| Justin|
|
||||
+----+-------+
|
||||
```
|
Загрузка…
Ссылка в новой задаче