2.0 KiB

Исходник Постоянная ссылка Ответственный История

.NET for Apache Spark Roadmap

The goal of the .NET for Apache Spark project is to provide an easy to use, .NET-friendly integration to the popular big data platform, Apache Spark. This document describes the tentative plan for the project in the short and long-term.

.NET for Apache Spark is a community effort and we welcome community feedback on our plans. The best way to give feedback is to open an issue in this repo. We are also excited to receive contributions (check out the contribution guide). It's always a good idea to open an issue for a discussion before embarking on a large code change to make sure there is not duplicated effort. Where we do know that efforts are already underway, we have used the (*) marker below.

Short Term

User Experience

1:1 API compatibility for Dataframes with Apache Spark 2.3.x, Apache Spark 2.4.x and Apache Spark 3.0.x (*)

Performance Optimizations

Improvements to C# Pickling Library
Improvements to Arrow .NET Library
Exploiting .NET Vectorization (*)
Micro-benchmarking framework for Interop

Benchmarks

Benchmarking scripts for all languages that include generating the dataset and running queries against it (*)
Published reproducible benchmarks against TPC-H (industry-standard database benchmark) (*)

Tooling Improvements

VS Code support (*)
Apache Jupyter integration with C# & F# Notebook Support (*)
Improved user experience for .NET app submission to a remote Spark cluster

Longer Term

User Experience

Idiomatic C# and F# APIs

Performance Optimizations

Contribute extensible interop layer to Apache Spark

Benchmarks

Published reproducible benchmarks against TPC-DS (industry-standard database benchmark)

Tooling Improvements

Visual Studio Extension for .NET app submission to a remote Spark cluster
Visual Studio Extension for .NET app debugging
Make it easy to copy/paste Scala examples into Visual Studio

2.0 KiB Исходник Постоянная ссылка Ответственный История

.NET for Apache Spark Roadmap

Short Term

User Experience

Performance Optimizations

Benchmarks

Tooling Improvements

Longer Term

User Experience

Performance Optimizations

Benchmarks

Tooling Improvements

2.0 KiB

Исходник Постоянная ссылка Ответственный История