Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Перейти к файлу
Sabee Grewal 5100b4a5e8 Merge pull request #164 from sabeegrewal/progressClean
Moving to version 0.15.0 of the EventHubs Java Client
2017-10-02 17:18:18 -07:00
core updating import order 2017-10-02 17:11:01 -07:00
docs reverting minor style changes 2017-10-02 16:23:45 -07:00
examples more scalastyle fixes 2017-09-29 15:16:14 -07:00
javadoc
project reverting to scalastyle 0.8.0 2017-10-02 13:39:58 -07:00
.gitignore
.travis.yml
LICENSE
README.md Update README.md 2017-09-18 10:58:41 -07:00
pom.xml moving to the latest eventhubs java client 2017-10-02 16:59:46 -07:00
run_tests.sh
scalastyle-config.xml

README.md

spark-eventhubs Build Status

This is the source code of EventHubsReceiver for Spark Streaming.

Here are the examples that use this library to process streaming data from Azure Eventhubs.

For latest integration of EventHubs and Spark Streaming, the document can be found here.

Latest Release: 2.1.3 (For Spark 2.1.x), 2.0.9 (For Spark 2.0.x), 1.6.3 (For Spark 1.6.x)

Change Log

Usage

For help with Spark Cluster setup and running Spark-EventHubs applications in your cluster, checkout our Getting Started page.

Getting Officially Released Version

We will have the official release in the maven central repo, you can add the following dependency to your project to reference Spark-EventHubs

Maven Dependency

<!-- https://mvnrepository.com/artifact/com.microsoft.azure/spark-streaming-eventhubs_[2.10 for spark 1.6. 2.11 for spark 2.0.x or spark 2.1.x] -->
<dependency>
    <groupId>com.microsoft.azure</groupId>
    <artifactId>spark-streaming-eventhubs_[2.10 for spark 1.6. 2.11 for spark 2.0.x or spark 2.1.x]</artifactId>
    <version>[change it to latest version]</version>
</dependency>

SBT Dependency

// https://mvnrepository.com/artifact/com.microsoft.azure/spark-streaming-eventhubs_2.11
libraryDependencies += "com.microsoft.azure" % "spark-streaming-eventhubs_[2.10 for spark 1.6. 2.11 for spark 2.0.x or spark 2.1.x]" % "[change it to latest version]"

Maven Central for other dependency co-ordinates

https://mvnrepository.com/artifact/com.microsoft.azure/spark-streaming-eventhubs_[2.10 for spark 1.6. 2.11 for spark 2.0.x or spark 2.1.x]/[change it to latest version]

Getting Staging Version

We will also publish the staging version of Spark-EventHubs in GitHub. To use the staging version of Spark-EventHubs, the first step is to indicate GitHub as the source repo by adding the following entry to pom.xml:

<repository>
      <id>spark-eventhubs</id>
      <url>https://raw.github.com/hdinsight/spark-eventhubs/maven-repo/</url>
      <snapshots>
        <enabled>true</enabled>
        <updatePolicy>always</updatePolicy>
      </snapshots>
</repository>

You can then add the following dependency to your project to take the pre-released version.

Maven Dependency

<!-- https://mvnrepository.com/artifact/com.microsoft.azure/spark-streaming-eventhubs_2.11 -->
<dependency>
    <groupId>com.microsoft.azure</groupId>
    <artifactId>spark-streaming-eventhubs_[2.10 for spark 1.6. 2.11 for spark 2.0.x or spark 2.1.x]</artifactId>
    <version>2.1.0-SNAPSHOT</version>
</dependency>

SBT Dependency

// https://mvnrepository.com/artifact/com.microsoft.azure/spark-streaming-eventhubs_2.11
libraryDependencies += "com.microsoft.azure" % "spark-streaming-eventhubs_2.11" % "2.1.0-SNAPSHOT"

Build Prerequisites

In order to build and run the examples, you need to have:

  1. Java 1.8 SDK.
  2. Maven 3.x
  3. Scala 2.11

Build Command

mvn clean
mvn install 

This command builds and installs spark-streaming-eventhubs jar to local maven cache. Subsequently you can build any Spark Streaming application that references this jar.

Integrate Structured Streaming and Azure Event Hubs

Integrate Spark Streaming and EventHubs with Direct DStream