Коммит
d2fbc24bcb
|
@ -0,0 +1,222 @@
|
|||
#Consuming Events with the Java client for Azure Event Hubs
|
||||
|
||||
Consuming events from Event Hubs is different from typical messaging infrastuctures like queues or topic
|
||||
subscriptions, where a consumer simply fetches the "next" message.
|
||||
|
||||
Event Hubs puts the consumer in control of the offset from which the log shall be read,
|
||||
and the consumer can repeatedly pick a different or the same offset and read the event stream from chosen offsets while
|
||||
the events are being retained. Each partition is therefore loosely analogous to a tape drive that you can wind back to a
|
||||
particular mark and then play back to the freshest data available.
|
||||
|
||||
Azure Event Hubs consumers also need to be aware of the partitioning model chosen for an Event Hub as receivers explicitly
|
||||
interact with partitions. Any Event Hub's event store is split up into at least 4 partitions, each maintaining a separate event log. You can think
|
||||
of partitions like lanes on a highway. The more events the Event Hub needs to handle, the more lanes (partitions) you have
|
||||
to add. Each partition can handle at most the equivalent of 1 "throughput unit", equivalent to at most 1000 events per
|
||||
second and at most 1 Megabyte per second.
|
||||
|
||||
The common consumption model for Event Hubs is that multiple consumers (threads, processes, compute nodes) process events
|
||||
from a single Event Hub in parallel, and coordinate which consumer is responsible for pulling events from which partition.
|
||||
|
||||
##Getting Started
|
||||
|
||||
This library is available for use in Maven projects from the Maven Central Repository, and can be referenced using the
|
||||
following dependency declaration inside of your Maven project file:
|
||||
|
||||
```XML
|
||||
<dependency>
|
||||
<groupId>com.microsoft.azure</groupId>
|
||||
<artifactId>azure-eventhubs-clients</artifactId>
|
||||
<version>0.6.0</version>
|
||||
</dependency>
|
||||
```
|
||||
|
||||
For different types of build environments, the latest released JAR files can also be [explicitly obtained from the
|
||||
Maven Central Repository]() or from [the Release distribution point on GitHub]().
|
||||
|
||||
|
||||
For a simple event publisher, you'll need to import the *com.microsoft.azure.eventhubs* package for the Event Hub client classes
|
||||
and the *com.microsoft.azure.servicebus* package for utility classes like common exceptions that are shared with the
|
||||
Azure Service Bus Messaging client.
|
||||
|
||||
|
||||
```Java
|
||||
import com.microsoft.azure.eventhubs.*;
|
||||
import com.microsoft.azure.servicebus.*;
|
||||
```
|
||||
|
||||
The receiver code creates an *EventHubClient* from a given connecting string
|
||||
|
||||
```Java
|
||||
EventHubClient ehClient = EventHubClient.createFromConnectionString(str).get();
|
||||
```
|
||||
|
||||
The receiver code then creates (at least) one *PartitionReceiver* that will receive the data. The receiver is seeded with
|
||||
an offset, in the snippet below it's simply the start of the log.
|
||||
|
||||
```Java
|
||||
String partitionId = "0"; // API to get PartitionIds will be released as part of V0.2
|
||||
PartitionReceiver receiver = ehClient.createReceiver(
|
||||
EventHubClient.DefaultConsumerGroupName,
|
||||
partitionId,
|
||||
PartitionReceiver.StartOfStream,
|
||||
false).get();
|
||||
```
|
||||
|
||||
Once the receiver is initialized, getting events is just a matter of calling the *receive()* method in a loop. Each call
|
||||
to *receive()* will fetch an eneumerable batch of events to process.
|
||||
|
||||
```Java
|
||||
Iterable<EventData> receivedEvents = receiver.receive().get();
|
||||
```
|
||||
|
||||
##Consumer Groups
|
||||
|
||||
Event Hub receivers always receive via Consumer Groups. A consumer group is a named entity on an Event Hub that is
|
||||
conceptually similar to a Messaging Topic subscription, even though it provides no server-side filtering capabilities.
|
||||
|
||||
Each Event Hub has a "default consumer group" that is created with the Event Hub, which is also the one used in
|
||||
the samples.
|
||||
|
||||
The primary function of consumers groups is to provide a shared coordination context for multiple concurrent consumers
|
||||
processing the same event stream in parallel. There can be at most 5 concurrent readers on a partition per consumer group;
|
||||
it is however *recommended* that there is only one active receiver on a partition per consumer group. The [Ownership, Failover,
|
||||
and Epochs](#ownership-failover-and-epochs) section below explains how to ensure this.
|
||||
|
||||
You can create up to 20 such consumer groups on an Event Hub via the Azure portal or the HTTP API.
|
||||
|
||||
##Using Offsets
|
||||
|
||||
Each Event Hub has a configurable event retention period, which defaults to one day and can be extended to seven days.
|
||||
By contacting Microsoft product support you can ask for further extend the retention period to up to 30 days.
|
||||
|
||||
There are three options for a consumer to pick at which point into the retained event stream it wants to
|
||||
begin receiving events:
|
||||
|
||||
1. **Start of stream** Receive from the start of the retained stream, as shown in the example above. This option will start
|
||||
with the oldest available retained event in the partition and then continously deliver events until all available events
|
||||
have been read.
|
||||
2. **Time offset**. This option will start with the oldest event in the partition that has been received into the Event Hub
|
||||
after the given instant.
|
||||
|
||||
``` Java
|
||||
PartitionReceiver receiver = ehClient.createReceiver(
|
||||
EventHubClient.DefaultConsumerGroupName,
|
||||
partitionId,
|
||||
Instant.now()).get();
|
||||
```
|
||||
3. **Absolute offset** This option is commonly used to resume receiving events after a previous receiver on the partition
|
||||
has been aborted or suspended for any reason. The offset is a system-supplied string that should not be interpreted by
|
||||
the application. The next section will discuss scenarios for using this option.
|
||||
|
||||
``` Java
|
||||
PartitionReceiver receiver = ehClient.createReceiver(
|
||||
EventHubClient.DefaultConsumerGroupName,
|
||||
partitionId,
|
||||
savedOffset).get();
|
||||
```
|
||||
|
||||
|
||||
##Ownership, Failover, and Epochs
|
||||
|
||||
As mentioned in the overview above, the common consumption model for Event Hubs is that multiple consumers process events
|
||||
from a single Event Hub in parallel. Depending on the amount of processing work required and the data volume that has to be
|
||||
worked through, and also dependent on how resilient the system needs to be against failures, these consumers may be spread
|
||||
across multiple different compute nodes (VMs).
|
||||
|
||||
A simple setup for this is to create a fixed assignment of Event Hub partitions to compute nodes. For instance, you
|
||||
could have two compute nodes handling events from 8 Event Hub partitions, assigning the first 4 partitions to the
|
||||
first node and assigning the second set of 4 to the second node.
|
||||
|
||||
The downside of such a simple model with fixed assignments is that if one of the compute nodes becomes unavailable, no events
|
||||
get processed for the partitions owned by that node.
|
||||
|
||||
The alternative is to make ownership dynamic and have all processing nodes reach consensus about who owns which partition,
|
||||
which is referred to as "[leader election](https://en.wikipedia.org/wiki/Leader_election)" or "consensus" in literature.
|
||||
One infrastructure for negotiating leaders is [Apache Zookeeper] (https://zookeeper.apache.org/doc/trunk/recipes.html#sc_leaderElection),
|
||||
another one one is [leader election over Azure Blobs](https://msdn.microsoft.com/de-de/library/dn568104.aspx).
|
||||
|
||||
> The "event processor host" is a forthcoming extension to this Java client that provides an implementation of leader
|
||||
> election over Azure blobs. The event processor host for Java is very similar to the respective implementation available
|
||||
> for C# clients.
|
||||
|
||||
As the number of event processor nodes grows or shrinks, a leader election model will yield a redistribution of partition
|
||||
ownership. More nodes each own fewer partitions, fewer nodes each own more partitions. Since leader election occurs
|
||||
external to the Event Hub clients, there's a mechanism needed to allow a new leader for a partition to force the old leader
|
||||
to let go of the partition, meaning it must be forced to stop receiving and processing events from the partition.
|
||||
|
||||
That mechanism is called **epochs**. An epoch is an integer value that acts as a label for the time period during which the
|
||||
"current" leader for the partition retains its ownership. The epoch value is provided as an argument to the
|
||||
*EventHubClient::createEpochReciver* method.
|
||||
|
||||
``` Java
|
||||
epochValue = 1
|
||||
PartitionReceiver receiver1 = ehClient.createEpochReceiver(
|
||||
EventHubClient.DefaultConsumerGroupName,
|
||||
partitionId,
|
||||
savedOffset,
|
||||
epochValue).get();
|
||||
```
|
||||
|
||||
When a new partition owner takes over, it creates a receiver for the same partition, but with a greater epoch value. This will instantly
|
||||
cause the previous receiver to be dropped (the service initiates a shutdown of the link) and the new receiver to take over.
|
||||
|
||||
``` Java
|
||||
/* obtain checkpoint data */
|
||||
epochValue = 2
|
||||
PartitionReceiver receiver2 = ehClient.createEpochReceiver(
|
||||
EventHubClient.DefaultConsumerGroupName,
|
||||
partitionId,
|
||||
savedOffset,
|
||||
epochValue).get();
|
||||
```
|
||||
|
||||
The new leader obviously also needs to know at which offset processing shall continue. For this, the current owner of a partition should
|
||||
periodically record its progress on the event stream to a shared location, tracking the offset of the last processed message. This is
|
||||
called "checkpointing". In case of the aforementioned Azure Blob lease election model, the blob itself is a great place to keep this information.
|
||||
|
||||
How often an event processor writes checkpoint information depends on the use-case. Frequent checkpointing may cause excessive writes to
|
||||
the checkpoint state location. Too infrequent checkpointing may cause too many events to be re-processed as the new onwer picks up from
|
||||
an outdated offset.
|
||||
|
||||
##AMQP 1.0
|
||||
Azure Event Hubs requires using the AMQP 1.0 protocol for consuming events.
|
||||
|
||||
AMQP 1.0 is a TCP based protocol. For Azure Event Hubs, all traffic *must* be protected using TLS (SSL) and is using
|
||||
TCP port 5672. For the WebSocket binding of AMQP, traffic flows via port 443.
|
||||
|
||||
##Connection Strings
|
||||
|
||||
Azure Event Hubs and Azure Service Bus share a common format for connection strings. A connection string holds all required
|
||||
information to set up a connection with an Event Hub. The format is a simple property/value list of the form
|
||||
{property}={value} with pairs separated by ampersands (&).
|
||||
|
||||
| Property | Description |
|
||||
|-----------------------|------------------------------------------------------------|
|
||||
| Endpoint | URI for the Event Hubs namespace. Typically has the form *sb://{namespace}.servicebus.windows.net/* |
|
||||
| EntityPath | Relative path of the Event Hub in the namespace. Commonly this is just the Event Hub name |
|
||||
| SharedAccessKeyName | Name of a Shared Access Signature rule configured for the Event Hub or the Event Hub name. For publishers, the rule must include "Send" permissions. |
|
||||
| SharedAccessKey | Base64-encoded value of the Shared Access Key for the rule |
|
||||
| SharedAccessSignature | A previously issued [Shared Access Signature token](https://azure.microsoft.com/en-us/documentation/articles/service-bus-sas-overview/) |
|
||||
|
||||
A connection string will therefore have the following form:
|
||||
|
||||
```
|
||||
Endpoint=sb://clemensveu.servicebus.windows.net&EntityPath=myeventhub&SharedAccessSignature=....
|
||||
```
|
||||
|
||||
Consumers generally have a different relationship with the Event Hub than publishers. Usually there are relatively few consumers
|
||||
and those consumers enjoy a high level of trust within the context of a system. The relationshiop between an event consumer
|
||||
and the Event Hub is commonly also much longer-lived.
|
||||
|
||||
It's therefore more common for a consumer to be directly configured with a SAS key rule name and key as part of the
|
||||
connection string. In order to prevent the SAS key from leaking, it is stil advisable to use a long-lived
|
||||
token rather than the naked key.
|
||||
|
||||
A generated token will be configured into the connection string with the *SharedAccessSignature* property.
|
||||
|
||||
More information about Shared Access Signature in Service Bus and Event Hubs about about how to generate the required tokens
|
||||
in a range of languages [can be found on the Azure site.](https://azure.microsoft.com/en-us/documentation/articles/service-bus-sas-overview/)
|
||||
|
||||
The easiest way to obtain a token for development purposes is to copy the connection string from the Azure portal. These tokens
|
||||
do include key name and key value outright. The portal does not issue tokens.
|
||||
|
|
@ -0,0 +1,204 @@
|
|||
#Publishing Events with the Java client for Azure Event Hubs
|
||||
|
||||
The vast majority of Event Hub applications using this and the other client libraries are and will be event publishers.
|
||||
And for most of these publishers, publishing events is extremely simple and handled with just a few API gestures.
|
||||
|
||||
##Getting Started
|
||||
|
||||
This library is available for use in Maven projects from the Maven Central Repository, and can be referenced using the
|
||||
following dependency declaration inside of your Maven project file:
|
||||
|
||||
```XML
|
||||
<dependency>
|
||||
<groupId>com.microsoft.azure</groupId>
|
||||
<artifactId>azure-eventhubs-clients</artifactId>
|
||||
<version>0.6.0</version>
|
||||
</dependency>
|
||||
```
|
||||
|
||||
For different types of build environments, the latest released JAR files can also be [explicitly obtained from the
|
||||
Maven Central Repository]() or from [the Release distribution point on GitHub]().
|
||||
|
||||
|
||||
For a simple event publisher, you'll need to import the *com.microsoft.azure.eventhubs* package for the Event Hub client classes
|
||||
and the *com.microsoft.azure.servicebus* package for utility classes like common exceptions that are shared with the
|
||||
Azure Service Bus Messaging client.
|
||||
|
||||
|
||||
```Java
|
||||
import com.microsoft.azure.eventhubs.*;
|
||||
import com.microsoft.azure.servicebus.*;
|
||||
```
|
||||
|
||||
Using an Event Hub connection string, which holds all required connection information including an authorization key or token
|
||||
(see [Connection Strings](#connection-strings)), you then create an *EventHubClient* instance.
|
||||
|
||||
```Java
|
||||
EventHubClient ehClient = EventHubClient.createFromConnectionString(str).get();
|
||||
```
|
||||
|
||||
Once you have the client in hands, you can package any arbitrary payload as a plain array of bytes and send it. The samples
|
||||
we use to illustrate the functionality send a UTF-8 encoded JSON data, but you can transfer any format you wish.
|
||||
|
||||
```Java
|
||||
EventData sendEvent = new EventData(payloadBytes);
|
||||
ehClient.send(sendEvent).get();
|
||||
```
|
||||
|
||||
The entire client API is built for Java 8's concurrent task model, generally returning
|
||||
[*CompleteableFuture<T>*](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/CompletableFuture.html), so the
|
||||
*.get()* suffixing the operations in the snippets above just wait until the respective operation is complete.
|
||||
|
||||
##AMQP 1.0
|
||||
Azure Event Hubs allows for publishing events using the HTTPS and AMQP 1.0 protocols. The Azure Event Hub endpoints
|
||||
also support AMQP over the WebSocket protocol, allowing event traffic to leverage the same outbound TCP port as
|
||||
HTTPS.
|
||||
|
||||
This client library is built on top of the [Apache Qpid Proton-J]() libraries and supports AMQP, which is significantly
|
||||
more efficient at publishing event streams than HTTPS. AMQP 1.0 is an international standard published as ISO/IEC 19464:2014.
|
||||
|
||||
AMQP is session-oriented and sets up the required addressing information and authorization information just once for each
|
||||
send link, while HTTPS requires doing so with each sent message. AMQP also has a compact binary format to express common
|
||||
event properties, while HTTPS requires passing message metadata in a verbose text format. AMQP can also keep a significant
|
||||
number of events "in flight" with asynchronous and robust acknowledgement flow, while HTTPS enforces a strict request-reply
|
||||
pattern.
|
||||
|
||||
AMQP 1.0 is a TCP based protocol. For Azure Event Hubs, all traffic *must* be protected using TLS (SSL) and is using
|
||||
TCP port 5672.
|
||||
|
||||
The current release of Proton-J does not yet support the WebSocket protocol. If your event publishers routinely operate
|
||||
in environments where tunneling over the HTTPS TCP port 443 is required due to Firewall policy reasons, please refer to
|
||||
the section [Publishing via HTTPS](#Publishing via HTTPS) below for a temporary alternative.
|
||||
|
||||
##Connection Strings
|
||||
|
||||
Azure Event Hubs and Azure Service Bus share a common format for connection strings. A connection string holds all required
|
||||
information to set up a connection with an Event Hub. The format is a simple property/value list of the form
|
||||
{property}={value} with pairs separated by ampersands (&).
|
||||
|
||||
| Property | Description |
|
||||
|-----------------------|------------------------------------------------------------|
|
||||
| Endpoint | URI for the Event Hubs namespace. Typically has the form *sb://{namespace}.servicebus.windows.net/* |
|
||||
| EntityPath | Relative path of the Event Hub in the namespace. Commonly this is just the Event Hub name |
|
||||
| SharedAccessKeyName | Name of a Shared Access Signature rule configured for the Event Hub or the Event Hub name. For publishers, the rule must include "Send" permissions. |
|
||||
| SharedAccessKey | Base64-encoded value of the Shared Access Key for the rule |
|
||||
| SharedAccessSignature | A previously issued Shared Access Signature token |
|
||||
|
||||
A connection string will therefore have the following form:
|
||||
|
||||
```
|
||||
Endpoint=sb://clemensveu.servicebus.windows.net&EntityPath=myeventhub&SharedAccessSignature=....
|
||||
```
|
||||
|
||||
###Tokens
|
||||
The preferred model for event publishers is to generate a token from the SAS rule key and give that to the publisher instead
|
||||
of giving the direct access to the signing key. Tokens can be extremely short lived, for a few minutes, when a sender shall
|
||||
only obtain temporary send access to the Event Hub. They can also be very long-lived, for several months, when a sender needs
|
||||
permanent send access to the Event Hub. When the key on which the token is based in invalidated, all tokens become invalid and
|
||||
must be reissued.
|
||||
|
||||
The generated token will be configured into the connection string with the *SharedAccessSignature* property.
|
||||
|
||||
More information about Shared Access Signature in Service Bus and Event Hubs about about how to generate the required tokens
|
||||
in a range of languages [can be found on the Azure site.](https://azure.microsoft.com/en-us/documentation/articles/service-bus-sas-overview/)
|
||||
|
||||
The easiest way to obtain a token for development purposes is to copy the connection string from the Azure portal. These tokens
|
||||
do include key name and key value outright. The portal does not issue tokens.
|
||||
|
||||
##Advanced Operations
|
||||
|
||||
The publisher example shown in the overview above sends an event into the Event Hub without further qualification. This is
|
||||
the preferred and most flexible and reliable option. For specific needs, Event Hubs offers two extra options to
|
||||
qualify send operations: Publisher policies and partion addressing.
|
||||
|
||||
###Publisher Policies
|
||||
|
||||
To a publisher, a publisher policy is largely just a suffix appended to the Event Hub path, which has the form
|
||||
|
||||
```
|
||||
<event-hub-name>/publishers/<policyname>
|
||||
```
|
||||
|
||||
The name of of the policy is commonly chosen by the Event Hub owner, and might be the name of the publisher's account
|
||||
or the identifier of a publishing device, or some randomly chosen string uniquely assigned to the sender.
|
||||
|
||||
With publisher policies, the Event Hub owner will generally hold on to the signing key of a SAS rule conferring only "Send"
|
||||
permission, and issue a send-only token to the publisher as described in the [Tokens](#tokens) section above. This token is
|
||||
scoped to the path shown above and can only be used to publish to this particular policy.
|
||||
|
||||
The special functionality of the publisher policy is twofold:
|
||||
|
||||
* The Event Hub owner can issue a very long-lived token to the publisher whose expiration may be several months or even years
|
||||
into the future. Should the owner suspect that the publisher has been compromised or acts maliciously, the Event Hub owner
|
||||
can revoke access for this particular publisher instead of having to invalidate the signing key and thus revoking all tokens.
|
||||
The revocation gesture is an HTTPS management operation that is documented [here].
|
||||
* Each event that is sent via a publisher policy has its PartitionKey property set, on the Event Hub side, to the publisher
|
||||
name to prevent publisher spoofing. The publisher has no control over the value of this field when sending via a publisher
|
||||
policy. The property therefore serves as proof to the event consumer that the publisher was indeed in possesion of a valid
|
||||
token for the publisher policy. The alternative to this mechanism is some form of digital signature applied to the payload.
|
||||
|
||||
There are some potential availability and reliability caveats associated with using punblisher policies that are discussed
|
||||
in the following section about partition addressing.
|
||||
|
||||
###Partition Addressing
|
||||
|
||||
Any Event Hub's event store is split up into at least 4 partitions, each maintaining a separate event log. You can think
|
||||
of partitions like lanes on a highway. The more events the Event Hub needs to handle, the more lanes (partitions) you have
|
||||
to add. Each partition can handle at most the equivalent of 1 "throughput unit", equivalent to at most 1000 events per
|
||||
second and at most 1 Megabyte per second.
|
||||
|
||||
In some cases, publisher applications need to address partitions directly in order to pre-categorize events for consumption.
|
||||
A partition is directly addressed either by using the partition's identifier or by using some string (partition key) that gets
|
||||
consistently hashed to a particular partition.
|
||||
|
||||
This capability, paired with a large number of partitions, may appear attractive for implementing a fine grained, per publisher
|
||||
subscription scheme similar to what Topics offer in Service Bus Messaging - but it's not at all how the capability should be used
|
||||
and it's likely not going to yield satisfying results.
|
||||
|
||||
Partition addressing is designed as a routing capability that consistently assigns events from the same sources to the same partition allowing
|
||||
downstream consumer systems to be optimized, but under the assumption of very many of such sources (hundreds, thousands) share
|
||||
the same partition. If you need fine-grained content-based routing, Service Bus Topics might be the better option.
|
||||
|
||||
####Using Partition Keys
|
||||
|
||||
Of the two addressing options, the preferable one is to let the hash algorithm map the event to the appropriate partition.
|
||||
The gesture is a straightforward extra override to the send operation supplying the partition key:
|
||||
|
||||
```Java
|
||||
EventData sendEvent = new EventData(payloadBytes);
|
||||
> ehClient.send(sendEvent, partitionKey).get();
|
||||
```
|
||||
|
||||
####Using Partition Ids
|
||||
|
||||
If you indeed need to target a specific partition, for instance because you must use a particular distribution strategy,
|
||||
you can send directly to the partition, but doing so requires an extra gesture so that you don't accidentally choose this
|
||||
option. To send to a partition you explicitly need to create a client object that is tued to the partition as shown below:
|
||||
|
||||
```Java
|
||||
EventHubClient ehClient = EventHubClient.createFromConnectionString(str).get();
|
||||
> EventHubSender sender = ehClient.createPartitionSender("0").get();
|
||||
EventData sendEvent = new EventData(payloadBytes);
|
||||
sender.send(sendEvent).get();
|
||||
```
|
||||
|
||||
####Special considerations for partitions and publisher policies
|
||||
|
||||
Using partitions or publisher policies (which are effectively a special kind of partition key) may impact throughput
|
||||
and availability of your Event Hub solution.
|
||||
|
||||
When you do a regular send operation that does not prescribe a particular partition, the Event Hub will choose a
|
||||
partition at random, ensuring about equal distribution of events across partitions. Sticking with the above analogy,
|
||||
all highway lanes get the same traffic.
|
||||
|
||||
If you explicitly choose the partition key or partition-id, it's up to you to take care that traffic is evenly
|
||||
distributed, otherwise you may end up with a traffic jam (in the form of throttling) on one partition while there's
|
||||
little or no traffic on another partition.
|
||||
|
||||
Also, like every other aspect of distributed systems, the log storage backing any partition may rarely and briefly slow
|
||||
down or experience congestion. If you leave choosing the target partition for an event to Event Hubs, it can flexibly
|
||||
react to such availability blips for publishers.
|
||||
|
||||
Generally, you should *not* use partitioning as a traffic prioritization scheme, and you should *not* use it
|
||||
for fine grained assignment of particular kinds of events to a particular partitions. Partitions are a load
|
||||
distribution mechanism, not a filtering model.
|
145
java/readme.md
145
java/readme.md
|
@ -4,14 +4,136 @@ Azure Event Hubs is a highly scalable publish-subscribe service that can ingest
|
|||
This lets you process and analyze the massive amounts of data produced by your connected devices and applications. Once Event Hubs has collected the data,
|
||||
transform and store it by using any real-time analytics provider or with batching/storage adapters.
|
||||
|
||||
Refer to the [documentation](https://azure.microsoft.com/services/event-hubs/) to learn more about Event Hubs in general.
|
||||
Refer to the [documentation on the Microsoft Azure site](https://azure.microsoft.com/services/event-hubs/) to learn more about Event Hubs in general.
|
||||
|
||||
|
||||
##Overview
|
||||
|
||||
##Instructions to open the Windows Azure EventHubs Java SDK in Eclipse:
|
||||
This Java client library for Azure Event Hubs allows for both sending events to and receiving events from an Azure Event Hub.
|
||||
|
||||
1. Maven is expected to be installed and Configured - version > 3.3.9
|
||||
2. After git-clone'ing to the project - open shell and navigate to the location where the 'pom.xml' is present
|
||||
An **event publisher** is a source of telemetry data, diagnostics information, usage logs, or other log data, as
|
||||
part of an emvbedded device solution, a mobile device application, a game title running on a console or other device,
|
||||
some client or server based business solution, or a web site.
|
||||
|
||||
An **event consumer** picks up such information from the Event Hub and processes it. Processing may involve aggregation, complex
|
||||
computation and filtering. Processing may also involve distribution or storage of the information in a raw or transformed fashion.
|
||||
Event Hub consumers are often robust and high-scale platform infrastructure parts with built-in analytics capabilites, like Azure
|
||||
Stream Analytics, Apache Spark, or Apache Storm.
|
||||
|
||||
Most applications will act either as an event publisher or an event consumer, but rarely both. The exception are event
|
||||
consumers that filter and/or transform event streams and then forward them on to another Event Hub; an example for such a consumer
|
||||
and (re-)publisher is Azure Stream Analytics.
|
||||
|
||||
We'll therefore only give a glimpse at publishing and receiving here in this overview and provide further detail in
|
||||
the [Publishing Events](PublishingEvents.md) and [Consuming Events](ConsumingEvents.md) guides.
|
||||
|
||||
###Publishing Events
|
||||
|
||||
The vast majority of Event Hub applications using this and the other client libraries are and will be event publishers.
|
||||
And for most of these publishers, publishing events is extremely simple.
|
||||
|
||||
With your Java application referencing this client library,
|
||||
which is quite simple in a Maven build [as we explain in the guide](PublishingEvents.md), you'll need to import the
|
||||
*com.microsoft.azure.eventhubs* package with the *EventData* and *EventHubClient* classes.
|
||||
|
||||
|
||||
```Java
|
||||
import com.microsoft.azure.eventhubs.*;
|
||||
```
|
||||
|
||||
Using an Event Hub connection string, which holds all required conenction information including an authorization key or token,
|
||||
you then create an *EventHubClient* instance, which manages a secure AMQP 1.0 connection to the Event Hub.
|
||||
|
||||
```Java
|
||||
EventHubClient ehClient = EventHubClient.createFromConnectionString(str).get();
|
||||
```
|
||||
|
||||
Once you have the client in hands, you can package any arbitrary payload as a plain array of bytes and send it.
|
||||
|
||||
```Java
|
||||
EventData sendEvent = new EventData(payloadBytes);
|
||||
ehClient.send(sendEvent).get();
|
||||
```
|
||||
|
||||
The entire client API is built for Java 8's concurrent task model, generally returning
|
||||
[*CompleteableFuture<T>*](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/CompletableFuture.html), so the
|
||||
*.get()* suffixing the operations in the snippets above just wait until the respective operation is complete.
|
||||
|
||||
Learn more about publishing events, including advanced options, and when you should and shouldn't use those options,
|
||||
[in the event publisher guide](PublishingEvents.md).
|
||||
|
||||
###Consuming Events
|
||||
|
||||
Consuming events from Azure Event Hubs is a bit more complex than sending events, because the receivers need to be
|
||||
aware of Event Hub's partitioning model, while senders can most often ignore it.
|
||||
|
||||
Any Event Hub's event store is split up into at least 4 partitions, each maintaining a separate event log. You can think
|
||||
of partitions like lanes on a highway. The more events the Event Hub needs to handle, the more lanes (partitions) you have
|
||||
to add. Each partition can handle at most the equivalent of 1 "throughput unit", equivalent to at most 1000 events per
|
||||
second and at most 1 Megabyte per second.
|
||||
|
||||
Consuming messages is also quite different compared to typical messaging infrastuctures like queues or topic
|
||||
subscriptions, where the consumer simply fetches the "next" message. Azure Event Hubs puts the consumer in control of
|
||||
the offset from which the log shall be read, and the consumer can repeatedly pick a different or the same offset and read
|
||||
the event stream from chosen offsets while the events are being retained. Each partition is therefore loosely analogous
|
||||
to a tape drive that you can wind back to a particular mark and then play back to the freshest data available.
|
||||
|
||||
Just like the sender, the receiver code imports the package and creates an *EventHubClient* from a given connecting string
|
||||
|
||||
```Java
|
||||
EventHubClient ehClient = EventHubClient.createFromConnectionString(str).get();
|
||||
```
|
||||
|
||||
The receiver code then creates (at least) one *PartitionReceiver* that will receive the data. The receiver is seeded with
|
||||
an offset, in the snippet below it's simply the start of the log.
|
||||
|
||||
```Java
|
||||
String partitionId = "0"; // API to get PartitionIds will be released as part of V0.2
|
||||
PartitionReceiver receiver = ehClient.createReceiver(
|
||||
EventHubClient.DefaultConsumerGroupName,
|
||||
partitionId,
|
||||
PartitionReceiver.StartOfStream,
|
||||
false).get();
|
||||
```
|
||||
|
||||
Once the receiver is initialized, getting events is just a matter of calling the *receive()* method in a loop. Each call
|
||||
to *receive()* will fetch an eneumerable batch of events to process.
|
||||
|
||||
```Java
|
||||
Iterable<EventData> receivedEvents = receiver.receive().get();
|
||||
```
|
||||
|
||||
As you might imagine, there's quite a bit more to know about partitions, about distributing the workload of processing huge and
|
||||
fast data streams across several receiver machines, and about managing offsets in such a multi-machine scenario such that
|
||||
data is not repeatedly read or, worse, skipped. You can find this and other details discussed in
|
||||
the [Consuming Events](ConsumingEvents.md) guide.
|
||||
|
||||
##Using the library
|
||||
|
||||
You will generally not have to build this client library yourself. The build model and options are documented in the
|
||||
[Contributor's Guide](developer.md), which also explains how to create and submit proposed patches and extensions, and how to
|
||||
build a private version of the client library using a snapshot version of the foundational Apache Qpid Proton-J library, which
|
||||
this library uses as its AMQP 1.0 protocol core.
|
||||
|
||||
This library is available for use in Maven projects from the Maven Central Repository, and can be referenced using the
|
||||
following dependency declaration. The dependency declaration will in turn pull futrher required dependencies, specifically
|
||||
the required version of Apache Qpid Proton-J, and the crytography library BCPKIX by the Legion of Bouncy Castle.
|
||||
|
||||
```XML
|
||||
<dependency>
|
||||
<groupId>com.microsoft.azure</groupId>
|
||||
<artifactId>azure-eventhubs-clients</artifactId>
|
||||
<version>0.6.0</version>
|
||||
</dependency>
|
||||
```
|
||||
|
||||
For different types of build environments, the latest released JAR files can also be [explicitly obtained from the
|
||||
Maven Central Repository]() or from [the Release distribution point on GitHub]().
|
||||
|
||||
###Explore the client library with the Eclipse IDE
|
||||
|
||||
1. Maven is expected to be installed and configured - version > 3.3.9
|
||||
2. After git-clone'ing to the project, open the shell and navigate to the location where the 'pom.xml' is present
|
||||
3. Run these commands to prepare this maven project to be opened in Eclipse:
|
||||
- mvn -Declipse.workspace=<path_to_workspace> eclipse:configure-workspace
|
||||
- mvn eclipse:eclipse
|
||||
|
@ -19,5 +141,16 @@ Refer to the [documentation](https://azure.microsoft.com/services/event-hubs/) t
|
|||
5. If you see any Build Errors - make sure the Execution Environment is set to java sdk version 1.7 or higher
|
||||
* [go to Project > Properties > 'Java Build Path' > Libraries tab. Click on 'JRE System Library (V x.xx)' and Edit this to be 1.7 or higher]
|
||||
|
||||
##Contributing
|
||||
[Refer to the developer.md](developer.md) to find out how to contribute to Event Hubs Java client.
|
||||
##How to provide feedback
|
||||
|
||||
First, if you experience any issues with the runtime behavior of the Azure Event Hubs service, please consider filing a support request
|
||||
right away. Your options for [getting support are enumerated here](https://azure.microsoft.com/support/options/). In the Azure portal,
|
||||
you can file a support request from the "Help and support" menu in the upper right hand corner of the page.
|
||||
|
||||
If you find issues in this library or have suggestions for improvement of code or documentation, [you can file an issue in the project's
|
||||
GitHub repository.](https://github.com/Azure/azure-event-hubs/issues). Issues related to runtime behavior of the service, such as
|
||||
sporadic exceptions or apparent service-side performance or reliability issues can not be handled here.
|
||||
|
||||
Generally, if you want to discuss Azure Event Hubs or this client library with the community and the maintainers, you can turn to
|
||||
[stackoverflow.com under the #azure-eventhub tag](http://stackoverflow.com/questions/tagged/azure-eventhub) or the
|
||||
[MSDN Service Bus Forum](https://social.msdn.microsoft.com/Forums/en-US/home?forum=servbus).
|
Загрузка…
Ссылка в новой задаче