azure-sdk-for-js/sdk/eventhub/event-processor-host
Daniel Rodríguez b8f46a9175
Ensuring that the build script also cleans (#17123)
* [Identity] Ensuring that the build script also cleans

* ...everywhere!

* removing double cleans
2021-08-26 14:25:14 -04:00
..
review Migrate to TypeScript 3.7 (#7210) 2020-02-06 12:49:50 -08:00
samples Update engines, @types/node to Node.js v12 (#15996) 2021-06-28 15:54:47 -07:00
src Rename master to main 2021-06-24 09:14:06 -07:00
test Rename hardcoded master links to use main 2021-06-22 15:51:16 -07:00
.nycrc [Service Bus] [Event Hubs] Fix code coverage for node tests (#6744) 2020-01-08 12:04:31 -08:00
CHANGELOG.md Rename hardcoded master links to use main 2021-06-22 15:51:16 -07:00
License Use inlineSources to bundle source file contents inside map files (#7615) 2020-03-11 10:34:37 -07:00
README.md Rename hardcoded master links to use main 2021-06-22 15:51:16 -07:00
api-extractor.json Set docModel to true in api-extractor.json config (#10509) 2020-08-10 10:42:04 -07:00
eph.png Moving eventhub into sdk directory (#1965) 2019-04-03 11:01:26 -07:00
overview.md [Storage] Enable doc-warden checking for storage READMEs (#6660) 2019-12-30 11:14:42 -08:00
package.json Ensuring that the build script also cleans (#17123) 2021-08-26 14:25:14 -04:00
rollup.base.config.js update clean commands (#14693) 2021-04-07 18:31:38 +00:00
rollup.config.js Moving eventhub into sdk directory (#1965) 2019-04-03 11:01:26 -07:00
rollup.test.config.js Moving eventhub into sdk directory (#1965) 2019-04-03 11:01:26 -07:00
tests.yml Fix broken live test configurations (#14052) 2021-03-02 13:25:37 -05:00
tsconfig.json Standardizing the tsconfig.json files (#8671) 2020-06-02 13:28:03 -04:00
tsdoc.json replace @ignore with @hidden (#12963) 2021-01-05 04:17:33 +00:00

README.md

@azure/event-processor-host

Please note, a newer package @azure/event-hubs is available as of January, 2020. While this package will continue to receive critical bug fixes, we strongly encourage you to upgrade. See the migration guide for more details.

Azure Event Processor Host helps you efficiently receive events from an EventHub. It will create EventHub Receivers across all the partitions in the provided consumer group of an EventHub and provide you messages received across all the partitions. It will checkpoint metadata about the received messages at regular interval in an Azure Storage Blob. This makes it easy to continue receiving messages from where you left at a later time.

Conceptual Overview

overview

  • More information about Azure Event Processor Host can be found over here.
  • General overview of how the Event Processor Host SDK works internally can be found over here.

Pre-requisite

  • Node.js version: 6.x or higher.
  • We would still encourage you to install the latest available LTS version at any given time from https://nodejs.org. It is a good practice to always install the latest available LTS version of node.js.
  • Installing node.js on Windows or macOS is very simple with available installers on the node.js website. If you are using a linux based OS, then you can find easy to follow, one step installation instructions over here.

Installation

npm install @azure/event-processor-host

IDE

This sdk has been developed in TypeScript and has good source code documentation. It is highly recommended to use vscode or any other IDE that provides better intellisense and exposes the full power of source code documentation.

Debug logs

You can set the following environment variable to get the debug logs.

  • Getting debug logs only from the Event Processor Host SDK
export DEBUG=azure:eph*
  • Getting debug logs from the Event Processor Host SDK and the protocol level library.
export DEBUG=azure:eph*,rhea*
  • Getting debug logs from the Event Processor Host SDK, the Event Hub SDK and the protocol level library.
export DEBUG=azure*,rhea*
  • If you are not interested in viewing the message transformation (which consumes lot of console/disk space) then you can set the DEBUG environment variable as follows:
export DEBUG=azure*,rhea*,-rhea:raw,-rhea:message,-azure:amqp-common:datatransformer
  • If you are interested only in errors, then you can set the DEBUG environment variable as follows:
export DEBUG=azure:eph:error,azure:event-hubs:error,azure-amqp-common:error,rhea-promise:error,rhea:events,rhea:frames,rhea:io,rhea:flow

Logging to a file

  • Set the DEBUG environment variable as shown above and then run your test script as follows:
    • Logging statements from your test script go to out.log and logging statements from the sdk go to debug.log.
      node your-test-script.js > out.log 2>debug.log
      
    • Logging statements from your test script and the sdk go to the same file out.log by redirecting stderr to stdout (&1), and then redirect stdout to a file:
      node your-test-script.js >out.log 2>&1
      
    • Logging statements from your test script and the sdk go to the same file out.log.
        node your-test-script.js &> out.log
      

Recommendation

  • You will find the sample provided below demonstrates a multi eph instance in the same process. Since node.js is single threaded, it has to load balance between managing(renew, steal, acquire, update) leases and receive messages across all the partitions. It is better to create each instance in a separate process or a separate machine. This should provide better results.

Examples

  • Examples can be found over here.

Usage

NOTE

The following samples focus on EPH (Event Processor Host) which is responsible for receiving messages. For sending messages to the EventHub, please use the @azure/event-hubs package from npm. More information about the event hub client can be found over here. You can also use this example that sends multiple messages batched together. You should be able to run the send example from one terminal window and see those messages being received in the singleEph or multipleEph example being run in the second terminal window.

Single EPH instance.

const { EventProcessorHost, delay } = require("@azure/event-processor-host");

const path = process.env.EVENTHUB_NAME;
const storageCS = process.env.STORAGE_CONNECTION_STRING;
const ehCS = process.env.EVENTHUB_CONNECTION_STRING;
const storageContainerName = "test-container";

async function main() {
  // Create the Event Processo Host
  const eph = EventProcessorHost.createFromConnectionString(
    EventProcessorHost.createHostName("my-host"),
    storageCS,
    storageContainerName,
    ehCS,
    {
      eventHubPath: path
    },
    onEphError: (error) => {
      console.log("This handler will notify you of any internal errors that happen " +
      "during partition and lease management: %O", error);
    }
  );
  let count = 0;
  // Message event handler
  const onMessage = async (context/*PartitionContext*/, data /*EventData*/) => {
    console.log(">>>>> Rx message from '%s': '%s'", context.partitionId, data.body);
    count++;
    // let us checkpoint every 100th message that is received across all the partitions.
    if (count % 100 === 0) {
      return await context.checkpoint();
    }
  };
  // Error event handler
  const onError = (error) => {
    console.log(">>>>> Received Error: %O", error);
  };
  // start the EPH
  await eph.start(onMessage, onError);
  // After some time let' say 2 minutes
  await delay(120000);
  // This will stop the EPH.
  await eph.stop();
}

main().catch((err) => {
  console.log(err);
});

Multiple EPH instances in the same process.

This example creates 2 instances of EPH in the same process. It is also perfectly fine to create multiple EPH instances in different processes on the same or different machine.

const { EventProcessorHost, delay } = require("@azure/event-processor-host");

// set the values from environment variables.
const path = process.env.EVENTHUB_NAME || "";
const storageCS = process.env.STORAGE_CONNECTION_STRING;
const ehCS = process.env.EVENTHUB_CONNECTION_STRING;

// set the names of eph and the lease container.
const storageContainerName = "test-container";
const ephName1 = "eph-1";
const ephName2 = "eph-2";

/**
 * The main function that executes the sample.
 */
async function main() {
  // 1. Start eph-1.
  const eph1 = await startEph(ephName1);
  await sleep(20);
  // 2. After 20 seconds start eph-2.
  const eph2 = await startEph(ephName2);
  await sleep(90);
  // 3. Now, load will be evenly balanced between eph-1 and eph-2. After 90 seconds stop eph-1.
  await stopEph(eph1);
  await sleep(40);
  // 4. Now, eph-1 will regain access to all the partitions and will close after 40 seconds.
  await stopEph(eph2);
}

// calling the main().
main().catch((err) => {
  console.log("Exiting from main() due to an error: %O.", err);
});

/**
 * Sleeps for the given number of seconds.
 * @param timeInSeconds Time to sleep in seconds.
 */
async function sleep(timeInSeconds /**number**/) {
  console.log(">>>>>> Sleeping for %d seconds..", timeInSeconds);
  await delay(timeInSeconds * 1000);
}

/**
 * Creates an EPH with the given name and starts the EPH.
 * @param ephName The name of the EPH.
 * @returns {Promise<EventProcessorHost>} Promise<EventProcessorHost>
 */
async function startEph(ephName /**string**/) {
  // Create the Event Processor Host
  const eph = EventProcessorHost.createFromConnectionString(
    ephName,
    storageCS,
    storageContainerName,
    ehCS,
    {
      eventHubPath: path,
      // This method will provide errors that occur during lease and partition management. The
      // errors that occur while receiving messages will be provided in the onError handler
      // provided in the eph.start() method.
      onEphError: (error) => {
        console.log(">>>>>>> [%s] Error: %O", ephName, error);
      }
    }
  );
  // Message handler
  let count = 0;
  const onMessage /**OnReceivedMessage**/ = async (
    context /**PartitionContext**/,
    data /**EventData**/
  ) => {
    count++;
    console.log(
      "##### [%s] %d - Rx message from '%s': '%s'",
      ephName,
      count,
      context.partitionId,
      data.body
    );
    // Checkpointing every 200th event that is received acrosss all the partitions.
    if (count % 200 === 0) {
      try {
        console.log(
          "***** [%s] EPH is currently receiving messages from partitions: %O",
          ephName,
          eph.receivingFromPartitions
        );
        await context.checkpoint();
        console.log("$$$$ [%s] Successfully checkpointed message number %d", ephName, count);
      } catch (err) {
        console.log(
          ">>>>>>> [%s] An error occurred while checkpointing msg number %d: %O",
          ephName,
          count,
          err
        );
      }
    }
  };
  // Error handler
  const onError /**OnReceivedError**/ = (error) => {
    console.log(">>>>> [%s] Received Error: %O", ephName, error);
  };
  console.log(">>>>>> Starting the EPH - %s", ephName);
  await eph.start(onMessage, onError);
  return eph;
}

/**
 * Stops the given EventProcessorHost.
 * @param eph The event processor host.
 * @returns {Promise<void>} Promise<void>
 */
async function stopEph(eph /**EventProcessorHost**/) {
  console.log(">>>>>> Stopping the EPH - '%s'.", eph.hostName);
  await eph.stop();
  console.log(">>>>>> Successfully stopped the EPH - '%s'.", eph.hostName);
}

EPH with IotHub connection string

const { EventProcessorHost, delay } = require("@azure/event-processor-host");

const path = process.env.EVENTHUB_NAME || "";
const storageCS = process.env.STORAGE_CONNECTION_STRING;
const iothubCS = process.env.IOTHUB_CONNECTION_STRING;
const storageContainerName = "test-container";

async function main() {
  // Create the Event Processo Host
  const eph = await EventProcessorHost.createFromIotHubConnectionString(
    EventProcessorHost.createHostName("my-host"),
    storageCS,
    storageContainerName,
    iothubCS,
    {
      eventHubPath: path
    }
  );
  let count = 0;
  // Message event handler
  const onMessage = async (context /*PartitionContext*/, data /*EventData*/) => {
    console.log(">>>>> Rx message from '%s': '%s'", context.partitionId, data.body);
    count++;
    // let us checkpoint every 100th message that is received across all the partitions.
    if (count % 100 === 0) {
      return await context.checkpoint();
    }
  };
  // Error event handler
  const onError = (error) => {
    console.log(">>>>> Received Error: %O", error);
  };
  // start the EPH
  await eph.start(onMessage, onError);
  // After some time let' say 2 minutes
  await delay(120000);
  // This will stop the EPH.
  await eph.stop();
}

main().catch((err) => {
  console.log(err);
});

AMQP Dependencies

It depends on rhea library for managing connections, sending and receiving events over the AMQP protocol.

Impressions