* added EventHubsConf but haven't integrated it yet. build is stable!
* putting a pin in these EventHubsConf changes to focus on Spark 2.2
* WIP: implementation and tests complete. Need to fix issue related to Spark 2.2
* updating connector to work with Spark 2.2
* minor update to comments
* setting timeouts in EventHubClientWrapper
* change EventHubsConf.copy to EventHubsConf.clone
* temporarily disabling tests. progress tracker tests are being problematic and they are going to be removed in the next phase of cleanup
* driver-side translation added. dstream re-written. rdd re-written. configuration documentation added.
* EventHubsSource partial rewrite complete. Committing progress b/c need hit pause and fix a bug in an older version
* EventHubsSource re-write complete. Moving on to testing. Re-write was substantial, so I expect further changes will be needed as we fine tune the connector
* Fixed client, starting tests
* moving all client functionality into the client. added simulated eventhubs. gonna starting really reworking the tests now
* cleaned out old tests. updated code. everything is building, no tests yet.
* updated EventHubsConfSuite, all tests passing
* test utils set up, first RDD is passing
* adding RDD tests
* finalized sequence number support in eventhubsconf, dstream, and source
* basic stream tests done, moving to checkpointing tests
* finished DStream tests. moving to Source tests
* tests for EventHubsSourceOffset and JsonUtils
* removing excessive stack trace printing
* first few source tests. running into a cast exception due to EH Java Client, gonna take care of that now
* fixing how source handles EnqueueTime from EventData
* added maxSeqNoPerTrigger and corresponding source tests
* additional Source tests
* decoupled simulated client from simulated eventhubs. extended simulated eventhubs to allow sending events
* rdd, dstream, and source tests adapted to new simulated eventhubs
* adding AddEventHubsData integration tests. switching machines.
* modifying eventhubsclient to avoid false positive data loss reports
* additional structured streaming integration tests
* adding support for national and private clouds via setDomainName in EventHubsConf
* added final integration tests for struct streaming
* EnqueuedTime is converted to java.sql.Timestamp
* removing unused imports
* moving to eventhubs java client 1.0.0
* Remove isValid from EventHubsConf
* maxRatePerPartition refactoring
* Client refactoring - signature changes and removing unused methods
* EventHubsConf refactoring
* Common package is removed
* dropping default max rate
* Support for JavaRDD and JavaInputDStream
* Rename Position to EventPosition
* misc cleanup
* Support multiple simulated eventhubs at once
* remove sql containsProps and userDefinedKeys options
* parallelized all loops in EventHubsClient.translate
* removing unecessary comments
* adding javadoc comments
* conn str builder tests
* EventHubsConf tests added
* Minor bug fixes and EventPosition serialization issue is fixed
* Simulated client is enabled in tests
* Moving non-util files out of utils package
* ClientWrapper fix
* Minor bug fixes in tests
* Moved to Spark 2.3, all tests passing
* EventPosition bug fix
* Receive until we do get null, only make API call for partition count once
* moving defaults into package.scala
* removing out of date docs, adding structured streaming integration guide
* spark streaming integration guide
* Removing old information from docs
* Updated PySpark docs
* updating doc name
* Updating minor issues in docs. Added experimental tag to four apis in eventhubsconf
* Adding support for batch styled queries in structured streaming
* Update struct streaming docs to reflect new batch query support
* docs/README formatting
* doc fomratting
* add batch style query code sample in docs
* EventData: remove inclusive flag from public api. Starts are always inclusive, ends are always exclusive
* Updating public apis to take NameAndPartition instead of PartitionId
* Fixing javadoc issues in EventPosition
* updating readme
* updating templates for pull requests, issues, and contriubting
* moving test resource to test directory
* renaming EventHubsClientWrapper to EventHubsClient
* fixing access issues in NameandPartition
* reorganizing test resources
* Accomodating breaking changes in java client
* Additional tracing in translate method
* Client connection pooling and thread pooling first draft
* Minor bug fix to connection pool
* remove failOnDataLoss option
* Adding EventHubsSink
* Adding send functionality to TestUtils
* First batched writes passing
* More unit tests for EventHubsRelation and EventHubsSink
* Additional Sink tests
* Final Sink test updates
* Adding Sink documentation to integration guide
* Adding databricks docs
* remvoing concurrent jobs limit in spark streaming
* Check for EventData expiration each batch
* Rebase
* Adding preferred location in Spark Streaming and Struct Streaming
* concurrency bug fix in EVentHubsClient
* Minor logging fix
* retry client create until successful
* Update structured-streaming-eventhubs-integration.md
* Update azure_eventhubs_support.md
* Update spark-streaming-eventhubs-integration.md
* Update structured-streaming-eventhubs-integration.md
* Update azure_eventhubs_support.md
* Update README.md
* add toString for simulated eventhubs
* Update structured-streaming-eventhubs-integration.md
* Update spark-streaming-eventhubs-integration.md
* Update azure_eventhubs_support.md
* Updating docs - typo fixes and reorganizing
* fixing NPE in RDD
* Moving to proper Spark 2.3.0 release and Java client 1.0.0 release
* Enabling unit and integration tests in Travis
* Updating CONTRIBUTING.md
* additional traces in client pool
* making naming less verbose
* more naming updates
* moving EventHubsUtils to eventhubs.common - going to add constants there and soon we'll support Spark core
* RateControlUtils cleanup
* additional cleanup
* updating examples to reflect new package location
* Removed unused API and exposed the option to create a DStream from a single Event Hubs
* changing variable name for consistency
* removing dependency on EventHubs Namespace parameter
* namespace name does not need to be passed for DStream to be created
* require that only one namespace exists within ehParams
* updating examples to reflect new API
* ignore scalastyle output
* test eventhubs 0.12
* release of 2.0.4
* fix compilation error
* fix NPE
* further fix NPE
* fix classcastexception
* fix failed test cases
* include scalaj to jar file
* do not limit to use WASB
* upgrade to 0.13
* release note of 2.0.4
* longer waiting interval
* restendpoint (#18)
This CR contains the working around for several issues we found in EventHub/Spark, specifically they are
1) Unreliable rest endpoint in EventHub -> fail the application if no response after retry
2) Spark checkpoint issue (https://issues.apache.org/jira/browse/SPARK-19278) -> disable cleanup for progress file but keep that for temp files
3) Too many files in EventHub client -> one-one mapping of Client/Receiver