kafka

Граф коммитов

Автор	SHA1	Сообщение	Дата
David Arthur	51b8f26e2c	Fix missing reference in kafka.py (#7715 ) Also fix a default value for a dictionary arg	2020-04-13 10:57:50 -07:00
Brian Bushree	efc131ab14	[MINOR] allow additional JVM args in KafkaService (#7297 ) Reviewers: Colin P. McCabe <cmccabe@apache.org>, Vikas Singh <vikas@confluent.io>	2020-04-13 10:57:38 -07:00
bill	054f334579	Bump version to 2.4.2-SNAPSHOT	2020-03-11 21:46:48 -04:00
bill	c57222ae8c	Bump version to 2.4.1	2020-03-02 19:32:42 -05:00
Brian Bushree	bd4feef3dd	MINOR: add wait_for_assigned_partitions to console-consumer (#8192 ) what/why the throttling_test was broken by this PR (#7785) since it depends on the consumer having partitions-assigned before starting the producer this PR provides the ability to wait for partitions to be assigned in the console consumer before considering it started. caveat this does not support starting up the JmxTool inside the console-consumer for custom metrics while using this wait_until_partitions_assigned flag since the code assumes one JmxTool running per node. I think a proper fix for this would be to make JmxTool its own standalone single-node service alternatives we could use the EndToEnd test suite which uses the verifiable producer/consumer under the hood but I found that there were more changes necessary to get this working unfortunately (specifically doesn't seem like this test suite plays nicely with the ProducerPerformanceService) Reviewers: Mathew Wong <mwong@confluent.io>, Bill Bejeck <bbejeck.com>	2020-02-29 19:46:04 -05:00
Matthew Wong	e60eb69afa	throttle consumer timeout increase (#8188 ) The test_throttled_reassignment test fails because the consumer that is used to validate reassignment does not start on time to consume all messages. This does not seem like an issue with the throttling of the reassignment, since increasing the timeout allowed the test to pass multiple consecutive runs locally. This test seemed to rely on the default JmxTool for the console consumer that was removed in this commit: `179d0d7` The console consumer would check to see if it had partitions assigned to it before beginning to consume. Although the test occasionally failed with the JmxTool, it began to fail much more after the removal. Error messages of failures followed the below format with varying numbers of missed messages. They are the first messages by the producer. 535 acked message did not make it to the Consumer. They are: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19...plus 515 more. Total Acked: 192792, Total Consumed: 192259. We validated that the first 535 of these missing messages correctly made it into Kafka's data files. This suggests they were lost on their way to the consumer. In the scope of the test, this error suggests that the test is falling into the race condition described in produce_consume_validate.py, which has the timeout to prevent the consumer from missing initial messages. This can serve as a temporary fix until the logic of consumer startup is addressed further. Reviewers: Jason Gustafson <jason@confluent.io>, Bill Bejeck <bbejeck@gmail.com>	2020-02-27 17:50:11 -05:00
Brian Bushree	179d0d73d6	MINOR: Disable JmxTool in kafkatest console-consumer by default (#7785 ) Do not initialize `JmxTool` by default when running console consumer. In order to support this, we remove `has_partitions_assigned` and its only usage in an assertion inside `ProduceConsumeValidateTest`, which did not seem to contribute much to the validation. Reviewers: David Arthur <mumrah@gmail.com>, Jason Gustafson <jason@confluent.io>	2020-01-09 16:54:37 -08:00
A. Sophie Blee-Goldman	69f8fad99c	HOTFIX: fix system test race condition (#7836 ) In some system tests a Streams app is started and then prints a message to stdout, which the system test waits for to confirm the node has successfully been brought up. It then greps for certain log messages in a retriable loop. But waiting on the Streams app to start/print to stdout does not mean the log file has been created yet, so the grep may return an error. Although this occurs in a retriable loop it is assumed that grep will not fail, and the result is piped to wc and then blindly converted to an int in the python function, which fails since the error message is a string (throws ValueError) We should catch the ValueError and return a 0 so it can try again rather than immediately crash Reviewers: Bill Bejeck <bbejeck@gmail.com>, John Roesler <vvcephei@users.noreply.github.com>, Guozhang Wang <wangguoz@gmail.com>	2020-01-06 18:34:25 -08:00
Manikumar Reddy	c37fe3c671	Bump version to 2.4.1-SNAPSHOT	2019-12-14 08:07:29 +05:30
Manikumar Reddy	bc5022e37a	Merge tag '2.4.0-rc4' into 2.4 2.4.0-rc4	2019-12-14 08:07:11 +05:30
Randall Hauch	d087eae684	MINOR: Simplify the timeout logic to handle protocol in Connect distributed system tests (#7806 )	2019-12-10 12:31:29 -06:00
Manikumar Reddy	77a89fcf8d	Bump version to 2.4.0	2019-12-09 22:16:25 +05:30
Randall Hauch	82d2446cd6	MINOR: Bump system test version from 2.2.1 to 2.2.2 (#7765 ) Author: Randall Hauch <rhauch@gmail.com> Reviewer: Ismael Juma <ismael@confluent.io>	2019-12-06 15:45:23 -06:00
Randall Hauch	0657879296	MINOR: Increase the timeout in one of Connect's distributed system tests (#7789 ) Author: Randall Hauch <rhauch@gmail.com> Reviewers: Nigel Liang <nigel@nigelliang.com>, Roesler <john@confluent.io>	2019-12-06 15:35:11 -06:00
David Arthur	0e14bc3432	Actually run the delete topic command in kafka.py (#7776 ) Reviewed-By: Jason Gustafson <jason@confluent.io>	2019-12-05 08:15:51 +05:30
David Arthur	5ebd1e4232	KAFKA-9123 Test a large number of replicas (#7621 ) Two tests using 50k replicas on 8 brokers: * Do a rolling restart with clean shutdown, delete topics * Run produce bench and consumer bench on a subset of topics Reviewed-By: David Jacot <djacot@confluent.io>, Vikas Singh <vikas@confluent.io>, Jason Gustafson <jason@confluent.io>	2019-11-26 11:15:13 -05:00
David Arthur	b8536cc7c6	KAFKA-8981 Add rate limiting to NetworkDegradeSpec (#7446 )	2019-11-26 11:14:43 -05:00
Bruno Cadonna	0bc40c63e1	MINOR: Fix Streams EOS system tests by adding clean-up of state dir (#7693 ) Recently, system tests test_rebalance_[simple\|complex] failed repeatedly with a verfication error. The cause was most probably the missing clean-up of a state directory of one of the processors. A node is cleaned up when a service on that node is started and when a test is torn down. If the clean-up flag clean_node_enabled of a EOS Streams service is unset, the clean-up of the node is skipped. The clean-up flag of processor1 in the EOS tests should stay set before its first start, so that the node is cleaned before the service is started. Afterwards for the multiple restarts of processor1 the cleans-up flag should be unset to re-use the local state. After the multiple restarts are done, the clean-up flag of processor1 should again be set to trigger node clean-up during the test teardown. A dirty node can lead to test failures when tests from Streams EOS tests are scheduled on the same node, because the state store would not start empty since it reads the local state that was not cleaned up. Reviewers: Matthias J. Sax <mjsax@apache.org>, Andrew Choi <andchoi@linkedin.com>, Bill Bejeck <bbejeck@gmail.com>	2019-11-21 14:07:16 -05:00
Jason Gustafson	fba3e7e1f9	KAFKA-9079: Fix reset logic in transactional message copier The consumer's `committed` API does not return an entry in the response map for a requested partition if there is no committed offset. The transactional message copier, which is used in the transaction system test, did not account for this. If the first transaction attempted by the copier was randomly aborted, then we would not seek to the beginning as expected, which means we would fail to copy some of the records. This patch fixes the problem by iterating over the assignment rather than the result of `committed` when resetting offsets. It also adds enables additional logging in the transaction message copier service to make finding problems easier in the future. Author: Jason Gustafson <jason@confluent.io> Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com> Closes #7653 from hachikuji/fix-transaction-system-test (cherry picked from commit `903d66e2f9`) Signed-off-by: Manikumar Reddy <manikumar@confluent.io>	2019-11-06 16:00:10 +05:30
A. Sophie Blee-Goldman	6fc17bcadd	HOTFIX: fix bug in VP test where it greps for the wrong log message (#7642 ) We added more version info to the log messages but must not have cleanly backported the test update -- looking back it seems like older versions might also need a fix (separate PR as the version numbers are different in 2.1-2.3) Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	2019-11-04 16:14:25 -08:00
Boyang Chen	9dce3e7535	KAFKA-8972 (2.4 blocker): correctly release lost partitions during consumer.unsubscribe() (#7441 ) Inside onLeavePrepare we would look into the assignment and try to revoke the owned tasks and notify users via RebalanceListener#onPartitionsRevoked, and then clear the assignment. However, the subscription's assignment is already cleared in this.subscriptions.unsubscribe(); which means user's rebalance listener would never be triggered. In other words, from consumer client's pov nothing is owned after unsubscribe, but from the user caller's pov the partitions are not revoked yet. For callers like Kafka Streams which rely on the rebalance listener to maintain their internal state, this leads to inconsistent state management and failure cases. Before KIP-429 this issue is hidden away since every time the consumer re-joins the group later, it would still revoke everything anyways regardless of the passed-in parameters of the rebalance listener; with KIP-429 this is easier to reproduce now. Our fixes are following: • Inside unsubscribe, first do onLeavePrepare / maybeLeaveGroup and then subscription.unsubscribe. This we we are guaranteed that the streams' tasks are all closed as revoked by then. • [Optimization] If the generation is reset due to fatal error from join / hb response etc, then we know that all partitions are lost, and we should not trigger onPartitionRevoked, but instead just onPartitionsLost inside onLeavePrepare. This is because we don't want to commit for lost tracks during rebalance which is doomed to fail as we don't have any generation info. Reviewers: Matthias J. Sax <matthias@confluent.io>, A. Sophie Blee-Goldman <sophie@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>	2019-10-29 10:42:13 -07:00
Konstantine Karantasis	99926bde1f	KAFKA-9078: Fix Connect system test after adding MM2 connector classes MM2 added a few connector classes in Connect's classpath and given that the assertion in the Connect REST system tests need to be adjusted to account for these additions. This fix makes sure that the loaded Connect plugins are a superset of the expected by the test connectors. Testing: The change is straightforward. The fix was tested with local system test runs. Author: Konstantine Karantasis <konstantine@confluent.io> Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com> Closes #7578 from kkonstantine/minor-fix-connect-test-after-mm2-classes	2019-10-23 15:48:07 +05:30
Bill Bejeck	f38d47bf07	KAFKA-8496: System test for KIP-429 upgrades and compatibility (#7529 ) Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	2019-10-16 22:30:00 -07:00
A. Sophie Blee-Goldman	de202a6697	KAFKA-8743: Flaky Test Repartition{WithMerge}OptimizingIntegrationTest (#7472 ) All four flavors of the repartition/optimization tests have been reported as flaky and failed in one place or another: * RepartitionOptimizingIntegrationTest.shouldSendCorrectRecords_OPTIMIZED * RepartitionOptimizingIntegrationTest.shouldSendCorrectRecords_NO_OPTIMIZATION * RepartitionWithMergeOptimizingIntegrationTest.shouldSendCorrectRecords_OPTIMIZED * RepartitionWithMergeOptimizingIntegrationTest.shouldSendCorrectRecords_NO_OPTIMIZATION They're pretty similar so it makes sense to knock them all out at once. This PR does three things: * Switch to in-memory stores wherever possible * Name all operators and update the Topology accordingly (not really a flaky test fix, but had to update the topology names anyway because of the IM stores so figured might as well) * Port to TopologyTestDriver -- this is the "real" fix, should make a big difference as these repartition tests required multiple roundtrips with the Kafka cluster (while using only the default timeout) Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	2019-10-10 16:24:09 -07:00
A. Sophie Blee-Goldman	133c33fde1	KAFKA-8179: Part 7, cooperative rebalancing in Streams (#7386 ) Key improvements with this PR: * tasks will remain available for IQ during a rebalance (but not during restore) * continue restoring and processing standby tasks during a rebalance * continue processing active tasks during rebalance until the RecordQueue is empty* * only revoked tasks must suspended/closed * StreamsPartitionAssignor tries to return tasks to their previous consumers within a client * but do not try to commit, for now (pending KAFKA-7312) Reviewers: John Roesler <john@confluent.io>, Boyang Chen <boyang@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	2019-10-07 09:30:02 -07:00
A. Sophie Blee-Goldman	3b05dc685b	MINOR: just remove leader on trunk like we did on 2.3 (#7447 ) Small follow-up to trunk PR #7423 While debugging the 2.3 VP PR we realized we should remove the leader-tracking from the VP system test altogether. We'd already merged the corresponding trunk PR so I made a quick new PR for trunk (also fixes a missed version bump in one of the log messages) Reviewers: Guozhang Wang <wangguoz@gmail.com>	2019-10-04 15:58:11 -07:00
Chris Egerton	791d0d61bf	KAFKA-8804: Secure internal Connect REST endpoints (#7310 ) Implemented KIP-507 to secure the internal Connect REST endpoints that are only for intra-cluster communication. A new V2 of the Connect subprotocol enables this feature, where the leader generates a new session key, shares it with the other workers via the configuration topic, and workers send and validate requests to these internal endpoints using the shared key. Currently the internal `POST /connectors/<connector>/tasks` endpoint is the only one that is secured. This change adds unit tests and makes some small alterations to system tests to target the new `sessioned` Connect subprotocol. A new integration test ensures that the endpoint is actually secured (i.e., requests with missing/invalid signatures are rejected with a 400 BAD RESPONSE status). Author: Chris Egerton <chrise@confluent.io> Reviewed: Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>	2019-10-02 17:06:57 -05:00
A. Sophie Blee-Goldman	8da69936a7	KAFKA-8649: Send latest commonly supported version in assignment (#7423 ) Instead of sending the leader's version and having older members try to blindly upgrade. The only other real change here is that we will also set the VERSION_PROBING error code and return early from onAssignment when we are upgrading our used subscription version (not just downgrading it) since this implies the whole group has finished the rolling upgrade and all members should rejoin with the new subscription version. Also piggy-backing on a fix for a potentially dangerous edge case, where every thread of an instance is assigned the same set of active tasks. Reviewers: Guozhang Wang <wangguoz@gmail.com>	2019-10-02 08:54:32 -07:00
Rajini Sivaram	0d31272b35	KAFKA-8848; Update system tests to use new AclAuthorizer (#7374 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	2019-09-24 10:30:17 +01:00
Randall Hauch	ada35d5ff4	Add recent versions of Kafka to the matrix of ConnectDistributedTest (#7024 ) Reviewers: Arjun Satish <arjun@confluent.io>, Konstantine Karantasis <k.karantasis@gmail.com>	2019-09-18 10:21:21 -07:00
vinoth chandar	ffef0871c2	KAFKA-7149 : Reducing streams assignment data size (#7185 ) * Leader instance uses dictionary encoding on the wire to send topic partitions * Topic names (most expensive component) are mapped to an integer using the dictionary * Follower instances receive the dictionary, decode topic names back * Purely an on-the-wire optimization, no in-memory structures changed * Test case added for version 5 AssignmentInfo Reviewers: Guozhang Wang <wangguoz@gmail.com>	2019-09-05 13:50:55 -07:00
Colin Patrick McCabe	a225347ff2	KAFKA-8840: Fix bug where ClientCompatibilityFeaturesTest fails when running multiple iterations (#7260 ) Fix a bug where ClientCompatibilityFeaturesTest fails when running multiple iterations. Also, fix a typo in tests/docker/Dockerfile. Reviewers: Ismael Juma <ismael@juma.me.uk>	2019-08-30 16:07:59 -07:00
Matthias J. Sax	4d1ee26a13	KAFKA-8594: Add version 2.3 to Streams system tests (#7131 ) Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Boyang Chen <boyang@confluent.io>, Bill Bejeck <bill@confluent.io>	2019-08-21 10:26:57 -07:00
David Arthur	ff9e95cb09	MINOR: Add fetch from follower system test (#7166 ) This adds a basic system test that enables rack-aware brokers with the rack-aware replica selector for fetch from followers (KIP-392). The test asserts that the follower was read from at least once and that all the messages that were produced were successfully consumed. Reviewers: Jason Gustafson <jason@confluent.io>	2019-08-13 12:33:05 -07:00
Arjun Satish	794637232c	KAFKA-8774: Regex can be found anywhere in config value (#7197 ) Corrected the AbstractHerder to correctly identify task configs that contain variables for externalized secrets. The original method incorrectly used `matcher.matches()` instead of `matcher.find()`. The former method expects the entire string to match the regex, whereas the second one can find a pattern anywhere within the input string (which fits this use case more correctly). Added unit tests to cover various cases of a config with externalized secrets, and updated system tests to cover case where config value contains additional characters besides secret that requires regex pattern to be found anywhere in the string (as opposed to complete match). Author: Arjun Satish <arjun@confluent.io> Reviewer: Randall Hauch <rhauch@gmail.com>	2019-08-13 09:40:12 -05:00
Matthias J. Sax	e9a35fe02e	MINOR: Bump system test version from 2.2.0 to 2.2.1 (#6873 ) Reviewers: Boyang Chen <boyang@confluent.io>, Bill Bejeck <bill@confluent.io>	2019-08-09 14:33:20 -07:00
Rajini Sivaram	de8ce78a90	MINOR: Tolerate limited data loss for upgrade tests with old message format (#7102 ) To avoid transient system test failures, tolerate a small amount of data loss due to truncation in upgrade system tests using older message format prior to KIP-101, where data loss was possible. Reviewers: Ismael Juma <ismael@juma.me.uk>	2019-07-31 16:19:36 +01:00
Brian Bushree	e5f7220b23	MINOR: kafkatest - adding whitelist for interbroker sasl configs (#7093 )	2019-07-22 09:38:28 +01:00
Ismael Juma	d67495d6a7	KAFKA-8634: Update ZooKeeper to 3.5.5 (#6802 ) ZooKeeper 3.5.5 is the first stable release in the 3.5.x series. The key new feature in is TLS support, but there are a few more noteworthy features: * Dynamic reconfiguration * Local sessions * New node types: Container, TTL * Ability to remove watchers * Multi-threaded commit processor * Upgraded to Netty 4.1 See the release notes for more detail: https://zookeeper.apache.org/doc/r3.5.5/releasenotes.html In addition to the version bump, we: * Add `commons-cli` dependency as it's required by `ZooKeeperMain`, but specified as `provided` in their pom. * Remove unnecessary `ZooKeeperMainWrapper`, the bug it worked around was fixed upstream a long time ago. * Ignore non zero exit in one system test invocation of `ZooKeeperMain`. `ZooKeeperMainWrapper` always returned `0` and `ZooKeeperService.query` relies on that for correct behavior. Reviewers: Jason Gustafson <jason@confluent.io>	2019-07-10 09:45:10 -07:00
Ismael Juma	57903be496	MINOR: Remove zkclient dependency (#7036 ) ZkUtils was removed so we don't need this anymore. Also: * Fix ZkSecurityMigrator and ReplicaManagerTest not to reference ZkClient classes. * Remove references to zkclient in various `log4j.properties` and `import-control.xml`. Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>	2019-07-05 07:50:32 -07:00
David Arthur	23beeea34b	KAFKA-8443; Broker support for fetch from followers (#6832 ) Follow on to #6731, this PR adds broker-side support for [KIP-392](https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica) (fetch from followers). Changes: * All brokers will handle FetchRequest regardless of leadership * Leaders can compute a preferred replica to return to the client * New ReplicaSelector interface for determining the preferred replica * Incremental fetches will include partitions with no records if the preferred replica has been computed * Adds new JMX to expose the current preferred read replica of a partition in the consumer Two new conditions were added for completing a delayed fetch. They both relate to communicating the high watermark to followers without waiting for a timeout: * For regular fetches, if the high watermark changes within a single fetch request * For incremental fetch sessions, if the follower's high watermark is lower than the leader A new JMX attribute `preferred-read-replica` was added to the `kafka.consumer:type=consumer-fetch-manager-metrics,client-id=some-consumer,topic=my-topic,partition=0` object. This was added to support the new system test which verifies that the fetch from follower behavior works end-to-end. This attribute could also be useful in the future when debugging problems with the consumer. Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>, Jun Rao <junrao@gmail.com>, Jason Gustafson <jason@confluent.io>	2019-07-04 08:18:51 -07:00
Brian Bushree	5287036b38	MINOR: system tests - avoid 'sasl.enabled.mechanisms' in listener overrides (#7018 ) Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	2019-07-03 12:17:17 +01:00
Randall Hauch	6f91096c7d	MINOR: Fix version for ConnectDistributed system test, remove 0.9.0.1 compatibility test (#7023 ) Connect tests were using String version for KafkaService instead of the expected KafkaVersion object. This broke due to recent changes to KafkaVersion. It turns out that the tests with String version were running compatibility tests against `dev` brokers rather than the older broker versions they were expecting to run against. When version was fixed, tests using 0.9.0.1 brokers started failing since new clients are not compatible with 0.9.0.1 brokers. So this PR fixes version parameter and removes the two tests against 0.9.0.1 brokers. Reviewers: Ismael Juma <ismael@juma.me.uk>, Rajini Sivaram <rajinisivaram@googlemail.com>	2019-07-02 19:10:41 +01:00
Colin Patrick McCabe	3d2d87abd1	MINOR: Add compatibility tests for 2.3.0 (#6995 ) Reviewers: Ismael Juma <ismael@juma.me.uk>	2019-06-28 09:25:08 -07:00
Brian Bushree	357aedeb1b	MINOR: Support listener config overrides in system tests (#6981 ) Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>	2019-06-27 18:10:43 +01:00
Stanislav Vodetskyi	594d043037	MINOR: Fix failing upgrade test by supporting both security.inter.broker.protocol and inter.broker.listener.name depending on kafka version (#7000 ) Reviewers: Brian Bushree <bbushree@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>	2019-06-27 17:50:17 +01:00
Stanislav Vodetskyi	f51d7d3c93	KAFKA-8557: system tests - add support for (optional) interbroker listener with the same security protocol as client listeners (#6938 ) Reviewers: Brian Bushree <bbushree@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>	2019-06-21 17:51:43 +01:00
David Arthur	d7a5e31ca2	KAFKA-8519 Add trogdor action to slow down a network (#6912 ) This adds a new Trogdor fault spec for inducing network latency on a network device for system testing. It operates very similarly to the existing network partition spec by executing the `tc` linux utility.	2019-06-21 11:30:05 -04:00
Boyang Chen	e981b82601	KAFKA-8500; Static member rejoin should always update member.id (#6899 ) This PR fixes a bug in static group membership. Previously we limit the `member.id` replacement in JoinGroup to only cases when the group is in Stable. This is error-prone and could potentially allow duplicate consumers reading from the same topic. For example, imagine a case where two unknown members join in the `PrepareRebalance` stage at the same time. The PR fixes the following things: 1. Replace `member.id` at any time we see a known static member rejoins group with unknown member.id 2. Immediately fence any ongoing join/sync group callback to early terminate the duplicate member. 3. Clearly handle Dead/Empty cases as exceptional. 4. Return old leader id upon static member leader rejoin to avoid trivial member assignment being triggered. Reviewers: Guozhang Wang <wangguoz@gmail.com>, Jason Gustafson <jason@confluent.io>	2019-06-12 08:41:58 -07:00
Jason Gustafson	2feb44ebc8	MINOR: Fix race condition on shutdown of verifiable producer We've seen `ReplicaVerificationToolTest.test_replica_lags` fail occasionally due to errors such as the following: ``` RemoteCommandError: ubuntuworker7: Command 'kill -15 2896' returned non-zero exit status 1. Remote error message: bash: line 0: kill: (2896) - No such process ``` The problem seems to be a shutdown race condition when using `max_messages` with the producer. The process may already be gone which will cause the signal to fail. Author: Jason Gustafson <jason@confluent.io> Reviewers: Gwen Shapira Closes #6906 from hachikuji/fix-failing-replicat-verification-test	2019-06-07 16:56:21 -07:00

1 2 3 4 5 ...

432 Коммитов