drbd: add comment why we want to first call local-io-error, then send state

Even though we really want to get the state information about our bad disk to the peer as soon as possible, it is useful to first call the local-io-error handler. People may chose to hard-reset the box from there. If that looks and behaves exactly like a "regular node crash", without bumping the data generation UUIDs on the peer in between, it makes it easier to deal with. If you intend to return from the local-io-error handler, then better return as quickly as possible to avoid triggering other timeouts. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-02-19 13:43:55 +01:00 · 2015-02-19 13:43:55 +01:00 · dc99562a48
--- a/drivers/block/drbd/drbd_state.c
+++ b/drivers/block/drbd/drbd_state.c
@ -1859,6 +1859,10 @@ static void after_state_ch(struct drbd_device *device, union drbd_state os,

 			was_io_error = test_and_clear_bit(WAS_IO_ERROR, &device->flags);

+			/* Intentionally call this handler first, before drbd_send_state().
+			 * See: 2932204 drbd: call local-io-error handler early
+			 * People may chose to hard-reset the box from this handler.
+			 * It is useful if this looks like a "regular node crash". */
 			if (was_io_error && eh == EP_CALL_HELPER)
 				drbd_khelper(device, "local-io-error");