WSL2-Linux-Kernel/net/ceph
Ilya Dryomov 67645d7619 libceph: fix ceph_msg_revoke()
There are a number of problems with revoking a "was sending" message:

(1) We never make any attempt to revoke data - only kvecs contibute to
con->out_skip.  However, once the header (envelope) is written to the
socket, our peer learns data_len and sets itself to expect at least
data_len bytes to follow front or front+middle.  If ceph_msg_revoke()
is called while the messenger is sending message's data portion,
anything we send after that call is counted by the OSD towards the now
revoked message's data portion.  The effects vary, the most common one
is the eventual hang - higher layers get stuck waiting for the reply to
the message that was sent out after ceph_msg_revoke() returned and
treated by the OSD as a bunch of data bytes.  This is what Matt ran
into.

(2) Flat out zeroing con->out_kvec_bytes worth of bytes to handle kvecs
is wrong.  If ceph_msg_revoke() is called before the tag is sent out or
while the messenger is sending the header, we will get a connection
reset, either due to a bad tag (0 is not a valid tag) or a bad header
CRC, which kind of defeats the purpose of revoke.  Currently the kernel
client refuses to work with header CRCs disabled, but that will likely
change in the future, making this even worse.

(3) con->out_skip is not reset on connection reset, leading to one or
more spurious connection resets if we happen to get a real one between
con->out_skip is set in ceph_msg_revoke() and before it's cleared in
write_partial_skip().

Fixing (1) and (3) is trivial.  The idea behind fixing (2) is to never
zero the tag or the header, i.e. send out tag+header regardless of when
ceph_msg_revoke() is called.  That way the header is always correct, no
unnecessary resets are induced and revoke stands ready for disabled
CRCs.  Since ceph_msg_revoke() rips out con->out_msg, introduce a new
"message out temp" and copy the header into it before sending.

Cc: stable@vger.kernel.org # 4.0+
Reported-by: Matt Conner <matt.conner@keepertech.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Tested-by: Matt Conner <matt.conner@keepertech.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2016-01-21 19:36:08 +01:00
..
crush crush: sync up with userspace 2015-06-25 11:49:31 +03:00
Kconfig libceph: select CRYPTO_CBC in addition to CRYPTO_AES 2014-10-14 21:03:20 +04:00
Makefile
armor.c
auth.c
auth_none.c
auth_none.h
auth_x.c libceph: add nocephx_sign_messages option 2015-11-02 23:37:46 +01:00
auth_x.h libceph: store session key in cephx authorizer 2014-12-17 20:09:50 +03:00
auth_x_protocol.h
buffer.c libceph: nuke ceph_kvfree() 2014-12-17 20:09:50 +03:00
ceph_common.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2015-11-13 09:24:40 -08:00
ceph_fs.c
ceph_hash.c
ceph_strings.c libceph: nuke pool op infrastructure 2015-02-19 13:31:37 +03:00
crypto.c KEYS: Merge the type-specific data with the payload data 2015-10-21 15:18:36 +01:00
crypto.h libceph: introduce ceph_x_authorizer_cleanup() 2015-11-02 23:36:48 +01:00
debugfs.c libceph: expose client options through debugfs 2015-04-20 18:55:39 +03:00
messenger.c libceph: fix ceph_msg_revoke() 2016-01-21 19:36:08 +01:00
mon_client.c libceph: use keepalive2 to verify the mon session is alive 2015-09-08 23:14:30 +03:00
msgpool.c
osd_client.c libceph: clear msg->con in ceph_msg_release() only 2015-11-02 23:37:46 +01:00
osdmap.c libceph: set 'exists' flag for newly up osd 2015-09-08 23:14:29 +03:00
pagelist.c libceph: reference counting pagelist 2014-10-14 12:56:48 -07:00
pagevec.c libceph: use kvfree() instead of open-coding it 2015-06-25 11:49:28 +03:00
snapshot.c