Граф коммитов

1129 Коммитов

Автор SHA1 Сообщение Дата
Ilya Dryomov 20cf67dcb7 libceph: fix race between delayed_work() and ceph_monc_stop()
commit 69c7b2fe4c9cc1d3b1186d1c5606627ecf0de883 upstream.

The way the delayed work is handled in ceph_monc_stop() is prone to
races with mon_fault() and possibly also finish_hunting().  Both of
these can requeue the delayed work which wouldn't be canceled by any of
the following code in case that happens after cancel_delayed_work_sync()
runs -- __close_session() doesn't mess with the delayed work in order
to avoid interfering with the hunting interval logic.  This part was
missed in commit b5d91704f5 ("libceph: behave in mon_fault() if
cur_mon < 0") and use-after-free can still ensue on monc and objects
that hang off of it, with monc->auth and monc->monmap being
particularly susceptible to quickly being reused.

To fix this:

- clear monc->cur_mon and monc->hunting as part of closing the session
  in ceph_monc_stop()
- bail from delayed_work() if monc->cur_mon is cleared, similar to how
  it's done in mon_fault() and finish_hunting() (based on monc->hunting)
- call cancel_delayed_work_sync() after the session is closed

Cc: stable@vger.kernel.org
Link: https://tracker.ceph.com/issues/66857
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-18 13:07:42 +02:00
Jordan Rife 58f0e6324e libceph: use kernel_connect()
commit 7563cf17dc upstream.

Direct calls to ops->connect() can overwrite the address parameter when
used in conjunction with BPF SOCK_ADDR hooks. Recent changes to
kernel_connect() ensure that callers are insulated from such side
effects. This patch wraps the direct call to ops->connect() with
kernel_connect() to prevent unexpected changes to the address passed to
ceph_tcp_connect().

This change was originally part of a larger patch targeting the net tree
addressing all instances of unprotected calls to ops->connect()
throughout the kernel, but this change was split up into several patches
targeting various trees.

Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/netdev/20230821100007.559638-1-jrife@google.com/
Link: https://lore.kernel.org/netdev/9944248dba1bce861375fcce9de663934d933ba9.camel@redhat.com/
Fixes: d74bad4e74 ("bpf: Hooks for sys_connect")
Signed-off-by: Jordan Rife <jrife@google.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-19 23:05:36 +02:00
Ilya Dryomov cd6872f2cf libceph: fix potential hang in ceph_osdc_notify()
commit e6e2843230 upstream.

If the cluster becomes unavailable, ceph_osdc_notify() may hang even
with osd_request_timeout option set because linger_notify_finish_wait()
waits for MWatchNotify NOTIFY_COMPLETE message with no associated OSD
request in flight -- it's completely asynchronous.

Introduce an additional timeout, derived from the specified notify
timeout.  While at it, switch both waits to killable which is more
correct.

Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-11 15:13:55 +02:00
Ilya Dryomov bb25c5c0e4 rbd: harden get_lock_owner_info() a bit
commit 8ff2c64c97 upstream.

- we want the exclusive lock type, so test for it directly
- use sscanf() to actually parse the lock cookie and avoid admitting
  invalid handles
- bail if locker has a blank address

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03 10:22:46 +02:00
Ilya Dryomov db8ca8d9b4 libceph: harden msgr2.1 frame segment length checks
commit a282a2f105 upstream.

ceph_frame_desc::fd_lens is an int array.  decode_preamble() thus
effectively casts u32 -> int but the checks for segment lengths are
written as if on unsigned values.  While reading in HELLO or one of the
AUTH frames (before authentication is completed), arithmetic in
head_onwire_len() can get duped by negative ctrl_len and produce
head_len which is less than CEPH_PREAMBLE_LEN but still positive.
This would lead to a buffer overrun in prepare_read_control() as the
preamble gets copied to the newly allocated buffer of size head_len.

Cc: stable@vger.kernel.org
Fixes: cd1a677cad ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)")
Reported-by: Thelford Williams <thelford@google.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-23 13:47:53 +02:00
Ilya Dryomov 24501d51c6 libceph: fix potential use-after-free on linger ping and resends
commit 75dbb685f4 upstream.

request_reinit() is not only ugly as the comment rightfully suggests,
but also unsafe.  Even though it is called with osdc->lock held for
write in all cases, resetting the OSD request refcount can still race
with handle_reply() and result in use-after-free.  Taking linger ping
as an example:

    handle_timeout thread                     handle_reply thread

                                              down_read(&osdc->lock)
                                              req = lookup_request(...)
                                              ...
                                              finish_request(req)  # unregisters
                                              up_read(&osdc->lock)
                                              __complete_request(req)
                                                linger_ping_cb(req)

      # req->r_kref == 2 because handle_reply still holds its ref

    down_write(&osdc->lock)
    send_linger_ping(lreq)
      req = lreq->ping_req  # same req
      # cancel_linger_request is NOT
      # called - handle_reply already
      # unregistered
      request_reinit(req)
        WARN_ON(req->r_kref != 1)  # fires
        request_init(req)
          kref_init(req->r_kref)

                   # req->r_kref == 1 after kref_init

                                              ceph_osdc_put_request(req)
                                                kref_put(req->r_kref)

            # req->r_kref == 0 after kref_put, req is freed

        <further req initialization/use> !!!

This happens because send_linger_ping() always (re)uses the same OSD
request for watch ping requests, relying on cancel_linger_request() to
unregister it from the OSD client and rip its messages out from the
messenger.  send_linger() does the same for watch/notify registration
and watch reconnect requests.  Unfortunately cancel_request() doesn't
guarantee that after it returns the OSD client would be completely done
with the OSD request -- a ref could still be held and the callback (if
specified) could still be invoked too.

The original motivation for request_reinit() was inability to deal with
allocation failures in send_linger() and send_linger_ping().  Switching
to using osdc->req_mempool (currently only used by CephFS) respects that
and allows us to get rid of request_reinit().

Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-25 09:57:28 +02:00
Baokun Li dc915ecde8 libceph: fix doc warnings in cls_lock_client.c
Add description to fix the following W=1 kernel build warnings:

 net/ceph/cls_lock_client.c:28: warning: Function parameter or
  member 'osdc' not described in 'ceph_cls_lock'
 net/ceph/cls_lock_client.c:28: warning: Function parameter or
  member 'oid' not described in 'ceph_cls_lock'
 net/ceph/cls_lock_client.c:28: warning: Function parameter or
  member 'oloc' not described in 'ceph_cls_lock'

[ idryomov: tweak osdc description ]

Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-06-28 23:49:25 +02:00
zuoqilin da6ebb4d67 libceph: remove unnecessary ret variable in ceph_auth_init()
There is no necessary to define variable assignment, just return
directly to simplify the steps.

Signed-off-by: zuoqilin <zuoqilin@yulong.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-06-28 23:49:25 +02:00
Zheng Yongjun 1e6de263d1 libceph: fix some spelling mistakes
Fix some spelling mistakes in comments:

- enconding ==> encoding
- ambigous ==> ambiguous
- orignal ==> original
- encyption ==> encryption

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-06-28 23:49:25 +02:00
Ilya Dryomov d71a95e7ff libceph: kill ceph_none_authorizer::reply_buf
We never receive authorizer replies with cephx disabled, so it is
bogus.  Also, it still uses the old zero-length array style.

Reported-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-06-28 23:49:25 +02:00
Ilya Dryomov 03af4c7bad libceph: set global_id as soon as we get an auth ticket
Commit 61ca49a910 ("libceph: don't set global_id until we get an
auth ticket") delayed the setting of global_id too much.  It is set
only after all tickets are received, but in pre-nautilus clusters an
auth ticket and the service tickets are obtained in separate steps
(for a total of three MAuth replies).  When the service tickets are
requested, global_id is used to build an authorizer; if global_id is
still 0 we never get them and fail to establish the session.

Moving the setting of global_id into protocol implementations.  This
way global_id can be set exactly when an auth ticket is received, not
sooner nor later.

Fixes: 61ca49a910 ("libceph: don't set global_id until we get an auth ticket")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2021-06-24 21:03:17 +02:00
Ilya Dryomov 3c0d089432 libceph: don't pass result into ac->ops->handle_reply()
There is no result to pass in msgr2 case because authentication
failures are reported through auth_bad_method frame and in MAuth
case an error is returned immediately.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2021-06-24 21:03:16 +02:00
Linus Torvalds 7ac86b3dca Notable items here are a series to take advantage of David Howells'
netfs helper library from Jeff, three new filesystem client metrics
 from Xiubo, ceph.dir.rsnaps vxattr from Yanhu and two auth-related
 fixes from myself, marked for stable.  Interspersed is a smattering
 of assorted fixes and cleanups across the filesystem.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAmCT8IITHGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzizgqCACYbyY4Yr/2C8fZsn+P9rd97zRTbcC6
 eufTZwnlECLnc89BxJQRk9a2UpDJfC8RMM3/9tmiulc8G4M+ggVbdFQTCzsZox3c
 vLAunGeVyfKIY+16Bv2RNuoO3KeeZm5aB3jXJ5QcUPcXmd4XnHKI1FU2ebC56UJb
 pxxfHpE6fb59r6Ek1e5uUFyta4KDMrvwXozghuAPEgT1GpKeA9zMIGI0CkQbBHlW
 PWHpcahTiT6GWa/d9ud0CnfssiBxVydWyKTz9xppYC6LNdsZUf9tBmYYGRklcjoA
 yAwPSuqxNmg+7uWubEawc0+a/3fXORgp2SF7Rbp1XYE+HpfnMF1J+nIn
 =IO5c
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-5.13-rc1' of git://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
 "Notable items here are

   - a series to take advantage of David Howells' netfs helper library
     from Jeff

   - three new filesystem client metrics from Xiubo

   - ceph.dir.rsnaps vxattr from Yanhu

   - two auth-related fixes from myself, marked for stable.

  Interspersed is a smattering of assorted fixes and cleanups across the
  filesystem"

* tag 'ceph-for-5.13-rc1' of git://github.com/ceph/ceph-client: (24 commits)
  libceph: allow addrvecs with a single NONE/blank address
  libceph: don't set global_id until we get an auth ticket
  libceph: bump CephXAuthenticate encoding version
  ceph: don't allow access to MDS-private inodes
  ceph: fix up some bare fetches of i_size
  ceph: convert some PAGE_SIZE invocations to thp_size()
  ceph: support getting ceph.dir.rsnaps vxattr
  ceph: drop pinned_page parameter from ceph_get_caps
  ceph: fix inode leak on getattr error in __fh_to_dentry
  ceph: only check pool permissions for regular files
  ceph: send opened files/pinned caps/opened inodes metrics to MDS daemon
  ceph: avoid counting the same request twice or more
  ceph: rename the metric helpers
  ceph: fix kerneldoc copypasta over ceph_start_io_direct
  ceph: use attach/detach_page_private for tracking snap context
  ceph: don't use d_add in ceph_handle_snapdir
  ceph: don't clobber i_snap_caps on non-I_NEW inode
  ceph: fix fall-through warnings for Clang
  ceph: convert ceph_readpages to ceph_readahead
  ceph: convert ceph_write_begin to netfs_write_begin
  ...
2021-05-06 10:27:02 -07:00
Ilya Dryomov 3f1c6f2122 libceph: allow addrvecs with a single NONE/blank address
Normally, an unused OSD id/slot is represented by an empty addrvec.
However, it also appears to be possible to generate an osdmap where
an unused OSD id/slot has an addrvec with a single blank address of
type NONE.  Allow such addrvecs and make the end result be exactly
the same as for the empty addrvec case -- leave addr intact.

Cc: stable@vger.kernel.org # 5.11+
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2021-05-04 16:06:15 +02:00
Ilya Dryomov 61ca49a910 libceph: don't set global_id until we get an auth ticket
With the introduction of enforcing mode, setting global_id as soon
as we get it in the first MAuth reply will result in EACCES if the
connection is reset before we get the second MAuth reply containing
an auth ticket -- because on retry we would attempt to reclaim that
global_id with no auth ticket at hand.

Neither ceph_auth_client nor ceph_mon_client depend on global_id
being set ealy, so just delay the setting until we get and process
the second MAuth reply.  While at it, complain if the monitor sends
a zero global_id or changes our global_id as the session is likely
to fail after that.

Cc: stable@vger.kernel.org # needs backporting for < 5.11
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2021-04-27 23:52:24 +02:00
Ilya Dryomov 7807dafda2 libceph: bump CephXAuthenticate encoding version
A dummy v3 encoding (exactly the same as v2) was introduced so that
the monitors can distinguish broken clients that may not include their
auth ticket in CEPHX_GET_AUTH_SESSION_KEY request on reconnects, thus
failing to prove previous possession of their global_id (one part of
CVE-2021-20288).

The kernel client has always included its auth ticket, so it is
compatible with enforcing mode as is.  However we want to bump the
encoding version to avoid having to authenticate twice on the initial
connect -- all legacy (CephXAuthenticate < v3) are now forced do so in
order to expose insecure global_id reclaim.

Marking for stable since at least for 5.11 and 5.12 it is trivial
(v2 -> v3).

Cc: stable@vger.kernel.org # 5.11+
URL: https://tracker.ceph.com/issues/50452
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2021-04-27 23:52:24 +02:00
Lu Wei 3f9143f10c net: ceph: Fix a typo in osdmap.c
Modify "inital" to "initial" in net/ceph/osdmap.c.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Lu Wei <luwei32@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25 17:05:07 -07:00
Ilya Dryomov d7ef2e59e3 libceph: remove osdtimeout option entirely
Commit 83aff95eb9 ("libceph: remove 'osdtimeout' option") deprecated
osdtimeout over 8 years ago, but it is still recognized.  Let's remove
it entirely.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2021-02-16 12:09:52 +01:00
Ilya Dryomov afd56e78dd libceph: deprecate [no]cephx_require_signatures options
These options were introduced in 3.19 with support for message signing
and are rather useless, as explained in commit a51983e4dd ("libceph:
add nocephx_sign_messages option").  Deprecate them.

In case there is someone out there with a cluster that lacks support
for MSG_AUTH feature (very unlikely but has to be considered since we
haven't formally raised the bar from argonaut to bobtail yet), make
nocephx_sign_messages also waive MSG_AUTH requirement.  This is probably
how it should have been done in the first place -- if we aren't going
to sign, requiring the signing feature makes no sense.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2021-02-16 12:09:52 +01:00
Ilya Dryomov 9d5ae6f3c5 libceph: fix "Boolean result is used in bitwise operation" warning
This line dates back to 2013, but cppcheck complained because commit
2f713615dd ("libceph: move msgr1 protocol implementation to its own
file") moved it.  Add parenthesis to silence the warning.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-01-21 16:49:59 +01:00
Ilya Dryomov 4972cf605f libceph, ceph: disambiguate ceph_connection_operations handlers
Since a few years, kernel addresses are no longer included in oops
dumps, at least on x86.  All we get is a symbol name with offset and
size.

This is a problem for ceph_connection_operations handlers, especially
con->ops->dispatch().  All three handlers have the same name and there
is little context to disambiguate between e.g. monitor and OSD clients
because almost everything is inlined.  gdb sneakily stops at the first
matching symbol, so one has to resort to nm and addr2line.

Some of these are already prefixed with mon_, osd_ or mds_.  Let's do
the same for all others.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
2021-01-04 17:31:32 +01:00
Ilya Dryomov 10f42b3e64 libceph: zero out session key and connection secret
Try and avoid leaving bits and pieces of session key and connection
secret (gets split into GCM key and a pair of GCM IVs) around.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2021-01-04 17:31:32 +01:00
Ilya Dryomov f5f2c9a0e3 libceph: align session_key and con_secret to 16 bytes
crypto_shash_setkey() and crypto_aead_setkey() will do a (small)
GFP_ATOMIC allocation to align the key if it isn't suitably aligned.
It's not a big deal, but at the same time easy to avoid.

The actual alignment requirement is dynamic, queryable with
crypto_shash_alignmask() and crypto_aead_alignmask(), but shouldn't
be stricter than 16 bytes for our algorithms.

Fixes: cd1a677cad ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-28 20:34:33 +01:00
Ilya Dryomov ad32fe8801 libceph: fix auth_signature buffer allocation in secure mode
auth_signature frame is 68 bytes in plain mode and 96 bytes in
secure mode but we are requesting 68 bytes in both modes.  By luck,
this doesn't actually result in any invalid memory accesses because
the allocation is satisfied out of kmalloc-96 slab and so exactly
96 bytes are allocated, but KASAN rightfully complains.

Fixes: cd1a677cad ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)")
Reported-by: Luis Henriques <lhenriques@suse.de>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-28 20:34:32 +01:00
Linus Torvalds be695ee29e The big ticket item here is support for msgr2 on-wire protocol, which
adds the option of full in-transit encryption using AES-GCM algorithm
 (myself).  On top of that we have a series to avoid intermittent
 errors during recovery with recover_session=clean and some MDS request
 encoding work from Jeff, a cap handling fix and assorted observability
 improvements from Luis and Xiubo and a good number of cleanups.  Luis
 also ran into a corner case with quotas which sadly means that we are
 back to denying cross-quota-realm renames.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAl/beWITHGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzi4i0CACnvd87l2n7dndig7p5d5lVsmo8tAFs
 wHYHaIVisWKMcqKoT+YJajSgzaonxjzvYiyCzwLxV7s7vI7cswAwjEfYT7tTDRp2
 pnO1+4N/1ftznnTk/1QdqwOQLUg5UtdgWvFCaXQF+Vr/YroZomKJPaK8fXK882pC
 9FBjoLNy1HWySsoXPCxJktmDzpEEyYRNJg0vquxm7mxwTgQErupWlwEFjNg5LBkm
 gC0UoKhCE3DeUrXnoq21Ga62RIajxHofTooNx7dg+JiSVgluW+nORaWDYJXNzwLC
 j5puSe4pWIah+gmcwIFuyNz4ddkvVL4URvsYPGkVFYXlEefQjErc10Jh
 =6b9f
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-5.11-rc1' of git://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
 "The big ticket item here is support for msgr2 on-wire protocol, which
  adds the option of full in-transit encryption using AES-GCM algorithm
  (myself).

  On top of that we have a series to avoid intermittent errors during
  recovery with recover_session=clean and some MDS request encoding work
  from Jeff, a cap handling fix and assorted observability improvements
  from Luis and Xiubo and a good number of cleanups.

  Luis also ran into a corner case with quotas which sadly means that we
  are back to denying cross-quota-realm renames"

* tag 'ceph-for-5.11-rc1' of git://github.com/ceph/ceph-client: (59 commits)
  libceph: drop ceph_auth_{create,update}_authorizer()
  libceph, ceph: make use of __ceph_auth_get_authorizer() in msgr1
  libceph, ceph: implement msgr2.1 protocol (crc and secure modes)
  libceph: introduce connection modes and ms_mode option
  libceph, rbd: ignore addr->type while comparing in some cases
  libceph, ceph: get and handle cluster maps with addrvecs
  libceph: factor out finish_auth()
  libceph: drop ac->ops->name field
  libceph: amend cephx init_protocol() and build_request()
  libceph, ceph: incorporate nautilus cephx changes
  libceph: safer en/decoding of cephx requests and replies
  libceph: more insight into ticket expiry and invalidation
  libceph: move msgr1 protocol specific fields to its own struct
  libceph: move msgr1 protocol implementation to its own file
  libceph: separate msgr1 protocol implementation
  libceph: export remaining protocol independent infrastructure
  libceph: export zero_page
  libceph: rename and export con->flags bits
  libceph: rename and export con->state states
  libceph: make con->state an int
  ...
2020-12-17 11:53:52 -08:00
Ilya Dryomov 2f0df6cfa3 libceph: drop ceph_auth_{create,update}_authorizer()
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:50 +01:00
Ilya Dryomov ce287162d9 libceph, ceph: make use of __ceph_auth_get_authorizer() in msgr1
This shouldn't cause any functional changes.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:50 +01:00
Ilya Dryomov cd1a677cad libceph, ceph: implement msgr2.1 protocol (crc and secure modes)
Implement msgr2.1 wire protocol, available since nautilus 14.2.11
and octopus 15.2.5.  msgr2.0 wire protocol is not implemented -- it
has several security, integrity and robustness issues and therefore
considered deprecated.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:50 +01:00
Ilya Dryomov 00498b9941 libceph: introduce connection modes and ms_mode option
msgr2 supports two connection modes: crc (plain) and secure (on-wire
encryption).  Connection mode is picked by server based on input from
client.

Introduce ms_mode option:

  ms_mode=legacy        - msgr1 (default)
  ms_mode=crc           - crc mode, if denied fail
  ms_mode=secure        - secure mode, if denied fail
  ms_mode=prefer-crc    - crc mode, if denied agree to secure mode
  ms_mode=prefer-secure - secure mode, if denied agree to crc mode

ms_mode affects all connections, we don't separate connections to mons
like it's done in userspace with ms_client_mode vs ms_mon_client_mode.

For now the default is legacy, to be flipped to prefer-crc after some
time.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:50 +01:00
Ilya Dryomov 313771e80f libceph, rbd: ignore addr->type while comparing in some cases
For libceph, this ensures that libceph instance sharing (share option)
continues to work.  For rbd, this avoids blocklisting alive lock owners
(locker addr is always LEGACY, while watcher addr is ANY in nautilus).

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:50 +01:00
Ilya Dryomov a5cbd5fc22 libceph, ceph: get and handle cluster maps with addrvecs
In preparation for msgr2, make the cluster send us maps with addrvecs
including both LEGACY and MSGR2 addrs instead of a single LEGACY addr.
This means advertising support for SERVER_NAUTILUS and also some older
features: SERVER_MIMIC, MONENC and MONNAMES.

MONNAMES and MONENC are actually pre-argonaut, we just never updated
ceph_monmap_decode() for them.  Decoding is unconditional, see commit
23c625ce30 ("libceph: assume argonaut on the server side").

SERVER_MIMIC doesn't bear any meaning for the kernel client.

Since ceph_decode_entity_addrvec() is guarded by encoding version
checks (and in msgr2 case it is guarded implicitly by the fact that
server is speaking msgr2), we assume MSG_ADDR2 for it.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:50 +01:00
Ilya Dryomov 8921f25116 libceph: factor out finish_auth()
In preparation for msgr2, factor out finish_auth() so it is suitable
for both existing MAuth message based authentication and upcoming msgr2
authentication exchange.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:50 +01:00
Ilya Dryomov c1c0ce78f4 libceph: drop ac->ops->name field
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:50 +01:00
Ilya Dryomov 59711f9ec2 libceph: amend cephx init_protocol() and build_request()
In msgr2, initial authentication happens with an exchange of msgr2
control frames -- MAuth message and struct ceph_mon_request_header
aren't used.  Make that optional.

Stop reporting cephx protocol as "x".  Use "cephx" instead.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:50 +01:00
Ilya Dryomov 285ea34fc8 libceph, ceph: incorporate nautilus cephx changes
- request service tickets together with auth ticket.  Currently we get
  auth ticket via CEPHX_GET_AUTH_SESSION_KEY op and then request service
  tickets via CEPHX_GET_PRINCIPAL_SESSION_KEY op in a separate message.
  Since nautilus, desired service tickets are shared togther with auth
  ticket in CEPHX_GET_AUTH_SESSION_KEY reply.

- propagate session key and connection secret, if any.  In preparation
  for msgr2, update handle_reply() and verify_authorizer_reply() auth
  ops to propagate session key and connection secret.  Since nautilus,
  if secure mode is negotiated, connection secret is shared either in
  CEPHX_GET_AUTH_SESSION_KEY reply (for mons) or in a final authorizer
  reply (for osds and mdses).

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:50 +01:00
Ilya Dryomov 6610fff278 libceph: safer en/decoding of cephx requests and replies
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:50 +01:00
Ilya Dryomov f79e25b087 libceph: more insight into ticket expiry and invalidation
Make it clear that "need" is a union of "missing" and "have, but up
for renewal" and dout when the ticket goes missing due to expiry or
invalidation by client.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:50 +01:00
Ilya Dryomov a56dd9bf47 libceph: move msgr1 protocol specific fields to its own struct
A couple whitespace fixups, no functional changes.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:50 +01:00
Ilya Dryomov 2f713615dd libceph: move msgr1 protocol implementation to its own file
A pure move, no other changes.

Note that ceph_tcp_recv{msg,page}() and ceph_tcp_send{msg,page}()
helpers are also moved.  msgr2 will bring its own, more efficient,
variants based on iov_iter.  Switching msgr1 to them was considered
but decided against to avoid subtle regressions.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:50 +01:00
Ilya Dryomov 566050e17e libceph: separate msgr1 protocol implementation
In preparation for msgr2, define internal messenger <-> protocol
interface (as opposed to external messenger <-> client interface, which
is struct ceph_connection_operations) consisting of try_read(),
try_write(), revoke(), revoke_incoming(), opened(), reset_session() and
reset_protocol() ops.  The semantics are exactly the same as they are
now.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:49 +01:00
Ilya Dryomov 6503e0b69c libceph: export remaining protocol independent infrastructure
In preparation for msgr2, make all protocol independent functions
in messenger.c global.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:49 +01:00
Ilya Dryomov 699921d9e6 libceph: export zero_page
In preparation for msgr2, make zero_page global.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:49 +01:00
Ilya Dryomov 3fefd43e74 libceph: rename and export con->flags bits
In preparation for msgr2, move the defines to the header file.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:49 +01:00
Ilya Dryomov 6d7f62bfb5 libceph: rename and export con->state states
In preparation for msgr2, rename msgr1 specific states and move the
defines to the header file.

Also drop state transition comments.  They don't cover all possible
transitions (e.g. NEGOTIATING -> STANDBY, etc) and currently do more
harm than good.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:49 +01:00
Ilya Dryomov 30be780a87 libceph: make con->state an int
unsigned long is a leftover from when con->state used to be a set of
bits managed with set_bit(), clear_bit(), etc.  Save a bit of memory.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:49 +01:00
Ilya Dryomov 2f68738037 libceph: don't export ceph_messenger_{init_fini}() to modules
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:49 +01:00
Ilya Dryomov fd1a154cad libceph: make sure our addr->port is zero and addr->nonce is non-zero
Our messenger instance addr->port is normally zero -- anything else is
nonsensical because as a client we connect to multiple servers and don't
listen on any port.  However, a user can supply an arbitrary addr:port
via ip option and the port is currently preserved.  Zero it.

Conversely, make sure our addr->nonce is non-zero.  A zero nonce is
special: in combination with a zero port, it is used to blocklist the
entire ip.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:49 +01:00
Ilya Dryomov 771294fe07 libceph: factor out ceph_con_get_out_msg()
Move the logic of grabbing the next message from the queue into its own
function.  Like ceph_con_in_msg_alloc(), this is protocol independent.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:49 +01:00
Ilya Dryomov fc4c128e15 libceph: change ceph_con_in_msg_alloc() to take hdr
ceph_con_in_msg_alloc() is protocol independent, but con->in_hdr (and
struct ceph_msg_header in general) is msgr1 specific.  While the struct
is deeply ingrained inside and outside the messenger, con->in_hdr field
can be separated.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:49 +01:00
Ilya Dryomov 8ee8abf797 libceph: change ceph_msg_data_cursor_init() to take cursor
Make it possible to have local cursors and embed them outside struct
ceph_msg.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-14 23:21:49 +01:00