I've tried to separate out as many individually coherent changes from
this work as I could into their own commits, but here's where I run
out and have to commit the rest of this major refactoring as a
big-bang change.
Most of ssh.c is now no longer in ssh.c: all five of the main
coroutines that handle layers of the SSH-1 and SSH-2 protocols now
each have their own source file to live in, and a lot of the
supporting functions have moved into the appropriate one of those too.
The new abstraction is a vtable called 'PacketProtocolLayer', which
has an input and output packet queue. Each layer's main coroutine is
invoked from the method ssh_ppl_process_queue(), which is usually
(though not exclusively) triggered automatically when things are
pushed on the input queue. In SSH-2, the base layer is the transport
protocol, and it contains a pair of subsidiary queues by which it
passes some of its packets to the higher SSH-2 layers - first userauth
and then connection, which are peers at the same level, with the
former abdicating in favour of the latter at the appropriate moment.
SSH-1 is simpler: the whole login phase of the protocol (crypto setup
and authentication) is all in one module, and since SSH-1 has no
repeat key exchange, that setup layer abdicates in favour of the
connection phase when it's done.
ssh.c itself is now about a tenth of its old size (which all by itself
is cause for celebration!). Its main job is to set up all the layers,
hook them up to each other and to the BPP, and to funnel data back and
forth between that collection of modules and external things such as
the network and the terminal. Once it's set up a collection of packet
protocol layers, it communicates with them partly by calling methods
of the base layer (and if that's ssh2transport then it will delegate
some functionality to the corresponding methods of its higher layer),
and partly by talking directly to the connection layer no matter where
it is in the stack by means of the separate ConnectionLayer vtable
which I introduced in commit 8001dd4cb, and to which I've now added
quite a few extra methods replacing services that used to be internal
function calls within ssh.c.
(One effect of this is that the SSH-1 and SSH-2 channel storage is now
no longer shared - there are distinct struct types ssh1_channel and
ssh2_channel. That means a bit more code duplication, but on the plus
side, a lot fewer confusing conditionals in the middle of half-shared
functions, and less risk of a piece of SSH-1 escaping into SSH-2 or
vice versa, which I remember has happened at least once in the past.)
The bulk of this commit introduces the five new source files, their
common header sshppl.h and some shared supporting routines in
sshcommon.c, and rewrites nearly all of ssh.c itself. But it also
includes a couple of other changes that I couldn't separate easily
enough:
Firstly, there's a new handling for socket EOF, in which ssh.c sets an
'input_eof' flag in the BPP, and that responds by checking a flag that
tells it whether to report the EOF as an error or not. (This is the
main reason for those new BPP_READ / BPP_WAITFOR macros - they can
check the EOF flag every time the coroutine is resumed.)
Secondly, the error reporting itself is changed around again. I'd
expected to put some data fields in the public PacketProtocolLayer
structure that it could set to report errors in the same way as the
BPPs have been doing, but in the end, I decided propagating all those
data fields around was a pain and that even the BPPs shouldn't have
been doing it that way. So I've reverted to a system where everything
calls back to functions in ssh.c itself to report any connection-
ending condition. But there's a new family of those functions,
categorising the possible such conditions by semantics, and each one
has a different set of detailed effects (e.g. how rudely to close the
network connection, what exit status should be passed back to the
whole application, whether to send a disconnect message and/or display
a GUI error box).
I don't expect this to be immediately perfect: of course, the code has
been through a big upheaval, new bugs are expected, and I haven't been
able to do a full job of testing (e.g. I haven't tested every auth or
kex method). But I've checked that it _basically_ works - both SSH
protocols, all the different kinds of forwarding channel, more than
one auth method, Windows and Linux, connection sharing - and I think
it's now at the point where the easiest way to find further bugs is to
let it out into the wild and see what users can spot.
Having redesigned it a few days ago in commit 562cdd4df, I'm changing
it again, this time to fix a potential race condition on the _output_
side: the last change was intended to cope with a server sending an
asynchronous message like IGNORE immediately after enabling
compression, and this one fixes the case in which _we_ happen to
decide to send an IGNORE while a compression request is still pending.
I couldn't fix this until after the BPP was reorganised to have an
explicit output queue of packets, but now it does, I can simply defer
processing that queue on to the output raw-data bufchain if we're
waiting for a compression request to be answered. Once it is answered,
the BPP can release any pending packets.
This is a convenient place for it because it abstracts away the
difference in disconnect packet formats between SSH-1 and -2, so when
I start restructuring, I'll be able to call it even from places that
don't know which version of SSH they're running.
Now, instead of writing each packet straight on to the raw output
bufchain by calling the BPP's format_packet function, the higher
protocol layers will put the packets on to a queue, which will
automatically trigger a callback (using the new mechanism for
embedding a callback in any packet queue) to make the BPP format its
queue on to the raw-output bufchain. That in turn triggers a second
callback which moves the data to the socket.
This means in particular that the CBC ignore-message workaround can be
moved into the new BPP routine to process the output queue, which is a
good place for it because then it can easily arrange to only put an
ignore message at the start of any sequence of packets that are being
formatted as a single output blob.
I've rewritten these macros so that they don't keep rewriting the same
value into the crLine variable. They now write it just once, before
ever testing the condition.
The point isn't the extra efficiency (which is surely negligible);
it's to make it safe to abort a coroutine and free its entire state at
unexpected moments. If you use one of these macros with a condition
that has side effects, say crWaitUntil(func()), and one of the side
effects can be to free the entire object that holds the coroutine
state, then the write to crLine after testing the condition would
previously have caused a stale-pointer dereference. But now that only
happened once, _before_ the condition was first evaluated; so as long
as func() returns false in the event that it frees the coroutine
state, it's safe - crWaitUntil will see the false condition and return
without touching the state object, and then it'll never be called
again because the whole object will have gone away.
This means that someone putting things on a packet queue doesn't need
to separately hold a pointer to someone who needs notifying about it,
or remember to call the notification function every time they push
things on the queue. It's all taken care of automatically, without
having to put extra stuff at the call sites.
The precise semantics are that the callback will be scheduled whenever
_new_ packets appear on the queue, but not when packets are removed.
(Because the expectation is that the callback is notifying whoever is
consuming the queue.)
This paves the way for me to reorganise ssh.c in a way that will mean
I don't have a ConnectionLayer available yet at the time I have to
create the connshare. The constructor function now takes a mere
Frontend, for generating setup-time Event Log messages, and there's a
separate ssh_connshare_provide_connlayer() function I can call later
once I have the ConnectionLayer to provide.
NFC for the moment: the new provide_connlayer function is called
immediately after ssh_connection_sharing_init.
Commit 6a8b9d381, which created the Channel vtable and moved the agent
forwarding implementation of it out into agentf.c, managed to set the
rcvd_eof flag to TRUE in agentf_new(), meaning that we behave exactly
as if the first agent request was followed by an incoming EOF.
Now the three 'proper' BPPs each have a BPP_READ() macro that wraps up
the fiddly combination of crMaybeWaitUntilV and bufchainery they use
to read a fixed-length amount of input data. The sshverstring 'BPP'
doesn't read fixed-length data in quite the same way, but it has a
similar BPP_WAITFOR macro.
No functional change. Mostly this is just a cleanup to make the code
more legible, but also, the new macros will be a good place to
centralise anything else that needs doing on every read, such as EOF
checking.
This is essentially trivial, because the only thing it needed from the
Ssh structure was the Conf. So the version in sshcommon.c just takes
an actual Conf as an argument, and now it doesn't need access to the
big structure definition any more.
We define a macro in terms of INT_MAX, so we ought to include
<limits.h> to ensure INT_MAX is defined, rather than depending on
every call site to have remembered to do that themselves.
This is a new idea I've had to make memory-management of PktIn even
easier. The idea is that a PktIn is essentially _always_ an element of
some linked-list queue: if it's not one of the queues by which packets
move through ssh.c, then it's a special 'free queue' which holds
packets that are unowned and due to be freed.
pq_pop() on a PktInQueue automatically relinks the packet to the free
queue, and also triggers an idempotent callback which will empty the
queue and really free all the packets on it. Hence, you can pop a
packet off a real queue, parse it, handle it, and then just assume
it'll get tidied up at some point - the only constraint being that you
have to finish with it before returning to the application's main loop.
The exception is that it's OK to pq_push() the packet back on to some
other PktInQueue, because a side effect of that will be to _remove_ it
from the free queue again. (And if _all_ the incoming packets get that
treatment, then when the free-queue handler eventually runs, it may
find it has nothing to do - which is harmless.)
Now I've got a list macro defining all the packet types we recognise,
I can use it to write a test for 'is this a recognised code?', and use
that in turn to centralise detection of completely unrecognised codes
into the binary packet protocol, where any such messages will be
handled entirely internally and never even be seen by the next level
up. This lets me remove another big pile of boilerplate in ssh.c.
This allows me to share just one definition of the packet types
between the enum declarations in ssh.h and the string translation
functions in sshcommon.c. No functional change.
The style of list macro is slightly unusual; instead of the
traditional 'X-macro' in which LIST(X) expands to invocations of the
form X(list element), this is an 'X-y macro', where LIST(X,y) expands
to invocations of the form X(y, list element). That style makes it
possible to wrap the list macro up in another macro and pass a
parameter through from the wrapper to the per-element macro. I'm not
using that facility just yet, but I will in the next commit.
In order to list cross-certifiable host keys in the GUI specials menu,
the SSH backend has been inventing new values on the end of the
Telnet_Special enumeration, starting from the value TS_LOCALSTART.
This is inelegant, and also makes it awkward to break up special
handlers (e.g. to dispatch different specials to different SSH
layers), since if all you know about a special is that it's somewhere
in the TS_LOCALSTART+n space, you can't tell what _general kind_ of
thing it is. Also, if I ever need another open-ended set of specials
in future, I'll have to remember which TS_LOCALSTART+n codes are in
which set.
So here's a revamp that causes every special to take an extra integer
argument. For all previously numbered specials, this argument is
passed as zero and ignored, but there's a new main special code for
SSH host key cross-certification, in which the integer argument is an
index into the backend's list of available keys. TS_LOCALSTART is now
a thing of the past: if I need any other open-ended sets of specials
in future, I can add a new top-level code with a nicely separated
space of arguments.
While I'm at it, I've removed the legacy misnomer 'Telnet_Special'
from the code completely; the enum is now SessionSpecialCode, the
struct containing full details of a menu entry is SessionSpecial, and
the enum values now start SS_ rather than TS_.
Vtable objects only need to be globally visible throughout the code if
they're used directly in some interchangeable way, e.g. by passing
them to a constructor like cipher_new that's the same for all
implementations of the vtable, or by directly looking up public data
fields in the vtable itself.
But the BPPs are never used like that: each BPP has its own
constructor function with a different type signature, so the BPP types
are not interchangeable in any way _before_ an instance of one has
been constructed. Hence, their vtable objects don't need external
linkage.
Pavel Kryukov reports that gcc 8 didn't like that buffer being the
same size as the one from which I was sprintf("%s")ing into it. The
easiest fix is to stop trying to guess buffer sizes and use dupprintf.
Commit 6a5d4d083 introduced a foolish list-handling bug: concatenating
a non-empty queue to an empty queue would set the tail of the output
list to the _head_ of the non-empty one, instead of to its tail. Of
course, you don't notice this until you have more than one packet in
the queue in question!
I've just noticed that we call ssh1_bpp_start_compression even if the
server responded to our compression request with SSH1_SMSG_FAILURE!
Also, while I'm here, there's a potential race condition if the server
were to send an unrelated message (such as SSH1_MSG_IGNORE)
immediately after the SSH1_SMSG_SUCCESS that indicates compression
being enabled - the BPP would try to decode the compressed IGNORE
message before the SUCCESS got to the higher layer that would tell the
BPP it should have enabled compression. Fixed that by changing the
method by which we tell the BPP what's going on.
Moved the typedef of BinaryPacketProtocol into defs.h on the general
principle that it's just the kind of thing that ought to go there;
also removed the declaration of pq_base_init from ssh.h on the grounds
that there's never been any such function! (At least, not in public
source control - it existed in an early draft of commit 6e24b7d58.)
Originally, it controlled whether ssh.c should send terminal messages
(such as login and password prompts) to terminal.c or to stderr. But
we've had the from_backend() abstraction for ages now, which even has
an existing flag to indicate that the data is stderr rather than
stdout data; applications which set FLAG_STDERR are precisely those
that link against uxcons or wincons, so from_backend will do the
expected thing anyway with data sent to it with that flag set. So
there's no reason ssh.c can't just unconditionally pass everything
through that, and remove the special case.
FLAG_STDERR was also used by winproxy and uxproxy to decide whether to
capture standard error from a local proxy command, or whether to let
the proxy command send its diagnostics directly to the usual standard
error. On reflection, I think it's better to unconditionally capture
the proxy's stderr, for three reasons. Firstly, it means proxy
diagnostics are prefixed with 'proxy:' so that you can tell them apart
from any other stderr spew (which used to be particularly confusing if
both the main application and the proxy command were instances of
Plink); secondly, proxy diagnostics are now reliably copied to packet
log files along with all the other Event Log entries, even by
command-line tools; and thirdly, this means the option to suppress
proxy command diagnostics after the main session starts will actually
_work_ in the command-line tools, which it previously couldn't.
A more minor structure change is that copying of Event Log messages to
stderr in verbose mode is now done by wincons/uxcons, instead of
centrally in logging.c (since logging.c can now no longer check
FLAG_STDERR to decide whether to do it). The total amount of code to
do this is considerably smaller than the defensive-sounding comment in
logevent.c explaining why I did it the other way instead :-)
When PuTTY is configured to display stderr diagnostics from the proxy
command but only until the main session starts, this flag is how the
SSH backend indicates the point at which the session starts. It was
previously set during version-string parsing, and I forgot to find a
new home for it when I moved the version string parsing out into the
new verstring BPP module in commit af8e526a7.
Now reinstated, at the point where that BPP gets back to us and tells
us what protocol version it's chosen.
Apparently introduced just now in commit 6c5cc49e2; thanks to Colin
Harrison for pointing it out very promptly.
All this FROMFIELD business, helpful as it is, doesn't change the fact
that you can still absentmindedly cast something to the wrong type if
you're specifying the type explicitly!
The helper functions mm_shuffle_pd_i0 and mm_shuffle_pd_i1 need the
FUNC_ISA macro (which expands to __attribute__((target("sse4.1,aes")))
when building with clang) in order to avoid a build error complaining
that their use of the _mm_shuffle_pd intrinsic is illegal without at
least sse2.
This build error is new in the recently released clang 7.0.0, compared
to the svn trunk revision I was previously building with. But it
certainly seems plausible to me, so I assume there's been some
pre-release tightening up of the error reporting. In any case, those
helper functions are only ever called from other functions with the
same attribute, so it shouldn't cause trouble.
I've removed the encrypted_len fields from PktIn and PktOut, which
were used to communicate from the BPP to ssh.c how much each packet
contributed to the amount of data encrypted with a given set of cipher
keys. It seems more sensible to have the BPP itself keep that counter
- especially since only one of the three BPPs even needs to count it
at all. So now there's a new DataTransferStats structure which the BPP
updates, and ssh.c only needs to check it for overflow and reset the
limits.
That function _did_ depend on ssh.c's internal facilities, namely the
layout of 'struct ssh_channel'. In place of that, it now takes an
extra integer argument telling it where to find the channel id in
whatever data structure you give it a tree of - so now I can split up
the SSH-1 and SSH-2 channel handling without losing the services of
that nice channel-number allocator.
While I'm at it, I've brought it all into a single function: the
parsing of data from Conf, the list of modes, and even the old
callback system for writing to the destination buffer is now a simple
if statement that formats mode parameters as byte or uint32 depending
on SSH version. Also, the terminal speeds and the end byte are part of
the same setup, so it's all together in one place instead of scattered
all over ssh.c.
It doesn't really have to be in ssh.c sharing that file's internal
data structures; it's as much an independent object implementation as
any of the less trivial Channel instances. So it's another thing we
can get out of that too-large source file.
It's really just a concatenator for a pair of linked lists, but
unhelpfully restricted in which of the lists it replaces with the
output. Better to have a three-argument function that puts the output
wherever you like, whether it overlaps either or neither one of the
inputs.
We used it to suppress reading from the network at every point during
protocol setup where PuTTY was waiting for a user response to a dialog
box (e.g. a host key warning). The purpose of this was to avoid
dropping an important packet while the coroutine was listening to one
of its other input parameters (as it were). But now everything is
queue-based, packets will stay queued until we're ready to look at
them anyway; so it's better _not_ to freeze the connection, so that
messages we _can_ handle in between (e.g. SSH_MSG_DEBUG or
SSH_MSG_IGNORE) can still be processed.
That dispenses with all uses of ssh_set_frozen except for its use by
ssh_throttle_conn to exert back-pressure on the server in SSH1 which
doesn't have per-channel windows. So I've moved that last use _into_
ssh_throttle_conn, and now the function is completely gone.
These are essentially data-structure maintenance, and it seems silly
to have them be part of the same file that manages the topmost
structure of the SSH connection.
Some upcoming restructuring I've got planned will need to pass output
packets back and forth on queues, as well as input ones. So here's a
change that arranges that we can have a PktInQueue and a PktOutQueue,
sharing most of their implementation via a PacketQueueBase structure
which links together the PacketQueueNode fields in the two packet
structures.
There's a tricksy bit of macro manoeuvring to get all of this
type-checked, so that I can't accidentally link a PktOut on to a
PktInQueue or vice versa. It works by having the main queue functions
wrapped by macros; when receiving a packet structure on input, they
type-check it against the queue structure and then automatically look
up its qnode field to pass to the underlying PacketQueueBase function;
on output, they translate a returned PacketQueueNode back to its
containing packet type by calling a 'get' function pointer.
Now there's a centralised routine in misc.c to do the sanitisation,
which copies data on to an outgoing bufchain. This allows me to remove
from_backend_untrusted() completely from the frontend API, simplifying
code in several places.
Two use cases for untrusted-terminal-data sanitisation were in the
terminal.c prompts handler, and in the collection of SSH-2 userauth
banners. Both of those were writing output to a bufchain anyway, so
it was very convenient to just replace a bufchain_add with
sanitise_term_data and then not have to worry about it again.
There was also a simplistic sanitiser in uxcons.c, which I've now
replaced with a call to the good one - and in wincons.c there was a
FIXME saying I ought to get round to that, which now I have!
Getting it out of the overgrown ssh.c is worthwhile in itself! But
there are other benefits of this reorganisation too.
One is that I get to remove ssh->current_incoming_data_fn, because now
_all_ incoming network data is handled by whatever the current BPP is.
So now we only indirect through the BPP, not through some other
preliminary function pointer _and_ the BPP.
Another is that all _outgoing_ network data is now handled centrally,
including our outgoing version string - which means that a hex dump of
that string now shows up in the raw-data log file, from which it was
previously conspicuous by its absence.
I'm quite surprised I haven't needed this for anything else yet. I
suppose if I had it, I could have written most of my ptrlen_eq_strings
in terms of it, and saved a lot of gratuitous runtime strlens.
This should make it easier to do formatted-string based logging
outside ssh.c, because I can wrap up a local macro in any source file
I like that expands to logevent_and_free(wherever my Frontend is,
dupprintf(macro argument)).
It caused yet another stub function to be needed in testbn, but there
we go.
(Also, while I'm here, removed a redundant declaration of logevent
itself from ssh.h. The one in putty.h is all we need.)
This replaces the previous log(n)^2 algorithm for channel-number
allocation, which binary-searched the space of tree indices using a
log-time call to index234() at each step, with a single log-time pass
down the tree which only has to check the returned channel number
against the returned tree index at each step.
I'm under no illusions that this was a critical performance issue, but
it's been offending my sense of algorithmic elegance for a while.
This is a thing I've been meaning to set up for a while: it's a
pull-based search system (that is, the caller takes each step of the
search manually, rather than providing a callback), which lets the
caller inspect every step of the search, including the index of each
candidate element in the tree. This allows flexible kinds of query
that play the element and its index off against each other.
I've also rewritten the existing findrelpos234() search function using
the new one as a primitive, because that simplifies it a lot!
Added a missing #include of string.h; arranged that the test program
reports the number of errors detected at the end so I don't have to
search its entire output for "ERROR"; made the test program return a
nonzero exit status if any error was found.
This is a vtable that wraps up all the functionality required from the
SSH connection layer by associated modules like port forwarding and
connection sharing. This extra layer of indirection adds nothing
useful right now, but when I later separate the SSH-1 and SSH-2
connection layer implementations, it will be convenient for each one
to be able to implement this vtable in terms of its own internal data
structures.
To simplify this vtable, I've moved a lot of the logging duties
relating to connection sharing out of ssh.c into sshshare.c: now it
handles nearly all the logging itself relating to setting up
connection sharing in the first place and downstreams connecting and
disconnecting. The only exception is the 'Reusing a shared connection'
announcement in the console window, which is now done in ssh.c by
detecting downstream status immediately after setup.
The tree234 storing currently active port forwardings - both local and
remote - now lives in portfwd.c, as does the complicated function that
updates it based on a Conf listing the new set of desired forwardings.
Local port forwardings are passed to ssh.c via the same route as
before - once the listening port receives a connection and portfwd.c
knows where it should be directed to (in particular, after the SOCKS
exchange, if any), it calls ssh_send_port_open.
Remote forwardings are now initiated by calling ssh_rportfwd_alloc,
which adds an entry to the rportfwds tree (which _is_ still in ssh.c,
and still confusingly sorted by a different criterion depending on SSH
protocol version) and sends out the appropriate protocol request.
ssh_rportfwd_remove cancels one again, sending a protocol request too.
Those functions look enough like ssh_{alloc,remove}_sharing_rportfwd
that I've merged those into the new pair as well - now allocating an
rportfwd allows you to specify either a destination host/port or a
sharing context, and returns a handy pointer you can use to cancel the
forwarding later.
Clients outside ssh.c - all implementations of Channel - will now not
see the ssh_channel data type itself, but only a subobject of the
interface type SshChannel. All the sshfwd_* functions have become
methods in that interface type's vtable (though, wrapped in the usual
kind of macros, the call sites look identical).
This paves the way for me to split up the SSH-1 and SSH-2 connection
layers and have each one lay out its channel bookkeeping structure as
it sees fit; as long as they each provide an implementation of the
sshfwd_ method family, the types behind that need not look different.
A minor good effect of this is that the sshfwd_ methods are no longer
global symbols, so they don't have to be stubbed in Unix Pageant to
get it to compile.