A recent test-compile at high warning level points out that if you
define a macro with a ... at the end of the parameter list, then every
call should at least include the comma before the variadic part. That
is, if you #define MACRO(x,y,...) then you shouldn't call MACRO(1,2)
with no comma after the 2. But that's what I had done in one of my
definitions of FUNC0 in the fiddly testcrypt system.
In a similar vein, it's a mistake to use the preprocessor 'defined'
operator when it's expanded from another macro. Adjusted the setup of
BB_OK in mpint_i.h to avoid doing that.
(Neither of these has yet caused a problem in any real compile, but
best to fix them before they do.)
Similarly to the previous commit, this function had an inconsistent
parameter list between Unix and Windows, because the Windows source
file that defines it (winnet.c) didn't include ssh.h where its
prototype lives, so the compiler never checked.
Luckily, the discrepancy was that the Windows version of the function
was declared as taking an extra parameter which it ignored, so the fix
is very easy.
In commit b4c8fd9d8 which introduced the Seat trait, I got a bit
confused about the prototype of new_prompts(). Previously it took a
'Frontend *' parameter; I edited the call sites to pass a 'Seat *'
instead, but the actual function definition takes no parameters at all
- and rightly so, because the 'Frontend *' inside the prompts_t has
been removed and _not_ replaced with a 'Seat *', so the constructor
would have nothing to do with such a thing anyway.
But I wrote the function declaration in putty.h with '()' rather than
'(void)' (too much time spent in C++), and so the compiler never
spotted the mismatch.
Now new_prompts() is consistently nullary everywhere it appears: the
prototype in the header is a proper (void) one, and the call sites
have been modified to not pointlessly give it a Seat or null pointer.
Leak Sanitiser was kind enough to point this out to me during testing
of the port forwarding rework: chan_log_close_msg() returns a
dynamically allocated char *, which the caller is supposed to free.
In commit 09954a87c I introduced the portfwdmgr_connect_socket()
system, which opened a port forwarding given a callback to create the
Socket itself, with the aim of using it to make forwardings to Unix-
domain sockets and Windows named pipes (both initially for agent
forwarding).
But I forgot that a year and a bit ago, in commit 834396170, I already
introduced a similar low-level system for creating a PortForwarding
around an unusual kind of socket: the portfwd_raw_new() system, which
in place of a callback uses a two-phase setup protocol (you create the
socket in between the two setup calls, and can roll it back if the
socket can't be created).
There's really no need to have _both_ these systems! So now I'm
merging them, which is to say, I'm enhancing portfwd_raw_new to have
the one new feature it needs, and throwing away the newer system
completely. The new feature is to be able to control the initial state
of the 'ready' flag: portfwd_raw_new was always used for initiating
port forwardings in response to an incoming local connection, which
means you need to start off with ready=false and set it true when the
other end of the SSH connection sends back OPEN_CONFIRMATION. Now it's
being used for initiating port forwardings in response to a
CHANNEL_OPEN, we need to be able to start with ready=true.
This commit reverts 09954a87c2 and its
followup fix 12aa06ccc9, and simplifies
the agent_connect system down to a single trivial function that makes
a Socket given a Plug.
I carefully set up separate mechanisms for the "-96" suffix on the
hash name and the "bug-compatible" in parens after it, so that the
latter could share its parens with annotations from the underlying
hash. And then I forgot to _use_ the second mechanism!
Also added ssh2_mac_text_name to the testcrypt API so I could check it
easily. The result before this fix:
>>> ssh2_mac_text_name(ssh2_mac_new("hmac_sha1_96_buggy", None))
'HMAC-SHA-1-96 (bug-compatible) (unaccelerated)'
And after, which is what I intended all along:
>>> ssh2_mac_text_name(ssh2_mac_new("hmac_sha1_96_buggy", None))
'HMAC-SHA-1-96 (bug-compatible, unaccelerated)'
In an early draft of the new PRNG, before I decided to get rid of
random_byte() and replace it with random_read(), it was important
after generating a hash-worth of PRNG output to buffer it so as to
return it a byte at a time. So the PRNG data structure itself had to
keep a hash-sized buffer of pending output, and be able to return the
next byte from it on every random_byte() call.
But when random_read() came in, there was no need to do that any more,
because at the end of a read, the generator is re-seeded and the
remains of any generated data is deliberately thrown away. So the
pending_output buffer has no need to live in the persistent prng
object; it can be relegated to a local variable inside random_read
(and a couple of other functions that used the same buffer since it
was conveniently there).
A side effect of this is that we're no longer yielding the bytes of
each hash in reverse order, because only the previous silly code
structure made it convenient. Fortunately, of course, nothing is
depending on that - except the cryptsuite tests, which I've updated.
This was pointed out as a compiler warning when I test-built with
up-to-date clang-cl. It looks as if it would cause the IDM_FULLSCREEN
item on the system menu to be wrongly greyed/ungreyed, but in fact I
think it's benign, because MF_BYCOMMAND == 0. So it's _just_ a
warning fix, luckily!
A user reports that Visual Studio 2013 and earlier have printf
implementations in their C library that don't support the 'z' modifier
to indicate that an integer argument is size_t. The 'I' modifier
apparently works in place of it.
To avoid littering ifdefs everywhere, I've invented my own inttypes.h
style macros to wrap size_t formatting directives, which are defined
to %zu and %zx normally, or %Iu and %Ix in old-VS mode. Those are in
defs.h, and they're used everywhere that a %z might otherwise get into
the Windows build.
An assortment of errors: int vs size_t confusion (probably undetected
since the big switchover in commit 0cda34c6f), some outright spurious
parameters after the format string (copy-paste errors), a particularly
silly one in pscp.c (a comma between two halves of what should have
been a single string literal), and a _missing_ format string in ssh.c
(but luckily in a context where the only text that would be wrongly
treated as a format string was error messages generated elsewhere in
PuTTY).
I've added the gcc-style attribute("printf") to a lot of printf-shaped
functions in this code base that didn't have it. To make that easier,
I moved the wrapping macro into defs.h, and also enabled it if we
detect the __clang__ macro as well as __GNU__ (hence, it will be used
when building for Windows using clang-cl).
The result is that a great many format strings in the code are now
checked by the compiler, where they were previously not. This causes
build failures, which I'll fix in the next commit.
In an early draft of commit de38a4d82 I used 'void *' as the reqid
type, and then I thought better of it and made it a special type of
its own, in keeping with my usual idea that it's better to have your
casts somewhat checked than totally unchecked. One remnant of the
'void *' version got past me. Now fixed.
A user reports that the ReadFile call in console_get_userpass_input
fails with ERROR_NOT_ENOUGH_MEMORY on Windows 7, and further reports
that this problem only happens if you tell ReadFile to read more than
31366 bytes in a single call.
That seems to be a thing that other people have found as well: I
turned up a similar workaround in Ruby's Win32 support module, except
that there it's for WriteConsole. So I'm reducing my arbitrary read
size of 64K to 16K, which is well under that limit.
This issue became noticeable in PuTTY as of the recent commit
cd6bc14f0, which reworked console_get_userpass_input to use strbufs.
Previously we were trying to read an amount proportional to the
existing size of the buffer, so as to grow the buffer exponentially to
save quadratic-time reallocation. That was OK in practice, since the
initial read size was nice and small. But in principle, the same bug
was present in that version of the code, just latent - if we'd ever
been called on to read a _really large_ amount of data, then
_eventually_ the input size parameter to ReadFile would have grown
beyond that mysterious limit!
The previous commit introduced the PageantAsyncOp trait, with only one
trivial implementation. This one introduces a second implementation
used for SSH2_AGENTC_SIGN_REQUEST only, which waits to actually
construct the signature until after a callback has happened. This will
allow the signing process to be suspended in order to request a dialog
box and wait for a user response to decide _how_ to reply.
Suspended signing operations relating to a particular key are held in
a linked list dangling from the corresponding PageantKey structure.
This will allow them all to be found easily if that key should be
deleted from the agent: in that situation pending signatures from the
key will all return SSH_AGENT_FAILURE.
Still no functional change, however: the new coroutine doesn't
actually wait for anything, it just contains a comment noting that it
could if it wanted to.
This is a pure refactoring: no functional change expected.
This commit introduces two new small vtable-style APIs. One is
PageantClient, which identifies a particular client of the Pageant
'core' (meaning the code that handles each individual request). This
changes pageant_handle_msg into an asynchronous operation: you pass in
an agent request message and an identifier, and at some later point,
the got_response method in your PageantClient will be called with the
answer (and the same identifier, to allow you to match requests to
responses). The trait vtable also contains a logging system.
The main importance of PageantClient, and the reason why it has to
exist instead of just passing pageant_handle_msg a bare callback
function pointer and context parameter, is that it provides robustness
if a client stops existing while a request is still pending. You call
pageant_unregister_client, and any unfinished requests associated with
that client in the Pageant core will be cleaned up, so that you're
guaranteed that after the unregister operation, no stray callbacks
will happen with a stale pointer to that client.
The WM_COPYDATA interface of Windows Pageant is a direct client of
this API. The other client is PageantListener, the system that lives
in pageant.c and handles stream-based agent connections for both Unix
Pageant and the new Windows named-pipe IPC. More specifically, each
individual connection to the listening socket is a separate
PageantClient, which means that if a socket is closed abruptly or
suffers an OS error, that client can be unregistered and any pending
requests cancelled without disrupting other connections.
Users of PageantListener have a second client vtable they can use,
called PageantListenerClient. That contains _only_ logging facilities,
and at the moment, only Unix Pageant bothers to use it (and even that
only in debugging mode).
Finally, internally to the Pageant core, there's a new trait called
PageantAsyncOp which describes an agent request in the process of
being handled. But at the moment, it has only one trivial
implementation, which is handed the full response message already
constructed, and on the next toplevel callback, passes it back to the
PageantClient.
This is preparation to allow Pageant to be able to return to its GUI
main loop in the middle of handling a request (e.g. present a dialog
box to the user related to that particular request, and wait for the
user's response). In order to do that, we need the main thread's
Windows message loop to never be blocked by a WM_COPYDATA agent
request.
So I've split Pageant's previous hidden window into two hidden
windows, each with a subset of the original roles, and created in
different threads so that they get independent message loops. The one
in the main thread receives messages relating to Pageant's system tray
icon; the one in the subthread has the identity known to (old) Pageant
clients, and receives WM_COPYDATA messages only. Each WM_COPYDATA is
handled by passing the request back to the main thread via an event
object integrated into the Pageant main loop, and then waiting for a
second event object that the main thread will signal when the answer
comes back, and not returning from the WndProc handler until the
response arrives.
Hence, if an agent request received via WM_COPYDATA requires GUI
activity, then the main thread's GUI message loop will be able to do
that in parallel with all Pageant's other activity, including other
GUI activity (like the key list dialog box) and including responding
to other requests via named pipe.
I can't stop WM_COPYDATA requests from blocking _each other_, but this
allows them not to block anything else. And named-pipe requests block
nothing at all, so as clients switch over to the new IPC, even that
blockage will become less and less common.
Those chomp operations in wincons.c and uxcons.c looked ugly, and I'm
not totally convinced they couldn't underrun the buffer by 1 byte in
weird circumstances. strbuf_chomp is neater.
UBsan pointed out another memcpy from NULL (again with length 0) in
the prompts_t system. When I looked at it, I realised that firstly
prompt_ensure_result_size was an early not-so-good implementation of
sgrowarray_nm that would benefit from being replaced with a call to
the real one, and secondly, the whole system for storing prompt
results should really have been replaced with strbufs with the no-move
option, because that's doing all the same jobs better.
So, now each prompt_t holds a strbuf in place of its previous manually
managed string. prompt_ensure_result_size is gone (the console
prompt-reading functions use strbuf_append, and everything else just
adds to the strbuf in the usual marshal.c way). New functions exist to
retrieve a prompt_t's result, either by reference or copied.
These are better than my previous approach of just assigning to
sb->len, because firstly they check by assertion that the new length
is within range, and secondly they preserve the invariant that the
byte stored in the buffer just after the length runs out is \0.
Switched to using the new functions everywhere a grep could turn up
opportunities.
UBsan points out that if the input pointer is NULL, we'll pass it to
memcpy, which is technically illegal by the C standard _even_ if the
length you pass with it is zero.
This stops cgtest from leaving detritus all over my git checkout.
There's a --keep option to revert to the previous behaviour, just in
case I actually want the detritus on some occasion - although in that
situation I might also need to arrange that the various intermediate
files all go by different names, because otherwise there's a good
chance that the one I cared about would already have been overwritten.
This bug applies to both the new stream-based agent forwarding, and
ordinary remote->local TCP port forwardings, because it was introduced
by the preliminary infrastructure in commit 09954a87c.
new_connection() and sk_new() accept a SockAddr *, and take ownership
of it. So it's a mistake to make an address, connect to it, and then
sk_addr_free() it: the free will decrement its reference count to
zero, and then the Socket made by the connection will be holding a
stale pointer. But that's exactly what I was doing in the version of
portfwdmgr_connect() that I rewrote in that refactoring. And then I
made the same error again in commit ae1148267 in the Unix stream-based
agent forwarding.
Now both fixed. Rather than remove the sk_addr_free() to make the code
look more like it used to, I've instead solved the problem by adding
an sk_addr_dup() at the point of making the connection. The idea is
that that should be more robust, in that it will still do the right
thing if portfwdmgr_connect_socket should later change so as not to
call its connect helper function at all.
The new Windows stream-based agent forwarding is unaffected by this
bug, because it calls new_named_pipe_client() with a pathname in
string format, without first wrapping it into a SockAddr.
You can now restrict testing to a single key type (for quicker round
trips once you know what you're debugging). Also --help, on general
principles now that there's more than one option.
In setting up the ECC tests for cmdgen, I noticed that OpenSSH and
PuTTYgen disagree on the bit length to put in a key fingerprint for an
ed25519 key: we think 255, they think 256.
On reflection, I think 255 is more accurate, which is why I bodged
get_fp() in the test suite to ignore that difference when checking our
key fingerprint against OpenSSH's. But having done that, it now seems
silly that if you unnecessarily specify a bit count at ed25519
generation time, cmdgen will insist that it be 256!
255 is now permitted everywhere an ed25519 bit count is input. 256 is
also still allowed for backwards compatibility but 255 is preferred by
the error message if you give any other value.
We've supported ECC keys for a while, but cgtest has never tested them
before. Now it does.
This wasn't quite as simple as adding two extra key types to the list.
I had to add a system of per-key-type flags in the tests to trigger
different expectations and workarounds: the new key types can't be
converted to and from ssh.com format, they behave differently from
rsa1 if you try (in that they'll get as far as asking for the
passphrase _before_ realising the key is an unsupported kind), and
also it turns out we disagree with OpenSSH ssh-keygen on the bit count
to write in the fingerprint of an ed25519 key. (We say 255, and they
say 256.)
But having fixed all those things, the tests pass.
I've adjusted the cmdgen main program so that it does all early
returns via the 'goto out' idiom, so that they still go through all
the last-minute freeing steps. That meant I had to adjust a few of the
last-minute freeing steps so they don't try to do impossible things
like freeing SSH2_WRONG_PASSPHRASE or calling a vtable method of a
null object. Also added a couple of completely missing frees, in
cmdgen itself ('outfiletmp') and in the cgtest wrapper main ('fp').
Now cgtest gets a completely clean run through Leak Sanitiser.
The self-test mode of command-line PuTTYgen used to be compiled by
manually setting a #define, so that it would _replace_ the puttygen
binary. Therefore, it was useful to still have it behave like puttygen
if invoked with arguments, so that you didn't have to annoyingly
recompile back and forth to switch between manual and automated
testing.
But now that cgtest is built _alongside_ puttygen, there's no need for
that. If someone needs the non-test version of puttygen, it's right
there next to cgtest. So I've removed that weird special case, and
replaced it with a new command-line syntax for cgtest which supports a
-v option (which itself replaces configuration via an awkward
environment variable CGTEST_VERBOSE).
A user reports that if the ^E answerback string is configured to be
empty, then causing the answerback to be sent fails the assertion in
ldisc_send introduced in commit c269dd013.
I thought I'd caught all of the remaining cases of this in commit
4634cd47f, but apparently not.
When I took the key comments out of the RSAKey / ssh2_userkey
structures stored in pageant.c's trees and moved them into the new
containing PageantKey structure, I forgot that that would mean Windows
Pageant - which tries to read the comments _from_ those structures -
would now not be able to show comments in its list box.
Comments are small, so the easiest fix is just to duplicate them in
both places.
I think this is on balance a cleanup in its own right. But the main
purpose is that now Pageant has its own struct that wraps the RSAKey
and ssh2_userkey objects it gets from the rest of the code. So I'll be
able to put extra Pageant-specific data in that struct in future.
Well, actually, two new test programs. agenttest.py is the actual
test; it depends on agenttestgen.py which generates a collection of
test private keys, using the newly exposed testcrypt interface to our
key generation code.
In this commit I've also factored out some Python SSH marshalling code
from cryptsuite, and moved it into a module ssh.py which the agent
tests can reuse.
This adds stability tests (of the form 'make sure this behaves
tomorrow the same way it behaved today, taking on faith that the
latter was right') for all the new in-memory APIs for public and
private key load/save.
All the functions that read and write public and private keys to a
FILE * are now small wrappers on top of an underlying set of functions
that read data in the same format from a BinarySource, or write it to
a strbuf. This sets me up to deal with key files in contexts other
than on disk.
This is like PTRLEN_LITERAL, but you can use it in a _declaration_ of
a compile-time constant ptrlen, instead of a literal in expression
context. 'const ptrlen foo = PTRLEN_DECL_LITERAL("bar");'
There are new functions here to get the next contiguous string of
characters from a given set (like strspn/strcspn, only for
BinarySource, and returning a ptrlen of what they skipped over). Also
we can get a line of text with the trailing newline chomped off.
Finally, I've provided a function to rewind a BinarySource to a
previous position with error checking.
Now they have names that are more consistent (no more userkey_this but
that_userkey); a bit shorter; and, most importantly, all the current
functions end in _f to indicate that they deal with keys stored in
disk files. I'm about to add a second set of entry points that deal
with keys via the more general BinarySource / BinarySink interface,
which will sit alongside these with a different suffix.
They're not much use for 'real' key generation, since like all the
other randomness-using testcrypt functions, they need you to have
explicitly queued up some random data. But for generating keys for
test purposes, they have the great virtue that they deliver the key in
the internal format, where we can generate all the various public and
private blobs from it as well as the on-disk formats.
A minor change to one of the keygen functions itself: rsa_generate now
fills in the 'bits' and 'bytes' fields of the returned RSAKey, without
which it didn't actually work to try to generate a public blob from
it. (We'd never noticed before, because no previous client of
rsa_generate even tried that.)
This doesn't affect what files are _legal_: the spec said we tolerated
three kinds of line ending, and it still says we tolerate the same
three. But I noticed that we're actually outputting \n by preference,
whereas the spec said we prefer \r\n. I'd rather change the docs than
the code.
I accepted both 'int' and 'uint' as function argument types, but
hadn't previously noticed that only 'uint' is handled properly as a
return type. Now both are.
It wasn't expanding the output strbuf to the full size of the key
modulus, so the output delivered to Python was only a part of the
mpint it should have been.
(Also, that was logically speaking a buffer overrun - we were writing
to the strbuf buffer beyond its length - although in practice I think
the _physical_ size of the buffer was large enough not to show it up
even under ASan. In any case, a buffer overrun only in the test suite,
and in a function I hadn't even got round to testing, is about the
best place to have one.)
While I'm here, I've also changed the way that the testcrypt wrapper
on rsa_ssh1_encrypt indicates failure: now we have the 'opt_'
mechanism, it can do that by returning None rather than "".