While debugging some new code, I ran valgrind in leak-checking mode
and it pointed out a handful of existing memory leaks, which got in the
way of spotting any _new_ leaks I might be introducing :-)
This was one: in the case where an asynchronous agent query on Unix is
aborted, the dynamically allocated buffer holding the response was not
freed.
The new AES routines are compiled into the code on any platform where
the compiler can be made to generate the necessary AES-NI and SSE
instructions. But not every CPU will support those instructions, so
the pure-software routines haven't gone away: both sets of functions
sit side by side in the code, and at key setup time we check the CPUID
bitmap to decide which set to select.
(This reintroduces function pointers into AESContext, replacing the
ones that we managed to remove a few commits ago.)
The outer routines are the ones which handle the CBC encrypt, CBC
decrypt and SDCTR cipher modes. Previously each of those had to be
able to dispatch to one of the per-block-size core routines, which
made it worth dividing the system up into two layers. But now there's
only one set of core routines, they may as well be inlined into the
outer ones.
Also as part of this commit, the nasty undef/redef of MAKEWORD and
LASTWORD have been removed, and the different macro definitions now
have different macro _names_, to make it clearer which one is used
where.
They're not really part of AES at all, in that they were part of the
Rijndael design but not part of the subset standardised by NIST. More
relevantly, they're not used by any SSH cipher definition, so they're
just adding complexity to the code which is about to get in the way of
refactoring it.
Removing them means there's only one pair of core encrypt/decrypt
functions, so the 'encrypt' and 'decrypt' function pointer fields can
be completely removed from AESContext.
Apparently I forgot to edit that when I originally imported this AES
implementation into PuTTY's SSH code from the more generically named
source file in which I'd originally developed it.
ATTR_REVERSE was being handled in the front ends, and was causing the
foreground and background colours to be switched. (I'm not completely
sure why I made that design decision; it might be purely historical,
but then again, it might also be because reverse video is one effect
on the fg and bg colours that must still be performed even in unusual
frontend-specific situations like display-driven monochrome mode.)
This affected both explicit reverse video enabled using SGR 7, and
also the transient reverse video arising from mouse selection. Thanks
to Markus Gans for reporting the bug in the latter, which when I
investigated it turned out to affect the former as well.
After fixing the previous two bugs, I thought it was probably a good
idea to re-check _everywhere_ in terminal.c where curr_attr is used,
to make sure that if curr_truecolour also needed updating at the same
time then that was being done.
I spotted this myself while looking through the code in search of the
cause of the background-colour-erase bug: saving and restoring the
cursor via ESC 7 / ESC 8 ought to also save and restore the current
graphics rendition attributes including foreground and background
colour settings, but it was not saving and restoring the new
term->curr_truecolour along with term->curr_attr.
So there's now a term->save_truecolour to keep that in, and also a
term->alt_save_truecolour to take account of the fact that all the
saved cursor state variables get swapped out _again_ when switching
between the main and alternate screens.
(However, there is not a term->alt_truecolour to complete the cross
product, because the _active_ graphics rendition is carried over when
switching between the terminal screens; it's only the _saved_ one from
ESC 7 / ESC 8 that is saved separately. That's consistent with the
behaviour we've had all along for ordinary fg/bg colour selection.)
I've done this on a 'where possible' basis: in Windows paletted mode
(in case anyone is still using an old enough graphics card to need
that!) I simply haven't bothered, and will completely ignore the dim
flag.
Markus Gans points out that some applications which (not at all
unreasonably) don't trust $TERM to tell them the full capabilities of
their terminal will use the sequence "OSC 4 ; nn ; ? BEL" to ask for
the colour-palette value in position nn, and they may not particularly
care _what_ the results are but they will use them to decide whether
the right number of colour palette entries even exist.
Otherwise, moving the cursor (at least in active, filled-cell mode) on
to a true-coloured character cell causes it to vanish completely
because the cell's colours override the thing that differentiates the
cursor.
I'm not sure if any X11 monochrome visuals or Windows paletted display
modes are still around, but just in case they are, we shouldn't
attempt true colour on either kind of display.
I know some users don't like any colour _at all_, and we have a
separate option to turn off xterm-style 256-colour sequences, so it
seems remiss not to have an option to disable true colour as well.
A mouse drag which manages to reach x < 0 (via SetCapture or
equivalent) was treated as having the coordinates of (x_max, y-1).
This is intended to be useful when the mouse drag is part of ordinary
raster-ordered selection.
But we were leaving that treatment enabled even for mouse actions that
went to xterm mouse tracking mode - thanks to Markus Gans for
reporting that - and when I investigated, I realised that this isn't a
sensible transformation in _rectangular_ selection mode either. Fixed
both.
This is a heavily rewritten version of a patch originally by Lorenz
Diener; it was tidied up somewhat by Christian Brabandt, and then
tidied up more by me. The basic idea is to add to the termchar
structure a pair of small structs encoding 24-bit RGB values, each
with a flag indicating whether it's turned on; if it is, it overrides
any other specification of fg or bg colour for that character cell.
I've added a test line to colours.txt containing a few example colours
from /usr/share/X11/rgb.txt. In fact it makes quite a good demo to run
the whole of rgb.txt through this treatment, with a command such as
perl -pe 's!^\s*(\d+)\s+(\d+)\s+(\d+).*$!\e[38;2;$1;$2;$3m$&\e[m!' rgb.txt
This causes PuTTY processes spawned from its system-tray menu to run
with the -restrict-acl option (or rather, the synonymous &R prefix
used by my auto-constructed command lines for easier parsing).
The previous behaviour of Pageant was never to pass -restrict-acl to
PuTTY, even when started with -restrict-acl itself; this is not
actually a silly thing to want to do, because Pageant might well have
more need of -restrict-acl than PuTTY (it stores longer-term and more
powerful secrets) and conversely PuTTY might have more need to _not_
restrict its ACL than Pageant (in that among the things enabled by an
unrestricted ACL are various kinds of accessibility software, which is
more useful on the more user-facing PuTTY than on Pageant).
But for those who want to lock everything down with every security
option possible (even though -restrict-acl is only an ad-hoc
precaution and cannot deliver any hard guarantees), this new option
should fill in the UI gap.
After a conversation this week with a user who tried to use it, it's
clear that Borland C can't build the up-to-date PuTTY without having
to make too many compromises of functionality (unsupported API
details, no 'long long' type), even above the issues that could be
worked round with extra porting ifdefs.
A user reports that a remote window title query, if the window title
is empty or if the option to return it is disabled, fails the
assertion in ldisc_send that I introduced as part of commit c269dd013
to catch any lingering uses of ldisc_send with length 0 that should
have turned into ldisc_echoedit_update. Added a check for len > 0
guarding that ldisc_send call, and likewise at one or two others I
noticed on my way here.
(Probably at some point I should decide that the period of smoking out
lingering old-style ldisc_send(0) calls is over, and declare it safe
to remove that assertion again and get rid of all the cumbersome
safety checks at call sites like these ones. But not quite yet.)
I've just upgraded my build process to a version of lld-link
that knows how to read .res, and I think it's a slightly more
commonly found format, so less confusing to encounter.
64-bit PuTTY should be loading gssapi64.dll, not gssapi32.dll. (In
contrast to the Windows system API DLLs, such as secur32.dll which is
also mentioned in the same source file; those keep the "32" in their
name whether we're in Win32 or Win64.)
When it calls through ocr->handler() to process the response to a
channel request, sometimes that call ends up back in the main SSH-2
authconn coroutine, and sometimes _that_ will call bomb_out(), which
closes the whole SSH connection and frees all the channels - so that
when control returns back up the call stack to
ssh2_msg_channel_response itself which continues working with the
channel it was passed, it's using freed memory and things go badly.
This is the sort of thing I'd _like_ to fix using some kind of
large-scale refactoring along the lines of moving all the actual
free() calls out into top-level callbacks, so that _any_ function
which is holding a pointer to something can rely on that pointer still
being valid after it calls a subroutine. But I haven't worked out all
the details of how that system should work, and doubtless it will turn
out to have problems of its own once I do, so here's a point fix which
simply checks if the whole SSH session has been closed (which is easy
- much easier than checking if that _channel_ structure still exists)
and fixes the immediate bug.
(I think this is the real fix for the problem reported by the user I
mention in commit f0126dd19, because I actually got the details wrong
in the log message for that previous commit: the user's SSH server
wasn't rejecting the _opening_ of the main session channel, it was
rejecting the "shell" channel request, so this code path was the one
being exercised. Still, the other bug was real too, so no harm done!)
A user reported a nonsensical assertion failure (claiming that
ssh->version != 2) which suggested that a channel had somehow outlived
its parent Ssh in the situation where the opening of the main session
channel is rejected by the server. Checking with valgrind suggested
that things start to go wrong at the point where we free the half-set-
up ssh->mainchan before having filled in its type field, so that the
switch in ssh_channel_close_local() picks an arbitrary wrong action.
I haven't reproduced the same failure the user reported, but with this
change, Unix plink is now valgrind-clean in that failure situation.
chiark's ftp server sometimes randomly refuses downloads. In the case
where this happens at postcheck time, this isn't really
release-blocking (all the files have been test-downloaded by precheck
already, so the main aim at this stage is to check that the 'latest'
symlink points to the right place, and even one or two successful
downloads are good enough to confirm that in practice). So now I can
add --no-ftp to the postcheck command line if that makes my life
easier.
This should have been part of commit ea0ab1c82; it's part of the
general revamp that we regenerate the autoconf files ourselves in a
clean directory - so we don't depend on them being present, but we
also don't depend on them being _absent_ either.
But when I made that commit my immediate priority was to get --setver
to work from a completely clean checkout, not from one already
littered with cruft, so I didn't check quite as carefully that my
changes fixed the problem in the latter case too :-)
In recent releases we've taken to making the actual release build (or
rather, candidates for it) ahead of time so that we can do some
slightly more thorough last-minute testing of the exact binaries that
we're going to release to everyone. It's time I actually wrote that
procedure down in the checklist, so that I remember what it is.
In particular, we had the idea that we should not properly GPG-sign
the release until the last moment, and use the presence of a set of
full GPG signatures as a means of distinguishing the real release
build from an RC that accidentally got out into the wild somehow. This
checklist update formalises that too, and documents the method I used
of ensuring the binaries weren't tampered with between RC building and
release signing (by making a signature on just the sha512sums). I also
include in this commit an extra command-line option to sign.sh to make
that preliminary signature step more convenient.
Previously, it demanded that your checkout was in a state where you
had run autoconf but not configure; so if you previously did have a
Makefile then you had to 'make distclean' to remove it, whereas if you
previously had no generated files at all (e.g. starting from a
completely clean checkout) then you had to run mkfiles.pl and
mkauto.sh to generate 'configure'.
This is obviously confusing, and moreover, the dependence on prior
generated files is fragile and prone to them having been generated
wrong. Adjusted the script so that it uses 'git archive' to get a
clean directory containing only the version-controlled files, and then
runs scripts itself to get that directory into the state it wants.
I listed a lot of other build options, but not that one. The release
checklist still recommends doing test builds with it, so it seems
sensible to arrange that you can tell if a build _is_ one of those or
not.
A lenof() being applied to a non-array, a couple of missing frees on
an error path, and a case in pfd_receive() where we could have gone
round a loop again after freeing the port forwarding structure it was
working on.
Like every other toolchain I've tried, my Coverity scanning build has
its share of random objections to parts of my Windows API type-
checking system. I do wonder if that bright idea was worth the hassle
- but it would probably cost all the annoyance all over again to back
out now...
Oops - I found this in an editor autosave file after pushing the
previous commit. I meant it to be part of the previous commit. Oh
well, better late than never.
Ilya Shipitsin sent me a list of errors reported by a tool 'cppcheck',
which I hadn't seen before, together with some fixes for things
already taken off that list. This change picks out all the things from
the remaining list that I could quickly identify as actual errors,
which it turns out are all format-string goofs along the lines of
using a %d with an unsigned int, or a %u with a signed int, or (in the
cases in charset/utf8.c) an actual _size_ mismatch which could in
principle have caused trouble on a big-endian target.