The idea is to arrange that an ssh_hash object can be reused without
having to free it and allocate a new one. So the 'final' method has
been replaced with 'digest', which does everything except the trailing
free; and there's also a new pair of methods 'reset' and 'copyfrom'
which overwrite the state of a hash with either the starting state or
a copy of another state. Meanwhile, the 'new' allocator function has
stopped performing 'reset' as a side effect; now it _just_ does the
administrative stuff (allocation, setting up vtables), and returns an
object which isn't yet ready to receive any actual data, expecting
that the caller will either reset it or copy another hash state into
it.
In particular, that means that the SHA-384 / SHA-512 pair no longer
need separate 'new' methods, because only the 'reset' part has to
change between them.
This commit makes no change to the user-facing API of wrapper
functions in ssh.h, except to add new functions which nothing yet
calls. The user-facing ssh_hash_new() calls the new and reset methods
in succession, and the copy and final methods still exist to do
new+copy and digest+free.
The number of people has been steadily increasing who read our source
code with an editor that thinks tab stops are 4 spaces apart, as
opposed to the traditional tty-derived 8 that the PuTTY code expects.
So I've been wondering for ages about just fixing it, and switching to
a spaces-only policy throughout the code. And I recently found out
about 'git blame -w', which should make this change not too disruptive
for the purposes of source-control archaeology; so perhaps now is the
time.
While I'm at it, I've also taken the opportunity to remove all the
trailing spaces from source lines (on the basis that git dislikes
them, and is the only thing that seems to have a strong opinion one
way or the other).
Apologies to anyone downstream of this code who has complicated patch
sets to rebase past this change. I don't intend it to be needed again.
Similarly to the 'AES (unaccelerated)' naming scheme I added in the
AES rewrite, the hash functions that have multiple implementations now
each come with an annotation saying which one they are.
This was more tricky for hashes than for ciphers, because the
annotation for a hash has to be a separate string literal from the
base text name, so that it can propagate into the name field for each
HMAC wrapper without looking silly.
Keeping that information alongside the hashes themselves seems more
sensible than having the HMAC code know that fact about everything it
can work with.
All the hash-specific state structures, and the functions that
directly accessed them, are now local to the source files implementing
the hashes themselves. Everywhere we previously used those types or
functions, we're now using the standard ssh_hash or ssh2_mac API.
The 'simple' functions (hmacmd5_simple, SHA_Simple etc) are now a pair
of wrappers in sshauxcrypt.c, each of which takes an algorithm
structure and can do the same conceptual thing regardless of what it
is.
This supersedes the '#ifdef TEST' main programs in sshsh256.c and
sshsh512.c. Now there's no need to build those test programs manually
on the rare occasion of modifying the hash implementations; instead
testcrypt is built every night and will run these test vectors.
RFC 6234 has some test vectors for HMAC-SHA-* as well, so I've
included the ones applicable to this implementation.
I noticed a few of these in the course of preparing the previous
commit. I must have been writing that idiom out by hand for _ages_
before it became totally habitual to #define it as 'lenof' in every
codebase I touch. Now I've gone through and replaced all the old
verbosity with nice terse lenofs.
This is the commit that f3295e0fb _should_ have been. Yesterday I just
added some typedefs so that I didn't have to wear out my fingers
typing 'struct' in new code, but what I ought to have done is to move
all the typedefs into defs.h with the rest, and then go through
cleaning up the legacy 'struct's all through the existing code.
But I was mostly trying to concentrate on getting the test suite
finished, so I just did the minimum. Now it's time to come back and do
it better.
The annoying int64.h is completely retired, since C99 guarantees a
64-bit integer type that you can actually treat like an ordinary
integer. Also, I've replaced the local typedefs uint32 and word32
(scattered through different parts of the crypto code) with the
standard uint32_t.
Ian Jackson points out that the Linux kernel has a macro of this name
with the same purpose, and suggests that it's a good idea to use the
same name as they do, so that at least some people reading one code
base might recognise it from the other.
I never really thought very hard about what order FROMFIELD's
parameters should go in, and therefore I'm pleasantly surprised to
find that my order agrees with the kernel's, so I don't have to
permute every call site as part of making this change :-)
The new version of ssh_hash has the same nice property as ssh2_mac,
that I can make the generic interface object type function directly as
a BinarySink so that clients don't have to call h->sink() and worry
about the separate sink object they get back from that.
Just as I did a few commits ago with the low-level SHA_Bytes type
functions, the ssh_hash and ssh_mac abstract types now no longer have
a direct foo->bytes() update method at all. Instead, each one has a
foo->sink() function that returns a BinarySink with the same lifetime
as the hash context, and then the caller can feed data into that in
the usual way.
This lets me get rid of a couple more duplicate marshalling routines
in ssh.c: hash_string(), hash_uint32(), hash_mpint().
In fact, those functions don't even exist any more. The only way to
get data into a primitive hash state is via the new put_* system. Of
course, that means put_data() is a viable replacement for every
previous call to one of the per-hash update functions - but just
mechanically doing that would have missed the opportunity to simplify
a lot of the call sites.
I've finally got tired of all the code throughout PuTTY that repeats
the same logic about how to format the SSH binary primitives like
uint32, string, mpint. We've got reasonably organised code in ssh.c
that appends things like that to 'struct Packet'; something similar in
sftp.c which repeats a lot of the work; utility functions in various
places to format an mpint to feed to one or another hash function; and
no end of totally ad-hoc stuff in functions like public key blob
formatters which actually have to _count up_ the size of data
painstakingly, then malloc exactly that much and mess about with
PUT_32BIT.
It's time to bring all of that into one place, and stop repeating
myself in error-prone ways everywhere. The new marshal.h defines a
system in which I centralise all the actual marshalling functions, and
then layer a touch of C macro trickery on top to allow me to (look as
if I) pass a wide range of different types to those functions, as long
as the target type has been set up in the right way to have a write()
function.
This commit adds the new header and source file, and sets up some
general centralised types (strbuf and the various hash-function
contexts like SHA_State), but doesn't use the new calls for anything
yet.
(I've also renamed some internal functions in import.c which were
using the same names that I've just defined macros over. That won't
last long - those functions are going to go away soon, so the changed
names are strictly temporary.)
This permits a hash state to be cloned in the middle of being used, so
that multiple strings with the same prefix can be hashed without
having to repeat all the computation over the prefix.
Having done that, we'll also sometimes need to free a hash state that
we aren't generating actual hash output from, so we need a free method
as well.
SHA-384 was previously not implemented at all, but is a trivial
adjustment to SHA-512 (different starting constants, and truncate the
output hash). Both are now exposed as 'ssh_hash' structures so that
key exchange methods can ask for them.