This change makes sccache run rustc with `--color=always`, removing any other
--color option from the compiler options so that it's not included in the
cache key.
A `color_mode` method is added to the `CompilerHasher` trait, and the value
is sent back to the client in the `CompileFinished` method, so the client can
determine whether specific --color options were passed without having to
do its own commandline parsing.
The client uses the new `strip-ansi-escapes` crate to remove escape codes
from the compiler output in situations where color output is not desired:
* When `--color=never` was passed on the compiler commandline.
* When `--color=auto` is in effect (the default) but the output is not
a terminal.
The existing cargo tests were lightly modified to run compile once with
`--color=never` and once with `--color=always` to validate that cache hits
are not affected by color options.
This cache module uses a Memcached cluster.
To make sccache use this, set SCCACHE_MEMCACHED
to a list of tcp://<hostname>:<port> addresses,
separated by whitespace.
This commit alters the main sccache server to operate and orchestrate its own
GNU make style jobserver. This is primarily intended for interoperation with
rustc itself.
The Rust compiler currently has a multithreaded mode where it will execute code
generation and optimization on the LLVM side of things in parallel. This
parallelism, however, can overload a machine quickly if not properly accounted
for (e.g. if 10 rustcs all spawn 10 threads...). The usage of a GNU make style
jobserver is intended to arbitrate and rate limit all these rustc instances to
ensure that one build's maximal parallelism never exceeds a particular amount.
Currently for Rust Cargo is the primary driver for setting up a jobserver. Cargo
will create this and manage this per compilation, ensuring that any one `cargo
build` invocation never exceeds a maximal parallelism. When sccache enters the
picture, however, the story gets slightly more odd.
The jobserver implementation on Unix relies on inheritance of file descriptors
in spawned processes. With sccache, however, there's no inheritance as the
actual rustc invocation is spawned by the server, not the client. In this case
the env vars used to configure the jobsever are usually incorrect.
To handle this problem this commit bakes a jobserver directly into sccache
itself. The jobserver then overrides whatever jobserver the client has
configured in its own env vars to ensure correct operation. The settings of each
jobserver may be misconfigured (there's no way to configure sccache's jobserver
right now), but hopefully that's not too much of a problem for the forseeable
future.
The implementation here was to provide a thin wrapper around the `jobserver`
crate with a futures-based interface. This interface was then hooked into the
mock command infrastructure to automatically acquire a jobserver token when
spawning a process and automatically drop the token when the process exits.
Additionally, all spawned processes will now automatically receive a configured
jobserver.
cc rust-lang/rust#42867, the original motivation for this commit
Since cargo gained jobserver support it now passes a CARGO_MAKEFLAGS
environment variable which is different for every build (it contains file
descriptor numbers or semaphore values). sccache uses all environment
variables starting with `CARGO_` as inputs to the hash key for Rust compilation
so this broke things. I think it was mostly only a problem on Windows,
where the values were semaphores. On POSIX platforms the values are file
descriptors, which are probably likely to be the same between runs of cargo.
This change also adds a test for compiling a simple Rust crate with
cargo and verifying that we get a cache hit. I pulled in the `assert_cli`
crate to write that test, which worked nicely.
The Firefox build system does a compile with '-Fonul' to determine the
-showIncludes prefix string. Unfortunately my change in 16ecd0ba31
makes this fail because NamedTempFile::persist cannot rename the output to
'nul'. Ideally we'd handle this more correctly, but for now just refuse
to cache compiles that output to 'nul'.
I was seeing intermittent rustc panics when compiling with sccache.
It turns out that rustc will wind up looking for rlibs in the output dir
and sometimes it finds the ones that we're in the middle of writing out from
cache and it panics trying to read the half-written file.
This change makes us write the cache entry to a tempfile and atomically
rename it to the final name.
Local benchmarking showed that the implementation of SHA-512 in the *ring* crate
is 3x faster than the implementation of SHA-1 in the `sha1` crate. I've also
noticed that 80%+ of sccache's runtime on a fully cached build is spent hashing.
With this change I noticed a decrease from 108s to 92s when building a fully
cached LLVM from the network. Not a huge win but if the network were faster
could perhaps add up!
Closes#108
Currently the system tests are not running in our Windows CI because
cl.exe is not in PATH. They also won't run in a local `cargo test` if
not run from a shell with the MSVC environment set. This is unfortunate
for many reasons. This changes fixes this by using the GCC crate to locate
MSVC. The compiler invocations in the system tests needed some fixup to
use the MSVC environment variables returned by the GCC crate as well.
This cache module uses a Redis instance. To make sccache use this,
set SCCACHE_REDIS to redis://[:<passwd>@]<hostname>[:port][/<db>].
The maximum and current cache size is retrieved from the INFO and
CONFIG GET redis command.
This commit migrates away from the `protobuf` crate to instead just working with
bincode on the wire as a serialization format. This is done by leveraging a few
different crates:
* The `bincode` and `serde_derive` crates are used to define serialization
for Rust structures as well as provide a bincode implementation.
* The `tokio_io::codec::length_delimited` module implements framing via length
prefixes to transform an asynchronous stream of bytes into a literal `Stream`
of `BytesMut`.
* The `tokio_serde_bincode` crate is then used to tie it all together, parsing
these `BytesMut` as the request/response types of sccache.
Most of the changes here are related to moving away from the protobuf API
throughout the codebase (e.g. `has_foo` and `take_foo`) towards a more
rustic-ish API that just uses enums/structs. Overall it felt quite natural (as
one would expect) to just use the raw enum/struct values.
This may not be quite as performant as before but that doesn't really apply to
sccache's use case where perf is hugely dominated by actually compiling and
hashing, so I'm not too too worried about that.
My personal motivation for this is twofold:
1. Using `protobuf` was a little clunky throughout the codebase and definitely
had some sharp edges that felt good to smooth out.
2. There's currently what I believe some mysterious segfault and/or stray write
happening in sccache and I'm not sure where. The `protobuf` crate had a lot
of `unsafe` code and in lieu of actually auditing it I figured it'd be good
to kill two birds with one stone. I have no idea if this fixes my segfault
problem (I never could reproduce it) but I figured it's worth a shot.
This is an abstraction of the Tokio stack now and doesn't have much impact on
sccache itself but I figured it'd be good to update a few deps and pick up the
recent version of things!
Plus that and I'd like to start using the length_delimited module soon...
Previous versions of futures-cpupool depended on crossbeam which unfortunately
has known segfaults (spurious) so the updated version no longer uses
futures-cpupool and instead relies on channels in the standard library.
I was sporadically receiving a segfault locally when trying to debug issues on
Windows and in tracking this down I discovered blackbeam/named_pipe#3 which
leads to segfaults locally on startup.
This switches the one use case to the relevant functionality in
`mio-named-pipes` (already pulled in as part of `tokio-process`) and then
otherwise mirrors the same logic as the Unix version, just waiting for a byte
with a timeout.
This commit pulls in Hyper from its master branch which contains the
asynchronous/Tokio integration. All instances of HTTP, namely when
interacting with S3, are now all powered by async Hyper and work with
futures rather than synchronous requests on thread pools.
This commit replaces all process spawning with the futures-powered
versions in the `tokio-process` crate. Most functions were already ready
to return futures, but a few minor updates were made to ensure that
functions could return futures.
Additionally, some `&CpuPool` arguments have shown up to allow executing
work on a thread pool, such as filesystem operations. A number of
compilers write data to tempdirs and such before running a compiler, and
these are all executed on thread pools rather than the main thread.
My main takeaway from this commit is that translating synchronous to
asynchronous code isn't always easy. The "easy" version ends up being
relatively wasteful in terms of clones for low-level performance, but
probably shouldn't matter much in the context of spawning processes.
Other than that though I found that there was a translation for all
patterns found didn't actually hit any bugs (yet at least) from the
translation.
This commit rewrites the `server` module of sccache to be backed with Tokio. The
previous version was written with `mio`, which Tokio is built on, but is
unfortunately less ergonomic. Tokio is the state-of-the-art for asynchronous
programming in Rust and sccache serves as a great testing ground for ergonomics!
It's intended that the support added here will eventually extend to many other
operations that sccache does as well. For example thread spawning has all been
replaced with `CpuPool` to have a shared pool for I/O operations and such
(namely the filesystem). Eventually the HTTP requests made by the S3 backend can
be integrated with the Tokio branch of Hyper as well to run that on the event
loop instead of in a worker thread. I'd also like to eventually extend this with
`tokio-process` as well to move process spawning off helper threads as well, but
I'm leaving that to a future commit as well.
Overall I found the transition was quite smooth, with the high level
architecture look like:
* The `tokio-proto` crate is used in streaming mode. The streaming part is used
for the one RPC sccache gets which requires a second response to be sent later
on. This second response is the "response body" in tokio-proto terms.
* All of sccache's logic is manifested in an implementation of the `Service`
trait.
* The transport layer is provided with `tokio_core::io::{Framed, Codec}`, and
simple deserialization/serialization is performed with protobuf.
Some differences in design are:
* The `SccacheService` for now is just a bunch of reference-counted pointers,
making it cheap to clone. As the futures it returns progress they will each
retain a reference to a cloned copy of the `SccacheService`. Before all this
data was just stored and manipulated in a struct directly, but it's now
directly managed through shared memory.
* The storage backends share a thread pool with the main server instead of
spawning threads.
And finally, some things I've learned along the way:
* Sharing data between futures isn't a trivial operation. It took an explicit
decision to use `Rc` and I'm not sure I'm 100% happy with how the ergonomics
played out.
* Shutdown is pretty tricky here. I've tried to carry over all the previous
logic but it definitely required not using `TcpServer` in tokio-proto at the
very least, and otherwise required a few custom futures and such to track the
various states. I have a hunch that tokio-proto could provide more options out
of the box for something like this.
This change makes `get_cached_or_compile` return a `Future` for the
cache write instead of blocking on it. The server then waits for the
result on a thread. It also changes the server to record cache write
timings, and display the average in the stats.
Unlike sccache1, you need to specify both bucket and region, like:
SCCACHE_BUCKET=sccache
SCCACHE_REGION=us-east-1
Pretty limited testing, no automated tests currently.