Also allow individual builders to override GOPROXY explicitly.
The vetall builder expects to build golang.org/x/tools from a specific
revision, and in module mode that implies that it needs access to all
of the module dependencies of that revision.
The individual subrepo builders may need a GOPROXY to resolve
dependencies when the default value of GO111MODULE changes to "on",
since they will enter module mode automatically even though they are
running within GOPATH.
Updates golang/go#30228
Change-Id: I40f6f2ea3c5ed05eaaaf3503ba97e6ccd3f20e1f
Reviewed-on: https://go-review.googlesource.com/c/build/+/165738
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Now that we can do nested virtualization on GCE, that means we can run
the Android emulator (which requires KVM) on GCE and at least get
fast trybots and such for android-amd64.
Updates golang/go#23824
Change-Id: I0da38c7fa0f15492230a31291d2921ba72f2151d
Reviewed-on: https://go-review.googlesource.com/c/163738
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
We previously hard-coded a Linux-only static map of 2015 data in for
the critical path scheduling of cmd/dist tests over N sharded
buildlets. That worked well only for Linux and only in 2015.
Instead, query BigQuery to find out what the recent timing data looks
like for all builders.
I'd started to work on this back in CL 30716 (Oct 2016) but apparently
never finished. Yay me. But skip the writing-to-CSV step. BigQuery is
much faster than I remember (maybe it got faster?), so just query it
directly. The query takes about 2 seconds, and we only do it every
hour (which is still overkill; daily is probably fine).
Change-Id: I498fc09dfaf24fb1f11b2c0ab4b952b2f15f9c32
Reviewed-on: https://go-review.googlesource.com/c/160037
Reviewed-by: Andrew Bonventre <andybons@golang.org>
The new BuildConfig.MinimumGoVersion field specifies the minimum
Go version the builder is allowed to use. It's useful when some
of the builders are too new, and do not support all of the supported
Go releases (e.g., openbsd-amd64-64 and freebsd-amd64-12_0 currently
require Go 1.11 and don't work on Go 1.10).
It only needs to be set when a builder doesn't support all supported
Go releases, since we don't typically test unsupported Go releases.
To allow cmd/coordinator to use this field and filter out work it
receives from maintner/maintnerd's GoFindTryWork RPC call,
we add a GoVersion slice to apipb.GerritTryWorkItem,
and populate it in maintapi.apiService.GoFindTryWork method.
For trybots on the Go repo, the GoVersion field is determined from
the branch name. For "release-branch.goX.Y" branches, it parses out
the major-minor Go version from the branch name. For master and
other branches, it assumes the latest Go release.
For trybots on subrepos, we already have the Go version information
available, so use it directly.
Afterwards, all that's left is to modify newTrySet in cmd/coordinator
to make use of BuildConfig.MinimumGoVersion and work.GoVersion to skip
builders that are too new for the Go version that needs to be tested.
Fixesgolang/go#29265
Change-Id: I50b01830647e33e37e9eb8b89e0f2518812fa44f
Reviewed-on: https://go-review.googlesource.com/c/155463
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Only the "oauth2" and "build" repos for now, as a test. We'll lock
down policy more later and decide when to do this automatically.
Also, this currently only runs buildlets which run in our GCP project,
because we're not yet proxying the a localhost:3000 port from the
reverse buildlets to an authenticated TLS connection back to our
module proxy service on GKE.
Updates golang/go#14594Fixesgolang/go#29637
Change-Id: I6f05da2186b38dc8056081252563a82c50f0ce05
Reviewed-on: https://go-review.googlesource.com/c/157438
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Andrew Bonventre <andybons@golang.org>
Part of getting the x/build repo passing on the builders.
Change-Id: I3f2055cbe91c03ddc0a5152bfdbc0f377f354f47
Reviewed-on: https://go-review.googlesource.com/c/157441
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
subTryBuilders no longer exists, it was factored out into another
package in CL 145157. That logic now happens within the call to
dashboard.TryBuildersForProject function. Remove the reference to it
since it's inaccurate and confusing.
Change-Id: I74d57717541a4ee4fe57f3ae9f46d3fc097cfce1
Reviewed-on: https://go-review.googlesource.com/c/155462
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This change takes care of a TODO comment. It makes the builder
configuration more centralized and contained in the dashboard
package.
Invert it for consistency with BuildConfig.BuildRepo method.
Updates golang/go#26791
Change-Id: I46368adadb85f2ec730da4fc0abe5fd6a112a7c7
Reviewed-on: https://go-review.googlesource.com/c/149738
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This change is required for Alan's vet refactor and change to how
vetall works in CL 149097.
Change-Id: Ia7864c1b2430c696a457d3f0de2820b9037d78ab
Reviewed-on: https://go-review.googlesource.com/c/149658
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Fixes assumption broken by https://golang.org/cl/143538 where number
of builders (types of builders) no longer equals the number of trybot
builds. (We now run linux-amd64 1 or 3 times depending on the repo)
Fixesgolang/go#28714
Change-Id: I3b85adbb79508890d16311fc75f4b48ffc1f3c78
Reviewed-on: https://go-review.googlesource.com/c/149437
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
BuildConfig grew way too large and has too many slices & such to be a
value type. Make it a pointer type, which matches HostConfig.
Change-Id: Ie625bece9d6d8c1ec6cff26e77416bd9b1f256d8
Reviewed-on: https://go-review.googlesource.com/c/145077
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
This is the coordinator half of the change to make the coordinator
test subrepos against two prior releases of Go.
The next change that's required is for the maintnerd API server to
return the name of the past two releases.
For now this change is a no-op. But once maintnerd starts returning
more data, then this change will start causing causing two more
linux-amd64 builders per subrepo trybot.
Updates golang/go#17626
Change-Id: I1cedbc2e4eee51fb003c8bcc8e072fd10717a833
Reviewed-on: https://go-review.googlesource.com/c/143538
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
The coordinator used to run on GCE and for some reason (graceful
upgrades, freeing up memory leaks?), the process would exit after it
had been up for 10 minutes and had no work to do.
Delete that. We run on Kubernetes now and don't need graceful upgrades
and shouldn't be leaking resources. (If we do, we should be fixing
that instead.)
Also it appears that this code no longer does anything.
Change-Id: I9b5f2881692117b7c99dd2e513ee146f43170791
Reviewed-on: https://go-review.googlesource.com/132075
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Once containers run on COS instead of Kubernetes, one name (Kube*) is
wrong and the other (GCE) is ambiguous. So rename them now to be more
specific.
No behavior changes. Just renaming in this step, to reduce size of
next CL.
Updates golang/go#25108
Change-Id: Ib09eb682ef74acbbf6ed50b46074f834ef5e0c0b
Reviewed-on: https://go-review.googlesource.com/111639
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Issue golang/go#22852 modifies formatting for Go tables. If the
code stays as is, users running Go 1.9 will format the file one way
and users running tip will format the file a different way. This is
problematic when git-codereview attempts to check the file is gofmt'ed
correctly, and does not let you mail commits that aren't.
Adding another newline makes the file compatible with tip and the
stable release branch and avoids this problem.
Change-Id: I0dbfe01db6e07578993e793e3710702ff2518436
Reviewed-on: https://go-review.googlesource.com/104836
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Load style.css so we share some styles with the Go Build Coordinator
homepage. Add some custom styles so text in tables is a little bigger
- there's less of it on the trybot page.
Instead of formatting to the nearest tenth of a minute, we can round
to the nearest second - Duration.Round() is now two Go versions old
- and use more natural duration formatting.
Instead of issuing many small writes to the ResponseWriter, write HTML
to a buffer and then write it to the ResponseWriter all at once. (We
should just use a template but that's a bigger project.)
Add a "/try-dev" handler you can enable with the `-dev` build tag
which makes it easy to view and test the "trybot status" page locally,
without requiring a builder.
Change-Id: I28617d02ed857c28d2bb2d9ccfb05ca9dc572212
Reviewed-on: https://go-review.googlesource.com/103870
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Allow any origin to make cross-origin requests to get builder
statuses.
Also fixes a bug with an ineffectual Header().Set(...) call since
WriteHeader was called before it.
Change-Id: Ic2867624a286dad7268ddb94c3bb5afb52bf84ac
Reviewed-on: https://go-review.googlesource.com/98995
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This adds a darwin-amd64-race builder. Of the four platforms that
support the race detector, this was the only one missing a builder.
To make up for the extra work, skip some redundant tests that the Mac
(and other) builds do. The test directory is really slow, and doesn't
have platform-specific code. We have amd64 & 386 test coverage for
that directory via the Linux builders, which is fine. No need to do it
on some of the heavier VM builders, or for trybots.
Fixesgolang/go#17674
Change-Id: I888119b3fdd28884b87a59d78746ab1fef6960c7
Reviewed-on: https://go-review.googlesource.com/82395
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The problem is especially severe on the slow builders,
like darwin/arm64, which can sometimes not have run
any commit from the past 12 hours and still pick an old
commit for its next trial. It would be far better for it to
prefer new work.
It's possible that this new logic should be disabled for
some of the auto-scaling builders, but for now it seems
like we can try it for all of them and see if that's OK.
Update golang/go#19178.
Change-Id: I32cc67c0c2c84130b40b250675b40aadb4a0a681
Reviewed-on: https://go-review.googlesource.com/70430
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Sarah Adams <shadams@google.com>
Instead of polling Gerrit every 60 seconds, poll maintnerd every
second. This should improve TryBot-requested to TryBot-running latency
by 30 seconds on average.
(Ideally it'd long poll for changes and not even have ~500ms delay,
but this is an improvement.)
Also, in the process this does most of the work for golang/go#17626.
And this cleans up the repo head tracking complication in the coordinator
and just asks maintner for it instead.
Running benchmarks in the coordinator has been disabled since
2017-06-23 but this CL disables it further, removing some code it'd
need to work. But considering that benchmarking would need code
repairs to get working again anyway, I consider this acceptable in the
short term. I left TODO notes.
Updates golang/go#17626
Updates golang/go#19871
Change-Id: Idab43f65a9cca861bffb37748b036a5c9baee864
Reviewed-on: https://go-review.googlesource.com/51590
Reviewed-by: Andrew Bonventre <andybons@golang.org>
This adds an SSH server to farmer.golang.org on port 2222 that proxies
SSH connections to users' gomote-created buildlet instances.
For example:
$ gomote create openbsd-amd64-60
user-bradfitz-openbsd-amd64-60-1
$ gomote ssh user-bradfitz-openbsd-amd64-60-1
Warning: Permanently added '[localhost]:33351' (ECDSA) to the list of known hosts.
OpenBSD 6.0 (GENERIC.MP) golang/go#2319: Tue Jul 26 13:00:43 MDT 2016
Welcome to OpenBSD: The proactively secure Unix-like operating system.
Please use the sendbug(1) utility to report bugs in the system.
Before reporting a bug, please try to reproduce it with the latest
version of the code. With bug reports, please try to ensure that
enough information to reproduce the problem is enclosed, and if a
known fix for it exists, include that as well.
$
As before, if the coordinator process is restarted (or crashes, is
evicted, etc), all gomote instances die.
Not yet supported:
* scp (help wanted)
* not all host types are configured. most are. some will need slight
config tweaks to the Docker image (e.g. adding openssh-server)
Supports currently:
* linux-amd64 (host type shared by 386, nacl)
* linux-arm
* linux-arm64
* darwin
* freebsd
* openbsd
* plan9-386
* windows
Implementation details:
* the ssh server process listens on port 2222 in the coordinator
(farmer.golang.org), which is behind a GKE TCP load balancer.
* the ssh server library is github.com/gliderlabs/ssh
* authentication is done via Github users' public keys. It's assumed
that gomote user == github user. But there's a mapping in the code
for known exceptions.
* we can't give out access to this too widely. too many things are
accessible from within the host environment if you look in the right
places. Details omitted. But the Go team and other trusted gomote
users can use this.
* the buildlet binary has a new /connect-ssh handler that acts like a
CONNECT request but instead of taking an explicit host:port, just
says "give me your machine's SSH connection". The buildlet can also
start sshd if needed for the environment. The /connect-ssh handler
also installs the coordinator's public key.
* a new buildlet client library method "ConnectSSH" hits the /connect-ssh
handler and returns a net.Conn.
* the coordinator's ssh.Handler is just running the OpenSSH ssh client.
* because the OpenSSH ssh child process can't connect to a net.Conn,
an emphemeral localhost port is created on the coordinator to proxy
between the ssh client and the net.Conn returned by ConnectSSH.
* The /connect-ssh handler requires http.Hijacker, which requires
fully compliant net.Conn implementations as of Go 1.8. So I needed
to flesh out revdial too, testing it with the
golang.org/x/net/nettest package.
* plan9 doesn't have an ssh server, so we use 0intro's new conterm
program (drawterm without GUI support) to connect to plan9 from the
coordinator ssh proxy instead of using the OpenSSH ssh client
binary.
* windows doesn't have an ssh server, so we enable the telnet service
and the coordinator ssh proxy uses telnet instead on the backend
on the private network. (There is a Windows ssh server but only in
new versions.)
Happy debugging over ssh!
Fixesgolang/go#19956
Change-Id: I80a62064c5f85af1f195f980c862ba29af4015f0
Reviewed-on: https://go-review.googlesource.com/50750
Reviewed-by: Herbie Ong <herbie@google.com>
Reviewed-by: Jessie Frazelle <me@jessfraz.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
tryKey had both a Project and Repo string field, and both had the same
value from Gerrit. Delete Repo and use Project (the Gerrit term) instead.
Change-Id: Ie1a68376a980dfc2b839544bc56faa63655ddb58
Reviewed-on: https://go-review.googlesource.com/51530
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
FreeBSD 9.x and 10.x are currently broken on GCE. Move to FreeBSD 11
for Trybots.
Also, disable NetBSD in the config, not in the code.
And disable plan9 and openbsd-386, which are also broken on GCE due to
the same GCE bug.
Change-Id: I86f9b2690f39b6aa76b4bb3a900a4302c9dc7830
Reviewed-on: https://go-review.googlesource.com/50011
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This moves getSourceTgz* to a new package called sourcecache. The
original functions couldn't be left in coordinator.go because that
would cause a single process to have multiple caches.
The code for building and running benchmarks is moved into the buildgo
package, and the public API becomes GoBuilder.EnumerateBenchmarks and
Run on the resulting BenchmarkItem objects.
Updates golang/go#19871
Change-Id: I28b660e1cdaa6d1c6b0378c08de30f5e58316cc6
Reviewed-on: https://go-review.googlesource.com/44211
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This moves RunMake from coordinator.go, allowing other packages to
build Go on a buildlet.
Updates golang/go#19871
Change-Id: Ic0e7c056020ef434f0620ae816f52b9112ea1c8d
Reviewed-on: https://go-review.googlesource.com/44176
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This package contains the BuilderRev type moved from cmd/coordinator.
The rest of the CL is simply updating coordinator with the new
exported names of the type and its fields.
This refactoring is in preparation for moving the benchmark building and running
code into a separate package.
(Most of the diff could have been avoided with a type alias, but I
assume we'd rather not do that.)
Updates golang/go#19871
Change-Id: Ib6ce49431c8529d6b4e72725d3cd652b9d0160db
Reviewed-on: https://go-review.googlesource.com/44175
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This moves main.spanLogger and main.eventSpan to spanlog.Logger and
spanlog.Span. This change is necessary to start pulling build and
benchmark code out of coordinator. The interfaces cannot simply be
copied because Logger contains a function whose return type is
Span, so the interface must be defined in exactly one place.
Updates golang/go#19871
Change-Id: I0a48192e6a5c8f5d0445f4f3d3cce8d91c90f8d3
Reviewed-on: https://go-review.googlesource.com/44174
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This refactoring is in preparation for moving benchmarks.go to a
separate package.
Updates golang/go#19871
Change-Id: I2b30bf5416937e52b603aec8102131fdccceee42
Reviewed-on: https://go-review.googlesource.com/44173
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
We still run some of it (at least for now, it's harmless, and in its
own goroutine), but not the -c part.
Fixesgolang/go#20222
Change-Id: I37a39a41c8a5f2ebaee70aeba390c58b22399d3d
Reviewed-on: https://go-review.googlesource.com/43619
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
This was lost in one of the many refactorings of CL 38306.
Updates golang/go#19871
Change-Id: I4514d4350feb1420861eae9a9dd021fee5b5ba6f
Reviewed-on: https://go-review.googlesource.com/43052
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
- Make URLs point to correct external IP
- Disable windows-amd64-2008 builder type that doesn't exist in
the staging farm
Updates golang/go#18817
Change-Id: Id64a63694f90e70c4fd78f9d1433ed5031822111
Reviewed-on: https://go-review.googlesource.com/42850
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This will also be used by maintner.golang.org shortly.
Updates golang/go#19866
Change-Id: Id952065831920a206e3cb97bd1f451ceaea34927
Reviewed-on: https://go-review.googlesource.com/42146
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This adds a boolean to flag a cross compiler config to always be
used for every build, not just when in staging or if it's a try-bot.
It's turned on for the arm5spacemonkey build machines since they take
about 27 minutes to run make.bash.
Fixesgolang/go#20118
Change-Id: I13a398d1645a2b5f0c87c5fffc7933922ac1365e
Reviewed-on: https://go-review.googlesource.com/41755
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The windows-amd64-2008 is the same OS as windows-amd64-gce but is an
auto-generated image.
TODO: 386 auto-generated Windows builders, and then maybe we'll move
TryBots to Windows Server 2016. One step at a time. This should be a
no-op. I verified performance is the same.
Updates golang/go#17513
Change-Id: I34984db14b87d03771e15465978b1687df6895f7
Reviewed-on: https://go-review.googlesource.com/41611
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
We've been logging event spans to datastore for years, but I'd lost
this CL and just found it back. This does two things: syncs the
datastore logs to BigQuery, and starts to use the from-BigQuery timing
info in the coordinator for scheduling sharded tests.
The plan was to have a job occasionally do a BigQuery query and write
out the results to a CSV file on GCS. The code to read that CSV file
is in this CL, but that code path is disabled, so this CL should be a
no-op.
A future change will periodically do the query and write the CSV file,
and then we can start using the new code path and remove the static
map of expected test durations.
Updates golang/go#12669
Change-Id: Ibe5b41d6a3009c2ade8ab728fa1cad646788e621
Reviewed-on: https://go-review.googlesource.com/30716
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Benchmarks are treated as unit tests and distributed to the test
helpers, which allows them to fit in our 5m trybot budget.
Currently we only run the go1 and x/benchmarks. Running package
benchmarks is a TODO.
This feature is disabled by default, and is enabled by the
"farmer-run-bench" project attribute.
Updates golang/go#19178
Updates golang/go#19871
Change-Id: I9c3a14da60c3662e7e2cb4e71953060915cc4364
Reviewed-on: https://go-review.googlesource.com/38306
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Previously we printed duration to seven decimal points after the
second, which isn't helpful or necessary to determine how long a
build took. Instead, round the duration to the nearest tenth of a
second (if the build took more than 10 seconds), the nearest hundredth
of a second (if it took 1-10 seconds), or the nearest tenth of a
millisecond (if it took less than a second), which should be more than
enough precision and is much easier to read.
Change-Id: I1c29d4a81335bf9ee3cddda0a341d3f321e82f6b
Reviewed-on: https://go-review.googlesource.com/40855
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The snapshot code had an old workaround that's no longer relevant. Remove it.
Also, add a BuildConfig.SkipSnapshot bool, and use it for the slow
mips builders.
Fixesgolang/go#19953
Change-Id: I114bb0a524184eaaae5be4715ce63f6adc519c2e
Reviewed-on: https://go-review.googlesource.com/40505
Reviewed-by: Sarah Adams <shadams@google.com>
The NetBSD builders were supposed to all be skipped by
buildRev.skipBuild, but there was another code path generating work
(the bootstrap-a-new-builder-column code) that wasn't using skipBuild,
which meant when the NetBSD column totally disappeared (as we wanted),
the builder tried to bring it back to life, doing a bunch of hanging
builds.
Change-Id: I976016a485e241b46f9cf8bb8b371e1038bb5f8c
Reviewed-on: https://go-review.googlesource.com/40470
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Currently all builds start and think they're running, but most are
just fighting over a mutex to grab a builder. That will be fixed, but
in the meantime it's nice to see what's actually working vs what's
waiting on e.g. arm5 hardware which won't be available for hours.
This is a baby step towards more monitoring. Currently this is just HTML
output, but the same data could be exported via JSON or something else later
for graphing.
Updates golang/go#19178 (add a buildlet scheduler)
Updates golang/go#15760 (monitor everything)
Change-Id: I36e16ea0919afe8023fe7fedd981f2e857f0d6df
Reviewed-on: https://go-review.googlesource.com/40397
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-- move policy of which builders are trybots out of coordinator
and into dashboard/builders.go.
-- move some GCE-specific code from coordinator.go to gce.go.
-- rename an old "watcher" reference to "gitmirror"
-- add some docs
-- actually add the Alpine builder, missing from https://golang.org/cl/33890Fixesgolang/go#17891
Change-Id: Ia63671ca09aec322ed57b3663e0ac5042cdc56f2
Reviewed-on: https://go-review.googlesource.com/40395
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Map iteration is random, so it's possible for two simultaneous calls
to getStatus to deadlock due to the use of defer to unlock.
Change-Id: Ib2c0b4122bd5ea17fde5e4c60da1a853a12ed2fd
Reviewed-on: https://go-review.googlesource.com/40301
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
statusDone contains the last maxStatusDone (currently 30) logs, but
logs for in-progress try sets are still available via the trySet until
the trySet is complete.
Currently, if you go to the main status page of the farmer, there is
an "Active Trybot Runs" section at the top. If you click on the status
link (taking you to /try?commit=xxx), it shows partial logs for each
of the buildlets. However, if you click the /temporarylogs link for
one of those buildlets, you get a 404 if that buildStatus has fallen
off the end of maxStatusDone. This CL makes it so that any
/temporarylogs link you can get from a /try page will always contain
the actual log, which we were already storing in memory anyway.
Change-Id: Ia07b9c8b1a16da23936b9b32e39827e591e3e471
Reviewed-on: https://go-review.googlesource.com/39312
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Specifically test that the context is not canceled when the create
function returns, which is the case that bit us yesterday. This
still fails "go vet" and we should figure out how to pass the cancel
function down or otherwise restructure the code.
Add the beginnings of a buildlet test framework, which will hopefully
make it easier to write more tests without needing to start up a
cluster or integrate with GCE. It would be nice if buildlet.Client
was an interface, not a concrete type, so we could store various
pieces of information on it and then test for them.
Change-Id: I75acc3252b59110647a8ad8d8f7c7cce66598a17
Reviewed-on: https://go-review.googlesource.com/37146
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The one remaining error reported by "go vet" is the un-canceled
context in cmd/coordinator/remote.go, which isn't an error.
Change-Id: Ie8d9269fc77203cf82d0a724dd33e4871a8113a7
Reviewed-on: https://go-review.googlesource.com/39350
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Previously the server would fail to start if a valid token could not
be found. The local development environment still doesn't do much, but
it should let you view and edit the HTML and confirm that the server
starts, so ignore the error if we are running in dev mode.
Add a short README explaining how to start and view the coordinator
server locally.
Fixesgolang/go#19828.
Fixesgolang/go#18291.
Change-Id: I91a3ce49e1e9ea18ca19f3867edb2c71fc1b5124
Reviewed-on: https://go-review.googlesource.com/39297
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The x/crypto/autocert package recently updated to use the context
package in the standard library, which means that cmd/coordinator no
longer works.
Also update the Dockerfile to build the acme/autocert dependency
that was recently added to cmd/coordinator.
Change-Id: I016ba797e1e67c50822c7ac786f502fcc008100b
Reviewed-on: https://go-review.googlesource.com/39330
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Starting the coordinator in the "watcher" role makes the binary
immediately exit with an error, so remove it.
Change-Id: Icfe4d7e60d0c7edf6e02804fd3dfcc6f76e4daf7
Reviewed-on: https://go-review.googlesource.com/39293
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Since farmer.golang.org has a valid certificate in production now, we
should use that certificate for retrieving data.
Change-Id: I3685d84953ea11e6ac09b4692fa27c73538a565c
Reviewed-on: https://go-review.googlesource.com/39290
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The buildlets were partially updated to support this in
https://golang.org/cl/38792 but overlooked that ServerName == "go" was
still hard-coded. This CL also fixes that in the buildlet.
Fixesgolang/go#16442
Change-Id: Ia2b794bdf9df8ab75875b9951b53a7bb5f5f6afe
Reviewed-on: https://go-review.googlesource.com/38798
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
These never work and are a waste of resources and unnecessary scary
red on the page.
Change-Id: I2d8f8908096dee01618129a9082e98f3ffd92bb7
Reviewed-on: https://go-review.googlesource.com/37467
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Will be used for dynamic creation/destruction of Mac VMs in subsequent CL.
Updates golang/go#9495 (Mac virtualization)
Updates golang/go#15760 (monitoring)
Change-Id: I48b17589b258d5d742bad5a3ddae18de98778149
Reviewed-on: https://go-review.googlesource.com/37457
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
(been in production for past few days)
Updates golang/go#19166
Change-Id: Iee301aac5aa87c6644ac1253d7c0bd6c522ff066
Reviewed-on: https://go-review.googlesource.com/37380
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The talks repo got too big.
Also clean up the gitmirror fetching a code while there. It no longer
runs as a subprocess, so we don't need to account for its start-up
time.
Fixesgolang/go#19162
Change-Id: I0f80b95360d079989254c26ea0406dab9633f0c1
Reviewed-on: https://go-review.googlesource.com/37217
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This lets us set a timeout on the HTTP request, which lets us time out the
writeSnapshot call in cmd/coordinator.
Fixesgolang/go#18812.
Change-Id: I370448df4d95130c9c5b30ba32459ce844a6c967
Reviewed-on: https://go-review.googlesource.com/36897
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
If the Gerrit query reported no CLs needed trybot runs, we returned
too early and didn't clear our state. Remove that 'if len(cis) == 0'
early return. The optimization was buggy and not even worth much if
correct.
Also rename some confusing variables.
Change-Id: I485d303c36cc477e3ac651ec25b2c777f512b658
Reviewed-on: https://go-review.googlesource.com/36808
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Switch all API methods that make requests to Gerrit to take a
context.Context as their first argument. Adds a package example and
a simple test that we make requests to the correct endpoint and that
the Client can handle correct responses and error responses from the
Gerrit server.
Switches all code in the x/build tree to use the new Gerrit
client. There are several projects outside the tree that import
x/build/gerrit; I'll submit CL's against those to pull in the new
interface once this gets merged.
Documents that the API is unstable.
Fixesgolang/go#18742.
Fixesgolang/go#18743.
Change-Id: Ifa78cbb058981e23cf5769955f6312fcbe08e174
Reviewed-on: https://go-review.googlesource.com/35559
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Saves 4.5 minutes or so by using fast x86 machines to build the ARM
build instead of running make.bash on Scaleway ARM machines.
We still run the tests on ARM, and have a separate builder only
running make.bash on ARM (see prior golang.org/cl/29670)
Fixesgolang/go#17105
Updates golang/go#17104
Change-Id: I1cb7b0e5b1cc8b644195f262328884ed3aff120a
Reviewed-on: https://go-review.googlesource.com/29677
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This is a new builder in prep for the change to the "linux-arm"
builder where the GOARCH=arm make.bash will be cross-compiled from a
Kubernetes container on fast hardware.
Updates golang/go#17105 (cross-compile ARM builders' make.bash)
Updates golang/go#17104 (5 minute trybots)
Change-Id: Icfd2644d77639f731151abe54839322960418254
Reviewed-on: https://go-review.googlesource.com/29670
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Our builders are named of the form "GOOS-GOARCH" or
"GOOS-GOARCH-suffix".
Over time we've grown many builders. This CL doesn't change
that. Builders continue to be named and operate as before.
Previously the build configuration file (dashboard/builders.go) made
each builder type ("linux-amd64-race", etc) define how to create a
host running a buildlet of that type, even though many builders had
identical host configs. For example, these builders all share the same
host type (a Kubernetes container):
linux-amd64
linux-amd64-race
linux-386
linux-386-387
And these are the same host type (a GCE VM):
windows-amd64-gce
windows-amd64-race
windows-386-gce
This CL creates a new concept of a "hostType" which defines how
the buildlet is created (Kube, GCE, Reverse, and how), and then each
builder itself references a host type.
Users never see the hostType. (except perhaps in gomote list output)
But they at least never need to care about them.
Reverse buildlets now can only be one hostType at a time, which
simplifies things. We were no longer using multiple roles per machine
once moving to VMs for OS X.
gomote continues to operate as it did previously but its underlying
protocol changed and clients will need to be updated. As a new
feature, gomote now has a new flag to let you reuse a buildlet host
connection for different builder rules if they share the same
underlying host type. But users can ignore that.
This CL is a long-standing TODO (previously attempted and aborted) and
will make many things easier and faster, including the linux-arm
cross-compilation effort, and keeping pre-warmed buildlets of VM types
ready to go.
Updates golang/go#17104
Change-Id: Iad8387f48680424a8441e878a2f4762bf79ea4d2
Reviewed-on: https://go-review.googlesource.com/29551
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Also, don't start obtaining test sharding buildlets early if they're
reverse buildlets. Reverse buildlets are either immediately available,
or they're not. No point monopolizing them earlier than needed.
Updates golang/go#17104
Change-Id: If5a0bbd0c59b55750adfeeaa8d0f81cdbcc8ad48
Reviewed-on: https://go-review.googlesource.com/29473
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
It's been there for the past two releases. We can assume it's present.
Change-Id: I94ac107d4c52ea823d42a63ced571ff4c81532a2
Reviewed-on: https://go-review.googlesource.com/29471
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
If the coordinator has a better guess as to how long various tests
take, then it can do better critical path scheduling and reduce the
overall time the sharded tests take to complete.
This table still needs to die and be based on recent empirical data,
but at least it's more accurate now, after a long delay in being
updated.
Update is from https://github.com/golang/go/issues/17104#issuecomment-247123458
Updates golang/go#17104
Change-Id: I115aad23fbdb0cde1b196e71a4131fbe36480cc0
Reviewed-on: https://go-review.googlesource.com/29167
Reviewed-by: David Crawshaw <crawshaw@golang.org>
I'm aiming to have trybot runs finish in under 5 minutes.
This CL removes openbsd-386-gce58 and freebsd-386-gce101 from the trybot set.
openbsd-386-gce58 is the slowest builder. It has an average speed of
722 seconds (and 95 percentile of 923 seconds) over the past week, and
that's sharded over 4 machines. Too slow. It's not worth the resources
to keep it as a trybot. It hasn't caught any interesting bugs. This
builder will still run, but not as a pre-submit trybot.
freebsd-386-gce101 is not slow, but we're removing it to shift its
resources to shard other builders wider.
The coordinator now supports varying the build sharding width based on
whether a build is for a trybot or not. This CL defines separate
numbers for each, sharding builds wider as needed for some trybots.
freebsd-amd64-gce101 goes from 4 to 5 machines in try runs, and down
to 3 when not in try runs.
linux-amd64-race gets one more machine during try runs, and one fewer
in regular runs.
linux-arm goes from 7 machines always, to 3 or 8, depending on whether
it's a try run.
openbsd-amd64-58 goes from 4 to 3 or 6.
windows-amd64-gce goes from 4 to 2 or 6.
windows-amd64-race goes from 4 to 2 or 6.
darwin-amd64-10_11 goes from 3 to 3 or 4.
I'll see how these do over the next few days and readjust as needed.
Also in this CL: fix the constants for the expected duration of
make.bash, which impact when we schedule the creation of test sharding
helper buildlets. We were creating them too early before, wasting
resources.
Change-Id: I38a9b24841e196f1eb668de058c49af8c1d1c64f
Reviewed-on: https://go-review.googlesource.com/29116
Reviewed-by: Quentin Smith <quentin@golang.org>
Ignore the old darwin-{amd64,386}-10_10 builders. Don't give them an
error, but pretend they don't exist.
Also: switch trybots from OS X 10.10 to OS X 10.11, and re-enable
sharding. Let's hope for the best. See golang/go#12979.
This also enables subrepo tests for all OS X versions.
darwin-386-* is currently offline, pending some golang/go#17009
Updates golang/go#9495 (OS X virtualization)
Change-Id: I4d53a79087404b5e8051d1aff0c668a92625f442
Reviewed-on: https://go-review.googlesource.com/28583
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
If the buildlet interface reports that "dist test list" was a
"remoteErr" (the remote side successfully ran the command and relayed
its exit status to us, and it was a failure), don't treat that as an
"exec error" (that is, a communication or temporary error).
This is what tied up all the builders forever the other night when the
arm and ppc64 builds were broken, unable to run the go binary, since
they were forever retrying and not making progress. They should've
reported an error to the dashboard.
This was the fix deployed last night.
Change-Id: I08ce15d87eac5374eb907d4a7c72278106ecaba9
Reviewed-on: https://go-review.googlesource.com/27655
Reviewed-by: Andrew Gerrand <adg@golang.org>
* Use ssh instead of https. (git-remote-https seems buggy and slow,
unable to deal with 40k+ refs efficiently at all)
* Add /debug/watcher/<repo> status pages on the coordinator (proxying
through to the watcher child process)
* more logging
* do mirroring of refs in batches, prioritizing heads and tags over
refs/changes/* (although this is admittedly less useful now that we
use ssh instead of https).
* update go-watcher-world from git 1.7 to git 2.8 (and Debian sid)
Fixesgolang/go#16388
Change-Id: If3e2cf67afddd544892886a466938e8f46df8c95
Reviewed-on: https://go-review.googlesource.com/25110
Reviewed-by: Andrew Gerrand <adg@golang.org>
* Move buildlet pools to top
* add #id elements to headings so we can link to #fragments
* collapse details of recently-completed builds
* link to build logs of recently-completed builds anyway
Change-Id: I39fc109e673c9bbbfe2e4904dbdbd6b4cbe42b6f
Reviewed-on: https://go-review.googlesource.com/22825
Reviewed-by: Andrew Gerrand <adg@golang.org>
Now that it's on Kubernetes, it's fast and cheap to shard it wider.
Change-Id: I2858f4c417ef26661575a348fbfc47f4c25d81a2
Reviewed-on: https://go-review.googlesource.com/22795
Reviewed-by: Andrew Gerrand <adg@golang.org>
If a commit doesn't have the new cmd/dist -compile-only flag, just skip
the ssacheck tests.
Change-Id: I41a5ce39968a073cd4be97e31bd26ffc5cf68699
Reviewed-on: https://go-review.googlesource.com/22793
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The blue in-progress gophers have been missing in build.golang.org
because the magic log span strings changed.
Change-Id: I702883d8a6f6eea31aaaaa25ebe10e929992a677
Reviewed-on: https://go-review.googlesource.com/22747
Reviewed-by: Ross Light <light@google.com>
We already had processStartTime.
Change-Id: I3b1522d9b365ad06af19518354d2bde9af766f4c
Reviewed-on: https://go-review.googlesource.com/22703
Reviewed-by: Andrew Gerrand <adg@golang.org>
For now, just log the process's start time and alive/heartbeat time.
Updates golang/go#12669
Change-Id: I4dd28f7ba761b5487f86a4cdb0f721b2aeb4fa57
Reviewed-on: https://go-review.googlesource.com/22701
Reviewed-by: Andrew Gerrand <adg@golang.org>
Most of these are documentation fixes.
Change-Id: If100cc2ef0ee0b8b47a1e96e4c51a2be8caf5842
Reviewed-on: https://go-review.googlesource.com/22366
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
These IDs will become the cloud datastore keys for keeping history and logs.
Many more CLs will follow: changing farmer URLs, logging to datastore,
fetching old logs from datastore, etc.
Updates golang/go#13076 (make all logs URLs permanent)
Updates golang/go#12669 (collect logs, stats)
Change-Id: I5b9fd21bf23581c59724b0ed32c8459baa9683f7
Reviewed-on: https://go-review.googlesource.com/21968
Reviewed-by: Andrew Gerrand <adg@golang.org>
Until we can use historical data, keep this roughly-accurate handmade
table up to date. This helps with the coordinator's critical path
scheduling of tests over buildlets. (I noticed this stuck at the end
of a build, rather than the beginning)
Change-Id: I6e64b138b1f5b3163157d93c59ba19c25c959730
Reviewed-on: https://go-review.googlesource.com/21966
Reviewed-by: Andrew Gerrand <adg@golang.org>
Improvements to support rapid scheduling of many build jobs:
- Retry logic in Kubernetes client to handle sporadic connection
closes from their API server under heavy load
- Cluster autoscaler scales on default CPU utilization metric
- Debug mode allows scheduling multiple builds to test scaling
- Account for scheduled vs. provisioned resources in a cluster
and use that information to estimate when a build's pod
will be scheduled and in running state
- Use estimated scheduled time to set context timeout
- Track pod lifecycle (requested time, estimated available time,
actual available time, terminate time, etc)
Change-Id: I14d6c5e01af0970dbb3390a29d1ee5c43049fff8
Reviewed-on: https://go-review.googlesource.com/19524
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Next step is logging it to datastore/bigquery. Then stats and
visualizing with trace viewer later.
Updates golang/go#12669
Change-Id: I039097791993dbe3d4aceb18d2a08db8299f3c05
Reviewed-on: https://go-review.googlesource.com/21640
Reviewed-by: Andrew Gerrand <adg@golang.org>
Now events and phases of the build are represented by a span, with a
stop and start, and each has timing and a success status.
These will visualize nicer (later) and will be able to be logged and
queried in Cloud Datastore + BigQuery.
Also, remove the overloaded and not-to-be-used "rev" parameter from
the GetBuildlet interface. An upcoming change disconnects the
buildlet's lifetime from that of a single build, so the rev isn't even
known at the time of the GetBuildlet call; the buildlet will be handed
out to the most worthy current waiter (and the rev might not have even
existed at the tim the GetBuildlet call started)
Updates golang/go#12669
Change-Id: I02b28fea8bbbc16dc95c1ea0507418be01aa8ca8
Reviewed-on: https://go-review.googlesource.com/21570
Reviewed-by: Andrew Gerrand <adg@golang.org>
One was in buildenv anyway.
The other hasn't been used in ages.
Change-Id: I1d5f652ca3bf713f59d50a11bdba1339653867bb
Reviewed-on: https://go-review.googlesource.com/20978
Reviewed-by: Andrew Gerrand <adg@golang.org>
Also update all users, and rename another template field.
Includes changes to coordinator, gomote, and release.
Change-Id: I1c4408eadbcb83d61063a910dfa18cc395952bc2
Reviewed-on: https://go-review.googlesource.com/20976
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
buildenv.Environment type defines configuration options:
- Coordinator uses the GCE project name to lookup config. A custom
config name can be provided at runtime to override.
- The conventional prod and stage project names ('symbolic-datum-552'
and 'go-dashboard-dev') map to prod and staging configuration structs.
- Production and staging status is explicitly defined in configuration.
- GCS bucket names for buildlet, logs, and snapshots are
configurable.
Change-Id: I7e6d7874eb0bdfe35dbdd5fcf6212ab50d576b88
Reviewed-on: https://go-review.googlesource.com/19502
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
OpenBSD 5.8 is the current release and OpenBSD 5.6 is no longer supported.
Revise build script:
- Use the auto installer and disklabel templates built into later versions of
OpenBSD, rather than entirely using expect.
- Rather than duplicating the entire script for openbsd-386, provide an ARCH
environment variable that switches between openbsd/amd64 and openbsd/i386.
Have the openbsd-386 script invoke the openbsd-amd64 script with the
appropriate environment.
- Remove the 'ignore classless-static-routes' option for dhclient, as it is
no longer needed for OpenBSD 5.7 and later.
- Clean up after ourselves, rather than leaving a bunch of temporary files
lying around.
Updates issue golang/go#13029.
Change-Id: Ic1b11dd5eded317b7be32b8f1c2485617ac02b78
Reviewed-on: https://go-review.googlesource.com/18358
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
We were burning a fair bit of CPU here.
Updates golang/go#12671
Change-Id: Iba16e1caa80a1f0b7330098209acaf6d46ce048f
Reviewed-on: https://go-review.googlesource.com/17130
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* flag allows debug (trigger a build) in prod mode
* use default zone (us-central1-f) in staging
* compute.CloudPlatformScope satisfies compute and storage scope
requirements
* new buildlet for linux-amd64 on Kubernetes
Updates golang/go#12546
Change-Id: I1a0cbc0e0c0552b7ac56943c646378194508d48f
Reviewed-on: https://go-review.googlesource.com/17102
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
There are several places where the coordinator passes a CL's Change-ID
back to the Gerrit API. However, these APIs will fail if there are
multiple changes with the same Change-ID (for example, if there's a
cherry-pick). One consequence of this is that the coordinator can't
post comments on changes with ambiguous Change-IDs, which means it
can't post trybot status or update the TryBot-Result label.
Because of this ambiguity, the Gerrit APIs also accept a more complete
and unique triple of (project, branch, Change-ID). This commit adds a
ChangeTriple method to tryKey that returns this triple and modifies
all API calls to use this, as well as a few places on the status
pages.
Fixesgolang/go#13331.
Change-Id: Ia13f66a5cb27a14dfc4c95b7e460110d2dbefe62
Reviewed-on: https://go-review.googlesource.com/17071
Reviewed-by: Andrew Gerrand <adg@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
GCE rolled back the change that broke Plan 9.
Fixesgolang/go#13077
Change-Id: Ifb04ce74fd593a2ab773ff56f8850a6f252ace48
Reviewed-on: https://go-review.googlesource.com/16947
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* status page shows kube pool details
* pods created by the coorindator are tracked
* pods that fail to create are deleted
* pods older than delete-at are deleted
* pods created by a different coordinator are deleted
Updates golang/go#12546
Change-Id: I4c4f8ff906962b4a014a66d0a9d490ff17710d62
Reviewed-on: https://go-review.googlesource.com/16101
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
GCE broke it. GCE is being fixed, but not immediately.
Updates golang/go#13077
Change-Id: I289e382b0a4affc4fc8faee5da053474d1cabc35
Reviewed-on: https://go-review.googlesource.com/16314
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
* Replaced cancel with context.Context
* StartPod can be canceled
* Wait for buildlet to come online, but fail fast if pod fails first
* Support timeout waiting for pod to leave pending phase
* Use Kubernetes watch API (long poll)
Updates golang/go#12546
Change-Id: I792a3b8fed615362a0290feee7de0c2cefe43c0e
Reviewed-on: https://go-review.googlesource.com/15285
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Not sure what I was thinking. Hopefully a copy/paste-o.
Fixesgolang/go#12813
Change-Id: Idf63301f8f4098db07c3203575a085f7dab8e6a6
Reviewed-on: https://go-review.googlesource.com/15253
Reviewed-by: Austin Clements <austin@google.com>
* set correct metadata as env vars on each container in pod
* stage0 detects when running in a pod
* add scope to support farmer on GCE communicating with Google
Container Engine API
* dev mode supports GCE and Kubernetes even when not running
on GCE
* pod buildlet returns a configured http.Client
* pod buildlets work, but pods are not removed after completion
(or on creation failure)
Updates golang/go#12546
Change-Id: If91673b49223130c1e7077c130f1abe1e7966d02
Reviewed-on: https://go-review.googlesource.com/15041
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Find the default Kubernetes cluster and configure a client to talk to it.
Use application default credentials.
Updates golang/go#12546
Change-Id: Ifb1ce57f52f4fbbee3267f8cc3cf02a78146bd5b
Reviewed-on: https://go-review.googlesource.com/14532
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This CL makes the main buildlet and any helper buildlets all
equivalent once we get to the final stage of the build (test
execution). Now the death of the main buildlet isn't critical and the
testing continues as long as any buildlet is alive (or any helper
buildlet is on its way).
It also makes timeouts (20 minutes) of remote commands fatal, now that
health checking is reliable for all buildlet types.
Fixesgolang/go#12685
Change-Id: I07d1e1fabf1027ee24ce679e4fcc8bd09bed6d83
Reviewed-on: https://go-review.googlesource.com/14774
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
We were creating corrupt snapshots before (golang/go#12671), so
double-check them before use, for now. We can relax this in the future
(see TODO).
Fixesgolang/go#12671
Change-Id: I10b7929d4b78c717f29c67a0ce495ecd435a1d26
Reviewed-on: https://go-review.googlesource.com/14772
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The previous blacklisting was because of golang/go#12668, since fixed.
Further, the tests are now sharded (golang/go#12623) and this test
name doesn't even exist, so this change is a no-op.
Change-Id: Id141923e904a5aec91e53ba6f1d44e6cdf137cd2
Reviewed-on: https://go-review.googlesource.com/14773
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Also show build steps with -x in Makefile. It's slow to use go build,
but go install doesn't quite work. Need to figure that out.
Change-Id: I79093dc7c84912b3f04c9e4708c9f65086c4445c
Reviewed-on: https://go-review.googlesource.com/14733
Reviewed-by: Andrew Gerrand <adg@golang.org>
Now Close always means destroy.
The old code & API was a half-baked implementation of a (lack of)
design where there was a difference between being done with a buildlet
(with it possibly being reused by somebody else?) and you wanting to
nuke it completely. Unfortunately this just grew messier and more
broken over time.
This attempts to clean it all up.
We can add the sharing ideas back later when there's actually a design
and implementation. (We're not losing anything with this CL, because
nothing ever shared buildlets)
In fact, this CL fixes a problem where reverse buildlets weren't
getting their underlying net.Conns closed when their healthchecks
failed, leading to 4+ hour (and counting) build hangs due to buildlets
getting killed at inopportune moments (e.g. me testing running a
reverse buildlet on my home mac to help out with the dashboard
backlog)
Change-Id: I07be09f4d5f0f09d35e51e41c48b1296b71bb9b5
Reviewed-on: https://go-review.googlesource.com/14585
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Andrew Gerrand <adg@golang.org>
* reverse buildlet rework (multiplexed TCP connections, instead
of a hacky reverse roundtripper)
* scaleway ARM image improvements
* parallel gzip implementation, which makes things ~8x faster on
Scaleway.
* merge watcher into the coordinator, for easier deployments
Change-Id: I55d769f982e6583b261435309faa1f718a15fde1
Reviewed-on: https://go-review.googlesource.com/12665
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
- record 'go list' failures in sub-repos as test failures.
- don't try to run sharded tests on old go 1.4 trees.
- improved logging.
Change-Id: Ia0ca72caf5fc02cddfefdd1c87d68108901a16be
Reviewed-on: https://go-review.googlesource.com/14284
Reviewed-by: Andrew Gerrand <adg@golang.org>
The coordinator doesn't know how to fetch dependencies outside
golang.org/x/... so we should allow it to use Go 1.5's new vendoring
support.
Change-Id: I5e9a8a7e574a1dc728b2b0a8a35616001ff6bc46
Reviewed-on: https://go-review.googlesource.com/12783
Reviewed-by: Nigel Tao <nigeltao@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
We solved the mentioned issue when we got cmd/release working.
Fixesgolang/go#11812.
Change-Id: I163f6757c5e930629fb8b840701e9e3ce7edf3cf
Reviewed-on: https://go-review.googlesource.com/12754
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This creates a mechanism for clients (such as cmd/release and
cmd/gomote) to obtain buildlets via the coordinator. Previously
cmd/release and cmd/gomote could only create GCE VMs themselves, and
required the GCE project's credentials. In addition to the awkwardness
of needing to hand out the GCE credentials, it also meant ARM and
Darwin buildlets (which use the reverse buildlet pool) weren't usable.
Instead, this creates a new auth mechanism where the coordinator is
contacted over TLS with key pinning (the CA system isn't used) in the
same way that the reverse builders already dialed into the
coordinator, and then a "user build type" and hash are sent as the
username and password. The same master key is used to sign user
builder keys, and they always start with "user-". (which isn't a GOOS).
Then the coordinator provides an API to create and list buildlets.
They auto-expire after a duration and are auto-renewed upon use.
The buildlet library (as used by cmd/release etc) then proxies HTTP
requests via the coordinator to the backend buildlet.
See doc/remote-buildlet.txt for protocol details.
Change-Id: I12e27eae788fdd91927cb182b950893dc759f8e9
Reviewed-on: https://go-review.googlesource.com/11901
Reviewed-by: Andrew Gerrand <adg@golang.org>
Add background heartbeats to detect dead GCE VMs (the OpenBSD
buildlets seem to hang a lot),
Add timeouts to test executions.
Take helper buildlets out of service once they're marked bad.
Keep the in-order buildlet running forever when sharding tests, in
case all the helpers die. (observed once)
Keep a cache of recently deleted VMs and don't try to delete VMs again
if we've recently deleted them. (they're slow to delete)
More reverse buildlets more paranoid in their health checking and closing
of the connection.
Make status page link to /try set URLs.
Also, better logging (more sometimes, less others0, and misc bug fixes.
Change-Id: I57a5e8e39381234006cac4dd799b655d64be71bb
Reviewed-on: https://go-review.googlesource.com/10981
Reviewed-by: Andrew Gerrand <adg@golang.org>
There were too many blue gophers too early on build.golang.org.
Wait until we have quota first (GCE quota or reverse buildlet pool
aquired).
Updates golang/go#10716
Change-Id: I076e89f319d68d55a61f3f8faf0242342c5ae825
Reviewed-on: https://go-review.googlesource.com/10921
Reviewed-by: Andrew Gerrand <adg@golang.org>
Also: heartbeat reverse buildlets more often
Also: adjust cmd/go test expected duration.
Change-Id: I75fbb66a0b20ac7357ad7cf78fc545101ac9aa33
Reviewed-on: https://go-review.googlesource.com/10884
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Andrew Gerrand <adg@golang.org>
Per request of iant and rsc.
Change-Id: Ifae7b6c1a83240382635a87ccc2c8d8aa429b901
Reviewed-on: https://go-review.googlesource.com/10848
Reviewed-by: Andrew Gerrand <adg@golang.org>