Граф коммитов

98 Коммитов

Автор SHA1 Сообщение Дата
Alexander Rakoczy 22bdacadd1 cmd/coordinator,buildlet: keep active GCE SSH sessions alive
This CL skips deleting active remote buildlets.

The coordinator has multiple ways of tracking stale buildlets. For our
GCE buildlets, we periodically delete old VMs after their expiration
time, typically 45 minutes after their creation. The expiration tracking
in coordinator/gce.go does not account for remote buildlets, which are
buildlets created by users or cmd/release. Remote buildlets have their
own staleness checks and cleanup process, so we should skip the GCE
specific cleanup logic for them.

This adds an additional field to the buildlet Client in order to
correlate a GCE VM with a buildlet.

Updates golang/go#37001

Change-Id: Ib0acdf79c4dfbee6e0061c513f98b749d4b9cc64
Reviewed-on: https://go-review.googlesource.com/c/build/+/217722
Run-TryBot: Alexander Rakoczy <alex@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
2020-02-06 15:35:01 +00:00
Alexander Rakoczy 65fa539f5d buildlet: use a dedicated service account on GCE
buildlet should use a specific service account, allowing fine-grained
permission control that is different from the default service account
for whichever project they are run in.

Change-Id: I7a86308d6b65f370dfc49649ef10686d1d8b2974
Reviewed-on: https://go-review.googlesource.com/c/build/+/210958
Run-TryBot: Alexander Rakoczy <alex@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Carlos Amedee <carlos@golang.org>
2019-12-13 17:33:05 +00:00
Brad Fitzpatrick 326548a346 cmd/coordinator, all: fix more things related to multi-zone buildlets
This fixes stuff in CL 210498 and CL 210237.

I renamed the Zone field to ControlZone both to make it more clear and
to force compilation errors wherever Zone was used previously, which
revealed some things that were missed.

Updates golang/go#35987

Change-Id: I2f890727ece86d093a90a3b47701caa58de6ccbc
Reviewed-on: https://go-review.googlesource.com/c/build/+/210541
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alexander Rakoczy <alex@golang.org>
2019-12-09 19:46:22 +00:00
Carlos Amedee 96d53bbd3f cmd/coordinator: support deploying vms in multiple GCE zones
GCE zones can and will run out of resources. This will randomly
select a zone from a list of zones to deploy each new VM to.

Fixes golang/go#35987

Change-Id: I57acad5c4e81d108f7db8f5bb1ff221a1845a422
Reviewed-on: https://go-review.googlesource.com/c/build/+/210237
Run-TryBot: Carlos Amedee <carlos@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alexander Rakoczy <alex@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-12-06 17:33:13 +00:00
Brad Fitzpatrick 6191a53cff cmd/coordinator, cmd/gomote: show periodic status updates to stderr
Also modernizes some code in the coordinator.

Updates golang/go#35354 (or fixes. But we could return more info.)

Change-Id: Ifc1aa85ca217a0932e388ec5d36ef0737b90c63d
Reviewed-on: https://go-review.googlesource.com/c/build/+/207841
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2019-11-20 22:32:37 +00:00
Brad Fitzpatrick d4e4f4e852 buildlet: add context argument to most of the remaining Client methods
(And update all the callers in the tree.)

Updates golang/go#35707

Change-Id: I54769bbe374f31ae1dd07776b27818db91ce8c70
Reviewed-on: https://go-review.googlesource.com/c/build/+/208157
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
2019-11-20 17:29:10 +00:00
Brad Fitzpatrick 5a36865fda buildlet: add context argument to Client.Exec
(And update all the callers in the tree.)

Updates golang/go#35707

Change-Id: I504ef73ea4ba7f8028f47a658c1cd519c7b5d986
Reviewed-on: https://go-review.googlesource.com/c/build/+/207908
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
2019-11-20 06:29:03 +00:00
Brad Fitzpatrick 7c2b75383d cmd/gomote: add gomote rdp subcommand to open an RDP proxy to a Windows buildlet
Fixes golang/go#26090

Change-Id: Ib58fb7767384778b67d0d62d34c1132d70d75c23
Reviewed-on: https://go-review.googlesource.com/c/build/+/207378
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2019-11-15 18:35:36 +00:00
Brad Fitzpatrick 34578470c8 buildlet, cmd/coordinator: stop using legacy Request.Cancel for cancellation
Maybe this will solve the golang/go#28365 problems. But at least it
gets us into codepaths that are known & trusted, and removes use of
deprecated API.

Also, add more logging to help debug golang/go#28365.

Updates golang/go#28365

Change-Id: Ibff2b03fd82573cbeedbbc22d12c30ae1a3c3aa0
Reviewed-on: https://go-review.googlesource.com/c/build/+/203217
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2019-10-24 19:35:26 +00:00
Brad Fitzpatrick f24504d3b6 all: remove use old, overloaded of IN_KUBERNETES
Fixes golang/go#34956

Change-Id: Ie6f521eee5995102d6dc13f92ecca6601a68661b
Reviewed-on: https://go-review.googlesource.com/c/build/+/201739
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2019-10-17 19:02:58 +00:00
Dmitri Shuralyov 7b90c2e750 cmd/coordinator: find inner modules when testing repos
Previously, we were invoking a single 'go test' run at the repository
root with the import path pattern of 'golang.org/x/{repo}/...'. This
pattern does not match packages that are located in nested modules
in the repository.

Look for go.mod files in all subdirectories of the repository to find
all inner modules. Then, run 'go test' inside each module root, thus
testing all packages in all modules of the repository. If one of the
test invocations fails, keep testing others, and report all failures.

When looking for inner modules, consider only those that have module
path that would not be ignored by the go tool and aren't vendored.
This way, go.mod files inside testdata directories aren't treated as
if they're proper modules.

This is being done only when the tests are running in module mode,
since module boundaries don't exist in GOPATH mode.

Fixes golang/go#32528

Change-Id: I9f8558982885c9955d3b34127c80c485d713b380
Reviewed-on: https://go-review.googlesource.com/c/build/+/194559
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2019-09-17 16:56:21 +00:00
Dmitri Shuralyov 6225d660dc cmd/release: create release archive after make.bash and before all.bash
Binary releases need to build Go and include binaries such as bin/go,
bin/gofmt, and others. Previously, this was accomplished by running
all.bash script for some GOOS/GOARCH pairs, and make.bash for others
where it wasn't viable to run tests as part of the release process.

This change makes the release process more consistent by always
packaging the release archive file after running make.bash. We still
run all.bash in situations where it was previously run, but we do so
after the release file has already been created. This avoids the
risk of any changes to GOROOT that may occur as part of all.bash
(including changing file permissions to be read-only) being included
in the final release file.

Add a step to check that files in the buildlet's $WORKDIR/go and
$WORKDIR/go/bin directories have expected permissions before
creating the release file.

Fixes golang/go#33537
Updates golang/go#30316

Change-Id: I7d40716dba656a8aca711377f2995df4880166c5
Reviewed-on: https://go-review.googlesource.com/c/build/+/189537
Reviewed-by: Andrew Bonventre <andybons@golang.org>
2019-08-21 16:40:51 +00:00
Brad Fitzpatrick 6994fc2d58 dashboard, buildlet, cmd/debugnewvm: clean up HostConfig.ContainerVMImage
The debugnewvm command was printing out the wrong VM name, since the
old method had also assumed (but not documented) that it only applied
to nested virtualization containers. But remove that requirement and
make the empty string mean unspecified instead (which we currently
mean to use Container-Optimized OS) role and then use its new
definition.

Change-Id: Ieca138285aa567b1c24d585c5aa180f8a1534154
Reviewed-on: https://go-review.googlesource.com/c/build/+/177919
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
2019-05-21 16:42:08 +00:00
Brad Fitzpatrick 4847158763 buildlet: increase SSH connect timeout
5 seconds is too slow on linux-arm it seems (starting sshd,
conditionally creating a host key, etc)

Change-Id: I2015908eb892d0b2dd580db8ba96c10726502cbb
Reviewed-on: https://go-review.googlesource.com/c/build/+/175998
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-05-08 16:27:26 +00:00
Brad Fitzpatrick 4f0f4bb614 revdial/v2: add new simpler, non-multiplexing revdial implementation
The old revdial has a simple multiplexing protocol that was like
HTTP/2 but without flow control, etc. But it was too simple (no flow
control) and too complex. Instead, just use one TCP connection per
reverse dialed connection. For now, the NAT'ed machine needs to go
re-connect for each incoming connection, but in practice that's just
once.

The old implementation is retained for now until all the buildlets are
updated.

Updates golang/go#31639

Change-Id: Id94c98d2949e695b677531b1221a827573543085
Reviewed-on: https://go-review.googlesource.com/c/build/+/174082
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
2019-04-29 17:41:24 +00:00
Brad Fitzpatrick 69dd6b2c22 env/linux-x86-vmx: add new Debian host that's like Container-Optimized OS + vmx
This adds scripts to create a new builder host image that acts like
Container-Optimized OS (has docker, runs konlet on startup) but with a
Debian 9 kernel + userspace that permits KVM for nested
virtualization.

Updates golang/go#15581 (solaris)
Updates golang/go#23060 (dragonfly)
Updates golang/go#30262 (riscv)
Updates golang/go#30267 (fuchsia)
Updates golang/go#23824 (android)

Change-Id: Ib1d3a250556703856083c222be2a70c4e8d91884
Reviewed-on: https://go-review.googlesource.com/c/163301
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
2019-02-21 22:30:49 +00:00
Brad Fitzpatrick 88cd9dd988 buildlet: change image name for COS-with-vmx buildlet
The COS image I'd forked from earlier didn't have CONFIG_KVM or
CONFIG_KVM_INTEL enabled in its kernel, so even though I'd enabled the
VMX license bit for the VM, the kernel was unable to use it.

Now I've instead rebuilt the ChromiumOS "lakitu" board with a modified
kernel config:

   https://cloud.google.com/container-optimized-os/docs/how-to/building-from-open-source

More docs later. Still tinkering. Nothing uses this yet.

Updates golang/go#15581 (solaris)
Updates golang/go#23060 (dragonfly)
Updates golang/go#30262 (riscv)
Updates golang/go#30267 (fuchsia)
Updates golang/go#23824 (android)

Change-Id: Id2839066e67d9ddda939d96c5f4287af3267a769
Reviewed-on: https://go-review.googlesource.com/c/163057
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
2019-02-19 20:44:46 +00:00
Brad Fitzpatrick 0261b66eb0 dashboard, buildlet: add a disabled builder with nested virt, for testing
This adds a linux-amd64 COS builder that should be just like our
existing linux-amd64 COS builder except that it's using a forked image
that has the VMX license bit enabled for nested virtualization. (GCE
appears to be using the license mechanism as some sort of opt-in
mechanism for features that aren't yet GA; might go away?)

Once this is in, it won't do any new builds as regular+trybot builders
are disabled. But it means I can then use gomote + debugnewvm to work
on preparing the other four image types.

Updates golang/go#15581 (solaris)
Updates golang/go#23060 (dragonfly)
Updates golang/go#30262 (riscv)
Updates golang/go#30267 (fuchsia)
Updates golang/go#23824 (android)

Change-Id: Ic55f17eea17908dba7f58618d8cd162a2ed9b015
Reviewed-on: https://go-review.googlesource.com/c/162959
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
2019-02-15 22:52:44 +00:00
Brad Fitzpatrick e9fe3dc893 buildlet, dashboard: add min CPU platform knob, require Skylake for OpenBSD
I thought this would be enough for OpenBSD to select the TSC on its own without
being forced to (as in CL 160319), but apparently it is not:

    https://github.com/golang/go/issues/29223#issuecomment-459035482

So it seems like we want both this CL and CL 160319.

Updates golang/go#29223

Change-Id: I0a092d62881d8dcce0ef1129d8d32d8f4025b6ac
Reviewed-on: https://go-review.googlesource.com/c/160457
Reviewed-by: Andrew Bonventre <andybons@golang.org>
2019-01-30 21:55:30 +00:00
Brad Fitzpatrick 6ae8750e84 all: use Container-Optimized VMs instead of Kubernetes for buildlet containers
Fixes golang/go#25108

Change-Id: I084669b52b699700ed26a7fdd890d9205a8b9dc9
Reviewed-on: https://go-review.googlesource.com/111267
Reviewed-by: Andrew Bonventre <andybons@golang.org>
2018-05-11 03:29:07 +00:00
Brad Fitzpatrick 23fc7515b0 buildlet: remove support for preemptible VMs we haven't used in a long time
Change-Id: I867c4b491327962410768291f8656665c2399e7b
Reviewed-on: https://go-review.googlesource.com/111641
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-05 17:19:19 +00:00
Brad Fitzpatrick 14d99737ae all: rename Kube to Container and IsGCE to IsVM
Once containers run on COS instead of Kubernetes, one name (Kube*) is
wrong and the other (GCE) is ambiguous. So rename them now to be more
specific.

No behavior changes. Just renaming in this step, to reduce size of
next CL.

Updates golang/go#25108

Change-Id: Ib09eb682ef74acbbf6ed50b46074f834ef5e0c0b
Reviewed-on: https://go-review.googlesource.com/111639
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-05 16:50:07 +00:00
Brad Fitzpatrick 9929a3fe2b all: move to using the oauth2/google.Credentials type, add buildenv accessor
This removes some duplication of scopes and how to get the
TokenSource and which credentials to use.

And update the coordinator deps, since its rev of
golang.org/x/oauth2/google was too old to have the new type.

I want to clean this up more, but I need to make some changes to to
the oauth2/google package first. More later.

Change-Id: Ic2799ec2ec62f67c65de6380b373fe915a43003e
Reviewed-on: https://go-review.googlesource.com/111266
Reviewed-by: Andrew Bonventre <andybons@golang.org>
2018-05-04 17:04:37 +00:00
Brad Fitzpatrick a6dd626c4c all: use "context" instead of golang.org/x/net/context
Change-Id: I106c81a16eb5b39ff38c6f896095a27f597b3f8d
Reviewed-on: https://go-review.googlesource.com/99016
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-03-06 23:26:52 +00:00
Brad Fitzpatrick 73f88a6d4c all: add README.md files where missing, and tool to keep them updated
Change-Id: I385171c415bf168c04c6c3a7a996bff88964af84
Reviewed-on: https://go-review.googlesource.com/52856
Reviewed-by: Andrew Bonventre <andybons@golang.org>
2017-08-02 22:17:52 +00:00
Brad Fitzpatrick 4eceee2d0f cmd/coordinator, cmd/buildlet, cmd/gomote: add SSH support
This adds an SSH server to farmer.golang.org on port 2222 that proxies
SSH connections to users' gomote-created buildlet instances.

For example:

    $ gomote create openbsd-amd64-60
    user-bradfitz-openbsd-amd64-60-1

    $ gomote ssh user-bradfitz-openbsd-amd64-60-1
    Warning: Permanently added '[localhost]:33351' (ECDSA) to the list of known hosts.
    OpenBSD 6.0 (GENERIC.MP) golang/go#2319: Tue Jul 26 13:00:43 MDT 2016

    Welcome to OpenBSD: The proactively secure Unix-like operating system.

    Please use the sendbug(1) utility to report bugs in the system.
    Before reporting a bug, please try to reproduce it with the latest
    version of the code.  With bug reports, please try to ensure that
    enough information to reproduce the problem is enclosed, and if a
    known fix for it exists, include that as well.

    $

As before, if the coordinator process is restarted (or crashes, is
evicted, etc), all gomote instances die.

Not yet supported:

* scp (help wanted)
* not all host types are configured. most are. some will need slight
  config tweaks to the Docker image (e.g. adding openssh-server)

Supports currently:

* linux-amd64 (host type shared by 386, nacl)
* linux-arm
* linux-arm64
* darwin
* freebsd
* openbsd
* plan9-386
* windows

Implementation details:

* the ssh server process listens on port 2222 in the coordinator
  (farmer.golang.org), which is behind a GKE TCP load balancer.

* the ssh server library is github.com/gliderlabs/ssh

* authentication is done via Github users' public keys. It's assumed
  that gomote user == github user. But there's a mapping in the code
  for known exceptions.

* we can't give out access to this too widely. too many things are
  accessible from within the host environment if you look in the right
  places. Details omitted. But the Go team and other trusted gomote
  users can use this.

* the buildlet binary has a new /connect-ssh handler that acts like a
  CONNECT request but instead of taking an explicit host:port, just
  says "give me your machine's SSH connection". The buildlet can also
  start sshd if needed for the environment. The /connect-ssh handler
  also installs the coordinator's public key.

* a new buildlet client library method "ConnectSSH" hits the /connect-ssh
  handler and returns a net.Conn.

* the coordinator's ssh.Handler is just running the OpenSSH ssh client.

* because the OpenSSH ssh child process can't connect to a net.Conn,
  an emphemeral localhost port is created on the coordinator to proxy
  between the ssh client and the net.Conn returned by ConnectSSH.

* The /connect-ssh handler requires http.Hijacker, which requires
  fully compliant net.Conn implementations as of Go 1.8. So I needed
  to flesh out revdial too, testing it with the
  golang.org/x/net/nettest package.

* plan9 doesn't have an ssh server, so we use 0intro's new conterm
  program (drawterm without GUI support) to connect to plan9 from the
  coordinator ssh proxy instead of using the OpenSSH ssh client
  binary.

* windows doesn't have an ssh server, so we enable the telnet service
  and the coordinator ssh proxy uses telnet instead on the backend
  on the private network. (There is a Windows ssh server but only in
  new versions.)

Happy debugging over ssh!

Fixes golang/go#19956

Change-Id: I80a62064c5f85af1f195f980c862ba29af4015f0
Reviewed-on: https://go-review.googlesource.com/50750
Reviewed-by: Herbie Ong <herbie@google.com>
Reviewed-by: Jessie Frazelle <me@jessfraz.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-07-28 18:21:11 +00:00
Brad Fitzpatrick 81725e0e0e cmd/debugnewvm: new debug tool to create buildlets on GCE for testing
Updates golang/go#21153

Change-Id: I1d80424a3272e7ee21eb176b020273d5903e444b
Reviewed-on: https://go-review.googlesource.com/51030
Reviewed-by: Sarah Adams <shadams@google.com>
2017-07-25 22:16:56 +00:00
Josh Bleecher Snyder d5f7ed3f5b buildlet: add config file to alter username
My gomote username is josharian.
My whoami is josh.
As a result, I need to run 'gomote -user=josharian'
every time, which is an annoyance.

Add a simple config file to allow a user to change their user name.
On startup, gomote checks whether user-$(whoami).user exists.
If so, it reads it to get the real username.
So I have a config file ~/.config/gomote/user-josh.user
with the contents 'josharian'.

It's a bit of an ugly hack, but it works and spares me some pain.

Change-Id: I372d5a786b99c9e3c6a57f25b0a38b9146f23598
Reviewed-on: https://go-review.googlesource.com/42230
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-05-01 18:04:27 +00:00
Brad Fitzpatrick acd9ecfc7a dashboard: add configs for new windows 2008, 2012, 2016 builders
See also: https://golang.org/cl/41142

Updates golang/go#17513

Change-Id: Ie743ae4604e65892e28423ed0af008450b647197
Reviewed-on: https://go-review.googlesource.com/41393
Reviewed-by: Jeff Johnson <jrjohnson@google.com>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
2017-04-21 23:29:29 +00:00
Kevin Burke f2cd214fae all: fix vet errors
Replace oauth2.NoContext (deprecated) with context.Background(),
which has been available for two consecutive releases.

Add more cloud.google.com/go packages to the cmd/coordinator
Dockerfile to fix an error building cmd/coordinator. A dependency is
not present in master of cloud.google.com/go, but was present in the
older revision, and was not getting checked out correctly during the
"go get" step. In addition, we were failing to fetch dependencies for
some packages that coordinator depends on. I added instructions for
hopefully doing this more systematically in the future.

Fix the gitmirror Dockerfile which has the same problem.

Change-Id: Id6c2220482350a686b87742ec7915c457a689e52
Reviewed-on: https://go-review.googlesource.com/40852
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-04-15 23:42:12 +00:00
Brad Fitzpatrick 9a094f9b85 buildlet: fix doc typo
Change-Id: I2e43aa33634eb63b7cbdcc6ba6296fe8e857ad81
Reviewed-on: https://go-review.googlesource.com/36912
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-02-13 18:03:35 +00:00
Brad Fitzpatrick 1e7c6919f1 cmd/buildlet, cmd/release: disable pargzip for release builds
Fixes golang/go#19052

Change-Id: Icf2923ea72290bd99b78e1c2fd9675668157b3ac
Reviewed-on: https://go-review.googlesource.com/36910
Reviewed-by: Quentin Smith <quentin@golang.org>
2017-02-13 17:51:36 +00:00
Kevin Burke 392d3a9c58 buildlet: add context to GetTar
This lets us set a timeout on the HTTP request, which lets us time out the
writeSnapshot call in cmd/coordinator.

Fixes golang/go#18812.

Change-Id: I370448df4d95130c9c5b30ba32459ce844a6c967
Reviewed-on: https://go-review.googlesource.com/36897
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-02-12 17:37:23 +00:00
shawnps 72ebb1e3e7 all: fix typos
Change-Id: I16c20d6eb746a3ec81021c2a367d74c258437019
Reviewed-on: https://go-review.googlesource.com/34921
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-01-07 18:01:14 +00:00
Brad Fitzpatrick 71265acedb all: adjust things for upgrade from GKE 1.2 to GKE 1.4
We hit GKE bugs and changes when upgrading from GKE 1.2 to 1.4.

The main issue is that Kubernetes does't reserve CPU or memory for
itself on nodes, so things were OOMing and getting killed. And when
Docker or Kubernetes got killed themselves, they were wedging and not
recovering.

So we're going to run a daemonset (POD on all nodes) to reserve space
for Kubernetes for it. That's not in this CL.

But this CL got us limping along and was already in production. It
doubles resource RAM usage for jobs, so fewer things schedule per node.
While we're at it, let jobs use more CPU if it's available.

Also, disable auto-scaling. It was off before by hand. Force it off
programatically too. And make the node count 5, like it was by hand.

Also, force un-graceful pod deletes, since GKE 1.3 or something
introduced a graceful-vs-ungraceful distinction, which we weren't
handling previously and therefore pods never were being deleted.

Change-Id: I3606e4e2e92c496d8194503d510921bd1614d34e
Reviewed-on: https://go-review.googlesource.com/33490
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-12-01 23:48:21 +00:00
Brad Fitzpatrick e4d08a5aad cmd/gomote, buildenv, buildlet: move config code to common places
Split off from Quentin's https://golang.org/cl/29399

Change-Id: I4578f8f485e97d6b9844fb12e84779167755752e
Reviewed-on: https://go-review.googlesource.com/29858
Reviewed-by: Quentin Smith <quentin@golang.org>
2016-09-27 18:47:08 +00:00
Brad Fitzpatrick 328c9b8918 all: split builder config into builder & host configs
Our builders are named of the form "GOOS-GOARCH" or
"GOOS-GOARCH-suffix".

Over time we've grown many builders. This CL doesn't change
that. Builders continue to be named and operate as before.

Previously the build configuration file (dashboard/builders.go) made
each builder type ("linux-amd64-race", etc) define how to create a
host running a buildlet of that type, even though many builders had
identical host configs. For example, these builders all share the same
host type (a Kubernetes container):

   linux-amd64
   linux-amd64-race
   linux-386
   linux-386-387

And these are the same host type (a GCE VM):

   windows-amd64-gce
   windows-amd64-race
   windows-386-gce

This CL creates a new concept of a "hostType" which defines how
the buildlet is created (Kube, GCE, Reverse, and how), and then each
builder itself references a host type.

Users never see the hostType. (except perhaps in gomote list output)
But they at least never need to care about them.

Reverse buildlets now can only be one hostType at a time, which
simplifies things. We were no longer using multiple roles per machine
once moving to VMs for OS X.

gomote continues to operate as it did previously but its underlying
protocol changed and clients will need to be updated. As a new
feature, gomote now has a new flag to let you reuse a buildlet host
connection for different builder rules if they share the same
underlying host type. But users can ignore that.

This CL is a long-standing TODO (previously attempted and aborted) and
will make many things easier and faster, including the linux-arm
cross-compilation effort, and keeping pre-warmed buildlets of VM types
ready to go.

Updates golang/go#17104

Change-Id: Iad8387f48680424a8441e878a2f4762bf79ea4d2
Reviewed-on: https://go-review.googlesource.com/29551
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2016-09-22 23:13:34 +00:00
Brad Fitzpatrick 44e74d53da buildlet: don't leak file descriptors to buildlets
Also, add count of fds and goroutines to the coordinator's status
page.

Change-Id: I857e609623cfa280716d5d079180d0e4021d0bac
Reviewed-on: https://go-review.googlesource.com/27550
Reviewed-by: Quentin Smith <quentin@golang.org>
2016-08-23 03:38:21 +00:00
Brad Fitzpatrick 940da928a2 buildlet: fix gomote crash
ctx was initialized in the wrong place.

Change-Id: I7fb3c56071a3e4d5cc2199e07fcc4b8ecbc7a674
Reviewed-on: https://go-review.googlesource.com/22967
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-05-10 03:51:36 +00:00
Brad Fitzpatrick 79f01e0ca7 buildlet: don't hang in HTTP requests if heartbeats fail
Fixes golang/go#15582

Change-Id: Iab31cb4770728d2cc405f930c973d87bae7fbf10
Reviewed-on: https://go-review.googlesource.com/22890
Reviewed-by: Andrew Gerrand <adg@golang.org>
2016-05-09 21:13:28 +00:00
Brad Fitzpatrick 4a3f4a8c16 kubernetes: avoid the WatchPod streaming API
Poll every 5 seconds instead.

Change-Id: I0a8a30fee1118b2aecf4b3cbbf342e07afc27602
Reviewed-on: https://go-review.googlesource.com/22820
Reviewed-by: Evan Brown <evanbrown@google.com>
2016-05-05 20:21:37 +00:00
Evan Brown 34ff1d9bc8 all: kubernetes builder autoscaling
Improvements to support rapid scheduling of many build jobs:

- Retry logic in Kubernetes client to handle sporadic connection
  closes from their API server under heavy load

- Cluster autoscaler scales on default CPU utilization metric

- Debug mode allows scheduling multiple builds to test scaling

- Account for scheduled vs. provisioned resources in a cluster
  and use that information to estimate when a build's pod
  will be scheduled and in running state

- Use estimated scheduled time to set context timeout

- Track pod lifecycle (requested time, estimated available time,
  actual available time, terminate time, etc)

Change-Id: I14d6c5e01af0970dbb3390a29d1ee5c43049fff8
Reviewed-on: https://go-review.googlesource.com/19524
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-04-07 17:22:57 +00:00
Evan Brown a731151878 all: consolidate configuration for coordinator and gomote
buildenv.Environment type defines configuration options:

- Coordinator uses the GCE project name to lookup config. A custom
  config name can be provided at runtime to override.

- The conventional prod and stage project names ('symbolic-datum-552'
  and 'go-dashboard-dev') map to prod and staging configuration structs.

- Production and staging status is explicitly defined in configuration.

- GCS bucket names for buildlet, logs, and snapshots are
  configurable.

Change-Id: I7e6d7874eb0bdfe35dbdd5fcf6212ab50d576b88
Reviewed-on: https://go-review.googlesource.com/19502
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-03-04 22:24:11 +00:00
Evan Brown c8b96fa9fa all: report cluster utilization to Cloud Monitoring
Change-Id: Iec8e03fdef27582756d96b731c2fb12b56175ca9
Reviewed-on: https://go-review.googlesource.com/18396
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-01-26 17:10:41 +00:00
Evan Brown 2e452e1be8 all: improve Kubernetes API client
* pod creation returns an *api.Pod
* WatchPod replaces WatchPodStatus

Updates golang/go#12546

Change-Id: I34bb6e0d994e552b41a8082cc4672a663ce961a3
Reviewed-on: https://go-review.googlesource.com/17100
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-11-20 17:56:59 +00:00
Evan Brown 83f9748046 all: display pool status and delete failed/old pods
* status page shows kube pool details
* pods created by the coorindator are tracked
* pods that fail to create are deleted
* pods older than delete-at are deleted
* pods created by a different coordinator are deleted

Updates golang/go#12546

Change-Id: I4c4f8ff906962b4a014a66d0a9d490ff17710d62
Reviewed-on: https://go-review.googlesource.com/16101
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-11-05 11:33:24 +00:00
Matthew Dempsky d5f422f474 all: fix vet warnings
Change-Id: Ia4519d9b050136b4d46dc8361f546e3c4205a9cd
Reviewed-on: https://go-review.googlesource.com/16288
Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-10-25 04:14:39 +00:00
Evan Brown 956434c23a all: improved monitoring of buildlet pods after creation
* Replaced cancel with context.Context
* StartPod can be canceled
* Wait for buildlet to come online, but fail fast if pod fails first
* Support timeout waiting for pod to leave pending phase
* Use Kubernetes watch API (long poll)

Updates golang/go#12546

Change-Id: I792a3b8fed615362a0290feee7de0c2cefe43c0e
Reviewed-on: https://go-review.googlesource.com/15285
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-10-15 23:45:50 +00:00
Evan Brown 4dfb85f32a all: basic support for linux-amd64 buildlets on Kubernetes
* set correct metadata as env vars on each container in pod

* stage0 detects when running in a pod

* add scope to support farmer on GCE communicating with Google
  Container Engine API

* dev mode supports GCE and Kubernetes even when not running
  on GCE

* pod buildlet returns a configured http.Client

* pod buildlets work, but pods are not removed after completion
  (or on creation failure)

Updates golang/go#12546

Change-Id: If91673b49223130c1e7077c130f1abe1e7966d02
Reviewed-on: https://go-review.googlesource.com/15041
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-10-01 17:25:38 +00:00
Evan Brown 882e20c4c9 cmd/coordinator: discover kube cluster
Find the default Kubernetes cluster and configure a client to talk to it.
Use application default credentials.

Updates golang/go#12546

Change-Id: Ifb1ce57f52f4fbbee3267f8cc3cf02a78146bd5b
Reviewed-on: https://go-review.googlesource.com/14532
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-09-21 18:28:38 +00:00