gopls/internal/test/marker: seed the cache before running tests

The marker tests are heavily parallelized, and many import common
standary library packages. As a result, depending on concurrency, they
perform a LOT of duplicate type checking and analysis.

Seeding the cache before running the tests resulted in an ~80% decrease
in CPU time on my workstation, from ~250s to ~50s, which is close to the
~40s of CPU time observed on the second invocation, which has a cache
seeded by the previous run. I also observed a ~33% decrease in run time.
Admittedly my workstation has 48 cores, and so I'd expect less of an
improvement on smaller machines.

Change-Id: Ied15062aa8d847a887cc8293c37cb3399e7a82b6
Reviewed-on: https://go-review.googlesource.com/c/tools/+/588940
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
This commit is contained in:
Rob Findley 2024-05-30 14:08:30 +00:00 коммит произвёл Robert Findley
Родитель 01018ba9ed
Коммит 8d54ca127f
1 изменённых файлов: 64 добавлений и 0 удалений

Просмотреть файл

@ -26,6 +26,7 @@ import (
"sort" "sort"
"strings" "strings"
"testing" "testing"
"time"
"github.com/google/go-cmp/cmp" "github.com/google/go-cmp/cmp"
@ -107,6 +108,10 @@ func Test(t *testing.T) {
// Opt: use a shared cache. // Opt: use a shared cache.
cache := cache.New(nil) cache := cache.New(nil)
// Opt: seed the cache and file cache by type-checking and analyzing common
// standard library packages.
seedCache(t, cache)
for _, test := range tests { for _, test := range tests {
test := test test := test
t.Run(test.name, func(t *testing.T) { t.Run(test.name, func(t *testing.T) {
@ -264,6 +269,65 @@ func Test(t *testing.T) {
} }
} }
// seedCache populates the file cache by type checking and analyzing standard
// library packages that are reachable from tests.
//
// Most tests are themselves small codebases, and yet may reference large
// amounts of standard library code. Since tests are heavily parallelized, they
// naively end up type checking and analyzing many of the same standard library
// packages. By seeding the cache, we ensure cache hits for these standard
// library packages, significantly reducing the amount of work done by each
// test.
//
// The following command was used to determine the set of packages to import
// below:
//
// rm -rf ~/.cache/gopls && \
// go test -count=1 ./internal/test/marker -cpuprofile=prof -v
//
// Look through the individual test timings to see which tests are slow, then
// look through the imports of slow tests to see which standard library
// packages are imported. Choose high level packages such as go/types that
// import others such as fmt or go/ast. After doing so, re-run the command and
// verify that the total samples in the collected profile decreased.
func seedCache(t *testing.T, cache *cache.Cache) {
start := time.Now()
// The the doc string for details on how this seed was produced.
seed := `package p
import (
"net/http"
"sort"
"go/types"
"testing"
)
var (
_ = http.Serve
_ = sort.Slice
_ types.Type
_ testing.T
)
`
// Create a test environment for the seed file.
env := newEnv(t, cache, map[string][]byte{"p.go": []byte(seed)}, nil, nil, fake.EditorConfig{})
// See other TODO: this cleanup logic is too messy.
defer env.Editor.Shutdown(context.Background()) // ignore error
defer env.Sandbox.Close() // ignore error
env.Awaiter.Await(context.Background(), integration.InitialWorkspaceLoad)
// Opening the file is necessary to trigger analysis.
env.OpenFile("p.go")
// As a checksum, verify that the file has no errors after analysis.
// This isn't strictly necessary, but helps avoid incorrect seeding due to
// typos.
env.AfterChange(integration.NoDiagnostics())
t.Logf("warming the cache took %s", time.Since(start))
}
// A marker holds state for the execution of a single @marker // A marker holds state for the execution of a single @marker
// annotation in the source. // annotation in the source.
type marker struct { type marker struct {