Speed up the loading of the worker status page.
Change-Id: I1503b5448fa6ef686a91fdc8fbd0490bf354177c
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/721278
CI-Result: Cloud Build <devtools-proctor-result-processor@system.gserviceaccount.com>
Reviewed-by: Julie Qiu <julieqiu@google.com>
Inserting playground links is now done behind a feature flag. There were
issues in dev when batch processing our entire dataset due to the load
we were putting on the playground server.
Updates b/154333737
Change-Id: I4119d9e48943bd693a6e46f74ea5a7cf8f0ad9a9
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/720896
CI-Result: Cloud Build <devtools-proctor-result-processor@system.gserviceaccount.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Fix staticcheck error:
should not use built-in type string as key for value; define
your own type to avoid collisions (SA1029)
Change-Id: I3f52e988113b51cef990fe0d4c86366010689b8f
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/720886
CI-Result: Cloud Build <devtools-proctor-result-processor@system.gserviceaccount.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Introduce internal/xcontext, which provides a way to "detach"
a context from its parent's timeout and cancellation signals.
(Copied from golang.org/x/tools.)
Use it when the worker does a fetch, to prevent the fetch
from being canceled while retaining the parent context's values.
Change-Id: I91fd7a5790b5654983ee72d8054fb74c45f9b417
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/720905
Reviewed-by: Julie Qiu <julieqiu@google.com>
If getting the source info fails, log at Info rather than Error
to avoid cluttering the logs with non-serious errors.
Fixes b/154274456.
Change-Id: I5b5128a9bc242c2e94423c95c5e3c5d9786c78b2
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/720901
CI-Result: Cloud Build <devtools-proctor-result-processor@system.gserviceaccount.com>
Reviewed-by: Julie Qiu <julieqiu@google.com>
Have the task IDs change every hour, instead of every 3 hours.
We want to be able to retry a task more frequently.
Fixes#154277084.
Change-Id: Ib687775705e019ac62ea75efd29eff540e6fa2d8
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/720902
CI-Result: Cloud Build <devtools-proctor-result-processor@system.gserviceaccount.com>
Reviewed-by: Julie Qiu <julieqiu@google.com>
Previously experiments were always inactive for the worker, because:
(1) X-Forwarded-For was empty since the request was coming from
Cloud Tasks. This is fixed by setting all experiments with rollout=100
to always be on. (There shouldn't be any case where worker flags
would only be partially on).
(2) Experiments were not being set on the new context produced by
trace.StartSpan and when a contextWithCancel was created and passed to
fetch.FetchModule. These are now set with the new
experiment.WithExperiment function.
Change-Id: Ie14b699bf435fecc8791c3ad435f73afe579812f
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/720661
Reviewed-by: Jonathan Amsterdam <jba@google.com>
An error occurred when inserting into readmes:
invalid byte sequence for encoding "UTF8"
This is now fixed.
Change-Id: I51f7d0c22f833b052daabc405495518c4c5e47a4
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/720665
CI-Result: Cloud Build <devtools-proctor-result-processor@system.gserviceaccount.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
There are 500 errors when inserting into package_imports due to
deadlines. We had this problem in the past with imports, and sorting by
path previously solved the issue.
Change-Id: Ia49408d4aa448852434b4dfa2fa6b1261fbfdaee
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/720664
Reviewed-by: Jonathan Amsterdam <jba@google.com>
unparam wasn't being used previously - it now runs when running
all.bash.
Change-Id: I71ee838b817ce1eb32c0fb60b34c5ef4c7cd4d25
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/718554
CI-Result: Cloud Build <devtools-proctor-result-processor@system.gserviceaccount.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Make log messages clearer by logging the function that finished when
logging successes.
Change-Id: I5f60683628c6a37c878d3c60208fa7f2d29156c9
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/719982
CI-Result: Cloud Build <devtools-proctor-result-processor@system.gserviceaccount.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
We're still seeing issues with UpsertVersionMap, even though the queue
has been dialed down to insert only one module at a time.
Change UpsertVersionMap to be two queries:
(1) select moduleID: should be a relatively trivial query that just
fetches the module_id for a given path and version from the modules
table
(2) upsert version_map: same as query before the data model changes,
which upserts a row in the version_map table, but now it also upserts
a module_id
This will allow us to get more information on where in UpsertVersionMap
the query is failing.
Not that both the current query on master and the new one succeed when
running the worker locally and connecting to the dev database.
Change-Id: I3c1d0a621294fa0e38bdd2165a35460dbacda4e6
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/719980
Reviewed-by: Jonathan Amsterdam <jba@google.com>
The experiment middleware is added to the worker, so that we can run
experiments from the worker.
Change-Id: Iaeb01fbf5480efb2c1f89a5eeabea81a3198c655
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/718553
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Upgrade github.com/google/go-cmp to v0.4.0 to avoid a panic
in the race detector.
Change-Id: I9b96d536f92c0a765e5f8612b59bed61cde2b74e
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/718642
CI-Result: Cloud Build <devtools-proctor-result-processor@system.gserviceaccount.com>
Reviewed-by: Julie Qiu <julieqiu@google.com>
This can be done with
gcloud builds submit
or by hooking up a CI system.
Change-Id: I07b1967ffd239aab1ed8a7c7993739f3017be490
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/718641
CI-Result: Cloud Build <devtools-proctor-result-processor@system.gserviceaccount.com>
Reviewed-by: Julie Qiu <julieqiu@google.com>
The /populate-search-documents endpoint was no longer used, and it was
broken: it would have added to search_documents everything from the
packages table that wasn't already there, but we now omit alternative
modules and their packages from search_documents. This would have
put them back.
However, now that we are exploring changes to our search algorithm, we
do need a way to reprocess all search documents. So add a new
endpoint, /repopulate-search-documents, that upserts packages in
search_documents that haven't been updated since a given time.
Change-Id: Icbae9de078774111f3adb61a35dee95c4711dffa
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/717851
Reviewed-by: Julie Qiu <julieqiu@google.com>
Simplify the error logic of most handlers by allowing
them to return an error.
Basically the same thing we did to the frontend in
75ed0713f34c997516b5c1c3e3f0e1ddb6cd8bfa.
Leave a couple of handlers as they are because they do something unusual.
Change-Id: Ib7c5f4f4752945a32c84fc3b49987e3712997521
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/717849
Reviewed-by: Julie Qiu <julieqiu@google.com>
licenses.module_id and version_map.module_id are now populated.
saveModule is refactored so that inserting the module is done in
multiple functions, instead of one large function, for readability. The
functionality has not change and the pieces of a module are still being
inserted in a single transaction.
Change-Id: I3603a388be5dbb90ce8f05ae9c237989f965e6e4
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/717237
Reviewed-by: Jonathan Amsterdam <jba@google.com>
For consistency with us using "module" instead of "version" in the data
model, fetchAndInsertVersion is renamed.
Change-Id: I515af4a62701601708798acec1e0283cfcb823bf
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/717846
Reviewed-by: Jonathan Amsterdam <jba@google.com>
FetchVersion will now return all of the directories in a module.
Module.Directories reprsents all of the directories in a module. A
directory is redefined as the path and all of the entities that exist at
that path, including README, package documentation, package imports, and
licenses.
Change-Id: I3ebb58800102a2d705fe7bcadaa3ddb476e4d9f6
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/705335
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Change the way that the Postgres ts_vector (list of search tokens) is computed.
- Use the path_tokens text configuration when creating the tsvector for the path.
- Construct sections B and C of the search document by combining the synopsis
part of the README. Parts of this processing are:
- Extract only the text of a markdown README, to remove images and
other extraneous information.
- Add alternatives to certain words in the synopsis and README. For
example, add "postgresql" whenever we see "postgres".
- Modify the ts_rank call in the code to use a B weight of 1.
- Change the call to the database search function so that it invokes
the function that has a B weight of 1.
These changes will require re-computing the
search_documents.tsv_search_tokens column. That should be done after
these are deployed.
Change-Id: Ib81601326f11efd81c8bc733694a000eccecf12b
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/705958
Reviewed-by: Julie Qiu <julieqiu@google.com>
The database.DB type now can represent a DB connection in the
middle of a transaction. Such a DB is created only by calling
DB.Transact.
The resulting API is much simpler, since all the ...Tx methods
disappear.
Change-Id: I41afada87738e1eacdec2fcf115902edddeff867
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/716719
Reviewed-by: Julie Qiu <julieqiu@google.com>
During doc fetch, share each example to the Go playground,
then add a link to that shared example alongside the code.
Fixesgolang/go#36865
Change-Id: Iaff51f99dd0d6d4fb71463304ee7cb747f037cd7
Add a text configuration that ignores parts of hyphenated words, to
reduce the inflated search ranking of paths with hyphenated elements.
We'll use this to create part of a search document that is taken from
the package path.
Currently we split a path into tokens, generating all sub-paths as
well as splitting at hyphens. For example,
github.com/CrunchyData/postgres-operator/apiserver
results in the tokens
postgres
postgres-operator
postgres-operator/apiserver
among others. The default text configuration splits at hyphens too, so
the three tokens above result in three copies of "postgres", when we'd
expect only one. That skews the search results to favor this package
more than it should.
If we use a text configuration that ignores hyphenated word parts, the
above tokens will result in only once instance of "postgres".
Change-Id: Ide76d80d32c079b6e3a07c7434ddddd61b39aadf
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/712881
Reviewed-by: Julie Qiu <julieqiu@google.com>
The golang text configuration was intended to prevent stemming for
terms like "postgres" and "NATS". However, it's too great a loss to
drop stemming for other, ordinary English words (e.g. "logging" to
"log"). And though "postgres" is stemmed to the nonsensical "postgr",
that happens for both the document and the search query, so search
quality isn't affected.
Change-Id: I730590e27499fca9f70a3ce65a6e6ea364f4adda
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/712880
Reviewed-by: Julie Qiu <julieqiu@google.com>
Tests for the fetch package are rewritten to make test cases clearer and
easier to add/edit.
Previously, there were multiple tests testing similar things in
fetch_test.go: TestExtractPackagesFromZip, TestFetchVersion, and
TestFetchVersion_Alternative.
The different cases from these tests are now moved to fetchdata_test.go,
and tested at the FetchVersion level.
Change-Id: Iceee81b4b889b350f0c10c53c2c5ca750a8db891
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/711173
Reviewed-by: Jonathan Amsterdam <jba@google.com>
An issue is fixed with the CSP that prevented images from
rendering properly.
The CSP frame-src is also removed since it is no longer being used.
Change-Id: I22aad093d218a3def7880289047b03471696a59d
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/711758
Reviewed-by: Jonathan Amsterdam <jba@google.com>