This was left-in by accident, and does not apply to this action. For
alignment purposes, move the action button to the leftmost form input.
Change-Id: I792ef15da4537d6c64a1a10988a5a4c597f0e9bb
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/467200
Reviewed-by: Julie Qiu <julieqiu@google.com>
This CL modifies postgres.InsertVersion overwrite version data, rather
than defaulting to ON CONFLICT DO NOTHING. This is achieved by first
deleting any existing version within the transaction, which will cascade
to all version data.
Testing this was a bit difficult due to the way the fake proxy is
implemented. In order to make this easier and facilitate easier testing
down the road, the fake proxy was modified to dynamically generate its
endpoints based on version information, allowing for the serving of
purely in-memory versions.
Since some common patterns are emerging in our tests, add an
internal/testhelper package.
Fixes b/132710180
Change-Id: I91137335d58c133d9de06ffe88e66b1f7846aa44
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/466417
Reviewed-by: Julie Qiu <julieqiu@google.com>
This makes license sorting deterministic and explicit: sort by proximity
to the package (or rather, by descending directory depth). For licenses
in the same directory, sort by FileType and then Type.
Fixes b/130744679
Change-Id: Ie245fcfa820f7dad6c1a8bd43c0e43350d38caa4
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/467261
Reviewed-by: Julie Qiu <julieqiu@google.com>
This CL adds ON DELETE CASCADE to foreign keys
referencing the versions table, so that deleting a version also deletes
all corresponding data.
Updates b/132710180
Updates b/131789052
Change-Id: I793d2b8b1769cda8b667d3d32c28728e79ac4b4a
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/466436
Reviewed-by: Andrew Bonventre <andybons@google.com>
Package name is remove from the search results, to reflect the updated
mocks.
The following error is fixed in content/static/search.tmpl:
search.tmpl:28:66: executing "main_content" at <$e.Type>: can't evaluate field Type in type string
Change-Id: If8bbf800fe2c2c4d73bece96fc28b88596762fba
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/467625
Reviewed-by: Robert Findley <rfindley@google.com>
plainto_tsquery is now used to rank relevance in db.Search, instead of
to_tsquery with terms separated by OR operators.
plainto_tsquery transforms the unformatted text querytext to a tsquery
value. The text is parsed and normalized much as for to_tsvector, then
the & (AND) tsquery operator is inserted between surviving words. More
info: https://www.postgresql.org/docs/current/textsearch-controls.html#TEXTSEARCH-PARSING-QUERIES
plainto_tsquery has improved multi-word searches and sanitizes user
input.
Fixes b/131860942
Fixes b/131908265
Updates b/131861451
Change-Id: I933cb3cef199055de19526b7ca6790752b3df4bd
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/461556
Reviewed-by: Robert Findley <rfindley@google.com>
This updates go.mod and go.sum now that the dependency on
golang.org/x/tools/go/packages was removed in cl/464096.
Change-Id: Ia1d475c35b1deac397172bf7f6ccaa1d7260614a
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/465675
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
This adds additional directives to our content security policy to avoid
known attack vectors. Notably JavaScript is disallowed entirely for now,
since we don't use it.
Additionally, some other security-related headers are set to prevent
clickjacking and MIME sniffing. Accordingly, middleware is renamed to
'SecureHeaders'.
Fixes b/130541353
Change-Id: I56e3d2d5ec7ddb2799a9150d142df405d7452b91
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/462179
Reviewed-by: Andrew Bonventre <andybons@google.com>
As we've starting to parameterize tab metadata, it is now possible to
implement some simplifications to both our templates and our rendering.
In this CL, a formal 'TabSettings' type is introduced, and used to both
eliminate the contextual 'templateName' template func, and to generate
the module nav via a range action.
Change-Id: I454150158a9b9b5e401f45d52aeb2f00543d0cda
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/461525
Reviewed-by: Andrew Bonventre <andybons@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
As pointed out in b/132284555, it's a little unclear what's going on
when a package is not redistributable: what exactly can't be shown?
This adds a 'DisplayName' property to details, so that we can improve
the error message to make it explicit that the details you're trying to
look at is intentionally hidden.
Fixes b/132284555
Change-Id: I25511ab15b317bb8ca0c53ef994b5f6d5b81a022
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/462576
Reviewed-by: Andrew Bonventre <andybons@google.com>
Previously, the high-level approach taken by extractPackagesFromZip
was to validate and extract the module zip contents to a temporary
work directory on disk, then use the go/packages API to interpret
the Go packages in that directory.
At this processing stage, we're only looking to extract package-scope
information about the Go packages we find inside a module, and nothing
that would require having the full transitive closure of dependencies.
We're also aiming to extract information from potentially broken, large,
incomplete, or otherwise adversarial Go packages in a way where we can
have tight control over the amount of computational resources we allot.
At this time, go/packages is not well-suited for this use-case. Even
when operating in the packages.NeedName | packages.NeedFiles mode,
it can end up indirectly requesting additional information from the
internet and doing other unhelpful work. Additionally, incomplete or
broken go.mod files can cause errors that prevent us from getting the
package name and files information. Finally, the vendor directory can
influence the results and introduce problems; we want to ignore it.
See issues golang.org/issue/31893 and golang.org/issue/31894.
This change implements the Go package processing behavior we need
more directly: by traversing the zip files, looking only at the .go
files in each directory, and parsing the .go files via the go/parser
package. The build.Context.MatchFile method from the go/build package
is used for implementing the build constraint satisfaction logic.
We explicitly do nothing that would require additional work to fetch
and process dependencies or use the internet; only the .go files in
the module zip are processed.
Now that we're taking a lower-level approach to package processing,
it no longer requires the .go files to be on disk. It means we can
process module zip files completely in-memory, one directory at a
time (while still validating for excessively large files/packages),
without having to extract module zip contents to disk before we're
able to start processing.
This approach should result in a significant increase in our ability
to extract Go packages from modules successfully, an improvement in
resource use when processing large modules, and an overall speed-up
in processing time.
For comparison, the time to process the module gocloud.dev@v0.13.0
from scratch with the previous approach was:
0m31.476s
After this change:
0m1.465s
This change deletes no longer used functionality to extract files
from a zip file to disk.
Fixes b/130218622
Fixes b/130089836
Fixes b/130218992
Fixes b/130089703
Updates b/128784074
Fixes b/130219508
Fixes b/130089884
Fixes b/130219846
Fixes b/130089686
Fixes b/130089413
Fixes b/130089785
Change-Id: I002df8b70bd4cfdffe3229e491ea4cf51ad7fa22
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/464096
Reviewed-by: Robert Findley <rfindley@google.com>
Use an en-dash for numeric ranges when summarizing the current page of
search results, and don't include the result range when all results fit
on a single page. This trivially covers the case where there are no
results.
Fixes b/132252091
Fixes b/132249393
Change-Id: Id2b48c5a777c7e4998f45b8d044744e79423ec9b
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/462086
Reviewed-by: Andrew Bonventre <andybons@google.com>
Implement the simplifications to our license algorithms discussed in
b/131921712. Also, expand the set of redistributable licenses to include
all (detectable) OSI approved licenses, and add COPYING.txt to the list
of license file names.
Update detect_test to not depend on the proxy testdata.
Additionally, fix a couple places where license FilePath was operated on
using the filepath library, rather than path. filepath is technically
incorrect, since the path in question is coming from the zip so always
uses forward slashes.
Fixes b/131921712
Fixes b/131741680
Change-Id: Iae1e4f6caf5e7ffffef3debe6537783f6438b52f
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/461991
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
This change cleans up some inconsistent style in templates.
Standardizing on no spaces was discussed in
Change-Id: I71a231d305100bfaf7c69cf762c20b2fdeef1904
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/462089
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Refactor the cron server to move functionality out of main.go and into a
new cron.Server. This allows writing http tests that better exercise the
full functionality of the cron handlers.
Additionally add two new handlers: /indexupdate/ and /fetchversions/, as
specified in the ETL design.
For manual testing and debugging, revamp the cron server's root handler
to display information about recent versions, and to support making
requests to the cron actions.
Updates b/131614520
Updates b/131612196
Updates b/131614691
Change-Id: I44e9198a367cbf95a90c47a8892cb8ea5eba2cc9
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/460768
Reviewed-by: Julie Qiu <julieqiu@google.com>
This CL modifies the vendored license detection to allow for packages
named 'vendor'.
Fixes b/132193937
Change-Id: I61c1001ae6fc8c69cfc7fcdce5bc4900d883686d
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/461082
Reviewed-by: Julie Qiu <julieqiu@google.com>
Imports and importedby are now grouped based on whether they are internal or external
to the package's module. This makes it easier to understand information about
module dependencies.
Package names are also removed from the imports view to make it easier
to scan the package paths, and for consistency with the module and importedby
views.
Fixes b/132050335
Fixes b/132072505
Updates b/132071971
Change-Id: Ia849dd94129d2f5a95e56a43fd882e5801f0a532
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/460416
Reviewed-by: Robert Findley <rfindley@google.com>
PersistentVolumes resources are added for writting to /tmp and /go.
Module zip files that are downloaded from the proxy are written to /tmp
for use by golang.org/x/tools/go/packages.Load for extracting package
information.
/go is set as the GOPATH in the Dockerfile.
golang.org/x/tools/go/packages.Load writes data to /go when running go
list.
Data stored in PersistentVolumes will continue to exist even as Pods are
deleted and recreated, which allows us to use them as a persistent
cache. As a result, the PersistentVolumes fetchdisk-tmp-pvc and
fetchdisk-gopath-pvc are created, with the ReadWriteOnce accessMode.
A StatefulSet is used to create the deployment, to prevent potential
deadlocks when pods are created and deleted. More info:
https://cloud.google.com/kubernetes-engine/docs/concepts/persistent-volumes#deployments_vs_statefulsets
cmd/fetch/README.md is updated with info on how to deploy the fetch
service with these changes.
Change-Id: Ic4964ec43c68edf20e1319c600158b824aff44c1
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/455000
Reviewed-by: Julie Qiu <julieqiu@google.com>
This unifies license-related functionality into a new internal/license
package, with the intention of making it easier to understand handling
of licenses. It also makes some type names a little cleaner.
A few minor changes are made along the way (e.g. LicenseInfo ->
Metadata), and tests are added for license matching.
Updates b/131921712
Updates b/131741680
Change-Id: I6a766513e971cea74938fbc60f0220c702e66bcd
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/459926
Reviewed-by: Julie Qiu <julieqiu@google.com>
Add new IndexVersion and VersionState types, along with functions for
storing, updating, and retrieving them from postgres.
Updates b/131614050
Change-Id: Ifbc26b061bbdc86c8c727d948f7b4f075d405cb2
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/459933
Reviewed-by: Julie Qiu <julieqiu@google.com>
Search results are currently indicated as "<start> of <total>" (such as
"0 of 95").
They are now displayed as <start>-<end> of <total> (such as "1-10 of
95").
Fixes b/131929222
Updates b/131908602
Change-Id: I266d7094a261b8f3674b6e2368c618c4d6a73221
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/460700
Reviewed-by: Robert Findley <rfindley@google.com>
The pages on the module view are now listed in order
of the package path to preserve directory structure.
Fixes b/131859852
Change-Id: I1924fd48c310e981e5659a9a3e5c44bb1e6bde1f
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/459561
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
The mvw_search_documents materialized view is now used for search.
postgres.RefreshSearchDocuments is introduced, which will be used
to refresh data in the materialized view in a later CL.
Fixes b/131861129
Fixes b/131859794
Fixes b/131911629
Updates b/131908602
Change-Id: I33209b65b7d00420660bd6dc58b03250c9fc144a
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/459559
Reviewed-by: Robert Findley <rfindley@google.com>
The search query has been optimized with the following:
* Using a where clause to filter out tsv_search_tokens
* Inline query to get packages and licenses and join with the versions
table in a subquery
This reduced a search for 'cloud' from 7934.626 ms to 1746.115 ms.
Updates b/130090305
Change-Id: I6c71800db9055e02474bfe30d4c9d2ee1c4f7e8a
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/457637
Reviewed-by: Robert Findley <rfindley@google.com>
At the moment, there is no limit on the number of page links on the
search page, which becomes messy when search has a lot of results.
Page numbers are now calculated using pagesToLink, which returns an
integer slice representing page numbers that will displayed. It
optimizes for the current page to be in the middle of that range
(similar to google.com).
Fixes b/131836875
Updates b/131862035
Change-Id: I9dafc35bff2278a4850bbded086a0000b5f76d3d
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/458088
Reviewed-by: Robert Findley <rfindley@google.com>
A bug was introduced in commit cc073d7624cfc3ca6fd8641101a5ffcc6bd1996d, causing
there to be two schema migrations with 000042.
000042_add_module_version_state is renumbered to 000043_add_module_version_state.
Change-Id: Ib25c631d5c8f662c6f8e8aa5c4da6fc395f7298e
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/459560
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Add a module_version_state table to implement the ETL state for new
module versions.
Updates b/131614050
Change-Id: I7496b428588b1d4f461c5da7376012e8513d842a
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/456916
Reviewed-by: Julie Qiu <julieqiu@google.com>
The materialized view mvw_search_documents is created to cache data
needed for generating search results. It contains information about the
number of importers for each package, tsvector tokens used to search,
and package metadata.
This table will be used by internal/postgres.Search in a later CL.
Updates b/131861129
Change-Id: Ib021c1e5cdec07f18304ff8ec5f7198419887dd8
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/458701
Reviewed-by: Robert Findley <rfindley@google.com>
The output is not actionable because thirdparty packages are copied
from an external source, and we can't modify them directly in this
project.
Remove trailing slash from a relative import path "./internal/secrets".
It's more common to not include a trailing slash when specifying import
paths.
Fixes b/131821500
Change-Id: Id7458b1474a4218422035cb15e305e2b2a356b64
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/457840
Reviewed-by: Robert Findley <rfindley@google.com>
At least temporarily, relax the content security policy to allow
arbitrary images, since this is currently breaking images in README
rendering.
Updates b/130541353
Change-Id: Ifbdea9c5600851f998553b413916ddab3d7ffd43
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/459719
Reviewed-by: Andrew Bonventre <andybons@google.com>
This change is the first step in applying proper structure
and styling based on mockups from UX.
The template structure has changed with a single "base" page,
base.html used as the first template parsed when constructing
each page. Other templates have been updated to use this new
structure.
An option has been added to the frontend controller to reload
templates on each request to ease development.
Change-Id: I6856ad3f854249aaf29e46fc6c3e563039ab931d
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/457866
Reviewed-by: Julie Qiu <julieqiu@google.com>
Update to a newer versions of github.com/golang-migrate/migrate/v4,
cloud.google.com/go, github.com/google/go-cmp, gocloud.dev, and
google.golang.org/genproto modules in order to significantly reduce
the size of the transitive closure of the required modules (from 640
down to 362).
The latest released version of gocloud.dev module is currently v0.13.0,
and does not yet include the change to drop many unnecessary modules
(specifically, see the go.mod/go.sum file diffs in the commit
4a87797b25).
The next release version v0.14.0 of gocloud.dev will include that
improvement, and it's going to come out within a month. We can update
to it then. All tests pass with the new dependency versions.
This reduces the deploy time of ./cmd/frontend (without a module proxy)
from 9-10 minutes to 2-3 minutes. The deploy time can be improved even
further by using a proxy.
Also update to new versions of htmlg and component packages that have
more visible license information.
Change-Id: I12851ac7a3e66f1d66348d83239686375466ea18
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/457839
Reviewed-by: Robert Findley <rfindley@google.com>
Add a ContentSecurityPolicy middleware to the frontend. For now, this
adds just a basic policy; the associated bug will not be closed until we
have finalized the policy.
Updates b/130541353
Change-Id: I6da1724525eacc02ad334a15e09c226f28ba6001
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/457874
Reviewed-by: Andrew Bonventre <andybons@google.com>
Clearly I've missed a couple warnings from our presubmit, so I'm making
it louder:
+ Use terminal colors to call out errors and warnings.
+ Always run header check on all (known) internal files, so that it
doesn't go silent after a bad commit is merged.
Change-Id: I88116e1dcb0ed6c15b8c1369cbb37de00bd57efd
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/457641
Reviewed-by: Julie Qiu <julieqiu@google.com>
This CL introduces an initial package documentation rendering code
for the discovery website. It performs the rendering during package
processing in the fetch service. There are many minor details left
to tweak, but it has the right general shape.
Also reduce the scope of work done by packages.Load to work on the
extracted module, and not try to pull in its transitive closure of
dependencies from the internet. What's left is to come up with a way
of writing a test for this, so the behavior doesn't regress.
Load Work Sans, Roboto, and Source Code Pro fonts for use on package
documentation page.
Update x/tools module to latest version.
Change-Id: I08fbbea8ef2b964c7b083a6dae3a725d2b7e4b68
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/457636
Reviewed-by: Julie Qiu <julieqiu@google.com>
The following indexes are added:
* idx_semver_sort ON versions (module_path, major DESC, minor DESC, patch DESC, prerelease DESC)
* idx_package_licenses ON package_licenses(version, module_path, file_path)
* idx_imports_to_path ON imports(to_path, from_path)
The following columns are altered to TYPE TEXT COLLATE "C", since they are used for sorting
* documents.module_path
* documents.package_path
* documents.series_path
* licenses.file_path
* licenses.module_path
* licenses.module_path
* package_licenses.module_path
* package_licenses.package_path
The primary key for imports is ordered to (to_path, from_path, from_version) to improve
performance for aggregating importedby count.
Fixes b/129600165
Updates b/130090305
Change-Id: Ia158884e5eaab1e5b49a651af38a5439b961052e
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/457638
Reviewed-by: Robert Findley <rfindley@google.com>
If switching between branches with different migrations, it's likely
that migration.Up will fail on your test databases. This can be a pain
now that there are multiple test databases to fix.
To handle this, automate the re-creation of the database if the
migration fails during the Up step.
Also, remove no-longer-needed migration for discovery-database-test.
Updates b/130719094
Change-Id: Ibb0550b95c2e7a3e437402cfe21ea22e460ac27b
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/457036
Reviewed-by: Julie Qiu <julieqiu@google.com>
The default tab is changed to "doc".
The module path is now displayed on the module tab.
Updates b/124309981
Fixes b/131746914
Change-Id: I85957b1b7dd587ad54544e9c269f3f630619a32b
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/457476
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>