Граф коммитов

25 Коммитов

Автор SHA1 Сообщение Дата
Julie Qiu bf8adebf67 cmd/cron,internal/cron: introduce max parallelism for requests to fetch service
Requests made from the cron endpoint /new/ to the fetch service are now
parallelized using cron.FetchVersions and by specifying a workers flag
for the maximum number of requests in flight.

Fixes b/128540164

Change-Id: Ic0885739ae668495cd91e5dacfd39846b16e83f3
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/442300
Reviewed-by: Robert Findley <rfindley@google.com>
2020-03-27 16:46:35 -04:00
Rob Findley 25fea74665 internal/frontend: rename 'tabName' to the more correct 'pageName'
Change-Id: Iadc768b1e5fad23f49f688cee857d59e2f72d9a2
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/446470
Reviewed-by: Julie Qiu <julieqiu@google.com>
2020-03-27 16:46:34 -04:00
Julie Qiu 79707fd405 internal/fetch: do not read non-.go files greater than 10MB
extractPackagesFromZip is updated so that it will skip non-.go files
greater than 10MB when parsing packages. If there is a .go file in the
module greater than 10MB, an error will be returned.

detectLicense is updated so that it will skip license files found
that are greater than 10MB.

Fixes b/130089430

Change-Id: I8ea97d514b7b07968bbc219c9b257f125df50e33
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/446245
Reviewed-by: Robert Findley <rfindley@google.com>
2020-03-27 16:46:34 -04:00
Dmitri Shuralyov e22e11aa97 internal/frontend: use blackfriday/v2 from github.com instead of gopkg.in
The canonical import path of blackfriday v2 in module mode is:

	github.com/russross/blackfriday/v2

It can be seen in the go.mod file of the latest version (v2.0.1)
at https://github.com/russross/blackfriday/blob/v2.0.1/go.mod#L1.
Also, https://github.com/russross/blackfriday#versions says:

> With package management you should import github.com/russross/blackfriday
> and specify that you're using version 2.0.0.

The wording there needs to be improved/updated, but for module mode,
it means to add a /v2 suffix.

Change-Id: Ie833acef8ecd1727a902723c07da4a918cdc7e68
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/446244
Reviewed-by: Julie Qiu <julieqiu@google.com>
2020-03-27 16:46:34 -04:00
Rob Findley 860eff1d8c cron: use the index timestamp for polling
This fixes a latent bug where clock offsets cause us to miss versions.
Spanner provides a monotonicity guarantee so that its commit time is
viable for pagination, if not ideal.

Unfortunately it's only monotonic: querying for since=t_0 returns all
versions with Timestamp >= t_0, and therefore we'll always have an
at-least-1 module overlap in our polling.

Fixes b/129700670

Change-Id: Iab9a2ed0445c913ef5f1d590d30152adda9a157e
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/444659
Reviewed-by: Julie Qiu <julieqiu@google.com>
2020-03-27 16:46:34 -04:00
Julie Qiu 5be9805028 internal/fetch,internal/postgres: fix no packages error and error messages display
validateVersions used to return an error if a version did not have any
packages. These versions should be inserted into the database, since we
could have a module without any packages.

Error messages corresponding to a given version no longer print the full
struct. They will only print the module path and version.

GetLatestPackageForPaths is also moved to internal/postgres/search.go, since
it is used for search.

Fixes b/130027196.

Change-Id: Ief0ee0ab977a47ee0840da14ef2336f082b2db28
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/444661
Reviewed-by: Channing Kimble-Brown <ckimblebrown@google.com>
2020-03-27 16:46:34 -04:00
Rob Findley 89cdb14112 discovery: add a timeout for client requests
Use the Timeout middleware to to add a timeout for incoming requests.
As a notable exception, don't do this for module fetch, as we want
module processing to continue even if the client cancels their request
early.  Instead, set an explicit timeout for module processing.

Additionally, bind work done on behalf of the request to the request
context.  This required switching to the context-aware sql APIs, setting
a context on outbound HTTP requests, and quite a bit of 'ctx' threading.

Fixes b/128689909

Change-Id: Ic732808b6df89a61106f5d14b8a9ecaaefb95c4f
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/443023
Reviewed-by: Julie Qiu <julieqiu@google.com>
2020-03-27 16:46:34 -04:00
Julie Qiu 06f2b9a513 internal/frontend,internal/postgres: implement fetchSearchPage
frontend.fetchSearchPage is implemented, which returns all of the data
needed to generate results for a search page.

postgres.Search is updated so that results with a rank score of 0 are
not returned.

Change-Id: I2e41716ce0cfa9db70b0a84930b4e2568a2c969b
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/443906
Reviewed-by: Robert Findley <rfindley@google.com>
2020-03-27 16:46:34 -04:00
Andrew Bonventre 4673ce2411 discovery: make small improvements for developer ergonomics
+ Fix a test that fails due to precision differences in time values
  in Postgres


+ Provide an example URL when visiting the root path of the fetch
  service
+ Update handlers to return a 404 instead of 500 if a user provides
  an invalid path
+ Don't require a full URL when parsing module data since the path
  is all that's needed

Change-Id: I9f8dc13b0a4995e6cb848f86ad41f6fed2759f82
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/439693
Reviewed-by: Julie Qiu <julieqiu@google.com>
2020-03-27 16:46:34 -04:00
Channing Kimble-Brown a247bd68db internal, internal/fetch: update parseVersion return types and version struct
This change updates the parseVersion so that it returns both a
VersionType and an error and adds a VersionType field to the
Version struct.

Change-Id: I35527ab135b21502beead3103d5075914da49a37
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/437555
Reviewed-by: Julie Qiu <julieqiu@google.com>
2020-03-27 16:46:34 -04:00
Julie Qiu d1a9c54f60 internal/frontend: display READMEs written in markdown as HTML
READMEs written in markdown will be displayed as HTML.

internal.Version.ReadMe is changed from type string to []byte.

Minor changes are made to the Overview page to reflect changes for
package discovery.

Change-Id: I52e5ae091cf8069154415e02c7d88c7911f3a205
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/437553
Reviewed-by: Andrew Bonventre <andybons@google.com>
2020-03-27 16:46:34 -04:00
Channing Kimble-Brown a6b5740369 internal, internal/fetch: add VersionType and getVersionType
The database needs an enum column called version_type to indicate
the type of version a package has. So to be able to insert
packages with the correct version type VersionType is being added
along with function getVersionType to parse a package's
version to determine what kind of VersionType it should have.

getVersionType is also used to make sure that the version is valid
in FetchAndInsertVersion.

Change-Id: Ia15c23c3783dbb83934ce8b0b580a25bafa219c2
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/437333
Reviewed-by: Andrew Bonventre <andybons@google.com>
2020-03-27 16:46:34 -04:00
Julie Qiu f6ffaa5bfb internal/postgres,internal/fetch: implement read/write packages
When a module zip is downloaded by the fetch service, its packages will
be extracted. The synopsis for each package will also be determined
using go/doc.

These packages will be written to postgres with postgres.InsertVersion.
Because we expected to add more version data over time, so
postgres.InsertVersion will no longer fail on duplicate key errors.

A package can be retrieved with postgres.GetPackage.

Change-Id: I9916d82c1479914f0b91ca02105aab86e578aac7
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/436915
Reviewed-by: Andrew Bonventre <andybons@google.com>
2020-03-27 16:46:34 -04:00
Julie Qiu 31b7830d5d internal/proxy,internal/fetch: create module zipfiles during test setup
All module zips inside internal/proxy/testdata were previously created
manually. They are now created automatically when tests are run.

Testdata for modules that were previoulsy inside internal/fetch/testdata
are moved to internal/proxy/testdata.

Tests inside internal/proxy and internal/fetch are updated to reflect
this changes.

Change-Id: I348fd4ebfe6c4ec0751945966984afc3859a2330
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/435777
Reviewed-by: Andrew Bonventre <andybons@google.com>
2020-03-27 16:46:33 -04:00
Julie Qiu dc7cd4ceb0 discovery: change replace directive for sos.googlesource.com/sos
The replace directive for sos.googlesource.com/sos is moved to
the root of the discovery repo to fix a build error when deploying GAE.

Fixes b/128783757.

Change-Id: I02d290f2f98e161e992b8191d9f74a8fc2d54531
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/435776
Reviewed-by: Channing Kimble-Brown <ckimblebrown@google.com>
2020-03-27 16:46:33 -04:00
Channing Kimble-Brown f75d561711 migrations: adding and fixing licenses
Some of the migrations files were missing licenses or said
'Copyright 2009' instead of 'Copyright 2019' and needed to be
fixed.

Change-Id: I36a6c627b8893ad227b608ea96d4f257e9ab0562
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/435003
Reviewed-by: Julie Qiu <julieqiu@google.com>
2020-03-27 16:46:33 -04:00
Julie Qiu cfb8fdf20b internal/cron: create proxy index cron
The proxy index cron is created, with a job to get new versions from the
module index. The cron will:

1. query the module index for new versions since a given timestamp
2. write each version to the version_logs table
3. make a request to the fetch service for each version to be
downloaded

The fetch client is also fixed to make a GET request to
https://<fetch-url>/<module>/@v/<version>.

Change-Id: I838029d94f9b2782e0c1066ec7932931b47fe01e
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/426749
Reviewed-by: Andrew Bonventre <andybons@google.com>
2020-03-27 16:46:33 -04:00
Channing Kimble-Brown 8f815ef30a cmd/frontend, internal/frontend: create frontend binary and load overview
There is now a frontend binary that accepts HTTP requests for:

GET /name?v=<version>

This route renders an overview page, which displays the name, version
publish date and readme for a specified module. The readme has not yet
been rendered into HTML.

Also note that the tabs are not fully functional or accessible and
these issues will be thoroughly addressed in a future CL.

Fixes b/124438879
Fixes b/124438779
Fixes b/124438684

Change-Id: I569dd3549bdea781183e2d7d29be375141fb5c3d
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/425454
Reviewed-by: Andrew Bonventre <andybons@google.com>
2020-03-27 16:46:33 -04:00
Julie Qiu b2c75fd60f internal/thirdparty: add script to download cmd/go/internal packages
The command internal/thirdparty/download.go is added, which is used to
download cmd/go/internal packages from https://go.googlesource.com/go.
It alsos fixes the imports

The cmd/go/internal packages module and semver are added using this
tool. These will be used to get requirements from a given mod file.

Change-Id: I81b0e065df0d7f544dcb6e5c985082a61866d7c9
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/424991
Reviewed-by: Andrew Bonventre <andybons@google.com>
2020-03-27 16:46:33 -04:00
Julie Qiu fba6d34c13 internal/fetch: implement license detection
detectLicenses is implemented, which searches for possible license files
in the contents directory of the provided zip path, runs them against a
license classifier, and provides all licenses with a confidence score
that are above 97%.

Change-Id: I91333beb9a84f1a89a7bbd054c1b81c98022d1ce
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/426792
Reviewed-by: Katie Hockman <katiehockman@google.com>
2020-03-27 16:46:33 -04:00
Julie Qiu 7fd4e6812f internal/postgres: update InsertVersion error message
The error message for attempting to insert an invalid version is updated
to specify the fields that cause the error.

Change-Id: Ic92da9940c6fa337d082eb126e25aee42591721d
Reviewed-on: https://team-review.git.corp.google.com/c/golang/discovery/+/426796
Reviewed-by: Katie Hockman <katiehockman@google.com>
2020-03-27 16:46:33 -04:00
Julie Qiu 79ac5ae337 internal/fetch: implement fetch and insert version to postgres
An initial version FetchAndInsertVersion is implemented, which:

(1) Downloads the given module version from the module proxy
(2) Process the contents:
  - Calculate series name
  - Get the contents of the README
  - Get the contents of the license
  - Get packages for the module
(3) Writes the data to postgres.

Change-Id: Iabce23879124a03599dc5779302e69cc5e432f88
Reviewed-on: https://team-review.git.corp.google.com/c/422995
Reviewed-by: Andrew Bonventre <andybons@google.com>
2020-03-27 16:46:33 -04:00
Julie Qiu 359416308b internal/fetch: implement get packages from a module zip
packagesInModuleZip is implemented, which parses the zip for a module version and returns
its packages.

Change-Id: I5fb12dfbe3b5912fa748ce5b77cd47f01055d60a
Reviewed-on: https://team-review.git.corp.google.com/c/420373
Reviewed-by: Andrew Bonventre <andybons@google.com>
2020-03-27 16:46:33 -04:00
Julie Qiu 54ce811d27 discovery/internal/sourcestorage: implement read/write from GCS
The discovery/internal/sourcestorage package is created to read/write
from a GCS bucket.

The following methods are implemented:
- OpenBucket(ctx context.Context, bucket string) (*Bucket, error)
- Write(ctx context.Context, key string, data []byte) error
- Read(ctx context.Context, key string) ([]byte, error)

Change-Id: Ib85f4beba32b08c3cd9935ba5792de22dd76ed45
Reviewed-on: https://team-review.git.corp.google.com/c/414335
Reviewed-by: Andrew Bonventre <andybons@google.com>
2020-03-27 16:46:33 -04:00
Julie Qiu dc3e6e940c discovery/internal/postgres: read/write to the version_logs table
The discovery/internal/postgres package is created to interact
with the discovery database. The following methods are implemented:
- LastestProxyIndexUpdate() *time.Time
- InsertNewVersionsToLog(newVersions []*internal.VersionLog) error

These will be used by the proxy index cron when fetching new versions
from the module index.

The version_log table is renamed to version_logs. This was initially
a typo.

A dependency to github.com/lib/pq v1.0.0 is introduced.

Change-Id: I720a836dc85a37a5df863e879ce6f60082c795f1
Reviewed-on: https://team-review.git.corp.google.com/c/413847
Reviewed-by: Andrew Bonventre <andybons@google.com>
2020-03-27 16:46:33 -04:00