Because:
- It turns out we invert the target for #integration tags with a regex
This Commit:
- Applies a negation to our our unit test filter that says testing everything not match #integration.
Because:
- we want to use the latest node LTS major version
This commit:
- upgrades FxA to use node 18, with two workarounds
- Webpack uses a hash algorithm that's no longer supported by default
in node 17+, causing build failures; --openssl-legacy-provider is
used as the workaround
- dns.lookup in node 17+ by default returns the results in the same
order as they are from the resolver, which could lead to
'localhost' resolving to ::1; --dns-result-order=ipv4first is used
as the workaround
Because:
- TypeScript errors were occurring during the docker build of the deploy workflow
- We want to catch typescript errors as early as possible
This Commit:
- Adds compile job, that runs tsc --noEmit across all workspaces
Because:
- Something was off for base-install in jobs using the ci-base-browser-latest images. Yarn cache hits were not happening as anticipated. Also, in the event the yarn install could be skipped, the postinstall script was taking longer than expected.
- An occasional race condition where the deploy-fxa-ci-browser-image could be built off a stale base image was detected.
- Since ci-base-latest and ci-base-browsers-latests weren’t pushed at the same time, occasionally different builds would be used in a pipeline. Although this never directly led to problems, it did seem like the potential was there.
- The ci-* docker images were larger than desired..
- While investigating optimizations, it became apparent job spin up times could be further reduced by using CircleCI workspaces.
– Nightly builds are useful, and we would occasionally encounter a regression on main that wasn’t present in PRs due to differences in resulting state after a merge.
- SRE needs a better way to trigger smoke tests in parallel.
This commit:
- Reworks the docker file.
- Puts everything into one multi stage build, which results in better layer caching.
- Creates new tag names ci-builder, ci-test-runner, & ci-functional-test-runner
- Makes sure these images are as small as possible
- Updates references to these new images in the executors
- Ensures that the ci images are built on the same machine and pushed at the same time. This addresses the potential race condition described in the because section above.
- Short circuits the docker build for ci images if there are no npm package changes detected. Since the base images are really just a way to decrease build time by caching package dependencies in a docker image that compresses better than the circle ci cache, we don’t actually need to build the image unless a package change has occurred.
- Leverages CircleCI workspaces. The build now primes the workspace and it is then restored in the subsequent jobs. This ends up saving time, because we skip the build / install step, and it allows us to use much smaller docker images for any job running post build.
- Adds a script to extract the current state of the yarn cache from a docker image, so the yarn cache can be kept fresh. This was an oversight in the initial pass.
- For functional tests, the operation that starts the pm2 stack and the actual jobs have been separated into different steps. This gives us better timing metrics, and also lets us see which step fails most often. Starting the stack actually takes up a considerable amount of time, and shouldn’t be confused with the time it takes to execute tests.
- For functional tests (both playwright and content), a memory optimization was made by not including fxa-shared or fxa-react in the ‘run start’ operation. Both of these workspaces are now built in the ‘build’ stage, so there is no reason to include them when spinning up the stack. This actually resulted in a noticeable reduction in memory usage. This is probably due to the fact that these pm2 tasks were running a watch operation, which is more memory intensive than just running a build.
- Ensures that the entire git history isn't copied into the base image, which results in a smaller image. (Our git history is surprisingly large!) This is done by setting a depth of 1 when cloning and depth of 2 when fetching. These changes also address FXA-6676, because the clone and fetch operations have been modified.
- Adds a nightly workflow, so that we can run a full test suite to guard against regression resulting from a merge into main. See FXA-6626 for more information. We can also use this nightly flow to postpone any CI tasks that run on main, but aren’t urgent.
- Retains the ability to manually trigger tasks that were shifted to the nightly workflow. These tasks include deploying storybook or deploying packages.
- Adds the ability to trigger smoke test workflows with pipeline parameters. It’s possible this will simplify executing smoke tests in parallel for SRE.
- Fixes the issue of slow base-install script performance when the yarn install operation could be skipped. The issue was that when we invoke postinstall directly, fxa-shared would be built. This is already taken care of in the build step, so doing so in the base install was an unneeded redundancy.
- Adds a few other minor improvements fixes such as avoiding a couple redundant build / lint operations, making the build script a bit more robust, and cleaning up the config file a bit.
Because:
- We wanted to run a few preliminary checks before proceeding to more
expensive CI jobs. Checks include:
- Compiling typescript in commonly referenced workspace packages
- Linting code that has changed
- Executing Unit Tests for code that has changed
- We wanted to partition test operations into unit tests, and
integration tests. Unit tests can be run relatively quickly and
require no additional infrastructure. Integration tests require
additional infrastructure and generally have longer execution
times. Now that jobs are blocked from running until preliminary
checks pass, one of which is unit tests, it is important to draw a
distinction between these two types of tests.
- We want to avoid unnecessary yarn installs and typescript
compilations, which are time consuming.
- We want to make sure that test results are published and failing tests
can be easily viewed in the CI.
This Commit:
- Creates a build-and-validate job in the CI that builds, lints, and
unit tests code prior to running any other jobs.
- Creates unit-test job in CI config
- Creates integration-test job in CI config
- Removes redundant calls to compile workspace packages. These
are now built up front, cached, and restored as needed for future
runs.
- Extends the create-lists script functionality to generate commands
that can be executed with the parallel command.
- Removes unnecessary yarn install operations. Invoking yarn workspace
focus results in a yarn install. In the case of running tests this is largely
unnecessary, because we already do a yarn install in the base-install
step.
- Make sure test results are exported as junit xml so the CI can report
back on tests that were failing. This was done for a couple workspace
packages, but many were lacking the capability. All test:unit and
test:integration npm scripts now export this data.
- Fixes the following issues encountered along the way:
- Adds logs to monitor heap usage of jest tests. Some
jest tests are still using a lot of memory.
- Moves a few slow / long running tests from unit test to
integration tests.
- Ensures that jest.transform for ts-jest is always instructed
to have the config option isolateModules is set to true. This
definitely decreases memory overhead and resolves some
of the OOM errors we were hitting. It was configured in
some places but not everywhere.
- Exports test results files for all tests
- Exports all test artifacts
- Uses gnu parallel to run tests in parallel. Turns out yarn
workspaces foreach would give a false positive when an OOM
was encountered. Fortunately, the parallel command offered an
acceptable work around, and even offers some nice features
like the load argument, which allows to control test execution a
bit more efficiently.