The main issue was that we had foreign keys without indexes for the
child key columns like SQLite suggests in
https://www.sqlite.org/foreignkeys.html.
My first try was to add the indexes and ran the benchmarks on my
machine:
- `ingest-again-amp-wikipedia` now completed successfully and returned
a time of ~1000ms.
- `ingest-amp-wikipedia` (the test for ingesting the first time) increased
from about ~450ms to ~575ms. It seems that adding these keys is
non-trivial, but there's no other way to make reingestion work at a
reasonable speed without a major rework of the system.
The second try was to remove the foreign keys altogether, which lead to
an even bigger speedup:
- `ingest-again-amp-wikipedia` was now ~600ms
- `ingest-amp-wikipedia` was now ~400ms, even faster than the
baseline.
- Database size staid the same as in `main`, whereas the previous
change increased it from 19m to 26m.
We discussed the pros-and-cons of this in the
https://github.com/mozilla/application-services/pull/6273 and decided to
go with dropping the foreign keys. This represents some loss of safety,
but it's not much because:
- Ingestion is deterministic, if we set up the relationships correctly
for one user then we set them up correctly for all users.
- We regularly throw away all data in the tables and re-create them.
AMP/Wikipedia dominates the total ingestion time, which is why I
focused on this benchmark. The second slowest benchmark is amp-mobile
which is doing very similar work, but with half the data and finishing
in about half the time. After that, all other benchmarks were < 15ms.
Updated the `assert_schema_matches_new_database` method to also check
that the indexes match.
Added new `ingest-again-*` benchmarks that test re-ingesting all
suggestions with a full database. In particular, this tests the
`drop_suggestions` method that's causing us issues.
Right now the `ingest-again-amp-wikipedia` is extremely slow. I've
always had to interrupt it before it finishes.
Also, updated `cargo suggest-debug-ingestion-sizes` to print out the
total database size. I want to use this to check how new indexes affect
it.
The goal here is to track down some foreign key failures that we're
seeing, but don't have any context on where they're happening.
- Record a breadcrumb when we start to ingest different suggestion types.
- Record what we were doing when we saw an SQL error. This should
change the message from `FOREIGN KEY constraint failed` to something
like `FOREIGN KEY constraint failed (context: mdn insert)`, which I
think will help greatly.
I wanted to also record which field the foreign key error was happening
on, but AFAICT this is not possible with SQLite.
I added the `extend` crate to help with some of this code. It's not
really needed, but I think it's worth the dependency. We've been using
it for `uniffi-bindgen-gecko-js` so it's already vetted.
* Abstracts the API towards being more general-purpose "tab command" rather than
being purely about closing individual tabs, with the intention that it's
easier to add, say, "close all inactive" (because with 100% certainty I
know this will be a very early request)
* Updates the schema creation code to be less foot-gun-y.
* Renaming a few things so that we consistently use `device_id` for the fxa id
and `client_id` for the Sync client ID.
* Fixes the case when syncing when the device ID != fxa id.
Added an `interrupt_everything` method that interrupts both the read and
write connection. This is intended for iOS to use to interrupt the
suggest component during shutdown. See
https://bugzilla.mozilla.org/show_bug.cgi?id=1897299 for a discussion on
next steps here.
Created a new `WriteScope` type that stores an interrupt scope that can
be used for multiple `write` calls. This allows the ingestion code to
catch all interruptions that happened after the ingestion started. With
the old system, if `interrupt_everything` was called in-between
ingestion two types then we might miss it.
- Added some extra utility functions/types for testing.
- Added a module with test data like "Los Pollos Hermanos" that we can
repeatedly use in the tests.
The end result is that the tests are much more compact, which makes them
easier to understand and maintain.
Also added a new mock RS client for the tests. It works better with the
new system and also fixes a weird aspect of the old mock client. The
old client would always return all records for each request, even though
the store now requests one provider type at a time. When the test store
requested the weather records, it would get all the mock Amp data. The
tests still passed, but I found this odd.
I'm stopping now before replacing all the test code to get feedback. If
others like this approach I can continue replacing the rest of the
tests.
Made the API more high-level and not just a copy of the remote settings
client API. It now inputs a `SuggestRemoteSettingsRecordRequest` which
is specific to the kinds of requests we make. It outputs a list of
records linked to the attachment data. This way:
- It's simpler to mock up in the tests since the mocks can match the
actual requests more directly. I'm hoping to use this to simplify
the store tests which have gotten really out of hand.
- We abstract away some of the remote settings client details, which
makes it easer to switch how we make requests in the future.
This fixes a regression from 9197c0bdce.
`NimbusAssembleToolsTask` wouldn't find `nimbus-fml` when run on an
on an x86-64 Linux host, because `getArchOs()` returns
`x86_64-unknown-linux` on that platform, while the directory in the
archive is named `x86_64-unknown-linux-musl`. Using `getArchOs()` as
the prefix fixes this for both x86-64 and ARM64 Linux hosts.
`NimbusAssembleToolsTask` now throws if no files match the unzip spec,
to catch issues like this in the future.
This is a substantial refactor of the plugin that should fix issues
like bug 1854254 and bug 1856461, and speed up FML generation. These
bugs had the same underlying cause: the plugin was trying to configure
its tasks without observing the Gradle build lifecycle, leading to
inconsistencies and accidental interdependencies.
Further, because the old tasks didn't declare their inputs or outputs,
Gradle couldn't "see" the full task graph, causing too much or too
little to be rebuilt. The easiest way to force Gradle to pick up the
changes used to be a full clobber.
This commit resolves all those issues by:
* Adopting Gradle's lazy configuration API [1], which lets
Gradle track inputs and outputs, and avoids ordering issues
caused by realizing the task graph at the wrong time [2].
As a bonus, we should be able to work much better with
`afterEvaluate` [3] in m-c's Gradle scripts.
* Adopting the new Android Gradle Plugin `SourceDirectories`
API that supports lazy configuration [4], which
replaces the deprecated `registerJavaGeneratingTask` API.
* Adding task classes and custom types to help with naming
and configuring inputs and outputs.
[1]: https://docs.gradle.org/current/userguide/lazy_configuration.html
[2]: https://docs.gradle.org/current/userguide/task_configuration_avoidance.html
[3]: https://mbonnin.hashnode.dev/my-life-after-afterevaluate
[4]: https://developer.android.com/reference/tools/gradle-api/8.4/com/android/build/api/variant/SourceDirectories
This can be run with `cargo relevancy`.
It currently supports classifying your history and printing the results.
The main way we get history is from sync. You can also point it to a
places DB, but this must be from a mobile FF, we can't read from
desktop.
This gives us nicer errors since where the error happened in the JSON
data. "Unexpected null value" is not very useful, but if that's paired
with `record_custom_details.category_to_domains.version` then you have
a chance of figuring out the bug.
Gave the error a more specific name and updated the error handling to
report the error to sentry if we see it. I don't think this affects
desktop at all, but once we have this on mobile I think we'll want to
surface these errors.
This wasn't really pulling it's weight. The goal was to handle records
with newer schema versions than the client's schema. However, whenever
we update the schema we also reset the database tables and re-ingest
everything. There was some dicsussion on avoiding the DB reset, but no
plans to actually do it, so let's just remove the code.
c6fdae855c caused Send Tab commands to
be sent with `"ttl": "null"` to the FxA server, since Send Tab
doesn't specify an explicit TTL.
This fixes bug 1895711.