## Description
We do not use `jest-dev-server` directly. Other things depend on it so
it doesn't disappear from the lockfile, but no need for the explicit
direct dependency. I confirmed the jest tests still pass in both
packages.
## Description
Updates dependencies to get to ws@8.17.1 (or ws@7.5.10) to address
https://nvd.nist.gov/vuln/detail/CVE-2024-37890. Updating socket.io to
4.8.0 was necessary in some cases get the necessary dependency ranges.
socket.io 4.7.5-4.8.0 is a minor semver update but contains [a breaking
change in the type of the `close()`
function](https://github.com/socketio/socket.io/pull/4971/files), so two
places had to be updated to account for that.
Refresh snapshot lifecycle tests are having timeout issues while running
them against FRS and ODSP drivers. On debugging I was able to detect a
throttling error while requesting updating the snapshot since the
timeout to request a new snapshot was too short. On increasing such
timeout to 1sec instead of 100ms, the throttling error disappear
completely on local tests while running against ODSP without impacting
on test duration (this could be subject to further calibration).
My hypothesis is that while running on the lab, snapshot refresh fails
silently, and given we never refreshed, our deferred promises that
awaits for it never resolves, ending in the timeout.
## Description
Misc improvements found when working on
https://github.com/microsoft/FluidFramework/pull/22874 .
Many of these are places where schema types were not captured properly,
or incorrect typing wasn't detected due to the incorrect variance.
## Breaking Changes
The change to `InsertableObjectFromSchemaRecord` could impact something:
it's strictly a bug fix but its unclear what implications its wrong
version could have. I suspect no realistic code would be broken by these
more correct types, but it's possible something could be impacted.
# @fluid-tools/build-infrastructure
This change introduces a new package containing types and helper
functions that will be used across multiple build-tools packages,
including `@fluidframework/build-tools` and `@fluid-tools/build-cli`.
The primary purpose of this package is to provide a common way to
enumerate Packages, Release Groups, and Workspaces
across a Fluid repo.
**The package is private and nothing depends on it yet.**
## Background
Today both fluid-build and build-cli (flub) need to understand the
layout of the repo - what packages belong to what release groups, what
folders are workspace roots vs. independent packages, etc. We have been
using the types and classes from fluid-build in the flub code, but the
relationship has always been strange because flub has a concept of
release groups while fluid-build uses a MonoRepo class which is
analogous to a Workspace.
Conceptually, Release Groups and Workspaces are distinct, but in code
there is not a strong representation of the "release group" concept. In
flub we use the MonoRepo class and a `type ReleaseGroup = string`
together as a stand-in for the ReleaseGroup concept which has meant that
release groups and workspaces (the MonoRepo class) are more or less the
same thing in practice.
That has always been a very weird situation that has gotten complicated
as the repo has grown, and now we find that we want the benefits of
interdependencies in between packages provided by workspaces but the
flexibility of releasing and versioning groups of packages within those
workspaces. These changes are a step towards that future.
## The goal
The eventual goal state - far beyond this change - is to have all the shared
types and data about the repo layout and other shared code in the
build-infrastructure package, and then have flub and fluid-build be in
individual packages. They shouldn't depend on each other - all the
shared logic - especially the types - should be in the infra package.
## How the API and interfaces were developed
The interfaces exposed by this package were extracted from the uses of
packages and release groups in build-tools and build-cli. The minimal
set of properties and functions necessary by both of those packages was
used. In some cases, duplicative properties or functions were removed.
For example, the Package type had `dependencies` and
`combinedDependencies`; now it just has the latter and the caller can
filter out what it doesn't want or extend the shared interface.
## Description
This gets rid of unnecessary mutations of the commit that becomes the
trunk base commit during trunk trimming in the EditManager. The mutation
of the `revision` property was done to avoid an additional check in the
`getSummaryData()` codepath, and the mutation of the `change` property
was done merely for consistency. This gets rid of both in favor of just
doing the simple check in `getSummaryData()`, since it is best to avoid
mutating objects that are otherwise considered to be immutable.
This PR also:
* adds a test for a scenario that is currently failing - namely,
reverting a commit (e.g. via an undo) when the SharedTree is detached.
This will be fixed in a follow-up PR.
* adds documentation for the special `"root"` revision tag.
## Description
Modify codecoverageAnalysis name to include job attempt number. This is
because the reupload attempts fails, if the folder already exists.
Co-authored-by: Jatin Garg <jatingarg@Jatins-MacBook-Pro-2.local>
## Description
Updates the `get-func-name` transitive dependency to address
[CVE-2023-43646](https://nvd.nist.gov/vuln/detail/CVE-2023-43646). It
came through `chai@4.3.7`; updating to `chai@4.5.0` was enough to get a
fixed version of `get-func-name`. Also did a few updates of transitive
dependency `loupe` from 2.3.6 to 2.3.7 (even though `chai` doesn't
strictly needed it), where they guarantee that their dependency ranges
now point to fixed versions of their dependencies.
ODSP driver reuses sockets for clients connected on the same node
process. When signals are received on the shared socket, we must filter
based on `targetClientId` property to make sure *only* the specified
client receives the signal.
## Description
The `StartupCheck` implementation was consumed by Gitrest and Historian.
However, since the r11 packages consumed by Gitrest and Historian are
not updated as frequently as r11s consumed by repos such as FRS, this
dependency caused a breakage in Gitrest in FRS.
As a result, the solution is to make the startupCheck parameter in
resourceFactories to be more generic - `IReadinessCheck`. This will
prevent any future class dependencies from breaking.
## Breaking Changes
There should be none as this change makes the parameter more 'generic'.
That is, old implementations should still work fine.
---------
Co-authored-by: Tyler Butler <tyler@tylerbutler.com>
## Description
The `StartupCheck` implementation was consumed by Gitrest and Historian.
However, since the r11 packages consumed by Gitrest and Historian are
not updated as frequently as r11s consumed by repos such as FRS, this
dependency caused a breakage in Gitrest in FRS.
As a result, the solution is to make the startupCheck parameter in
resourceFactories to be more generic - `IReadinessCheck`. This will
prevent any future class dependencies from breaking.
## Breaking Changes
There should be none as this change makes the parameter more 'generic'.
That is, old implementations should still work fine.
---------
Co-authored-by: Tyler Butler <tyler@tylerbutler.com>
## Description
- It's
[documented](https://learn.microsoft.com/en-us/azure/devops/pipelines/troubleshooting/troubleshooting?view=azure-devops#variables-having--single-quote-appended)
that ADO logging commands (`echo ##vso[<something>]`) don't interact
well with `set -x`, so undoing that part of
https://github.com/microsoft/FluidFramework/pull/22659 because we use
them in several places. I could have wrapped each use in `set +x;
<command>; set -x` but I think that would warrant a comment every time.
We have lived forever without having each command in a script printed
back to us, so I think it's fine to keep that behavior.
- Adds steps to install the correct node version in the job to upload
dev manifests. It wasn't using it, and since the build agent image made
node 18 available by default (by coincidence?) when installing the
several versions that we install in our image, everything worked. But in
the latest version of the image, re-generated automatically, node 16
seems to be the default (unsure why, separate topic I'm looking into),
and that uncovered this missing step.
The second bullet point is highlighted with a comment below. Everything
else in the PR is about the first one.
Disabling code coverage in internal pipelines to save time since we
don't use this data currently. (We can re-enable this if we end up
needing it later.) I made this change per Alex's suggestion here:
https://github.com/microsoft/FluidFramework/pull/22762/files#r1804966589.
These are the times for the Build stage in my test runs (9m 12s saved on
average):
1. With code coverage: 53m 41s, 58m 3s, 52m 8s, 50m 37s
2. Without code coverage: 46m 35s, 43m 42s, 44m 6s, 43m 16s
See runs for test/nocc branch (ignoring first one since CodeQL Finalize
step takes super long the first time in a branch).
## Description
While we should avoid inline scripts in `Bash@3` tasks in YML pipelines
as much as possible, wherever we have them, we should make them robust.
Adding `set -eux -o pipefail` to prevent common issues (like a
multi-command script failing in the middle but not causing the task to
fail because some other command after it succeeded) seems like a net
good.
Github copilot was pretty helpful to understand exactly what the line
does:
> - set -e: Exit immediately if a command exits with a non-zero status.
> - set -u: Treat unset variables as an error and exit immediately.
> - set -x: Print commands and their arguments as they are executed.
> - -o pipefail: The return value of a pipeline is the status of the
last command to exit with a non-zero status, or zero if no command
exited with a non-zero status.
>
> These options are often used in scripts to make them more robust and
easier to debug.
Motivated by [this pipeline
run](https://dev.azure.com/fluidframework/internal/_build/results?buildId=296239&view=logs&j=39ff69ce-e37a-5454-6dd7-938cc0c22dc6&t=b4f667a1-f59a-5f88-c7c4-0c8c8ea4419c)
(msft internal) where the reported error occurred way after the first
things started to fail but their tasks didn't.
- use `details` for all elements
- SignalLatency: shorten names now that data is packed into details
- SignalLost/SignalOutOfOrder: rename `trackingSequenceNumber` to
`expectedSequenceNumber`
- SignalOutOfOrder: avoid logging `contents.type` for when there it is a
chance it could be customer content.
- update tests:
- logger.assertMatch to use true inlineDetailsProp argument
- explicit `reconnectCount` for SignalLatency
- explicit `expectedSequenceNumber` and
`clientBroadcastSignalSequenceNumber` for SignalLost
This addresses all known `npm audit` issues in the docs project.
The docs project had some old unused dependencies and used the
**download** package which is apparently unmaintained. I replaced it
with a homegrown fetch wrapper called
[dill](https://dill.tylerbutler.com/introduction/).
## Description
Updates the `latest-version` dependency to the (drumroll...) latest
version 😄. No significant changes in [any of the
releases](https://github.com/sindresorhus/latest-version/releases)
(mostly updating Node version support), except the fact that upgrading
their dependency on `package-json` now filters deprecated versions by
default. @tylerbutler do you think that affects us? Do we need to
provide the option explicitly to maintain previous behavior?
The motivation is to get rid of our dependency on `got@9.6.0`, which (in
build-tools) only comes as a transitive dependency of
`latest-version@5.1`.
Skip token cache if the token is about to expire in 5 minutes.
---------
Co-authored-by: Yunho <yunho-macbookpro2024@DESKTOP-M86HBMH.redmond.corp.microsoft.com>
Co-authored-by: Yunho <yunho-macbookpro2024@Yunhos-MacBook-Pro.local>
Co-authored-by: Yunho <yunho-macbookpro2024@Yunhos-MBP.guest.corp.microsoft.com>
fluid-build was failing to load in repos without a config file. This
behavior was inadvertently broken when the `getFluidBuildConfig`
function was changed to throw an error when a config was not found. This
prevented existing default config fallback logic to be used.
This change clarifies what the default config is, returns it
consistently from `getFluidBuildConfig` when no config is found, and
updates the FluidRepo loading logic to account for the change.
## Description
Cleans up imports in example app so it imports everything it can from
`fluid-framework` instead of individual packages. It still needs to
import from `@fluid-experimental/ai-collab` and `@fluidframework/tree`
for `@alpha` APIs.
## Description
This PR follows the r11s PR -
https://github.com/microsoft/FluidFramework/pull/22819 - to remove the
usage of the `StartupCheck` singleton by Historian and Gitrest. This
singleton along with r11 package mismatch caused bugs. Hence, now I pass
the implementation of the startup probe as a resource to the server.
## Breaking Changes
Changes the resourceFactory args of both Gitrest and Historian.
---------
Co-authored-by: Tyler Butler <tyler@tylerbutler.com>
## Description
In protocol handler, we have duplicate for loops that populates Audience
from Quorum members (lines 67-69 already do this). This PR deletes the
duplicate code.
## Description
The singleton implementation of `StartupCheck` causes bugs when the r11
packages do not match the historian packages consumed. Hence, I decided
to switch to a non-singleton implementation. This introduces the
`StartupCheck` as an implementation of `IReadinessCheck`. This probe is
a resource provided to all HTTP services in r11.
## Breaking Changes
Changes Resource and Runner objects for Alfred, Riddler and Nexus to
include the `StartUpCheck` object.
---------
Co-authored-by: Tyler Butler <tyler@tylerbutler.com>
During rollback of delete from string, the delta callback is invoked
before the lengths are updated, which results in the callback code
seeing messed up string content. This fix swaps the order and adds
testing for the events.
We recently updated the pnpm config in the repo to do a lockfile-only
install by default. This updates the build tools to pass the
`--no-frozen-lockfile` flag when installing, so the lockfile will be
updated if needed.
## Description
- The URL for the ai-collab sample app is now `http://localhost:3000`
instead of `http://localhost:3000/tasks-list`, for simplicity. The
README had a typo so it pointed to a bad URL, so that has been fixed to
point to the new simplified URL too.
- Adds a check before using `window` to account for NextJS's server-side
rendering, to avoid an error in the console output when running, and
letting `next build` complete successfully.
- Removes NextJS-related npm scripts that would be important in a prod
application but we don't really use in the sample app.
This change enables changing the default behavior of
IContainerRuntimeWithResolveHandle_Deprecated.resolveHandle by changing
the value of the default wait header which controls if the resolveHandle
call will wait indefinitely for requests it can't handle, or if it will
return 404 error specifying the datastore does not exist:
```
{
"mimeType": "text/plain",
"status": 404,
"value": "not found: <missing id>",
}
```
This behavior can be changed by setting this value in the config
provider:
`"Fluid.ContainerRuntime.WaitHeaderDefault": true`
Additionally, this change adds some telemetry that can help to
understand how wait is being used.
---------
Co-authored-by: jzaffiro <110866475+jzaffiro@users.noreply.github.com>
ProseMirror example crashes due to #17546 when NestBegin/End reference
types were removed without fixing up the handling of the open and close
paragraph markers. I'm adding back the deleted code and creating a
property to indicate begin and end (instead of NestBegin/End) in the
example.
This resulted in:
error.js:127 Uncaught (in promise) DataProcessingError: There is no mark
type referenceRangeLabels in this schema at Mark.fromJSON
(index.js:510:1) at Schema.markFromJSON (index.js:2438:1) at Array.map
(<anonymous>) at Node.fromJSON (index.js:1515:1) at Schema.nodeFromJSON
(index.js:2431:1) at Array.map (<anonymous>) at Fragment.fromJSON
(index.js:319:1) at Node.fromJSON (index.js:1522:1) at
Schema.nodeFromJSON (index.js:2431:1) at new FluidCollabManager
(fluidCollabManager.ts:144:32)
## Description
This PR is to add circuit breaker functionality for scriptorium lambda.
It is to handle the exceptions where service restart is not helpful and
instead, we want to wait and retry again. For example, when mongo db is
unavailable/down, and scriptorium is not able to write ops to the db,
restarting the service doesnt help, instead we would wait and retry
after some time. Circuit Breaker pattern helps in such cases by
maintaining open/closed/halfOpen state.
So in scriptorium, all the calls to db are wrapped by the circuit
breaker, and in case of such errors, the circuit will open and pause the
lambda (i.e. pause the incoming messages). After some time, the circuit
will go to halfOpen state and call a healthCheck function - if it
succeeds, the circuit will close and resume the incoming messages, else
it will stay open and paused.
We can configure various options, like error threshold, reset timeout,
the errors for which we want to engage the circuit breaker, etc. Also if
the circuit is not able to close or resume for some time (configurable),
we will fallback to restarting the service to avoid being in an endless
state of waiting.
This PR is for scriptorium, and once we validate and roll this out in
production, we will add the same pattern for document lambdas too.
Summary of changes made in this PR:
- Circuit Breaker Implementation: Adds a circuit breaker pattern to
scriptorium->db calls, with various configuration options for error
thresholds, reset timeouts, and error filters.
- Pause and Resume Methods: Adds pause and resume methods for lambdas,
context, documentContext, partition, partitionManager, kafkaRunner,
rdKafkaConsumer, and lambda to manage message flow during circuit
breaker states.
- Health Check for MongoDB: Adds a health check method to the MongoDB
class and exposes a healthCheck property from the MongoManager class.
## Testing
- [X] Added unit tests for circuit breaker.
- [X] Tested the scriptorium end to end functionality locally by forcing
the db to be unavailable in the local setup.
- [x] Tested in dev cluster by changing mongo db settings to replicate a
networking error.
We will roll this out slowly by testing in each ring.
---------
Co-authored-by: Shubhangi Agarwal <shuagarwal@microsoft.com>
Bumped client from 2.4.0 to 2.5.0.
Some new packages in the client had `workspace:^` deps that were
corrected to use tilde, so the lockfile had to be updated.
Command used:
```shell
flub release -g client -v
```