Граф коммитов

21362 Коммитов

Автор SHA1 Сообщение Дата
Alex Villarreal 7a8c8d0e0f
refactor(devtools): Remove unnecessary explicit dependency (#22882)
## Description

We do not use `jest-dev-server` directly. Other things depend on it so
it doesn't disappear from the lockfile, but no need for the explicit
direct dependency. I confirmed the jest tests still pass in both
packages.
2024-10-23 20:46:22 +00:00
Daniel Madrid 1dc40df5ca
Treat t9s as local for refresh snapshot tests (#22881)
Following up https://github.com/microsoft/FluidFramework/pull/22871 PR,
treat t9s as local to avoid possible timeouts or issues while running
against it since we haven't seen any using current 100ms timeout
2024-10-23 12:08:14 -07:00
Alex Villarreal b10c0a9729
refactor: Update `less` dependency (#22862)
## Description

Update the `less` dependency to remove some references to optional
dependencies that are flagged as having CVEs.
2024-10-23 14:02:35 -05:00
Alex Villarreal 1b5ced0d94
refactor(server): Update ws to address CVE (#22845)
## Description

Updates dependencies to get to ws@8.17.1 (or ws@7.5.10) to address
https://nvd.nist.gov/vuln/detail/CVE-2024-37890. Updating socket.io to
4.8.0 was necessary in some cases get the necessary dependency ranges.

socket.io 4.7.5-4.8.0 is a minor semver update but contains [a breaking
change in the type of the `close()`
function](https://github.com/socketio/socket.io/pull/4971/files), so two
places had to be updated to account for that.
2024-10-23 14:01:25 -05:00
Daniel Madrid 4b8d1c5408
Increase snapshotRefreshTimeoutMs for non local drivers on e2e tests (#22871)
Refresh snapshot lifecycle tests are having timeout issues while running
them against FRS and ODSP drivers. On debugging I was able to detect a
throttling error while requesting updating the snapshot since the
timeout to request a new snapshot was too short. On increasing such
timeout to 1sec instead of 100ms, the throttling error disappear
completely on local tests while running against ODSP without impacting
on test duration (this could be subject to further calibration).

My hypothesis is that while running on the lab, snapshot refresh fails
silently, and given we never refreshed, our deferred promises that
awaits for it never resolves, ending in the timeout.
2024-10-23 10:38:28 -07:00
Craig Macomber (Microsoft) 7405cf5ad2
tree: Cleanup related to contravariant input types (#22875)
## Description

Misc improvements found when working on
https://github.com/microsoft/FluidFramework/pull/22874 .

Many of these are places where schema types were not captured properly,
or incorrect typing wasn't detected due to the incorrect variance.

## Breaking Changes

The change to `InsertableObjectFromSchemaRecord` could impact something:
it's strictly a bug fix but its unclear what implications its wrong
version could have. I suspect no realistic code would be broken by these
more correct types, but it's possible something could be impacted.
2024-10-23 00:01:20 +00:00
Tyler Butler b8e887ead4
feat(build-tools): Add build-infrastructure package (#22853)
# @fluid-tools/build-infrastructure

This change introduces a new package containing types and helper
functions that will be used across multiple build-tools packages,
including `@fluidframework/build-tools` and `@fluid-tools/build-cli`.

The primary purpose of this package is to provide a common way to
enumerate Packages, Release Groups, and Workspaces
across a Fluid repo.

**The package is private and nothing depends on it yet.**

## Background

Today both fluid-build and build-cli (flub) need to understand the
layout of the repo - what packages belong to what release groups, what
folders are workspace roots vs. independent packages, etc. We have been
using the types and classes from fluid-build in the flub code, but the
relationship has always been strange because flub has a concept of
release groups while fluid-build uses a MonoRepo class which is
analogous to a Workspace.

Conceptually, Release Groups and Workspaces are distinct, but in code
there is not a strong representation of the "release group" concept. In
flub we use the MonoRepo class and a `type ReleaseGroup = string`
together as a stand-in for the ReleaseGroup concept which has meant that
release groups and workspaces (the MonoRepo class) are more or less the
same thing in practice.

That has always been a very weird situation that has gotten complicated
as the repo has grown, and now we find that we want the benefits of
interdependencies in between packages provided by workspaces but the
flexibility of releasing and versioning groups of packages within those
workspaces. These changes are a step towards that future.

## The goal

The eventual goal state - far beyond this change - is to have all the shared
types and data about the repo layout and other shared code in the
build-infrastructure package, and then have flub and fluid-build be in
individual packages. They shouldn't depend on each other - all the
shared logic - especially the types - should be in the infra package.

## How the API and interfaces were developed

The interfaces exposed by this package were extracted from the uses of
packages and release groups in build-tools and build-cli. The minimal
set of properties and functions necessary by both of those packages was
used. In some cases, duplicative properties or functions were removed.
For example, the Package type had `dependencies` and
`combinedDependencies`; now it just has the latter and the caller can
filter out what it doesn't want or extend the shared interface.
2024-10-22 15:44:59 -07:00
tyler-cai-microsoft 6251e16cf1
Data Migration: Create some testing infrastructure for testing data migration solutions (#22830)
[AB#20657](https://dev.azure.com/fluidframework/235294da-091d-4c29-84fc-cdfc3d90890b/_workitems/edit/20657)

Build some basic testing infrastructure for data migration to save us
some time from having to write the tests over and over again. Obviously,
these tests can be iterated over and are not complete, but they should
be a good baseline starting point.
2024-10-22 13:11:52 -07:00
Noah Encke a562685d11
When trimming the EditManager trunk, avoid mutating revision and change of the new trunk base commit (#22869)
## Description

This gets rid of unnecessary mutations of the commit that becomes the
trunk base commit during trunk trimming in the EditManager. The mutation
of the `revision` property was done to avoid an additional check in the
`getSummaryData()` codepath, and the mutation of the `change` property
was done merely for consistency. This gets rid of both in favor of just
doing the simple check in `getSummaryData()`, since it is best to avoid
mutating objects that are otherwise considered to be immutable.

This PR also:
* adds a test for a scenario that is currently failing - namely,
reverting a commit (e.g. via an undo) when the SharedTree is detached.
This will be fixed in a follow-up PR.
* adds documentation for the special `"root"` revision tag.
2024-10-22 13:07:59 -07:00
Craig Macomber (Microsoft) 05197d6d3f
tree: Improve non-schema aware typing (#22763)
## Description

See changeset

## Breaking Changes

See changeset
2024-10-22 12:31:01 -07:00
Jatin Garg 5806a27f6e
Modify codecoverageAnalysis name to include job attempt number (#22870)
## Description

Modify codecoverageAnalysis name to include job attempt number. This is
because the reupload attempts fails, if the folder already exists.

Co-authored-by: Jatin Garg <jatingarg@Jatins-MacBook-Pro-2.local>
2024-10-22 19:28:30 +00:00
Alex Villarreal 5c17a9c053
refactor: Update `get-func-name` dependency to address CVE (#22861)
## Description

Updates the `get-func-name` transitive dependency to address
[CVE-2023-43646](https://nvd.nist.gov/vuln/detail/CVE-2023-43646). It
came through `chai@4.3.7`; updating to `chai@4.5.0` was enough to get a
fixed version of `get-func-name`. Also did a few updates of transitive
dependency `loupe` from 2.3.6 to 2.3.7 (even though `chai` doesn't
strictly needed it), where they guarantee that their dependency ranges
now point to fixed versions of their dependencies.
2024-10-22 11:52:42 -05:00
WillieHabi c61fb608f2
odspDocumentDeltaConnection targetClientId check (#22749)
ODSP driver reuses sockets for clients connected on the same node
process. When signals are received on the shared socket, we must filter
based on `targetClientId` property to make sure *only* the specified
client receives the signal.
2024-10-22 07:49:37 -07:00
dhr-verma b6aa07a614
Replaced StartupCheck with IReadinessCheck in Gitrest and Historian (#22868)
## Description

The `StartupCheck` implementation was consumed by Gitrest and Historian.
However, since the r11 packages consumed by Gitrest and Historian are
not updated as frequently as r11s consumed by repos such as FRS, this
dependency caused a breakage in Gitrest in FRS.

As a result, the solution is to make the startupCheck parameter in
resourceFactories to be more generic - `IReadinessCheck`. This will
prevent any future class dependencies from breaking.

## Breaking Changes

There should be none as this change makes the parameter more 'generic'.
That is, old implementations should still work fine.

---------

Co-authored-by: Tyler Butler <tyler@tylerbutler.com>
2024-10-21 23:01:11 -07:00
dhr-verma 71ad22bea0
Replaced StartupCheck with IReadinessCheck (#22867)
## Description

The `StartupCheck` implementation was consumed by Gitrest and Historian.
However, since the r11 packages consumed by Gitrest and Historian are
not updated as frequently as r11s consumed by repos such as FRS, this
dependency caused a breakage in Gitrest in FRS.

As a result, the solution is to make the startupCheck parameter in
resourceFactories to be more generic - `IReadinessCheck`. This will
prevent any future class dependencies from breaking.

## Breaking Changes

There should be none as this change makes the parameter more 'generic'.
That is, old implementations should still work fine.

---------

Co-authored-by: Tyler Butler <tyler@tylerbutler.com>
2024-10-21 19:02:01 -07:00
Alex Villarreal 0ef059c270
refactor(server): Update dependencies to remove `ip` (CVE) (#22860)
## Description

Updates dependencies so we get rid of the transitive dependency on `ip`
which is flagged for a CVE.
2024-10-21 17:18:51 -05:00
Alex Villarreal bd1807ffe0
refactor(ci): Fix for ADO logging commands and fix for robustness (#22858)
## Description

- It's
[documented](https://learn.microsoft.com/en-us/azure/devops/pipelines/troubleshooting/troubleshooting?view=azure-devops#variables-having--single-quote-appended)
that ADO logging commands (`echo ##vso[<something>]`) don't interact
well with `set -x`, so undoing that part of
https://github.com/microsoft/FluidFramework/pull/22659 because we use
them in several places. I could have wrapped each use in `set +x;
<command>; set -x` but I think that would warrant a comment every time.
We have lived forever without having each command in a script printed
back to us, so I think it's fine to keep that behavior.
- Adds steps to install the correct node version in the job to upload
dev manifests. It wasn't using it, and since the build agent image made
node 18 available by default (by coincidence?) when installing the
several versions that we install in our image, everything worked. But in
the latest version of the image, re-generated automatically, node 16
seems to be the default (unsure why, separate topic I'm looking into),
and that uncovered this missing step.

The second bullet point is highlighted with a comment below. Everything
else in the PR is about the first one.
2024-10-21 15:35:22 -05:00
Scarlett Lee e0679665ca
Disable code coverage for internal pipeline runs (#22855)
Disabling code coverage in internal pipelines to save time since we
don't use this data currently. (We can re-enable this if we end up
needing it later.) I made this change per Alex's suggestion here:
https://github.com/microsoft/FluidFramework/pull/22762/files#r1804966589.

These are the times for the Build stage in my test runs (9m 12s saved on
average):
1. With code coverage: 53m 41s, 58m 3s, 52m 8s, 50m 37s
2. Without code coverage: 46m 35s, 43m 42s, 44m 6s, 43m 16s

See runs for test/nocc branch (ignoring first one since CodeQL Finalize
step takes super long the first time in a branch).
2024-10-21 13:32:42 -07:00
Craig Macomber (Microsoft) f688418771
Remove tsc-multi leftovers (#22789)
## Description

Remove remaining references to tsc-multi.
2024-10-21 12:49:54 -07:00
Alex Villarreal 8915f8fb89
refactor(ci): Add `set -eux -o pipefail` to all bash@3 tasks (#22659)
## Description

While we should avoid inline scripts in `Bash@3` tasks in YML pipelines
as much as possible, wherever we have them, we should make them robust.
Adding `set -eux -o pipefail` to prevent common issues (like a
multi-command script failing in the middle but not causing the task to
fail because some other command after it succeeded) seems like a net
good.

Github copilot was pretty helpful to understand exactly what the line
does:

> - set -e: Exit immediately if a command exits with a non-zero status.
> - set -u: Treat unset variables as an error and exit immediately.
> - set -x: Print commands and their arguments as they are executed.
> - -o pipefail: The return value of a pipeline is the status of the
last command to exit with a non-zero status, or zero if no command
exited with a non-zero status.
>
> These options are often used in scripts to make them more robust and
easier to debug.

Motivated by [this pipeline
run](https://dev.azure.com/fluidframework/internal/_build/results?buildId=296239&view=logs&j=39ff69ce-e37a-5454-6dd7-938cc0c22dc6&t=b4f667a1-f59a-5f88-c7c4-0c8c8ea4419c)
(msft internal) where the reported error occurred way after the first
things started to fail but their tasks didn't.
2024-10-21 11:39:18 -05:00
Jason Hartman e6566f6358
fix(client): reformat signal telemetry details (#22804)
- use `details` for all elements
- SignalLatency: shorten names now that data is packed into details
- SignalLost/SignalOutOfOrder: rename `trackingSequenceNumber` to
`expectedSequenceNumber`
- SignalOutOfOrder: avoid logging `contents.type` for when there it is a
chance it could be customer content.
- update tests:
  - logger.assertMatch to use true inlineDetailsProp argument
  - explicit `reconnectCount` for SignalLatency
- explicit `expectedSequenceNumber` and
`clientBroadcastSignalSequenceNumber` for SignalLost
2024-10-18 23:58:18 +00:00
Jason Hartman c0aa7fb764
test(client): presence tracker improvements (#22737)
+ add tinylicious reference and start script
+ add webpack source map
2024-10-18 16:30:37 -07:00
Tyler Butler d4f8a58e2e
build(docs): Remove and/or replace outdated dependencies (#22269)
This addresses all known `npm audit` issues in the docs project.

The docs project had some old unused dependencies and used the
**download** package which is apparently unmaintained. I replaced it
with a homegrown fetch wrapper called
[dill](https://dill.tylerbutler.com/introduction/).
2024-10-18 20:08:32 +00:00
Alex Villarreal cf339a9872
refactor: Update `latest-version` dependency (#22848)
## Description

Updates the `latest-version` dependency to the (drumroll...) latest
version 😄. No significant changes in [any of the
releases](https://github.com/sindresorhus/latest-version/releases)
(mostly updating Node version support), except the fact that upgrading
their dependency on `package-json` now filters deprecated versions by
default. @tylerbutler do you think that affects us? Do we need to
provide the option explicitly to maintain previous behavior?

The motivation is to get rid of our dependency on `got@9.6.0`, which (in
build-tools) only comes as a transitive dependency of
`latest-version@5.1`.
2024-10-18 13:05:56 -05:00
Jatin Garg 3ed9d075bb
Add code coverage reporting to client build yaml file (#22762)
## Description


[AB#15187](https://dev.azure.com/fluidframework/235294da-091d-4c29-84fc-cdfc3d90890b/_workitems/edit/15187)

Add code coverage reporting to client build yaml file.

---------

Co-authored-by: Jatin Garg <jatingarg@Jatins-MacBook-Pro-2.local>
Co-authored-by: Alex Villarreal <716334+alexvy86@users.noreply.github.com>
Co-authored-by: Tyler Butler <tyler@tylerbutler.com>
2024-10-17 14:26:57 -07:00
jzaffiro a6e1efe9da
Disable parallel forks flaky test (#22837)
This test is known to be flaky, and causing noise for OCE. Disabling for
now, until a fix can be published.


[AB#14900](https://dev.azure.com/fluidframework/235294da-091d-4c29-84fc-cdfc3d90890b/_workitems/edit/14900),
[AB#20297](https://dev.azure.com/fluidframework/235294da-091d-4c29-84fc-cdfc3d90890b/_workitems/edit/20297)
2024-10-17 13:33:07 -07:00
Tyler Butler 6095d7f4f2
build: Remove references to deleted readme-command package (#22831)
The readme-command package was deleted some time ago but there were
still some references to it in configs and comments.
2024-10-17 12:36:37 -07:00
yunho-microsoft 4df3b36633
Fix token cache error: invalid expire time (#22761)
Skip token cache if the token is about to expire in 5 minutes.

---------

Co-authored-by: Yunho <yunho-macbookpro2024@DESKTOP-M86HBMH.redmond.corp.microsoft.com>
Co-authored-by: Yunho <yunho-macbookpro2024@Yunhos-MacBook-Pro.local>
Co-authored-by: Yunho <yunho-macbookpro2024@Yunhos-MBP.guest.corp.microsoft.com>
2024-10-17 18:45:06 +00:00
Tyler Butler ee61ed119a
improvement(fluid-build): Optimize timer calls (#22834)
fluid-build's main function was unnecessarily calling
timer.getTotalTime() repeatedly.
2024-10-17 10:56:13 -07:00
Tyler Butler 88843657f4
fix(fluid-build): Load default config when no config is found (#22825)
fluid-build was failing to load in repos without a config file. This
behavior was inadvertently broken when the `getFluidBuildConfig`
function was changed to throw an error when a config was not found. This
prevented existing default config fallback logic to be used.

This change clarifies what the default config is, returns it
consistently from `getFluidBuildConfig` when no config is found, and
updates the FluidRepo loading logic to account for the change.
2024-10-17 10:01:47 -07:00
Alex Villarreal 693dcf25b5
refactor(examples/ai-collab): Clean up imports (#22824)
## Description

Cleans up imports in example app so it imports everything it can from
`fluid-framework` instead of individual packages. It still needs to
import from `@fluid-experimental/ai-collab` and `@fluidframework/tree`
for `@alpha` APIs.
2024-10-16 15:35:17 -05:00
dhr-verma 878582128c
Removed singleton Startup check usage by Historian and Gitrest (#22826)
## Description

This PR follows the r11s PR -
https://github.com/microsoft/FluidFramework/pull/22819 - to remove the
usage of the `StartupCheck` singleton by Historian and Gitrest. This
singleton along with r11 package mismatch caused bugs. Hence, now I pass
the implementation of the startup probe as a resource to the server.

## Breaking Changes

Changes the resourceFactory args of both Gitrest and Historian.

---------

Co-authored-by: Tyler Butler <tyler@tylerbutler.com>
2024-10-16 20:34:41 +00:00
Tyler Butler 65ec324a2a
build(client): Update type test baselines to ~2.4.0 (#22817)
Commands used:

```shell
pnpm exec flub typetests -g client --reset --normalize --exact '~2.4.0'
pnpm i --no-frozen-lockfile
pnpm run typetests:gen
```
2024-10-16 20:26:57 +00:00
jzaffiro 530b0acdd5
Allow quorum member removal when detached (#22812)
When a quorum member is removed and we do the associated cleanup, the
assumption was that the runtime was attached. This is not necessarily
true, and causes this assert to fail in valid cases.


[AB#19980](https://dev.azure.com/fluidframework/235294da-091d-4c29-84fc-cdfc3d90890b/_workitems/edit/19980)
2024-10-16 18:12:48 +00:00
WillieHabi 6e55c5a816
Remove duplicate code for Audience population from snapshot (#22801)
## Description
In protocol handler, we have duplicate for loops that populates Audience
from Quorum members (lines 67-69 already do this). This PR deletes the
duplicate code.
2024-10-16 07:55:06 -07:00
Tyler Butler f2a97b2a9d
build(client,build-tools): Upgrade biome to 1.9.3 (#22721) 2024-10-15 19:41:20 -07:00
Sonali Deshpande dabc03d236
Add `.legacy-compat` release report (#22687)
This PR updates release reports to incorporate a `.legacy-compat.json`
file, allowing to specify and manage the declared compatible version
range. The legacy compat range can be extended to any release group or
independent package.

### Example

With a corrected configuration as shown below:

```json
      releaseReport: {
		legacyCompatInterval: {
			"client": 25,
                        "@fluidframework/eslint-config-fluid": 10,
		},
	}
```

The legacy compatibility release report will appear as follows:

```json
...
	"@fluid-internal/platform-dependent": ">=2.3.1 <2.25.0",
	"@fluid-internal/replay-tool": ">=2.3.1 <2.25.0",
	"@fluid-internal/tablebench": ">=2.3.1 <2.25.0",
	"@fluid-internal/test-driver-definitions": ">=2.3.1 <2.25.0",
	"@fluid-internal/test-service-load": ">=2.3.1 <2.25.0",
	"@fluid-internal/test-snapshots": ">=2.3.1 <2.25.0",
	"@fluid-private/changelog-generator-wrapper": ">=2.3.1 <2.25.0",
	"@fluid-private/stochastic-test-utils": ">=2.3.1 <2.25.0",
	"@fluid-private/test-dds-utils": ">=2.3.1 <2.25.0",
	"@fluid-private/test-drivers": ">=2.3.1 <2.25.0",
	"@fluid-private/test-end-to-end-tests": ">=2.3.1 <2.25.0",
	"@fluid-private/test-loader-utils": ">=2.3.1 <2.25.0",
	"@fluid-private/test-pairwise-generator": ">=2.3.1 <2.25.0",
	"@fluid-private/test-version-utils": ">=2.3.1 <2.25.0",
	"@fluid-tools/benchmark": "^0.50.0",
	"@fluid-tools/build-cli": "^0.47.0",
	"@fluid-tools/fetch-tool": ">=2.3.1 <2.25.0",
	"@fluid-tools/markdown-magic": ">=2.3.1 <2.25.0",
	"@fluid-tools/version-tools": "^0.47.0",
	"@fluidframework/agent-scheduler": ">=2.3.1 <2.25.0",
	"@fluidframework/app-insights-logger": ">=2.3.1 <2.25.0",
	"@fluidframework/aqueduct": ">=2.3.1 <2.25.0",
	"@fluidframework/azure-client": ">=2.3.1 <2.25.0",
	"@fluidframework/azure-end-to-end-tests": ">=2.3.1 <2.25.0",
	"@fluidframework/azure-local-service": ">=2.3.1 <2.25.0",
	"@fluidframework/azure-service-utils": ">=2.3.1 <2.25.0",
	"@fluidframework/build-common": "^2.0.3",
	"@fluidframework/build-tools": "^0.47.0",
	"@fluidframework/bundle-size-tools": "^0.47.0",
	"@fluidframework/cell": ">=2.3.1 <2.25.0",
	"@fluidframework/container-definitions": ">=2.3.1 <2.25.0",
	...
```

For a complete view of the legacy compatibility settings with an
interval of 25, refer to the commit:
29e58739a3
2024-10-16 01:48:45 +00:00
dhr-verma c5d7bde895
Removed the singleton implementation of the startup probe (#22819)
## Description

The singleton implementation of `StartupCheck` causes bugs when the r11
packages do not match the historian packages consumed. Hence, I decided
to switch to a non-singleton implementation. This introduces the
`StartupCheck` as an implementation of `IReadinessCheck`. This probe is
a resource provided to all HTTP services in r11.

## Breaking Changes

Changes Resource and Runner objects for Alfred, Riddler and Nexus to
include the `StartUpCheck` object.

---------

Co-authored-by: Tyler Butler <tyler@tylerbutler.com>
2024-10-15 17:16:03 -07:00
Scarlett Lee 027f5985df
Update lengths before delta event in rollback (#22818)
During rollback of delete from string, the delta callback is invoked
before the lengths are updated, which results in the callback code
seeing messed up string content. This fix swaps the order and adds
testing for the events.
2024-10-15 16:05:22 -07:00
Tyler Butler 0334d003b0
fix(build-tools): Run install with `--no-frozen-lockfile` (#22814)
We recently updated the pnpm config in the repo to do a lockfile-only
install by default. This updates the build tools to pass the
`--no-frozen-lockfile` flag when installing, so the lockfile will be
updated if needed.
2024-10-15 15:45:03 -07:00
Jatin Garg a390d2f246
build(client): Bump build-tools dependencies to 0.49.0 (#22808)
build(client): Bump build-tools dependencies to 0.49.0

---------

Co-authored-by: Jatin Garg <jatingarg@Jatins-MacBook-Pro-2.local>
2024-10-15 19:28:02 +00:00
Tyler Butler d4d3ac1ec4
build(build-tools): Add vscode workspace for build-tools (#22810) 2024-10-15 12:06:53 -07:00
Alex Villarreal 065969e7fd
refactor(examples/ai-collab): Simplify and fix a few things (#22807)
## Description

- The URL for the ai-collab sample app is now `http://localhost:3000`
instead of `http://localhost:3000/tasks-list`, for simplicity. The
README had a typo so it pointed to a bad URL, so that has been fixed to
point to the new simplified URL too.
- Adds a check before using `window` to account for NextJS's server-side
rendering, to avoid an error in the console output when running, and
letting `next build` complete successfully.
- Removes NextJS-related npm scripts that would be important in a prod
application but we don't really use in the sample app.
2024-10-15 18:48:51 +00:00
Tony Murphy 2e1a70eb9e
Default Runtime Request Header Wait Config and Telemetry (#22769)
This change enables changing the default behavior of
 IContainerRuntimeWithResolveHandle_Deprecated.resolveHandle by changing
the value of the default wait header which controls if the resolveHandle
call will wait indefinitely for requests it can't handle, or if it will
return 404 error specifying the datastore does not exist:
```
{
  "mimeType": "text/plain",
  "status": 404,
  "value": "not found: <missing id>",
}
```
This behavior can be changed by setting this value in the config
provider:
`"Fluid.ContainerRuntime.WaitHeaderDefault": true`

Additionally, this change adds some telemetry that can help to
understand how wait is being used.

---------

Co-authored-by: jzaffiro <110866475+jzaffiro@users.noreply.github.com>
2024-10-15 17:15:19 +00:00
Scarlett Lee aa4b2a9f21
Fix handling of ProseMirror open and close markers (#22736)
ProseMirror example crashes due to #17546 when NestBegin/End reference
types were removed without fixing up the handling of the open and close
paragraph markers. I'm adding back the deleted code and creating a
property to indicate begin and end (instead of NestBegin/End) in the
example.

This resulted in:
error.js:127 Uncaught (in promise) DataProcessingError: There is no mark
type referenceRangeLabels in this schema at Mark.fromJSON
(index.js:510:1) at Schema.markFromJSON (index.js:2438:1) at Array.map
(<anonymous>) at Node.fromJSON (index.js:1515:1) at Schema.nodeFromJSON
(index.js:2431:1) at Array.map (<anonymous>) at Fragment.fromJSON
(index.js:319:1) at Node.fromJSON (index.js:1522:1) at
Schema.nodeFromJSON (index.js:2431:1) at new FluidCollabManager
(fluidCollabManager.ts:144:32)
2024-10-15 10:01:14 -07:00
Shubhangi 256bf0899c
Circuit breaker implementation in scriptorium (#22730)
## Description

This PR is to add circuit breaker functionality for scriptorium lambda.
It is to handle the exceptions where service restart is not helpful and
instead, we want to wait and retry again. For example, when mongo db is
unavailable/down, and scriptorium is not able to write ops to the db,
restarting the service doesnt help, instead we would wait and retry
after some time. Circuit Breaker pattern helps in such cases by
maintaining open/closed/halfOpen state.

So in scriptorium, all the calls to db are wrapped by the circuit
breaker, and in case of such errors, the circuit will open and pause the
lambda (i.e. pause the incoming messages). After some time, the circuit
will go to halfOpen state and call a healthCheck function - if it
succeeds, the circuit will close and resume the incoming messages, else
it will stay open and paused.

We can configure various options, like error threshold, reset timeout,
the errors for which we want to engage the circuit breaker, etc. Also if
the circuit is not able to close or resume for some time (configurable),
we will fallback to restarting the service to avoid being in an endless
state of waiting.

This PR is for scriptorium, and once we validate and roll this out in
production, we will add the same pattern for document lambdas too.

Summary of changes made in this PR: 
- Circuit Breaker Implementation: Adds a circuit breaker pattern to
scriptorium->db calls, with various configuration options for error
thresholds, reset timeouts, and error filters.
- Pause and Resume Methods: Adds pause and resume methods for lambdas,
context, documentContext, partition, partitionManager, kafkaRunner,
rdKafkaConsumer, and lambda to manage message flow during circuit
breaker states.
- Health Check for MongoDB: Adds a health check method to the MongoDB
class and exposes a healthCheck property from the MongoManager class.

## Testing

- [X] Added unit tests for circuit breaker.
- [X] Tested the scriptorium end to end functionality locally by forcing
the db to be unavailable in the local setup.
- [x] Tested in dev cluster by changing mongo db settings to replicate a
networking error.

We will roll this out slowly by testing in each ring.

---------

Co-authored-by: Shubhangi Agarwal <shuagarwal@microsoft.com>
2024-10-15 09:35:40 -07:00
Alex Villarreal ae197872de
refactor: Disable CodeQL in test pipelines (#22802)
## Description

Disables the auto-injection of CodeQL tasks in our test pipelines. The
CodeQL tasks always "fail" (the ADO step still succeeds but the task
can't find source code so it can't run CodeQL and logs warnings) in the
test pipelines for which we don't checkout the repo, which is the case
for our test pipelines.
[Example](https://dev.azure.com/fluidframework/internal/_build/results?buildId=300062&view=logs&j=26483432-40f5-552c-5792-70aed8cde738&t=fb288d9b-a668-57e8-da86-3cec1930f710&l=55)
(msft internal). The task itself says that it should only run for build
pipelines (where source code exists).
2024-10-15 09:21:50 -05:00
Tyler Butler b053eb67bd
build(client): Generate changelogs for 2.4 release (#22798) 2024-10-14 19:38:11 -07:00
Jatin Garg fcd15e0cb3
[bump] build-tools: 0.49.0 => 0.50.0 (minor) (#22800)
Bumped build-tools from 0.49.0 to 0.50.0.

Co-authored-by: Jatin Garg <jatingarg@Jatins-MacBook-Pro-2.local>
2024-10-14 16:04:54 -07:00
Tyler Butler 3d5cbdc22c
[bump] client: 2.4.0 => 2.5.0 (minor) (#22799)
Bumped client from 2.4.0 to 2.5.0.

Some new packages in the client had `workspace:^` deps that were
corrected to use tilde, so the lockfile had to be updated.

Command used:

```shell
flub release -g client -v
```
2024-10-14 15:57:23 -07:00