FluidFramework

Граф коммитов

Автор	SHA1	Сообщение	Дата
Zach Newton	7abf5cfd3b	fix(gitrest): Handle FileSystem Errors in HTTP Responses (#22986 ) ## Description Currently, there are some filesystem operations in Gitrest that result in a generic 400 HTTP error code, rather than a helpful HTTP status and message based on the error that occurred. This PR adds some wrapper functions that help determine if an error is a FileSystemError (or RedisFSError, which is similar) and bubble that up as a NetworkError that can be parsed for the HTTP response.	2024-11-05 13:04:13 -08:00
Alex Villarreal	c56a5218b8	refactor: Update dependencies so path-to-regexp gets to a version without CVE (#22928 ) ## Description Updates our transitive dependencies on `path-to-regexp` to versions that fixed https://nvd.nist.gov/vuln/detail/CVE-2024-45296 . Accomplished by updating our direct dependencies on `sinon` to a mix of version 18 and 19, since that's the main way in which we get transitive dependencies on `path-to-regexp`. `@types/sinon` was also opportunistically updated to the latest version where it wasn't already up to date.	2024-10-30 11:48:31 -05:00
Alex Villarreal	1edf5091de	refactor: Update transitive dependency on `tar` to address CVE (#22932 ) ## Description Updates `tar` to version `6.2.1` to address https://nvd.nist.gov/vuln/detail/CVE-2024-28863 . Done by adding a `pnpm.overrides` entry `"tar": "^6.2.1"` to the package.json of each of the affected packages, running `pnpm i --no-frozen-lockfile`, then removing the override from package.json and running the same command again.	2024-10-30 11:46:01 -05:00
Alex Villarreal	c8ea391f49	refactor: Update dependency on cookie to address CVE (#22847 ) ## Description Updates the `cookie` dependency to address [a [CVE](https://nvd.nist.gov/vuln/detail/CVE-2021-23368) in the `cookie` package](https://nvd.nist.gov/vuln/detail/CVE-2024-47764). This required updating `express` since it declares a hardcoded (no range) dependency on `cookie`.	2024-10-25 10:40:56 -05:00
kekachmar	ff6a3bf1b8	add isEphemeralContaner to relevant metrics (#22890 ) Adds `isEphemeralContainer` as a property to relevant metrics	2024-10-24 20:04:24 +00:00
dhr-verma	b621e4a25d	Enabled support for a token issuance endpoint in Alfred (#22884 ) ## Description This PR adds support for a fluid token issuance endpoint. This endpoint can be used to issue fluid access tokens based on a custom implementation of the `IFluidAccessTokenGenerator` interface. One use of this endpoint is to enable access using cloud identity providers such as Entra-ID. Alfred has a new endpoint - `api/v1/tenants/:tenantid/accesstoken`. This endpoint expects a `Bearer` token and creates an access token for a given fluid tenant. This PR also adds support to inject a custom implementation of the `IFluidAccessTokenGenerator`. This can be used to implement any custom logic that is needed for a business use case. It also has unit tests for the following: 1) Throttling of the endpoint 2) Validation of cases where a `Bearer` token is not provided or an invalid authorization method is used 3) Validation of a valid token creation path 4) Validation of token creation failure due to invalid token signature and due to unauthorized access ## Breaking Changes This PR adds a new resourceFactory arg of type `IFluidAccessTokenGenerator`. This arg is customizable as well. --------- Co-authored-by: Tyler Butler <tyler@tylerbutler.com>	2024-10-24 17:48:47 +00:00
Alex Villarreal	9901775f65	refactor: Update build-tools dev deps in common-utils, protocol-definitions, server/historian, and server/gitrest (addressesCVE-2024-43788) (#22885 ) ## Description Updates the build-tools and build-cli dev dependencies in protocol-definitions, common-utils, server/historian, and server/gitrest. This gets webpack updated to the latest 5.x version, which addresses https://nvd.nist.gov/vuln/detail/CVE-2024-43788 . Also updates eslint-config-fluid in protocol-definitions, server/historian, and server/gitrest, just to keep with the latest version.	2024-10-24 10:46:27 -05:00
Alex Villarreal	1b5ced0d94	refactor(server): Update ws to address CVE (#22845 ) ## Description Updates dependencies to get to ws@8.17.1 (or ws@7.5.10) to address https://nvd.nist.gov/vuln/detail/CVE-2024-37890. Updating socket.io to 4.8.0 was necessary in some cases get the necessary dependency ranges. socket.io 4.7.5-4.8.0 is a minor semver update but contains [a breaking change in the type of the `close()` function](https://github.com/socketio/socket.io/pull/4971/files), so two places had to be updated to account for that.	2024-10-23 14:01:25 -05:00
dhr-verma	b6aa07a614	Replaced StartupCheck with IReadinessCheck in Gitrest and Historian (#22868 ) ## Description The `StartupCheck` implementation was consumed by Gitrest and Historian. However, since the r11 packages consumed by Gitrest and Historian are not updated as frequently as r11s consumed by repos such as FRS, this dependency caused a breakage in Gitrest in FRS. As a result, the solution is to make the startupCheck parameter in resourceFactories to be more generic - `IReadinessCheck`. This will prevent any future class dependencies from breaking. ## Breaking Changes There should be none as this change makes the parameter more 'generic'. That is, old implementations should still work fine. --------- Co-authored-by: Tyler Butler <tyler@tylerbutler.com>	2024-10-21 23:01:11 -07:00
dhr-verma	71ad22bea0	Replaced StartupCheck with IReadinessCheck (#22867 ) ## Description The `StartupCheck` implementation was consumed by Gitrest and Historian. However, since the r11 packages consumed by Gitrest and Historian are not updated as frequently as r11s consumed by repos such as FRS, this dependency caused a breakage in Gitrest in FRS. As a result, the solution is to make the startupCheck parameter in resourceFactories to be more generic - `IReadinessCheck`. This will prevent any future class dependencies from breaking. ## Breaking Changes There should be none as this change makes the parameter more 'generic'. That is, old implementations should still work fine. --------- Co-authored-by: Tyler Butler <tyler@tylerbutler.com>	2024-10-21 19:02:01 -07:00
Alex Villarreal	0ef059c270	refactor(server): Update dependencies to remove `ip` (CVE) (#22860 ) ## Description Updates dependencies so we get rid of the transitive dependency on `ip` which is flagged for a CVE.	2024-10-21 17:18:51 -05:00
Tyler Butler	6095d7f4f2	build: Remove references to deleted readme-command package (#22831 ) The readme-command package was deleted some time ago but there were still some references to it in configs and comments.	2024-10-17 12:36:37 -07:00
yunho-microsoft	4df3b36633	Fix token cache error: invalid expire time (#22761 ) Skip token cache if the token is about to expire in 5 minutes. --------- Co-authored-by: Yunho <yunho-macbookpro2024@DESKTOP-M86HBMH.redmond.corp.microsoft.com> Co-authored-by: Yunho <yunho-macbookpro2024@Yunhos-MacBook-Pro.local> Co-authored-by: Yunho <yunho-macbookpro2024@Yunhos-MBP.guest.corp.microsoft.com>	2024-10-17 18:45:06 +00:00
dhr-verma	878582128c	Removed singleton Startup check usage by Historian and Gitrest (#22826 ) ## Description This PR follows the r11s PR - https://github.com/microsoft/FluidFramework/pull/22819 - to remove the usage of the `StartupCheck` singleton by Historian and Gitrest. This singleton along with r11 package mismatch caused bugs. Hence, now I pass the implementation of the startup probe as a resource to the server. ## Breaking Changes Changes the resourceFactory args of both Gitrest and Historian. --------- Co-authored-by: Tyler Butler <tyler@tylerbutler.com>	2024-10-16 20:34:41 +00:00
dhr-verma	c5d7bde895	Removed the singleton implementation of the startup probe (#22819 ) ## Description The singleton implementation of `StartupCheck` causes bugs when the r11 packages do not match the historian packages consumed. Hence, I decided to switch to a non-singleton implementation. This introduces the `StartupCheck` as an implementation of `IReadinessCheck`. This probe is a resource provided to all HTTP services in r11. ## Breaking Changes Changes Resource and Runner objects for Alfred, Riddler and Nexus to include the `StartUpCheck` object. --------- Co-authored-by: Tyler Butler <tyler@tylerbutler.com>	2024-10-15 17:16:03 -07:00
Shubhangi	256bf0899c	Circuit breaker implementation in scriptorium (#22730 ) ## Description This PR is to add circuit breaker functionality for scriptorium lambda. It is to handle the exceptions where service restart is not helpful and instead, we want to wait and retry again. For example, when mongo db is unavailable/down, and scriptorium is not able to write ops to the db, restarting the service doesnt help, instead we would wait and retry after some time. Circuit Breaker pattern helps in such cases by maintaining open/closed/halfOpen state. So in scriptorium, all the calls to db are wrapped by the circuit breaker, and in case of such errors, the circuit will open and pause the lambda (i.e. pause the incoming messages). After some time, the circuit will go to halfOpen state and call a healthCheck function - if it succeeds, the circuit will close and resume the incoming messages, else it will stay open and paused. We can configure various options, like error threshold, reset timeout, the errors for which we want to engage the circuit breaker, etc. Also if the circuit is not able to close or resume for some time (configurable), we will fallback to restarting the service to avoid being in an endless state of waiting. This PR is for scriptorium, and once we validate and roll this out in production, we will add the same pattern for document lambdas too. Summary of changes made in this PR: - Circuit Breaker Implementation: Adds a circuit breaker pattern to scriptorium->db calls, with various configuration options for error thresholds, reset timeouts, and error filters. - Pause and Resume Methods: Adds pause and resume methods for lambdas, context, documentContext, partition, partitionManager, kafkaRunner, rdKafkaConsumer, and lambda to manage message flow during circuit breaker states. - Health Check for MongoDB: Adds a health check method to the MongoDB class and exposes a healthCheck property from the MongoManager class. ## Testing - [X] Added unit tests for circuit breaker. - [X] Tested the scriptorium end to end functionality locally by forcing the db to be unavailable in the local setup. - [x] Tested in dev cluster by changing mongo db settings to replicate a networking error. We will roll this out slowly by testing in each ring. --------- Co-authored-by: Shubhangi Agarwal <shuagarwal@microsoft.com>	2024-10-15 09:35:40 -07:00
Mark Fields	de6928b528	Stop parsing op contents in DeltaManager - runtime will do it (#22750 ) A long time ago (`5acfef448f`) we added support in ContaineRuntime to parse op contents if it's a string. The intention was to stop parsing in DeltaManager once that saturated. This is that long overdue follow-up. Taking this opportunity to make a few things hopefully clearer in ContainerRuntime too: * Highlighting where/how the serialization/deserialization of `contents` happens * Highlighting the different treatment/expectations for runtime v. non-runtime messages during `process` flow ## Deprecations: Deprecating use of `contents` on the event arg `op` for `batchBegin`/`batchEnd` events, they're in for a surprise. I added a changeset for this case.	2024-10-11 23:01:59 +00:00
yunho-microsoft	03d6823692	Improve socket errors for AFR (#22745 ) This PR includes: 1. Send back retryAfterMs for draining errors 2. Use Network for token revocation errors; deprecate TokenRevocationErrors --------- Co-authored-by: Yunho <yunho-macbookpro2024@Yunhos-MacBook-Pro.local> Co-authored-by: Yunho <yunho-macbookpro2024@Yunhos-MBP.guest.corp.microsoft.com>	2024-10-10 11:06:02 -07:00
Brandon	fffe980734	Session discovery metrics/monitoring (#22681 ) ## Description - Add metrics to know where time is being spent during session discovery - Broken down into two primary pieces: verifyStorageToken and getSession - GetSession is further broken down into three parts: checkDocumentExistence, updateExistingSession, and createNewSession - checkDocumentExistence is the DB call that is made to retrieve the doc and see if it exists - updateExistingSession will only happen if the session is not yet alive/discovered - createNewSession will only happen if the session is undefined (docs created before the concept of service sessions) --------- Co-authored-by: Brandon Diaz <“BrandonLouisDiaz@gmail.com”>	2024-10-10 14:01:31 -04:00
Alex Villarreal	4228a21d96	fix: Update transitive dependencies on `braces` to address CVE (#22768 ) ## Description Updates transitive dependencies on `braces` from 3.0.2 to 3.0.3 to address [CVE-2024-4068](https://nvd.nist.gov/vuln/detail/CVE-2024-4068). A couple of applications of `flub modify lockfile --dependency braces --version 3.0.3 --releaseGroup <release group>`, and some manual updates in packages/release groups that we can't target with `flub`, basically doing the same thing but manually (add an override in package.json, install dependencies, remove override, install dependencies again to clean up override from the lockfile). In a few cases I got unrelated updates, mostly about node types, which I reverted manually. Server packages also got semver update from 7.6.0 to 7.6.3 which seems fine.	2024-10-09 14:16:38 -05:00
yunho-microsoft	9e1f6bf859	add new error code: TokenRevoked (#22723 ) Add a new error code: TokenRevoked to InternalErrorCode enum for driver to handle token revocation scenario: should refresh token and reconnect. Co-authored-by: Yunho <yunho-macbookpro2024@Yunhos-MBP.guest.corp.microsoft.com>	2024-10-03 16:19:58 -07:00
dhr-verma	ebed7e613c	Added support for health probes to Gitrest and Historian (#22710 ) ## Description This PR takes in the r11 changes - https://github.com/microsoft/FluidFramework/pull/22635 - and adds support for the `/healthz` endpoints for `Historian` and `Gitrest`. 1. `/healthz/startup`: Startup readiness check endpoint 2. `/healthz/ready`: Service lifecycle readiness check endpoint 3. `/healthz/ping`: Liveness endpoint. This endpoint was not added for `Historian` as it already has an existing ping endpoint `/repos/ping` 4. These are needed to support Kubernetes Health Checks. The readiness endpoint would need a custom implementation of IReadinessCheck. If this is not provided, the endpoint will not be created. ## Breaking Changes Adds customizations to the ResourceFactory and Runners each of the service mentioned above. These are used to inject an implementation of IReadinessCheck. --------- Co-authored-by: Tyler Butler <tyler@tylerbutler.com>	2024-10-02 17:09:23 +00:00
dhr-verma	9d41303ccf	Added support for health probes for all HTTP services in Routerlicious (#22635 ) ## Description This PR adds support for the following endpoints for `Riddler, Nexus, and Alfred`: 1) `/healthz/startup`: Startup readiness check endpoint 2) `/healthz/ready`: Service lifecycle readiness check endpoint 3) `/healthz/ping`: Liveness endpoint. This endpoint was not added for `Alfred` as it already has an existing ping endpoint `/api/v1/ping` These are needed to support Kubernetes Health Checks. The startup endpoint relies on a new singleton class introduced in this PR - `StartupChecker`. This class returns the `startup` status as `isReady: true` after the service runner is created. The readiness endpoint would need a custom implementation of `IReadinessCheck`. If this is not provided, the endpoint will not be created. To support HTTP endpoints in Nexus, it also adds a request listener to the HTTP server setup in Nexus. ## Breaking Changes Adds customizations to the ResourceFactory and Runners each of the service mentioned above. These are used to inject an implementation of `IReadinessCheck`. --------- Co-authored-by: Tyler Butler <tyler@tylerbutler.com>	2024-10-01 15:56:44 -07:00
Alex Villarreal	a127c7cebe	refactor(server): Remove deprecated version property from docker-compose files (#22546 ) ## Description The version property in docker-compose files is deprecated and only used for backwards compatibility. When using the docker-compose files in ADO we get warnings like these: ![image](https://github.com/user-attachments/assets/d9ec81d5-7fc9-4ef7-80a4-8d20502f1e93) I don't think we support using older versions of docker compose so removing the optional/deprecated property seems fine. See https://docs.docker.com/reference/compose-file/version-and-name/#version-top-level-element-optional	2024-10-01 15:22:25 -05:00
Pradeep Vairamani	fbda4c0ad2	Upgrade express and body-parser (#22600 ) Upgrades the express and body-parser packages in historian and gitrest to address [CVE-2024-45590](https://nvd.nist.gov/vuln/detail/CVE-2024-45590). Release notes for express [4.20.0](https://github.com/expressjs/express/releases/tag/4.20.0) and [4.21.0](https://github.com/expressjs/express/releases/tag/4.21.0) Release notes for body-parser [1.20.3](https://github.com/expressjs/body-parser/releases/tag/1.20.3) Follow up to #22480 Co-authored-by: Pradeep Vairamani <pradeep@Pradeeps-MacBook-Pro-2.local>	2024-09-24 10:24:10 -07:00
Alex Villarreal	0b17d50af8	fix: Have alfred redirect requests to nexus when appropriate in local docker environment (#22535 ) ## Description This PR makes it so alfred redirects requests whose path starts with `/socket.io` to nexus for handling instead of trying to handle them itself, specifically in the case of a local routerlicious environment running in docker. ### Context While trying to run our e2e tests against a local routerlicious environment running in docker I noticed that some compat tests with older versions (1.x) were failing consistently, and looking at the server logs I realized that requests for the delta stream were being received by alfred, who doesn't handle them anymore since https://github.com/microsoft/FluidFramework/pull/19227. That PR updated the kubernetes manifests so requests to alfred's URL where the path starts with `/socket.io` are actually routed to nexus now. I believe that was necessary because older versions of the driver would not understand new settings for the deltaStreamUrl. That makes things work for an AKS deployment, but we missed doing the same thing for the local docker environment, which this PR fixes.	2024-09-19 10:53:48 -05:00
Alex Villarreal	8987bc9e76	fix(server): Remove bad comma in server configmap (#22560 ) ## Description Removes a trailing comma that results in an invalid JSON config and kubernetes pods crashing trying to load it. Introduced recently in https://github.com/microsoft/FluidFramework/pull/22442.	2024-09-18 12:43:35 -05:00
WillieHabi	f8d3fed16c	Client support for targeted signals (#22321 ) ## Description Client side changes needed to support targeting signals to a specific client id. Signals are now sent with v2 signals protocol (`ISentSignalMessage`) Unnecessary override of `submitSignal` function is removed from localDocumentDeltaConnection. This is handled in documentDeltaConnection of base driver These changes follow the server changes to support targeted signals #19519 [ADO Task 7026](https://dev.azure.com/fluidframework/internal/_workitems/edit/7026)	2024-09-14 05:00:38 +02:00
Mark Fields	0795a20d22	Some overdue cleanup from prior message layer/refactoring work (#22404 )	2024-09-13 17:57:14 +00:00
Tyler Butler	697bb0cc7d	build(server): Upgrade express to 4.21.0 and body-parser to 1.20.3 (#22480 ) Upgrades the express and body-parser packages to address [CVE-2024-45590](https://nvd.nist.gov/vuln/detail/CVE-2024-45590). The package.json range for body-parser was `"^1.17.1"`, but we were already resolved to 1.20.2 in our lockfile anyway, so this is really just a patch bump. The express upgrade is the bigger change. - Release notes for express [4.20.0](https://github.com/expressjs/express/releases/tag/4.20.0) and [4.21.0](https://github.com/expressjs/express/releases/tag/4.21.0) - Release notes for body-parser [1.20.3](https://github.com/expressjs/body-parser/releases/tag/1.20.3)	2024-09-12 18:00:56 -07:00
Matt Rakow	d9f0c37395	Update webpack-related dependencies (#22447 )	2024-09-10 23:11:36 +00:00
Zach Newton	fff9bab5a5	server: Use HSCAN for Nexus getAllSessions (#22442 ) ## Description During peak traffic hours, the RedisCollaborationSessionManager introduced in #22381 could potentially return thousands of sessions. After 1,600 sessions, this exceeds the recommended maximum Redis response size of 200kb (each session+key is about 172 bytes) for optimal efficiency. To improve efficiency, we can use [Redis HSCAN](https://redis.io/docs/latest/commands/hscan/) to fetch sessions from Redis in batches. Here, the default number of sessions per batch is 800 (half the maximum) to allow wiggle room for future session information. ### Tests Added some unit tests for the RedisCollaborationSessionManager, and bumped the `ioredis-mock` version to include stipsan/ioredis-mock#1300.	2024-09-10 19:43:52 +00:00
Zach Newton	e892b97f87	server: Use key for docId tenantId instead of nexus session redis fields (#22439 ) ## Description It is redundant and a waste of space to store the documentId and tenantId in redis fields when they are already present in the key. Improves #22381	2024-09-09 10:45:28 -07:00
Matt Rakow	63aeb13082	Update/remove some deps using old semver (#22420 ) Updates: * `pm2` * `@changesets/cli` * `@changesets/types` * `sass` * `sass-loader` Removes * `typescript-formatter`	2024-09-06 16:21:16 -07:00
Joel Zhu	eaad5963c3	Use the net Library for IP Type (#22405 ) Use the net library for IP type detection instead of your custom method. Some IP addresses may not be recognized or printed correctly if you use your own regular expression method.	2024-09-06 16:12:42 -07:00
Zach Newton	9a932a638b	server: add collab session tracking to Nexus lambda (#22381 ) ## Description Currently, the only reliable way to track a session in R11s is via Deli's `SessionResult` metric, which depends on Join/Leave Ops and Deli's "close" handler. This session tracking does not account for sessions that only have Reader clients with no Ops. This PR introduces an optional, alternative method for tracking collaboration sessions within the Nexus lambda itself, which is able to account for Read-only sessions. > Note: This is an alternative to #9191 which requires creating Orderer connections to manage read clients using Deli, as well as keep-alive pings from the frontend (Nexus in our case). We do not want to spin up Deli and create Orderer connections for read sessions. ### Solution Design Details > Context The original design attempted to only use information already available from `IClientManager` to understand active session information and act accordingly. However, the "currently connected client list" available via `IClientManager` was insufficient for handling various multi-instance scenarios such as clients leaving from separate Nexus instances causing the session to "terminate" too quickly/twice or a Nexus instance shutting down causing a session end timer to be lost. 1. "Session Creation (First Client Join)": When a client for a given document connects to the socket server while no other clients are connected/active for that document, and the previous session either never existed or was inactive for more than 10 minutes, the session is "created/started." 2. "Session Expansion/Continuation (Client Join): When a client for a given document connects to the socket server while other clients are connected/active for that document, or the previous session has been inactive for less than 10 minutes, the session is updated with information about that new client, and any existing timers are reset. 3. Session End (Last Client Leave): When the only remaining connected client for a given document disconnects from the socket server, the session is updated with "last client leave time" and a 10 minute timeout is started. 4. Session Timeout (Inactive for 10 minutes): When a session's inactivity timer expires and there are still no clients in the session according to the ClientManager, the session is logged as "ended" and cleaned up. All of the above "session" information is stored within a Redis HashMap that allows the list of current sessions to be retrieved and iterated over, or a single session to be retrieved and updated. ## Breaking Changes ### Firm Input Validation When the client sends a malformed connect message (i.e. the message does not contain all expected properties with expected types), Nexus will emit a `connect_document_error` message with a 400 error code, indicating malformed user input to the client. #### Context Nexus currently makes a lot of type assumptions about the client's `IConnect` message in the `connect_document` event handler. This can cause the service to crash due to unhandled TypeErrors at runtime. This PR introduces strong type checks for the incoming `IConnect` message and its internal `IClient` details so that Nexus can safely access the expected properties in that message. ## Reviewer Guidance - Main Session Tracking Logic: server/r11s/packages/services/src `redisSessionManager.ts` and `sessionTracker.ts` - Main Nexus Session Tracking: server/r11s/packages/lambdas/src/nexus `connect.ts` and `disconnect.ts` - There is also a small refactor in `disconnect.ts` to make the Disconnect handler structure more similar to the Connect handler by moving the internal loops into their own named functions. - Type Validation: server/r11s/packages/lambdas/src/nexus `index.ts` and `protocol.ts` --------- Co-authored-by: Tyler Butler <tyler@tylerbutler.com>	2024-09-06 19:32:08 +00:00
Matt Rakow	afe20defdf	Update build-tools versions to latest (0.44.0) (#22407 ) This also requires updates to some typetests since the format has changed.	2024-09-06 11:47:26 -07:00
Alex Villarreal	f8505c1a8c	Update axios dependencies (#22388 ) ## Description Updates axios dependencies to the latest version (in package.json direct dependencies and in transitive dependencies in lockfiles) throughout the repo to address a few CVEs.	2024-09-05 13:23:42 -05:00
zhangxin511	c4870068b5	Add isEphemeralContainer information to session logs (#22284 ) ## Description We don't have good way of hooking up connect document metrics with isEphemeralContainer flags. Get session would be the entry point of connect a document so this will provide us more accurate information. ## Breaking Changes N/A --------- Co-authored-by: Xin Zhang <zhangxin@microsoft.com>	2024-09-04 11:33:27 -04:00
dhr-verma	c8e16500ff	Vermadhr/correlation id source tracking (#22292 ) ## Description Refactors and changes the prop `correlationIdSource` to `requestSource` to avoid ambiguity in understanding whether we are tracking request origin or correlationId origin.	2024-08-22 18:00:29 +00:00
dhr-verma	7fd8c786b3	Added correlationId source tracking (#22280 ) ## Description This PR adds telemetry to track the origin of the correlation associated with an API call by adding a new telemetry prop - `correlationIdSource`. If the client sends a correlationId in the `x-correlation-id` header or in the `x-telemetry-header`, then the source is set as `"correlationIdSource": "client"`. Else the correlationId is generated by the server and the prop is set as `"correlationIdSource": "server"`. ## Breaking Changes Updates `ITelemetryContextProperties` to include the `correlationIdSource` property.	2024-08-21 19:16:39 +00:00
kekachmar	afe31cfe89	make consumeLoopTimeoutDelay configurable (#22253 ) Change to make consumeLoopTimeoutDelay configurable	2024-08-19 13:53:50 -04:00
Zach Newton	f461368cf6	server: upgrade server packages in Historian and Gitrest (#22220 ) ## Description Upgrading Routerlicious server packages in Gitrest and Historian to pull in changes from #22109. Adds `getTelemetryContextProeprties` param to each BasicRestWrapper instantiation	2024-08-16 23:37:11 +00:00
Zach Newton	24ad74864f	server: Switch Ephemeral Container expired to document deleted message (#22217 ) ## Description Customers depend on the "Document is deleted..." message, not the error code. Some of our E2E tests do to. When an EC is considered expired, just say it's "deleted" to match existing client logic. Follow-up to move to a better message: [ADO #12867](https://dev.azure.com/fluidframework/internal/_workitems/edit/12867)	2024-08-15 20:50:28 +00:00
Zach Newton	38194426d6	server: add a couple doc comments (#22213 ) ## Description Missed adding some doc comments in #22109	2024-08-14 22:58:28 +00:00
Zach Newton	18b76b29ff	server: telemetry context header (#22109 ) ## Description Global TelemetryContext was implemented several major server versions ago. At the same time, the old `getCorrelationId` and `bindCorrelationId` method of tracking correlationId was deprecated. This PR removes usage of those methods, and also adds a new Telemetry Context header that can be extended to track other information for the lifetime of an API request. For the new `x-telemetry-context` header, the old `x-correlation-id` header will still be respected (for now) if `x-telemetry-context` header does not container `correlationId` property. BasicRestWrapper now takes in an optional `getTelemetryContextProperties` method, similar to how it takes a `getCorrelationId` method. This is used to generate telemetryContext header on outgoing requests from within R11s. `x-correlation-id` is still generated. ## Breaking Changes - `enableGlobalTelemetryContext` config switched to `true` in code. Was already true in configs. - `bindCorrelationId` usage was removed from Gitrest, Historian, and Routerlicious Rest APIs, meaning `getCorrelationId` without `enableGlobalTelemetryContext: true` will not work anymore. I'm leaving the old `getCorrelationId` and `bindCorrelationId` methods in for 1 more release cycle out of abundance of caution, even though it has been deprecated for almost a year.	2024-08-14 14:59:13 -07:00
Alex Villarreal	e9d1a83787	refactor: Address CredScan warning in server pipelines (#22179 ) ## Description This PR fixes the CredScan warnings we were getting in the server pipelines, before they become a blocker that makes the pipeline runs fail. The auto-injected CredScan task in server pipelines was complaining about things that we had already indicated should be skipped (through the CredScanSuppressions.json file). Turns out that for docker builds, the file is expected in the "root context" for the docker build, not at the root of the repo like it is for some other auto-injected tasks. This PR makes it so we copy the file to the necessary new location in the server pipelines. It also replaces a bunch of fake usernames/passwords in a file's comments with "PLACEHOLDER" which the CredScan task automatically skips (pro-tip: don't use "PLACEHOLDER" as your actual password 😄). Finally, it adds more suppressions for files that are part of test code in some server dependencies.	2024-08-14 17:35:30 +00:00
Zach Newton	c1e343e4c6	server: explicitly reject requests for expired Ephemeral Containers (#22174 ) ## Description Currently, we rely on an Ephemeral Container to either 1) be cleaned up by the Deli lambda on session end, or 2) expire due to DB and Redis TTL values. There are inconsistencies in configurations and TTL behaviors regardless of configs, so we want to explicitly reject access to Ephemeral containers that are older than a certain time. This PR causes all Historian requests and Alfred getSession requests to fail with an explicit `404 - Ephemeral Container Expired: ...` error when the container was created longer ago than the EphemeralDocumentTTL config value. It also changes Gitrest's Ephemeral TTL configuration to use an explicit EphemeralDocumentTTL value for consistency, rather than an implicit general Redis TTL value. The defaults for these values are remaining as 24 hours.	2024-08-12 21:57:20 +00:00
Zach Newton	e9614754e9	Make ephemeral Document TTL configurable (#22164 ) ## Description Makes the ephemeral container DB TTL added in #19981 configurable.	2024-08-08 23:51:04 +00:00
Tyler Butler	8a0e4190f2	ci: Fix docker pipelines to correctly pack packages (#22072 ) This change reverts part of the changes made in #21018. The past changes inadvertently caused the packages to be published without any built content. I have verified from test builds that the published packages do have built content with this change. In this change, the pack process for docker pipelines is once again run with a unique shell command that is run in the docker container, and the package lists are created directly in the pipeline instead of by a script. This is unfortunate from a maintenance perspective because it means there are two slightly different pack paths depending on the pipeline. That said, this is by far the most straightforward fix.	2024-07-31 19:55:41 -05:00

1 2 3 4 5 ...

2596 Коммитов