Merged PR 791646: Spellcheck various docs

Spellcheck various docs
This commit is contained in:
Michael Pysson 2024-06-26 21:52:30 +00:00
Родитель 004ded682e
Коммит 0c99477c88
14 изменённых файлов: 25 добавлений и 25 удалений

Просмотреть файл

@ -7,7 +7,7 @@ bxlAnalayzer.exe is a tool that lives next to bxl.exe. It can provide analysis o
The analyzer works on both on Windows and macOS (unless otherwise specified). For macOS, remove the ".exe" extension from the examples and correct the paths.
## Specifying inputs
Most analysis modes require a pointer to an execution log from a prior BuildXL build session. This can either be specified on the command line with `/xl:[PathTo.xlg file]` or the analyzer will find the log of the last BuildXL sesison run if left blank. See the analyzer help text for more details
Most analysis modes require a pointer to an execution log from a prior BuildXL build session. This can either be specified on the command line with `/xl:[PathTo.xlg file]` or the analyzer will find the log of the last BuildXL session run if left blank. See the analyzer help text for more details
# Analysis modes
The analyzer application has many different modes added by the BuildXL team as part of the core product as well as other consumers of BuildXL. See the help text for a full listing of the various analyzer modes: `bxlanalyzer.exe /help`. Specify the mode using the `/m:` option.

Просмотреть файл

@ -11,7 +11,7 @@ But this is problematic when processes enumerate directories that change as a re
## Virtual filesystems
To address the mutating directory fingerprint, BuildXL only uses the actual filesystem to compute the directory fingerprints for directories under read-only mounts. For writable mounts, BuildXL doesn't always query the physical filesytem. Instead it looks at the following virtual filesystems, depending on the configuration:
To address the mutating directory fingerprint, BuildXL only uses the actual filesystem to compute the directory fingerprints for directories under read-only mounts. For writable mounts, BuildXL doesn't always query the physical filesystem. Instead it looks at the following virtual filesystems, depending on the configuration:
* Full graph based filesystem
* Minimal Pip filesystem
@ -36,5 +36,5 @@ The default behavior depends on whether partial evaluation is enabled (<code>/us
The filesystem may be overridden by using the <code>/filesystemMode</code> option. It has the following settings:
1. RealAndMinimalPipGraph - real filesystem for non-writable mounts, Minimal Pip filesystem for writable mounts. This is the default when partial evaluation is enabled.
1. RealAndPipGraph - same as above, except uses the Full Graph filesystem instead of the Minimal Pip filesytem. Default when partial evaluation is disabled
1. AlwaysMinimalGraph - Uses the Minimal Pip filesytem for writable and non-writable mounts.
1. RealAndPipGraph - same as above, except uses the Full Graph filesystem instead of the Minimal Pip filesystem. Default when partial evaluation is disabled
1. AlwaysMinimalGraph - Uses the Minimal Pip filesystem for writable and non-writable mounts.

Просмотреть файл

@ -1,30 +1,30 @@
# Plugin mode
Plugin mode is a way of providing the extensibilty of changing the default behavior in BuildXL. Plugin is running in a separate process. BuildXL commnuicate with plugin over grpc thus user can define rpc methods for their own plugins.
Plugin mode is a way of providing the extensibility of changing the default behavior in BuildXL. Plugin is running in a separate process. BuildXL communicates with plugin over grpc thus user can define rpc methods for their own plugins.
## How does it works
* Pass `/enablePlugins+` command arguments to BuildXL to allow plugin mode
* Pass `/pluginPaths:<list of paths>` (each path can be seperated by `;`) command arguments to BuildXL to tell where to find plugins.
* Pass `/pluginPaths:<list of paths>` (each path can be separated by `;`) command arguments to BuildXL to tell where to find plugins.
* BuildXL will load plugins one by one
* BuildXL will shutdown all plugins when build is done
* BuildXL will choose the first plugin that can handle the request
* BuildXL will have plugin client to commnuicate with each plugin over grpc(one client per plugin)
* BuildXL will have plugin client to communicate with each plugin over grpc(one client per plugin)
## Required operations
Plugin implementation should conform a set of rpc operations:
1. `Start`: instruct plugin to start and load any necessary resources
1. `Stop`: instruct plugin to stop and clean up
1. `SupportedOperation`: get operations that plugin support. This is used to register in BuildXL thus BuildXL can dispatch message accordingly.
1. `Send`: send `PluginMessage` to plugin and expect to receive `PluginMessageResponse`. Both `PluginMessage` and `PluginMessageResponse` contain dynamic payload. Based on payload type in message received by plugin, it can infer type of request and repond accordingly.
1. `Send`: send `PluginMessage` to plugin and expect to receive `PluginMessageResponse`. Both `PluginMessage` and `PluginMessageResponse` contain dynamic payload. Based on payload type in message received by plugin, it can infer type of request and respond accordingly.
__Note:__ we don't support two plugins have overlapped `supportedMessageType`, one messageType per plugin. This restriction may be subject to change in future
## How to implement new plugin
### plugin client
plugin client implmentation is in BuildXL codebase and it is supposed to bootstrap any plugin if you follow the requirements. Generally speaking, you don't need to make change to it, unless:
plugin client implementation is in BuildXL codebase and it is supposed to bootstrap any plugin if you follow the requirements. Generally speaking, you don't need to make change to it, unless:
*a new type of plugin message
*add a new type of plugin supported operation
*add new extensibile point
*add new extensible point
### plugin
Your plugin implementation should have implementations for handle those requrired rpc operations, see `LogParsePluginExample.cs` as an example
Your plugin implementation should have implementations for handle those required rpc operations, see `LogParsePluginExample.cs` as an example

Просмотреть файл

@ -2,7 +2,7 @@
BuildXL runs all build tools in a sandbox to observe their actions and, in some cases, prevent processes from taking certain actions. However, running tools in a sandbox has some limitations: the process (and child processes) lifespan is confined to the lifespan of the corresponding pip. This means that any child process that tries to survive the main process will be terminated. This behavior doesn't allow processes like telemetry, compiler, code generation services, etc. to be correctly modeled with BuildXL.
In order to accomodate this type of scenarios, BuildXL enables **trusted** tools a way to configure child processes to escape the sandbox.
In order to accommodate this type of scenarios, BuildXL enables **trusted** tools a way to configure child processes to escape the sandbox.
## Configuring a breakaway process

Просмотреть файл

@ -92,4 +92,4 @@ The arrows represent the data-flow. Pip-Cons only reads `d.cpp` and `a.h` it do
Note that we also track “directory enumeration” in BuildXL that might cause a pip to rerun. I.e. if we observe a process doing the equivalent of `dir`/`ls` we record this as well. If a new file is introduced between builds we will rerun the pip. Now there are some policies that we allow to be configured on a per-tool basis to restrict and not go crazy but in short you can assume BuildXL will do the right thing.
We also track probes i.e. if a process looks for the existence of a file. So if you delete a file from a sealed directory that was previously read, we will rerun the pip. If the previous run checked for the existance of `d.h` and the next run you introduce this file, we will rerun the pip.
We also track probes i.e. if a process looks for the existence of a file. So if you delete a file from a sealed directory that was previously read, we will rerun the pip. If the previous run checked for the existence of `d.h` and the next run you introduce this file, we will rerun the pip.

Просмотреть файл

@ -2,7 +2,7 @@
Server mode is an optimization to speed up back-to-back builds. It is recommended for use on dev machine builds but not in a datacenter. It is currently only available on Windows.
## How does server mode work?
It works by spawning a second bxl.exe 'server' process as a child process of the user initated bxl.exe process. This second process is the one that actually performs the build. It communicates back with the original bxl.exe 'client' process to send console output and results back to the user. When the build completes, the server process stays alive and keeps some state in memory to make subsequent builds faster.
It works by spawning a second bxl.exe 'server' process as a child process of the user initiated bxl.exe process. This second process is the one that actually performs the build. It communicates back with the original bxl.exe 'client' process to send console output and results back to the user. When the build completes, the server process stays alive and keeps some state in memory to make subsequent builds faster.
In a subsequent build, the user will launch a second bxl.exe client process which will connect to the existing server process to perform the build. If no build is requested of a server process for 60 minutes, it will shut itself down.

Просмотреть файл

@ -15,7 +15,7 @@ When infileA was changed, the change affected input for processC is outfileB.
The feature is currently used for changelist code coverage. BuildXL computes the source change affected input list for QTest pip. QTest will only process the file listed as affected when it computes code coverage results. This reduces the QTest's instrumentation time.
## Enabling The Feature
BuildXL needs to know the source changes for the computation. BuildXL will only perform the compution for the process that requires to know its affected inputs by providing the path of a file that the result can be written into.
BuildXL needs to know the source changes for the computation. BuildXL will only perform the computation for the process that requires to know its affected inputs by providing the path of a file that the result can be written into.
### Source Change Tracking
Currently, BuildXL doesn't check the source change of the enlistment itself. It requires source change provided through the command line argument `/inputChanges:<path-to-file-containing-change-list>`. Full paths of the changed source files should be listed in this file.

Просмотреть файл

@ -22,7 +22,7 @@ The behavior described above may be modified with a few command line arguments.
bxl.exe accepts `/normalizeReadTimestamps-` as an argument to disable timestamp normalization. But remember that timestamps of output files may not necessarily be preserved after a pip runs since they get hardlinked into the cache to preserve space. If the file with the same content has earlier been produced and is in the cache, it will be deduped and will get the timestamp from the first time the file was introduced into the cache.
## Other platforms
BuildXL's macOS sandbox implementation doesn't have a mechanism to fake timestamps. It would be possible via interposing, but that feature requires System Integrity Protection to be disabled and thus is undesireable. In order to enable distributed builds and shared cache on macOS, source control needs to ensure timestamp consistency.
BuildXL's macOS sandbox implementation doesn't have a mechanism to fake timestamps. It would be possible via interposing, but that feature requires System Integrity Protection to be disabled and thus is undesirable. In order to enable distributed builds and shared cache on macOS, source control needs to ensure timestamp consistency.
To deal with precomiled headers in clang, the recommendation is to utilize the `-fno-pch-timestamp` option. This instructs clang to ignore the timestamp check. This puts trust in the build graph specified to BuildXL to invoke clang appropriately when the header file content changes.

Просмотреть файл

@ -1,5 +1,5 @@
# Overview
There are two variants of BuildXL development: Public (default) and Internal (for Microsoft internal developers). The difference comes down to a few dependencies which are only availably internally within Microsoft today, like the connections to an internal cache server. The aquisition path for machine prerequesites may also differ slightly.
There are two variants of BuildXL development: Public (default) and Internal (for Microsoft internal developers). The difference comes down to a few dependencies which are only available internally within Microsoft today, like the connections to an internal cache server. The acquisition path for machine prerequisites may also differ slightly.
If you are a Microsoft internal developer, the Internal variant is automatically selected based on your user domain on Windows. On Linux and macOS you need to specify --internal in bxl.sh.

Просмотреть файл

@ -5,7 +5,7 @@ This document describes the high level design and the components of distributed
- **Orchestrator** – this machine initiates the build and is responsible for constructing the initial schedule, connects to workers, and orchestrating the builds.
- **Worker** - a machine that receives requests to execute certain pip steps and reports the result back to the orchestrator
- **Cache** - the [cache](../../Public/Src/Cache/README.md) is used to exchange files between the orchestrator and the worker (e.g. files for reconstructing the pip graph, input files for execution)
- [**Worker**](../../Public/Src/Engine/Scheduler/Distribution/Worker.cs) object - A local or remote worker capable of executing processes and IPC pips. The scheduler keeps a list of Workers (wich always has a single local worker, and additionally the remote workers when running distributed builds).
- [**Worker**](../../Public/Src/Engine/Scheduler/Distribution/Worker.cs) object - A local or remote worker capable of executing processes and IPC pips. The scheduler keeps a list of Workers (which always has a single local worker, and additionally the remote workers when running distributed builds).
- [**PipExecutionStep**](../../Public/Src/Engine/Scheduler/PipExecutionStep.cs) - A specific step in a pip's execution. Some of these steps can be distributed and executed in a remote worker.
- [**OrchestratorService**](../../Public/Src/Engine/Dll/Distribution/OrchestratorService.cs) - This service runs only in the orchestrator and is in charge of keeping track of the remote workers and receiving their messages (i.e., attachment and pip execution completions, and error events). `OrchestratorService` does not directly _send_ messages to the workers: this is done mainly by the scheduler itself through a `RemoteWorker` instance.
- [**WorkerService**](../../Public/Src/Engine/Dll/Distribution/WorkerService.cs) - This service runs only in the workers and is in charge of communicating with the orchestrator, both receiving and sending messages (for orchestrator attachment, pip step execution requests/results and warning/error events).
@ -35,7 +35,7 @@ The orchestrator starts with a list of the addresses for all the workers and sen
- `NotStarted` -> `Starting`: Before it sends the attachment request
- `Starting` -> `Started`: After successfully sending the Attach RPC
- `Started` -> `Attached`: After receiving the attach completion RPC from the worker (as described above)
- `Attached` -> `Running`: After validating that it can succesfully pull from the cache the validation content pushed by the worker. This checks that we can communicate content between orchestrator and worker through the cache (as the worker successfully retrieved the pip graph and we succesfully retrieved this validation content).
- `Attached` -> `Running`: After validating that it can successfully pull from the cache the validation content pushed by the worker. This checks that we can communicate content between orchestrator and worker through the cache (as the worker successfully retrieved the pip graph and we successfully retrieved this validation content).
After transitioning to the `Running` state the `RemoteWorker` starts the thread that will send requests to the remote worker.

Просмотреть файл

@ -16,7 +16,7 @@ The following commands compile the codebase in debug mode:
Lower level filter for compiling a specific qualifier:
* `.\bxl "/f:tag='targetFramework=net472'" /q:Release` to compile targeting net472 in Release mode.
* `.\bxl "/f:tag='targetRuntime=win-x64'"` an alternative way to compile everyting targeting Windows OS.
* `.\bxl "/f:tag='targetRuntime=win-x64'"` an alternative way to compile everything targeting Windows OS.
* `.\bxl "/f:(tag='targetFramework=netcoreapp3.1')and(tag='targetRuntime=win-x64') -Use Dev"` to only target netcore app for Windows.
### How to run a single unit test?

Просмотреть файл

@ -5,7 +5,7 @@
BuildXL is a build system that manages build tasks at the process (or _pip_ for short) level. BuildXL "expects" every process to be _short-lived_, _atomic_, and _well-defined_ (SLAWD), meaning:
- _short-lived_: the process performs a single (conceptual) operation and terminates;
- _atomic_: once started, neither can the process receive any new commands, nor can its outputs be accessed by any of its dependants before it terminates;
- _atomic_: once started, neither can the process receive any new commands, nor can its outputs be accessed by any of its dependents before it terminates;
- _well-defined_: process inputs and outputs are all statically known and fully specified.
Furthermore, BuildXL enforces that the runtime behavior of each process adheres to its input/output specification. This makes it difficult (or even impossible) to spawn and use *background* processes in a concrete build, i.e., any process that does not fit the definition above. Both Unix daemons and Windows Services fall into this category, but any resident process may apply too.
@ -235,7 +235,7 @@ function uploadDrop() {
* Service pips should not be cached. A service starts only if some pip which requires the service needs to execute (cache miss).
* Sercice pips may choose on their own accord to skip work. In the drop example described above the `dropd.exe` service pip makes sure the external service is aware of the output file but it will avoid uploading content when the remote system already has the content.
* Service pips may choose on their own accord to skip work. In the drop example described above the `dropd.exe` service pip makes sure the external service is aware of the output file but it will avoid uploading content when the remote system already has the content.
### Sandbox Considerations

Просмотреть файл

@ -11,12 +11,12 @@ The following features are not properly supported, or are only partially support
The following features are not supported on the PTrace sandbox.
- Blocking disallowed file accesses
## maOS Support History
## macOS Support History
In the past there was a push to bring BuildXL to macOS to provide cached and distributed builds to the a number of Microsoft teams. BuildXL moved to .netcore and scrubbed the codebase to add Unix support. The core bxl executable can be cross compiled on Windows to run on macOS. The major component that needed to be rewritten for macOS is the file access monitoring layer. This is what allows BuildXL to provide reliable caching.
There are a number of options for monitoring process trees and the files they access on unix platforms, and slightly fewer on macOS. Thorough analysis and prototyping was performed and all existing frameworks had issues that prevented their use. The last resort was writing a custom Kernel Extension (KEXT). This was able to satisfy the requirements for high performance and lossless file access tracking for child process trees. It enabled moving forward with macOS support but it came with the risk of of using a technology that might not be supported long term.
In 2020, it was announced that KEXT support would be deprecated from new versions of macOS. Apple provided a replacement for the core functionality in [Endpoint Security](https://developer.apple.com/documentation/endpointsecurity). A prototype Endpoint Security based monitoring sandbox for BuildXL has been implemented, but at the time it proved to drop too many events to be practical for our main use case. That use case was primarily C++ and various scripts where including probes to non-existing files was important for the correcness of caching. Those probes were the highest volume of events which caused the Endpoint Security to be lossy. Between this and competing priorities on the build graph translation work, the decision was made to cease the effort.
In 2020, it was announced that KEXT support would be deprecated from new versions of macOS. Apple provided a replacement for the core functionality in [Endpoint Security](https://developer.apple.com/documentation/endpointsecurity). A prototype Endpoint Security based monitoring sandbox for BuildXL has been implemented, but at the time it proved to drop too many events to be practical for our main use case. That use case was primarily C++ and various scripts where including probes to non-existing files was important for the correctness of caching. Those probes were the highest volume of events which caused the Endpoint Security to be lossy. Between this and competing priorities on the build graph translation work, the decision was made to cease the effort.
Porting BuildXL to macOS helped jump start the team's other cross platform investments, namely Linux. The process execution and monitoring layer is common now across many Microsoft build products, including the Windows and Linux monitoring layer. The macOS layer remains in the state of being implemented, but too lossy for practical use.

Просмотреть файл

@ -81,7 +81,7 @@ The proper credentials need to be provided in order for the BuildXL cache to sto
Under github [codespaces](https://github.com/features/codespaces), a VSCode extension [Azure Devops Codespaces Authentication](https://github.com/microsoft/ado-codespaces-auth/) can be used for providing seamless authentication against Azure DevOps using Entra ID login. BuildXL will look for the `azure-auth-helper` tool under `PATH`, and interact with this auth helper in order to get a valid Entra ID token to use to access the blob account. The authenticated user (or containing security group) needs to have `Storage Blob Data Contributor` permissions to access the blob account. BuildXL will only use this auth method when the aforementioned tool is found in `PATH` and the `StorageAccountEndpoint` is provided.
### Using interactive browser authentication
A user-interactive authentication mechanism via a web browser can be used to acquire an Entra ID token. This auth mechanism will only be attempted if the `StorageAccountEndpoint` is provided and BuildXL is run with the `/interactive` flag, indicating that this is a developer build and therefore interactive prompts are allowed. The interactive prompt will try to acquier a token via Entra ID authentication. Similarly to the above auth method, the blob storage account needs to have configured access such that the authenticated user (or containing security group) has `Storage Blob Data Contributor` permissions.
A user-interactive authentication mechanism via a web browser can be used to acquire an Entra ID token. This auth mechanism will only be attempted if the `StorageAccountEndpoint` is provided and BuildXL is run with the `/interactive` flag, indicating that this is a developer build and therefore interactive prompts are allowed. The interactive prompt will try to acquire a token via Entra ID authentication. Similarly to the above auth method, the blob storage account needs to have configured access such that the authenticated user (or containing security group) has `Storage Blob Data Contributor` permissions.
### Using a connection string
The environment variable `BlobCacheFactoryConnectionString` has to be set in the context of the running BuildXL instance containing the connection string for the blob storage account. Please check [here](https://learn.microsoft.com/en-us/azure/storage/common/storage-configure-connection-string) for details. If you are running your build in an Azure pipeline, the recommendation is to securely store the environment variable containing the connection string in an [Azure Key Vault](https://learn.microsoft.com/en-us/azure/key-vault/general/overview) and allow your pipeline to consume it. BuildXL will only use this method if the aforementioned environment variable is set.