Microsoft CodeQL analyzer's suppression format is slightly different
than GitHub's, and expects the suppression comment to be one line.
Update suppression comments in `pkg\ociwclayer\import.go` to conform.
Suppress warnings for "uncontrolled process operation" in `init\init.c`
and `vsockexec\vsockexec.c`.
Suppress "incorrect conversion between integer types" in
`internal\jobobject\limits.go`, and add fix to
`internal\guest\runtime\hcsv2\uvm.go`.
Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
* Expose NUMA config to containers. Use HCS device affinity so that UVM is configured on same NUMA node as the device. Expose SLIT configuration to UVM to gather NUMA node distances.
Signed-off-by: Apurv Barve <apurvbarve@microsoft.com>
* Fixing lint errors
Signed-off-by: Apurv Barve <apurvbarve@microsoft.com>
* Fix linter errors
Signed-off-by: Apurv Barve <apurvbarve@microsoft.com>
* Addressing review comments
Signed-off-by: Apurv Barve <apurvbarve@microsoft.com>
* Addressing review comments
Signed-off-by: Apurv Barve <apurvbarve@microsoft.com>
* Schema changes
Signed-off-by: Apurv Barve <apurvbarve@microsoft.com>
* OS build version check
Signed-off-by: Apurv Barve <apurvbarve@microsoft.com>
* Removing some checks not required for container platform
Signed-off-by: Apurv Barve <apurvbarve@microsoft.com>
* Removing a TODO comment
Signed-off-by: Apurv Barve <apurvbarve@microsoft.com>
* Conditionalize setting PropagateNumaAffinity for newer OS build only
Signed-off-by: Apurv Barve <apurvbarve@microsoft.com>
* Changing variable name from propagationEnabled to propagateAffinity for better readability
Signed-off-by: Apurv Barve <apurvbarve@microsoft.com>
* Modifying comment and not initializing pointer as it happens implicitly.
Signed-off-by: Apurv Barve <apurvbarve@microsoft.com>
---------
Signed-off-by: Apurv Barve <apurvbarve@microsoft.com>
Dependebot PRs:
- hcsshim/2140
- hcsshim/2145
- hcsshim/2146
- hcsshim/2149
- hcsshim/2153
- hcsshim/2154
- hcsshim/2159
- hcsshim/2161
- hcsshim/2174
- hcsshim/2175
- hcsshim/2176
Update protobuf files.
`google.golang.org/grpc v1.64.0` deprecated `Dial[Context]` and
`WithBlock`.
Replacing either is non-trivial, and left for a future PR.
Update dependabot file to ignore patch updates: they rarely provide bug
fixes and increase repo churn.
Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
* verity-boot: append hash device to rootfs
Turned out that dev nodes for SCSI devices may not be
determenistic, where the hash device and rootfs may end
up appearing under /dev/sda and /dev/sdb respectively.
Instead of mounting a separate hash device, append the
verity Merkle tree to rootfs ext4 filesystem, similarly
to how it's done for layer VHDs and mount single VHD.
Remove redundant hash device code.
The default `GuestStateFile` filename was changed to `kernel.vmgs`.
Update the IVGM kernel init to reflect the changes.
The kernel command looks something like this:
8250_core.nr_uarts=0 panic=-1 debug loglevel=7 root=/dev/dm-0 \
dm-mod.create="dmverity,,,ro,0 173768 verity \
1 /dev/sda /dev/sda 4096 4096 21721 21721 sha256 \
42896a788a58da77b6acb8ddf53aa744bd269c19146cfdf48eb8fc5529a52e62 \
a1c38923e44adffdd21f84e9185248c884fa28e767795d1025e5804e1c3df905" \
init=/startup.sh
To break this down a little further:
dm-mod.create="<name>,<uuid>,<minor>,<flags>,[table {verity_params}]"
table="<start_sector> <num_sectors> <target_type> verity_params"
verity_params="<version> <data_device> <hash_device> <data_block_size> \
<hash_block_size> <num_data_blocks> <hash_start_block> \
<algorithm> <root_digest> <salt> [<opt_params>]"
With the example above we get:
name: "dmverity"
uuid: ""
minor: ""
flags: "ro"
table: 0 0 173768 verity <verity_params>
verity_params:
version: 1
data_device: /dev/sda
hash_device: /dev/sda
data_block_size: 4096
hash_block_size: 4096
num_data_blocks: 21721
hash_start_block: 21721
algorithm: "sha256"
root_digest: "42896a788a58da77b6acb8ddf53aa744bd269c19146cfdf48eb8fc5529a52e62"
salt: "a1c38923e44adffdd21f84e9185248c884fa28e767795d1025e5804e1c3df905"
The support for booting non-SNP UVMs with dm-verity has also been added
as part of this PR. A new annotation can be used to pass the `dm-mod.create`
parameters to kernel. The assumption that the rootfs VHD will also have Merkle
tree appended after ext4 filesystem still holds. The new annotation is
"io.microsoft.virtualmachine.lcow.dmverity-create-args" and must be used
in conjunction with an existing "io.microsoft.virtualmachine.lcow.dmverity-mode"
annotation.
Add an internal "io.microsoft.virtualmachine.console.pipe" annotation, which
can be used to set the serial for the UVM for debugging purposes.
Note that dm-verity boot has a dependency on `CONFIG_DM_INIT` kernel config.
---------
Signed-off-by: Maksim An <maksiman@microsoft.com>
CimFS is currently not supported with HyperV isolation. However, we still have code that
handles processing of UtilityVM layer during image import. After the layer refactoring
change we need to update this code as well. But since this code isn't being used anywhere
updating it doesn't make much sense. There are no tests for this either. This code is
removed for now and we can add it back later once the plan for running HyperV isolated
containers with CimFS is more solid.
Signed-off-by: Amit Barve <ambarve@microsoft.com>
Current layer writer interface forces us to calculate the CIM path from the layer path by
making assumptions about CIM storage. This isn't a very good approach, better way is to be
explicit about what information the layer writer needs from containerd. This change
updates the CIM layer writer to take in layer CIM & parent CIM paths as inputs. This also
means a corresponding changes needs to be made in containerd.
Signed-off-by: Amit Barve <ambarve@microsoft.com>
Remove unused/legacy functional test flags/environment variables.
Unify [TestMain] control flow, so there is only one exit call, and
`defer` is used to run cleanup after the tests are run.
Standardize UVM `default[L|W]COWOptions` to accept a context, and add
context parameter to `namespacedContext`
Remove all build tags aside from `functional`, since features are used
to select the tests to run. This standardizes the functional tests with
the cri-containerd tests, even though the feature names themselves are
different.
Add `test/pkg/uvm.CreateWCOW` function to mirror `CreateLCOW`, and add
`Create` and `CreateAndStart` functions that pick LCOW or WCOW based on
the options provided.
Have uVM scratch and image layers be created under a dedicated and
persisted folder within `%TEMP%` that is excluded from Windows defender.
(The folder will be removed during OS restart, regardless of contents.)
Remove copied OCI spec options from `test/internal/oci`, add new
options for creating HostProcess containers.
Add a `internal\sync.OnceValue`(`Ctx`) function that mirrors
`sync.OnceValues` (introduced in go1.21) to have a type-safe `Once`
function.
Check that required privileges are held (only once) when unpacking
Windows layers.
Fix LCOW tests in `lcow_test.go` that were setting `KernelDirect`
without also updating `KernelFile`.
Add `util.Context` function to create context that times out before test
timeout, to help with timing issues and allow time for cleanup and
logging.
Rename `cri_util` to `criutil`, since underscores are frowned upon in
package names.
Add a `test` prefix to `github.com/Microsoft/hcsshim/test/pkg/*` and
`github.com/Microsoft/hcsshim/test/internal/*` imports to be consistent
across all `test/functional/*` files.
Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
* [deps] Omni-bus dependency update
Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
* upgrade containerd to see if tests pass
Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
---------
Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
CodeQL is generating a warning for tar extraction code suggesting that the tar file entries are used in an
unsanitized way and that could lead to file system traversal attacks. However, during tar extraction all the
files are written to the disk using the `internal/safefile` package which ensures all the filesystem
operations during layer extraction happen under the layer root directory. So this warning can be safely
suppressed.
Signed-off-by: Amit Barve <ambarve@microsoft.com>
* Working DM-Verity boot using 5..15 kernel
Signed-off-by: Ken Gordon <Ken.Gordon@microsoft.com>
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* Working to boot 6.1 or 5.15 kernels with vhd supplied userland and merkle tree.
Signed-off-by: Ken Gordon <Ken.Gordon@microsoft.com>
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* PR https://github.com/microsoft/hcsshim/pull/1886 changes which are required or gcs cannot start on 6.1
Signed-off-by: Ken Gordon <Ken.Gordon@microsoft.com>
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* Use "modern" igvm tooling from github repo.
Signed-off-by: Ken Gordon <Ken.Gordon@microsoft.com>
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* Clean up Makefile
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* Add boot doc
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* Remove startup_2 as it is now redundant
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* Tidying
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* print opts
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* debug
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* debug
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* Remove extra err
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* Rm fmt
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* Clean up startups
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* Kick CI
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* Add HvSock port annotation
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* Clean up merge
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* Mark ups pre-rebasing
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* gofmt
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* More concise Makefile snp target
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* Apply nits
Signed-off-by: Joe Powell <joepowell@microsoft.com>
---------
Signed-off-by: Ken Gordon <Ken.Gordon@microsoft.com>
Signed-off-by: Joe Powell <joepowell@microsoft.com>
Co-authored-by: Ken Gordon <Ken.Gordon@microsoft.com>
* Minor fixes for cimfs writer
Adds minor fixes like updating the Windows build which supports CimFS, using safefile for creating directories in CimFS writer etc.
* Always expand volume when expanding sandbox VHD
Currently, ExpandScratchSize or ExpandSandboxSize functions expand the VHD itself but don't expand the volume
on that VHD (unless we are on 19H1 & build < 19020). This works because for legacy layers the PrepareLayer
call made just before starting the container will automatically expand the volume to match the size of the
VHD. However, in case of CimFS layers we don't call PrepareLayer at all, so in that case we need to expand the
volume at the time of expanding the VHD.
This also means in case of legacy layers, we might have a small perf hit because the VHD is mounted twice for
expansion (once here and once during the PrepareLayer call). But as long as the perf hit is minimal, we should
be okay.
Signed-off-by: Amit Barve <ambarve@microsoft.com>
* Lint common error wrapping issues, update README
Enable `errorlint` to catch common issues with wrapping and testing for errors.
Wherever possible, switched to using `errors.Is` and `errors.As`.
Exceptions:
- function is defined in the same package and explicitly returns a
know error variable
- returns from functions in `io`, `binary`, `context`, `syscall`,
`golang.org/x/sys/windows`, or `golang.org/x/sys/unix` that are
(relatively) stable in error return value and type
- conversion would interact with with `github.com/pkg/errors`
- conversion would be non-trivial and require additional
testing/validation
- specifically, legacy code in `runhcs` and the root of the repo
Rename `context` to `ctx` in `pkg\go-runhcs\*.go` to avoid
overshadowing `context` package.
Update `README.md`:
- run markdown formatter (spaces around code blocks and headers, raw link URLS)
- add section on linter and go generate (similar to go-winio's)
Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
* PR: hcserrors(+tests), README
Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
---------
Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
Add (Windows) build tags to necessary files, and add
`internal\vhdx\doc.go` so that go language server does not complain that
package is missing on Linux.
Not all updated files are Windows specific. Some (eg,
`internal\gcs\iochannel.go`) are only used by Windows code, so the go
build tag prevents `unused` lint errors when `GOOS=linux`.
Export the `parseAnnotations(Uint32|Uint64|String)` functions in
`internal\oci\annotations.go` since other functions in the
file are used in Linux files and that was the only way to avoid `unused`
lint errors.
Finally updated lint (in `.github\workflows\ci.yml`) and codeql
(in `.github\workflows\codeql.yml`) jobs to run on entire repo for
Linux, rather then specific directories.
Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
* Add support for Linux kernel 6.x to fetch attestation report
Signed-off-by: Takuro Sato <takurosato@microsoft.com>
* Hard code ioctl code
Signed-off-by: Takuro Sato <takurosato@microsoft.com>
---------
Signed-off-by: Takuro Sato <takurosato@microsoft.com>
Allow callers to specify additional registry values in the WCOW OS via
the `io.microsoft.virtualmachine.wcow.additional-reg-keys` annotation.
The intent is to test and validate bug fixes or debug uVM behavior
(eg, via setting values in `SYSTEM\CurrentControlSet\Services\wcifs`,
`SYSTEM\CurrentControlSet\Policies\Microsoft\FeatureManagement\Overrides\*`)
without requiring a new package.
The annotation is under `internal/annotations`, since it is not
suitable for end users to rely on.
Additionally, limit the settable registry keys to prevent it being used
(abused) as a catch all mechanism to arbitrarily modify uVM behavior.
Additionally, add
[RegistryValueType](https://learn.microsoft.com/en-us/virtualization/api/hcs/schemareference#RegistryValueType)
and
[RegistryHive](https://learn.microsoft.com/en-us/virtualization/api/hcs/schemareference#RegistryHive)
enum types for HCS v2 schema.
Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
Adds a new CimLayerWriter that implements the same LayerWriter interface that the legacy layer writer
implements. This CimLayerWriter can be used in containerd to pull images into the cimfs format.
Signed-off-by: Amit Barve <ambarve@microsoft.com>
- Extend hcsTask.Update() to process and add
mount for running process isolated and hyperV
wcow containers
Signed-off-by: Kirtana Ashok <kiashok@microsoft.com>
Add an option to mount partitioned disks with dmverity.
Additionally add support for reading verity information from
within the guest. The expectation is that verity hash device
is appended to the read-only file system. The functionality
can be enabled by passing a container annotation.
Host no longer reads verity superblock and as a result
the `DeviceVerityInfo` protocol message is being
deprecated. The guest will always attempt to read verity
super-block when non-empty security policy is passed.
Security policy is expected to be empty only in regular
LCOW scenarios.
Signed-off-by: Maksim An <maksiman@microsoft.com>
* Allow setting HclEnabled to false
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* Ensure HclEnabled field can still be omitted
Signed-off-by: Joe Powell <joepowell@microsoft.com>
---------
Signed-off-by: Joe Powell <joepowell@microsoft.com>
* Enable all go vet checks
Turn on all [go vet](https://pkg.go.dev/cmd/vet) checks (except for
`fieldalignment`, and ignore shadowing `err` variables.
Caught a couple minor bugs:
- ncproxy did not set the panic file for the service
- `nil`-field access in logs
- not updating `processorLimits` in `(*UtilityVM).Update`
Simplified a couple `if` statements clauses where
[conditional evaluation](https://go.dev/ref/spec#Logical_operators)
made `!= nil` checks redundant.
Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
* PR: err name; decl
Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
---------
Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
offline registry API is required during CimFS layer import. This commit adds Go wrappers around it. It also
exports some constants from the wclayer package so that those constants can be used by the cim layer package
Signed-off-by: Amit Barve <ambarve@microsoft.com>
* Update containerd1.7; google.golang.org/protobuf
Update to containerd 1.7 and move to `google.golang.org/protobuf`
from `github.com/gogo/protobuf/gogoproto`.
These two changes are intertwined, since containerd 1.7 changes its
ttrpc task server definitions and protobuff generation (as well as some
other API changes).
Additionally, the task server gRPC code is imported from containerd
directly, rather than being generated here, and that code now explicitly
imports `google.golang.org/protobuf` instead of
`github.com/gogo/protobuf/gogoproto`.
Upgrading to `google.golang.org/protobuf` also requires updating
the `containerd/cgroups` dependency to v3
(`github.com/containerd/cgroups/v3/cgroup1/stats/`).
The new `protoc-gen-go-grpc` generators do not allow directives such as
`gogoproto.customname`, so the `go-fix-acronym` command is used to
update acronym customization (which is what containerd does).
Updated `Protobuild.toml` to specify new generators.
Added an `Update-Proto.ps1` script to re-generate protobuf files locally
and in GitHub CI.
Add `protobuild` and protobuff `grpc` and `ttrpc` generators to
`tools.go` so they are tracked and vendored, and can be trivially
installed via `go install`.
Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
* Vendor protobuf import changes
Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
---------
Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
Explicitly track `github.com/golang/mock/mockgen` as a go dependency.
(There is no change in our go.mod or go.sum, since the package is
already being used by the generated code and elsewhere).
Add `//go:generate` directives to create the mocked files and ensure the
results are always up to date.
Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>
This change completely removes the support for cloning/late cloning that
was added a few years ago. The reasoning behind this is as follows:
- There are no plans to utilize cloning at this point in time.
- The cloning support required extensive/invasive changes across many
parts of the shim. This has made future changes and refactorings more
difficult in some cases. While these changes are still possible, it
seems like an unnecessary burden if we are not going to use cloning.
- The cloning functionality was never actually utilized, and thus may
still have had fixes needed to be production-ready.
If cloning is needed again in the future, we will be able to revert this
commit to add it back.
Signed-off-by: Kevin Parsons <kevpar@microsoft.com>
In some circumstances, the policy decision object returned from a policy
denial causes the resulting error message to exceed the maximum error length
imposed by Service Fabric. This PR adds some truncation logic to reduce the
size of the decision object so it first into the limit. Firstly, all standard
capability sets (privileged and unprivileged) are replaced with a placeholder.
Then, if the message is above the length limit then the following things are
truncated until the message is below the threshold:
1. `reason.error_objects`
2. `input`
3. `reason` (the rest of the reason object returned from the policy)
Signed-off-by: Matthew A Johnson <matjoh@microsoft.com>
This commit makes two major changes:
1. All policy enforcement points now receive a context objects and use it
to log policy errors and denial decisions.
2. Policy denials are now conveyed as structured JSON objects.
Whereas previously policy denial was surfaced as a text error message,
the policy now generates a bracketed base64 encoded string:
policydecision< (base64) >policydecision
When decoded, this will be a JSON object with the following structure:
```json
{
"input": <redacted input object>,
"decision": "deny",
"reason": <reason object explaining decision>,
"policyError": <optional error field>
}
```
NB: the `"policyError"` field above is only present if the denial was
triggered by an actual error in the Rego policy.
Signed-off-by: Matthew A Johnson <matjoh@microsoft.com>
Due to `execve` limitation on the size of environment variable, write the
base64 encoded security policy, UVM reference info and host AMD certificate
to container's rootfs.
Update existing test accordingly.
Signed-off-by: Maksim An <maksiman@microsoft.com>
This commit adds enforcement over the seccomp profile associated with a container. The
policy author can measure their seccomp profile and include this measurement in the
policy. Subsequently, they can provided that same seccomp profile to the orchestrator
(e.g. via an annotation) and GCS will measure the provided profile and provide this as
input to the policy engine.
This commit also adds a series of CRI tests for security context enforcement.
Fixing error with privileged exec_in_container
Adding CRI test for privileged exec in container
Signed-off-by: Matthew A Johnson <matjoh@microsoft.com>
Remove `//nolint` directives for varcheck, deadcode, and
structcheck; they were deprecated in golangci-lint v1.49.
Removed `//nolint` directives for `unused`; it appears a new version of
that linter is less false-positive prone.
Fix instance of loop variable being captured in closure for
`Test_RunPodSandbox_Concurrently` in `policy_test.go`.
Signed-off-by: Hamza El-Saawy <hamzaelsaawy@microsoft.com>