* Add a parameter for enabling Entra ID RBAC on key vaults
* Add an RP-level feature flag for determining whether to use the mock MSI RP
* Tweak the mock identity URL to play nicely with the mock MSI RP
* Add Azure SDK client wrappers for new clients (federated identity credentials control plane and key vault data plane)
* Vendor in new Azure SDK clients and update msi-dataplane
* Lay groundwork for use of cluster MSI...
- Initialize the MSI dataplane client, using the mock MSI RP/stub if
appropriate
- Initialize key vault store client (for MSI certificates; functionality
is implemented in MSI dataplane module)
- Create a cluster MSI certificate and store it in the key vault during
cluster bootstrap
- Instantiate an Azure SDK FederatedIdentityCredential client using the
cluster MSI certificate
- Delete the cluster MSI certificate as needed during cluster deletion
* Don't fail during cluster deletion if the cluster MSI certificate is
already gone from the key vault (or was potentially never created)
* Establish an RP-Config variable for the MSI RP endpoint
- Update doc comment for ensureClusterMsiCertificate
- Simplify conditional logic in MSI cert deletion
* Use pointer conversion functions that aren't deprecated
* Respond to PR comments (and fix some other things along the way)
- Move `clusterMsiResourceId` function to `OpenShiftCluster` type
- When persisting the MSI cert to KV, use the `NotAfter` returned by the MSI RP (for the stub, just use an arbitrary value)
- Move `getClientOptions` functionality to `AROEnvironment` type
- Move logic for determining cluster MSI key vault name to `pkg/env`
- Pull cloud name mapping stuff out to `AROEnvironment` type
- Update msi-dataplane module to include new changes and use `UserAssignedIdentities` type to get Azure credential in `pkg/cluster/clustermsi.go`
- Fix typo in https URL in comment in `pkg/cluster/delete.go`
- Implement suggestion to use `errors.As` instead of a type assertion in `pkg/cluster/delete.go`
* Update documentation with info about new feature flag
- Move new cluster MSI steps forward in bootstrap step order
- Move MSI dataplane client options stuff to pkg/env
- Explicitly check for a single cluster MSI in `ClusterMsiResourceId`
- Other small tweaks
* Vendor in msi-dataplane update that prevents a potential nil pointer dereference
* Add missing method to internal key vault client
* Make error messages more specific in ClusterMsiResourceId
* Add missing env vars to run-rp make target and uncomment dynamic validation bootstrap step
- In newly added Azure clients, return struct types instead of interface
types
- Move cluster MSI certificate deletion to be after Azure resource
deletion for safety just in case cx continues to use cluster that is
in Failed/Deleting provisioning state
* Add new env vars for MIWI to env.example for clarity/completeness
* Turn check for nonzero number of user assigned identities into a utility function
* Use existing constant for key vault dns suffix
* ARO-4376 Track2 authorization api addition for roledefinitions
* ARO-4376 add a stringutil funcs
* ARO-4376 use dbPlatformWorkloadIdentityRoleSets to get platform identity roles for cluster version
* ARO-4376 add dynamic validation for platformworkloadidentityprofile
* ARO-4376 resolve initial comments
* ARO-4376 refactor error messages and checkaccess action crosscheck
* ARO-4376 Add unit tests and comments resolution
* ARO-4376 add validation for upgradeableTo
* ARO-4376 Comment resoultion and additional unit tests
* ARO-4376 minor version comparison handling
* ARO-4376 update permission error messaging handling for MIWI
* ARO-4376 update constructors to return non-interface type
* ARO-4376 add unit tests for GroupsIntersect
* ARO-4376 update generate files to support bingo
* Update ci-go
* Update go toolset
* Update prepare shared rp dev
* Update prepare your dev
* More 1.21 updates
* more changes
* save work
* test
* tidy up
* Add license to typealker test
* move some repeated code into pkg/util/service/
* cleanups in cmd/aro
* update_ocp_versions does not need AEAD
* cache the authorisers rather than recreating them
* env mock updates
* move stuff around from review
Dumps the VM info + console logs on failure so that we don't need to run the Geneva Action or have the control plane still around to get it. Also refactors frontend and geneva action to make use of the same code path.
* Update openshift/api to release-4.12
* Add machinev1 resources to scheme
* Add CPMSDeactivatorEnabled flag
* Add CPMS Deactivator operator controller
* Add controlplanemachinesets to system:aro-sre ClusterRole
* Use better naming convention for CPMS controller flag
* Change debug log messages to info
* Make CPMS controller exit early if clusterversion < 4.12
* Only setup CPMS controller on clusters with machinev1 API
This is necessary in order to Watch the CPMS resource - this operation will fail on
clusters that do not support the Machine V1 API (OCP <= 4.11), causing controller
setup to fail. Since these clusters do not have a CPMS resource to manage, we can
safely skip running this controller on those clusters.
* Fix CPMS controller name
* Add Cosmos DB container for PlatformWorkloadIdentityRoleSets
* Revert change to AKS k8s version - committed by mistake
* Fix bug in converter
When I first wrote the converter, I thought Go would treat the the slice
we `make` few lines above these changes as a slice full of zero-value
structs, but it actually treats it as an empty slice, which led to
out-of-bound issues when I first tried to use this converter to work on
the API endpoints.
* Add the PlatformWorkloadIdentityRoleSetConverter to the API register
* Implement the change feed for role sets in the easiest, most naive way
* Implement the external API endpoint for listing role sets
* Fix a small oversight from earlier on
* Add unit tests for the list endpoint
* Add unit tests for changefeed changes
* Uncomment the static validator
* Fix more slice out of bounds bugs in the converters...
* Add converter and static validator to the admin API register
* Add list and put endpoints
* Fix name of function to match convention
* Fix bug in static validator
I originally wrote the code the way I did so that we could aggregate
errors so that we could provide a better UX in cases where there are
multiple similar errors in the request content. I found while writing
unit tests that aggregating the errors in this way and not wrapping them
in a CloudError causes the RP to return an internal server error instead
of a 400 bad request.
Is there a way we can aggregate the errors and still wrap them in a
CloudError? I'm not sure of the formatting requirements for the text of
CloudErrors.
* Add unit tests for new API endpoints
* Fix typo
* Appease the linter
* Appease the linter
* Add TODO comment re: the number of parameters
* Update static validator to return multiple validation issues at the same time where applicable for better UX
* Add a simple utility function to make semver comparisons of OpenShift minor version more readable
* Log error before returning 500 to user
* Log errors before returning 500 to user
* Improve naming of unit test cases
* Add additional unit test cases
* api: Avoid referencing DefaultInstallStream in tests
* frontend: Avoid referencing DefaultInstallStream
The frontend's OpenShiftVersions change feed handler will record
the current default version for the rest of the frontend to use.
* monitor: Remove latestGaMinorVersion metric
The RP no longer has this information internally, so the metric
is no longer relevant.
* update_ocp_versions: Read versions from an environment variable
Read OpenShift versions and pull specs from an OPENSHIFT_VERSIONS
environment variable containing a JSON object. This data includes
the default OpenShift version for new installs that don't specify
a version.
This moves us toward eliminating hard-coded OpenShift versions in
pkg/util/version/const.go.
* cache_fallback_discovery_client_test.go: Hard-code version
I'm not sure what to do with this test. Install stream data has
moved to RP-Config, so if the test is worth keeping then I guess
the oldest supported version will have to be hard-coded and kept
up-to-date. But it probably won't be.
* version: Remove DefaultInstallStreams
DefaultInstallStream will remain for now, but it's ONLY for use by
local development mode until we can come up with a better solution.
---------
Co-authored-by: Matthew Barnes <mbarnes@fedorapeople.org>