Граф коммитов

4177 Коммитов

Автор SHA1 Сообщение Дата
Ankur Singh 91a82f3acf Adding 4.13.23 as deafult version 2024-04-17 10:57:12 -07:00
Ayato Tokubi d6fc9fcf63
add export in createapp.env (#3518) 2024-04-17 10:47:29 +10:00
Edison Cardenas 8878b0d460
ARO-5325: Propagate install error to user (#3509)
* ARO-5325: Propagate error to user and add unit test.

* Refactor: Revert  double quote
2024-04-15 11:48:25 +05:30
Maitiú Ó Ciaráin d6082a5ecb
Use the correct MDSD image (#3517) 2024-04-11 13:09:56 +00:00
b-jhoreman 99838eacc4 adding retry logic on yum commands to avoid resource locking 2024-04-09 11:11:36 -07:00
b-jhoreman 220a14d390 adding retry logic on yum commands to avoid resource locking 2024-04-09 11:05:07 -07:00
b-jhoreman 55c8b9df29 adding retry logic on yum commands to avoid resource locking 2024-04-09 10:59:41 -07:00
b-jhoreman f2ae7c1488 adding retry logic on yum commands to avoid resource locking 2024-04-09 10:35:34 -07:00
Hilliary Lipsig df91955909
Merge pull request #3487 from Azure/tsatam/hotfix-mdsd-pullspec-allow-reset-to-default
Ensure Fluentbit/MDSD spec operatorflags assume default version when set to empty string
2024-04-08 10:20:35 -07:00
Hilliary Lipsig a9a3ffc5e0
Merge pull request #3504 from Azure/nwnt/remove-skiplinuxazsecpack-tag
Remove SkipLinuxAzSecPack from RP and Gateway VMSS
2024-04-08 10:14:40 -07:00
Caden Marchese adc4836520
Add new initial fields to v20240812preview (#3478)
* Add new initial fields to v20240812preview
* update openshiftcluster_example.go
* add new fields to converter

Co-authored-by: kimorris27 <kimorris@redhat.com>
2024-04-08 09:26:41 -04:00
Amber Brown 867b0d5979
Load the app/SP from the environment instead of automatically creating it (#3498)
* use multierror here, so it's more obvious if we're missing multiple keys

* Ignore the written out clusterapp.env

* move create/delete into separate commands, which write out a clusterapp.env file

* delete the app in the e2e.sh file

* update the docs
2024-04-08 08:06:53 +10:00
Amber Brown 27bc205e24
Remove portal v1 (#3465)
Portal V1, you have served us well.
2024-04-05 12:06:22 +11:00
Nont 8c8e8e4ed4 Remove SkipLinuxAzSecPack from RP and Gateway VMSS 2024-04-04 15:34:34 -05:00
Tanmay Satam fb41688ea3
MDM/MDSD/Fluentbit Image Bumps (#3493)
* Update mdm/mdsd/fluentbit coordinates to latest versions

* make generate
2024-04-04 10:55:09 +02:00
Tanmay Satam 222b5f24fc Ensure Fluentbit/MDSD spec operatorflags assume default version when empty string 2024-03-28 09:33:47 -04:00
Maitiú Ó Ciaráin be7b8157b0
Refactor the ResultType (#3443) 2024-03-28 13:09:07 +01:00
Caden Marchese 5cd28ea9ed
Consistent OIC storage account naming between dev and prod (#3468)
* Consistent naming between dev and prod
2024-03-27 12:49:03 +11:00
Ben Vesel 14e3c5a881
Merge pull request #3474 from Azure/ARO-6247/revendor-go-cosmosdb
Revendor go-cosmosdb
2024-03-21 16:44:11 -04:00
Maitiú Ó Ciaráin 07672dde9c
Switch to using the secondary key (#3427)
* Switch to using the secondary key

* Documentation update

* Log the name of the key used

* Pass log arg through

* Fix import ordering

* Shorten line
2024-03-21 11:28:58 -04:00
Tanmay Satam 19ea73917a Include generated client 2024-03-20 18:21:05 -04:00
LiniSusan c38242fea3
capturing Azure storage account throttling error (#3298)
reverting the runner changes

changes for more robust code- checking for httpstatuscode

simplifying the levels of nesting of the block of code
2024-03-19 09:23:30 +05:30
Jeff Yuan a215c370e9 add go test 2024-03-14 17:15:15 +13:00
Jeff Yuan 836c0eaa4f reload aead when encountering CIF chacha20poly1305 error 2024-03-13 17:15:34 +13:00
Roland Kunkel 38ebdaef10
[ARO-5445] DNS checker logging custom DNS server as an error (#3453) 2024-03-12 22:04:24 +01:00
Maitiú Ó Ciaráin f20df84717
Make TestAsyncOperationResultLog more robust (#3442)
* Check all logs

* Use better test pattern

* Remove unused property
2024-03-12 11:55:41 +01:00
Amber Brown 7a415b07de
Remove unneeded OpenShift pins & imports (#3430)
* Remove dependencies on console-operator and cluster-api-azure

* remove the forks that we don't use

* go mod updates

* go mod vendor

* stop relying on the providerspec being registered in tests

* cleanups

* update go sum

* test coverage fixes
2024-03-12 16:23:44 +11:00
Amber Brown e86100c9ae
Add clienthelper, a replacement for dynamichelper (#3345) 2024-03-12 12:44:53 +11:00
Tony Schneider cf8c569257
compare certs by serial number (#3435)
Compare mdm/mdsd certs by serial number
2024-03-11 15:48:16 +01:00
Caden Marchese cda87483e3
Fix typo introduced in #3404 (#3433) 2024-03-08 10:31:05 -05:00
Nont 6061732eb8
Update applens tenant (#3444)
* Update applens tenant

* Change the applens prod app ID as well
2024-03-08 08:35:47 -05:00
Maitiú Ó Ciaráin b7de9128e1
Revert "Refactor result type (#3410)" (#3441)
Revert "Refactor result type" (commit 6ed5f52)
2024-03-07 14:00:54 +01:00
Maitiú Ó Ciaráin 27ce730ac3
Remove ifreload workaround (#3437)
Drop ifReload workaround
2024-03-07 11:00:26 +01:00
Maitiú Ó Ciaráin 6ed5f52e0f
Refactor result type (#3410)
Refactor result type
2024-03-06 12:14:21 +01:00
Amber Brown ed94c28346
Update to Go 1.20 (#3429)
* Go 1.20 changes

* go 1.20 does not need the seed randomised by default

* go generate
2024-03-05 18:18:36 +11:00
Kipp Morris 8ee1b531ef
2024-08-12-preview API skeleton (#3419) 2024-02-29 12:37:28 -05:00
Caden Marchese 42f3708452
Create oic storage account in dev (#3404)
* create oic storage account in dev
* split oic resources into new template for reuse
* add roleassignment, dev script
* parameterize, add documentation
* create new cmd for full env, doc change
2024-02-27 13:15:35 -07:00
Tony Schneider af311a2d31
Back port v2023-11-22 Azure REST specs (#3420)
* ProvisioningState Cancelled -> Canceled
* add WorkerProfileStatuses to example openshift cluster
* add x-ms-enum header to ProvisioningState and PreconfiguredNSG
* make client
2024-02-26 15:15:33 -07:00
Ankur Singh 8b523d1d2f
Adding segregation of resultType for CloudError (#3396)
* Adding segragation of resultType for CloudError

* Fixing lint issues

* Adding test coverage

* Fixing lint issue

* Uncommit unwanted commit

* Fixing test cases
2024-02-22 08:41:14 +05:30
Amber Brown 730651374c
Merge pull request #3413 from hawkowl/hawkowl/ci-cleanup-feb24
Clean up some of the CI/linting/generate
2024-02-21 17:48:08 +11:00
Amber Brown 527fe77c7b comment fixes 2024-02-21 11:13:46 +11:00
Amber Brown 4181083fe7 fix 2024-02-21 10:59:54 +11:00
Amber Brown 0b31fe7560 fix local dev not building a cluster successfully because of network issues, plus fix a bug in the authorisation logging 2024-02-21 10:59:54 +11:00
Tanmay Satam 07135d82dc
Revert pkg/util/cluster creation to use 20230904 API (#3415)
* Revert pkg/util/cluster creation to use 20230904 API

* remove 20231122 client from cluster.go entirely
2024-02-20 14:58:10 -05:00
Tanmay Satam 6b8cca0078 PR nits 2024-02-20 10:57:07 -05:00
Nont 6f2b31bacb
Make NSG Monitor run every 10 minutes (#3343)
* Make NSG Monitor run every 10 minutes

* Fix rebase unit tests

* Address code review comments

* Add defer the NSG ticker's stop
2024-02-20 15:00:37 +01:00
Amber Brown 6b3e58e430 fix the generation from controller-tools 2024-02-20 14:01:56 +11:00
Amber Brown f0611becb1 fix linting errors that become noisier in go 1.20 2024-02-20 13:56:18 +11:00
Amber Brown 46976867c2 fix incorrect rm in generate 2024-02-20 13:55:56 +11:00
Ben Vesel 5ac65fd499
Don't add security context on <4.11 as OpenShift restricted SCCs do not tolerate it (#3401)
* Don't add security context on <4.11 as OpenShift restricted SCCs do not
tolerate it

* Update GetClusterVersion to return cv.Status.History[0] if no completed update exists
2024-02-19 09:26:52 -05:00
Tony Schneider f4d03fb440
add archv1 check to multiple ip static validation (#3406)
* add archv1 check to multiple ip static validation

* add archv1 check to preview api static validation

* use InstallArchitectureVersion constant in test
2024-02-19 09:12:13 -05:00
Nicolas Ontiveros 7f99749a4d
Don't accept TLS 1.0/1.1 Connections (#3378)
* Update dbokten server

* Update frontend

* Update portal

* Update proxy

---------

Co-authored-by: Nicolas Ontiveros <nicolas.ontiveros@microsoft.com>
2024-02-16 12:40:28 -05:00
Nont 00a44ec345 Remove BYO NSG references and replace with Preconfigure NSG 2024-02-15 14:38:53 -05:00
Nont 34ed0d7a17 Adjust the NSG validation logic for GA 2024-02-15 14:38:53 -05:00
Amber Brown 94b690c08a
fix go 1.20 change around proxy and default user agents (#3408) 2024-02-14 08:43:32 -05:00
Tanmay Satam 08bf84c329 Only watch for generation (spec) changes on ARO cluster resource in most controllers 2024-02-09 11:13:41 -05:00
Tanmay Satam 397e898fd8 Extract duplicated predicates to common predicates pkg 2024-02-09 11:13:41 -05:00
Tanmay Satam 9eb8db1e4e
Update cluster lifecycle API fields to readOnly (#3380)
* Exclude Python AAZ client from license hack script
* Update clusterProfile url, apiserverProfile url and ip, ingressProfile ip, systemdata to be readonly in v20231122 API
* Update swagger json (generated)
* Update API static validation tests and converter to respect readOnly fields
* Update generated client SDKs
2024-02-08 10:35:17 -07:00
Maitiú Ó Ciaráin e7721a086a
Remove unused old code (#3383) 2024-02-07 12:12:20 +11:00
Daniel Holmes 2d7fc451bd
Merge pull request #3170 from hawkowl/hawkowl/denyassignment-insights
Allow metricalerts, actiongroups, and activitylogalerts for monitoring in the denyassignment
2024-02-07 09:55:26 +10:00
Srinivas Atmakuri 6e8e4e1870
HiveNamespace to use docID/clusterID for new Installs (#2992)
HiveNamespace currently uses aro-<uuid>, this change is an effort
to unify UUIDs accross cluster doc instead of having multiple,
by pointing HiveNamespace to docID so this can be leveraged later.

More Details: https://redhat-internal.slack.com/archives/C02ULBRS68M/p1686806655273309
2024-02-05 12:45:28 +11:00
carvalhe 2ac8d655d2
Added FP Check for validation (#3335)
* Added FP Check for validation

* Cloud Err to surface level backend for SP issue
2024-02-02 11:23:57 -05:00
bennerv 7a40bfb343 Update MUO to use proper pod and container security context 2024-02-02 11:22:45 -05:00
bennerv d37167475c Add securityContext to ARO worker and master operator deployments as required by k8s v1.27.x 2024-02-02 11:22:45 -05:00
Tanmay Satam 9c6eab88f2
Do not exit NSG monitor when unable to get subnet (#3372) 2024-02-01 11:00:21 -07:00
Tanmay Satam 2a0ef0e1a1
Return error on certificateexpirationstatus monitor when IngressController is invalid rather than panic (#3371) 2024-01-31 13:41:55 +01:00
Nont 5d4995a8b1
Refactoring per comments in #3324 (#3358)
* Refactor to address the comments

* Remove InOrder tests
2024-01-24 14:42:17 -05:00
Nont 01cb9bbcfe Remove additional private preview logic for NSG. 2024-01-24 14:41:51 -05:00
Nont 88e12febf9 Delete the flag const 2024-01-24 14:41:51 -05:00
Nont 5d17746abf Delete AFEC flag behavior for preconfiguredNSG 2024-01-24 14:41:51 -05:00
Jory Horeman 3abcd30e5b
2023-11-22 stable API (porting 07-01-preview api) (#3300)
* adding 2023-11-22 stable

---------

Co-authored-by: b-jhoreman <b-jhoreman@microsoft.com>
2024-01-18 14:15:38 -07:00
Matthew Barnes 809041fe2e
Move default openshift version (#3094)
* api: Avoid referencing DefaultInstallStream in tests

* frontend: Avoid referencing DefaultInstallStream

The frontend's OpenShiftVersions change feed handler will record
the current default version for the rest of the frontend to use.

* monitor: Remove latestGaMinorVersion metric

The RP no longer has this information internally, so the metric
is no longer relevant.

* update_ocp_versions: Read versions from an environment variable

Read OpenShift versions and pull specs from an OPENSHIFT_VERSIONS
environment variable containing a JSON object. This data includes
the default OpenShift version for new installs that don't specify
a version.

This moves us toward eliminating hard-coded OpenShift versions in
pkg/util/version/const.go.

* cache_fallback_discovery_client_test.go: Hard-code version

I'm not sure what to do with this test.  Install stream data has
moved to RP-Config, so if the test is worth keeping then I guess
the oldest supported version will have to be hard-coded and kept
up-to-date.  But it probably won't be.

* version: Remove DefaultInstallStreams

DefaultInstallStream will remain for now, but it's ONLY for use by
local development mode until we can come up with a better solution.

---------

Co-authored-by: Matthew Barnes <mbarnes@fedorapeople.org>
2024-01-18 13:20:03 -07:00
Amber Brown b4e8930830
Make env know what the service component its running is (#3254)
* make env know what the service component its running is

* regen mocks
2024-01-17 15:17:14 +11:00
Nont aee18fe570
Improve NSG monitoring by retrieving subnet only once (#3324)
* Improve NSG monitoring by retrieving just once

* Skip wp profiles if count is 0
2024-01-11 14:08:50 -07:00
Nont 5068e9d4e7
Add unit tests for ensureAccessTokenClaims (#3337) 2024-01-11 07:22:44 -07:00
Nicolas Ontiveros 477a0453c8
Add maintenance state for customer action needed (#3294) 2024-01-10 09:06:26 -07:00
Ankur Singh e7550b17ba
Fixing cert metric emitting condition param (#3284) 2024-01-10 08:22:03 -07:00
azoppiserpa 40e7987ef8
Fixing magic strings on operator flags (#3327)
* replacing usages of magic strings with flags from the subpackage

* removing the //todo comment regarding the magic strings

* replacing magic strings with operator constants

* move DefaultOperatorFlags to operator package, inject when needed
2024-01-04 15:59:24 +11:00
Jeff Yuan 5218c82775
Merge pull request #3319 from yjst2012/guardrails-add-user
added exempted user to guardrails
let the cleanup continue when failure occurs
iterate over all namespaces to find out if another gatekeeper is deployed
2024-01-04 10:13:13 +13:00
Caden Marchese 82f47281d4
Remove dnsmasq restart script for new clusters (#3313) 2024-01-03 10:00:30 -05:00
Daniel Ionel Bizau 4afe1d7339
disable gzip compression as a workaround (#3339) 2024-01-02 19:10:44 -05:00
Jeff Yuan 7bf277bd1d iterate over all namespaces to find out if another gatekeeper is deployed 2023-12-21 16:45:41 +13:00
kimorris27 3b2db9c121 Added space and fixed grammar in user-facing error messages 2023-12-20 15:56:59 -05:00
kimorris27 d7377e7d67 Capitalize `Azure` in customer-facing error message 2023-12-20 15:56:59 -05:00
kimorris27 adb1f0c4c8 Update attachNSGs cluster install step to retry if Azure complains that NSG is not yet ready 2023-12-20 15:56:59 -05:00
Amit af1f7ae2ac dynamic validation on CIDR 2023-12-19 11:24:41 -05:00
Tanmay Satam 23290e4f1b Include and serve static prometheus web UI in admin portal 2023-12-15 14:15:35 -05:00
Kipp Morris de1b399b6c
`az aro update` CredentialsRequest hotfix (#3325)
* If the CredentialsRequest isn't found, retry until timeout instead of immediately erroring out

* `ensureCredentialsRequest` upon every `az aro update`

* Add an E2E test for the `az aro update` scenario where the ARO
operator's CredentialsRequest has been deleted
2023-12-14 15:49:19 -05:00
Jeff Yuan 17ad4315a4 update test cases 2023-12-12 10:42:42 +13:00
Jeff Yuan 1b8adb22fd let the cleanup continue when failure occurs 2023-12-11 18:34:13 +13:00
Jeff Yuan 031270377d added exempted user to guardrails 2023-12-08 17:01:49 +13:00
Ben Vesel c3d127ad47
Use image digests for fluentbit images (#3316) 2023-12-07 09:08:31 -05:00
Andrew Denton 91fd9d00ac
Enable 4.13.23 as an install target instead of 4.13.16 (#3314) 2023-12-05 17:29:26 -05:00
Andrew Denton cb6ab10d16
Enable 4.13.16 as an install target (#3312) 2023-12-05 14:57:13 -05:00
Ben Vesel 232d3bc536
Move to correct mdsd image (#3308) 2023-12-01 16:11:41 -05:00
Steven Fairchild 86deb391a8
MDM Image Bump - ARO-4792 (#3297)
Update MDM image to 2.2023.1118.1225-d7e0d6-20231118t1338
Update MDSD image to mariner_20231109.1
2023-12-01 13:40:15 -05:00
Jeff Yuan 548ad5cb0c
Merge pull request #3232 from SrinivasAtmakuri/remove-IsLessThanMinimumDuration
etcd-renew - Add extra validation and remove IsLessThanMinDuration
2023-11-30 18:46:41 +13:00
Daniel J. Holmes (jaitaiwan) 6b15494718 fix(admin): Ensure check on certData slice 2023-11-30 13:51:18 +10:00
Tony Schneider 9b92b4f79b
Admin action to delete a cluster managed resource (#3286)
* add ResourceDeleteAndWait to azureactions

* add delete resource admin action and frontend routing

* add helper functions for lb config manipulation

* refactor azure actions
- moves resource delete code to seperate file
- adds loadbalancer client to handle deleting FrontendIPConfiguration
- updates ResourceDeleteAndWait to handle deleting FrontendIPConfigurations
- adds DeleteByIDAndWait to features/resources client

* add e2e tests

* fix imports and add license headers

* cleanup / fix lint

* add command example to doc

* rename to "managed" resource id

* change query param to camel case

* use var group instead

* return error as adminReply already wraps in CloudError

* fix missed camelCase of query param

* use regex to match frontend ip configurations

* remove focus

* add deny list to prevent deleting PLS and Storage

* fix mixed import

* use fake pls name to prevent accidently deleting e2e cluster pls

* fix test

* add PE to deny list
2023-11-29 17:09:56 -05:00
bennerv 72cad158a9 Remove version 4.11.26 as it's affected by installation bugs 2023-11-29 13:51:28 +00:00
Nicolas Ontiveros e4eea7f7a8
Use an enum for cluster maintenance states (#3230) 2023-11-28 16:25:49 -05:00
Kipp Morris 9a9edacf6b
Update ARO operator Azure auth scheme to use a DefaultAzureCredential (#3274)
* Update the cluster authorizer to use a DefaultAzureCredential

* Update the ARO operator to set and use DefaultAzureCredential via env vars

* Add a CredentialsRequest to the ARO operator deployment

* Restart the ARO operator upon `az aro update`

* Removed now unused AzCredentials function

* Changed ARO operator deployment wait time during `az aro update` from
  20 minutes -> 5 minutes

* Refactor CliWithApply to generalize to different object types

* Updated Restart in pkg/util/kubernetes to use server-side apply
* Updated Restart in pkg/operator/deploy to only return an error after
  at least attempting to restart all of the deployments passed in

* E2E test for ARO operator master deployment's restart upon cluster update

* Wait for the ARO operator's CredentialsRequest to be reconciled before
restarting
2023-11-28 10:45:00 -05:00
Srinivas Atmakuri d71061995a test case validateEtcdCertsRenewed func 2023-11-28 10:51:39 +05:30
Amber Brown afa28a3789
fix not returning arm template errors (#3288) 2023-11-27 13:52:49 -05:00
Matthew Barnes 966f7a176e graph: Allow gzip compression in requests
Microsoft fixed the Graph service regression that broke
kiota-http-go, so revert the workaround.
2023-11-27 10:30:47 -05:00
Amber Brown 4a1ea4074e
Remove journald fields that aren't needed (#3283) 2023-11-22 10:57:26 +11:00
Tanmay Satam 06f78b75ce
Watch MachineSets for worker subnet changes instead of Machines (#3280)
https://issues.redhat.com/browse/ARO-4632
2023-11-21 10:24:25 +11:00
Nont e7f514086d
Migrate documentdb client from sdk track 1 to track 2 client (#3255)
* Create documentdb track 2 client and mockgen

* Replace track 1 documentdb with track 2

* Refactor per comments

* Delete unused client

* Fix generated env mocks
2023-11-17 15:34:17 -05:00
Nicolas Ontiveros a85b8d011e
FedRAMP TLS 1.3 Compliance (#3285) 2023-11-17 14:10:13 -05:00
Tanmay Satam f421bac527 Rename installLogsForDeployment to indicate latest deployment 2023-11-17 11:31:04 -05:00
Tanmay Satam e9e3c7a89d Implement generic handling for Hive ProvisionFailed=True conditions w/ Azure policy errors 2023-11-17 11:31:04 -05:00
Tanmay Satam ae5c57b6a8 Add hack script to generate hive install-log-regexes.yaml from ARO definition 2023-11-17 11:31:04 -05:00
Steven Fairchild ac7cf2f35c
Fix CVE-2023-45857 npm vulnerability found in audit (#3279)
https://github.com/advisories/GHSA-wf5p-g6vw-rhxx
2023-11-14 14:10:52 -05:00
tschneid 19703fefc6 move default IP creation outside of newPublicLoadBalancer method 2023-11-13 12:49:40 -05:00
tschneid 3d7c789c13 remove outdated comment 2023-11-13 12:49:40 -05:00
tschneid d37aae372a add unit tests for multiple IP aro create 2023-11-13 12:49:40 -05:00
Daniel Ionel Bizau 0b7096dc43
add back cli-domain-from-installer-image (#3277)
Overriding the npm security alert to be able to hotfix the RP
2023-11-10 17:10:53 +01:00
Amber Brown 19d053d2fe
Move genevalogging operator controller files into separate .confs with goembed (#3276) 2023-11-10 14:28:43 +11:00
Matthew Barnes d25e7f29d9
graph: Add HTTP tracing option for MS Graph requests (#3269)
* graph: Add HTTP tracing option for MS Graph requests

Debug feature enabled by ARO_MSGRAPH_TRACE environment variable.

* graph: Temporarily disable gzip compression in requests

---------

Co-authored-by: Matthew Barnes <mbarnes@fedorapeople.org>
2023-11-08 15:12:42 -05:00
Matthew Barnes 67d06bd655
Propagate errors of GenevaLogging controller (#3221)
* genevalogging: Use AROController as base type
* genevalogging: Split off business logic for uniform error handling
* genevalogging: Add condition for controller status
* genevalogging: Check status conditions in unit tests

---------

Co-authored-by: Matthew Barnes <mbarnes@fedorapeople.org>
2023-11-08 09:37:03 -07:00
Amber Brown e278fd6891
Add some more golangci-lint linters and fix the issues they find (#3234) 2023-11-08 10:45:17 +11:00
tschneid c53b31c73d add cluster check alert 2023-11-07 11:15:37 -05:00
tschneid 590cc5297d add muo alerts 2023-11-07 11:15:37 -05:00
Ben Vesel 35a5f16464
Bump Hive Version + Minimal Install (#3260)
* fix: match existing hive-config with production hive-config

* bug: bump hive version to use minimal install version and resolve vulns

* Remove oc-cli domain annotation
2023-11-02 12:56:26 -04:00
Tanmay Satam a0cc0eec38
Patch Hive ClusterDeployment rather than update it (#3244) 2023-11-01 12:25:27 -04:00
Amber Brown 9100c81eb0
clean up some of the monitor emitting metrics for aro operator statuses and prometheus (#3235) 2023-11-01 10:10:44 -04:00
Goutham Muguluvalli Niranjan 0dd1ec9300
Add defaults and update k8s dev version (#3245)
* add defaults and update k8s dev version

* update default of outbound_type

* no default set for disk enryption

* nit: fix style

---------

Co-authored-by: gniranjan <gniranjan@microsoft.com>
2023-11-01 10:02:06 -04:00
Ben Vesel af5f8260e8
bug: Keep cli-domain-from-installer-image until we use minimal hive version (#3257) 2023-10-31 13:35:10 -04:00
Ben Vesel c473088c75
Hive Minimal Install Annotation on ClusterDeployment (#3243)
* Add annotation for hive minimal install

* Remove cli-domain annotation and use installAttemptsLimit instead of annotation
2023-10-30 11:39:16 -04:00
Amber Brown bb8097c92f
Skip failing pki test for now (#3249) 2023-10-30 13:59:56 +01:00
Srinivas Atmakuri 7e09f87c45 Add extra validation and remove IsLessThanMinDuration
Added extra validation to verify if etcd certificates are renewed
Also, removed isLessThanMinDuration code block so we can
even renew clusters whose expiry is more than 6 months duration
2023-10-30 12:53:55 +05:30
Ben Vesel debb5ae92b
Remove 4.10 as an installation target (#3238) 2023-10-27 10:38:23 -04:00
Nont a6ce8b6e94 Add additional metrics 2023-10-27 09:51:40 -04:00
Nont 54bfa2b1ee Fix azure client for multiple clouds 2023-10-27 09:51:40 -04:00
Nont effcd0beab Fix make generate 2023-10-27 09:51:40 -04:00
Nont 55ac567aee Fix unit tests 2023-10-27 09:51:40 -04:00
Nont ab05ba74c4 Add a new azureclient for subnet from az-sdk 2023-10-27 09:51:40 -04:00
Nont c6285803fa Make all monitors into go routines 2023-10-27 09:51:40 -04:00
Nont bfd8f6497d Consolidate into consistent monitor emitters 2023-10-27 09:51:40 -04:00
nont 9c30a16b1c Plug NSG monitoring to mon.WorkOne 2023-10-27 09:51:40 -04:00
nont ba330d8007 Add NSG monitoring 2023-10-27 09:51:40 -04:00
Nont 9bb9b518d8
Fix CheckAccess unmarshal bug (#3241) 2023-10-27 12:08:12 +02:00
Ben Vesel f44f544e9f
Default installation target to 4.12 (#3237) 2023-10-26 15:31:32 -04:00
Tanmay Satam 4c89c89171
Update MDM, MDSD, Fluentbit images (#3233)
* Update MDM,MDSD,Fluentbit versions to latest

* Update generated deployment artifacts
2023-10-26 14:06:54 -04:00
Caden Marchese 4e00e297ef
Return actionable service principal errors to the user (#3231)
* revert changes made in #3222 to ensureAccessTokenClaims

* after timeout, return any actionable errors to the user

* combine / improve error messages

* still log err in RP logs if not nil

* fix nil pointer
2023-10-25 15:29:25 -04:00
tschneid 24d7966271 ensure API Server public IP is added to DependsOn 2023-10-25 11:16:29 -04:00
tschneid e6816b8fa6 add managed ips to arm templates 2023-10-25 11:16:29 -04:00
Arris Li 8019a0ee5e
feat: add deny pull secret deletion policy (#3228) 2023-10-25 11:38:39 +11:00
Ben Vesel 4e8877545b
Fix panic on addressPrefix vs addressPrefixes on subnet (#3223) 2023-10-24 18:26:42 -04:00
Matthew Barnes 3273b70b33
BUG: Fix golang client PUT for GA (#3204)
* swagger: Use struct tags to specify read-only fields

* immutable: Handle `swagger:"readOnly"` tag during validation

The left-hand operand (v) should omit read-only struct fields;
i.e. the field should always be the zero-value for its type.

* api: Add ExternalNoReadOnly method to OpenShiftClusterConverter

ExternalNoReadOnly removes all read-only fields from the external
representation. This is necessary when patching a cluster document;
read-only fields must be omitted from the external representation
in order to pass static validation.

* e2e: Exercise using PUT to update managed outbound IPs

---------

Co-authored-by: Matthew Barnes <mbarnes@fedorapeople.org>
2023-10-24 16:10:48 -06:00
Tanmay Satam 45a0899046 Remove requeueAfter during cluster upgrades as it is unnecessary 2023-10-23 15:56:45 -04:00
Tanmay Satam 4f8c591a5b Ensure DynamicHelper preserves Status on MachineHealthCheck updates 2023-10-23 15:56:45 -04:00
Tanmay Satam 276f924e09 Add k8s paused annotation to ARO MHC during upgrades 2023-10-23 15:56:45 -04:00
Caden Marchese cd81ce9073
Return errors from AuthorizationRetryingAction if timeout is reached (#3222) 2023-10-23 16:21:30 +02:00
Tanmay Satam 365413c357
Add routing to admin portal (#3164)
* Add React Router library

* Use React Router for search params

The existing functionality using this appears to be non-functional, but its behavior
is preserved.

* Use cluster resourceID in route for details modal

* Use URL routing to handle Cluster Details navigation

* Route all admin portal frontend subroutes to index.html

* Add handling to portal login redirect to preserve original path

* Update E2E tests for new admin portal routing

* Replace OverviewComponent with new implementation

- Use FluentUI DetailsList for contents
- Always display all properties, even if value is not present
- Modify E2E test to check each individual property

* Build frontend artifacts
2023-10-18 16:43:38 -04:00
Amit Arora 19f4e0697e
Additional Waiting for API Server to become ready (#3220)
* waiting for API Server to become ready
2023-10-18 16:02:07 -04:00
Mohammed Safwan Aslam Kazi 66c113a541
Propagate errors of ARO MachineHealthCheckController to ARO Operator (#3177)
* Propagate errors of ARO MachineHealthCheckController to ARO Operator
2023-10-17 16:26:31 -06:00
Ben Vesel 6cdf2c3281
Match AKS development version with production (#3217) 2023-10-17 16:03:44 -04:00
Tanmay Satam 4ae11b1c41
Enable Double (infrastructure) Encryption on ARO-provisioned storage accounts (#3216)
* Upgrade Microsoft.Storage API Version to 2019-06-01

* Explicitly set encryption Enabled=True on all storage account services

This is not strictly necessary, as the Storage API will default these to True.
This change is just to reconcile expected with actual.

* Update generated deployment assets
2023-10-17 11:26:50 -06:00
Tony Schneider 92ca0e4c1f
Add e2e tests for multiple ips per loadbalancer (#3183)
* add azureclient for 2023-07-01-preview
* add UpdateAndWait to oc azure client
* add e2e test for managed outbound IPs
2023-10-17 11:25:34 -06:00
Steven Fairchild 1844f2dc31 Revert workaround for mdsd bug
Remove workaround that reset log permissions to syslog.
Once Microsoft fixes race condition in https://dev.azure.com/msazure/One/_workitems/edit/12512148 this PR will be ready to merge.
2023-10-17 10:02:10 -04:00
Christoph Blecker f131c2564e
Merge pull request #3211 from bennerv/network-acl-fix-for-storageaccounts
Network acl fix for storageaccounts on Create or AdminUpdate
2023-10-14 14:48:35 -07:00
kimorris27 8cba7164c8 Fix some nits and verify service endpoint location before reconciling vnet rules 2023-10-12 18:09:10 -05:00
kimorris27 8abf6a445c Fix storage account reconciliation, take two
I removed the check of the gateway status because the attempt to add
vnet rules will fail if there are no service endpoints on the subnet
anyway.
2023-10-12 16:57:37 -04:00
kimorris27 b75bc878d1 Fix storage account controller vnet rule additions
Fixed such that we continue to add vnet rules for cluster subnets if Microsoft.Storage service endpoint present on subnet(s)

Some context for this fix:

"""
Unbeknownst to us, as of OCP 4.11, the default StorageClass object for azurefile-csi was automatically selecting the cluster Storage Account we use to store things like ignition config to also back all PVs created with this storage class, that is, customer data.

As of this RP Release / PUCM, ARO SRE added functionality to ARO to disallow requirements on Service Endpoints on cluster subnets (seen as an attack vector for some highly security-conscious customers), and in doing so, we added some NACL rules to not allow nodes to access the cluster storage account, because we figured from a service perspective it was unused.

These two things are now in conflict. We now have some customers who used the default storage class for Azure File, and can no longer mount volumes due to our new NACL configuration. We need to modify our stack to accommodate these customers (estimated to be a few dozen).
"""
2023-10-12 16:57:37 -04:00
bennerv d346dd760c Add NACLs for OCP Subnets iff service endpoints set on subnets 2023-10-12 16:57:36 -04:00
bennerv e76ad78a55 Helper function to return functions which have a specific service endpoint enabled 2023-10-12 15:52:14 -04:00
Matthew Barnes ca8a3a0c17
api: Add MissingFields to OpenShiftVersionProperties (#3208)
Co-authored-by: Matthew Barnes <mbarnes@fedorapeople.org>
2023-10-12 08:50:03 -04:00
Matthew Barnes cf357559a8
api: Add Default field to OpenShiftVersionProperties (#3202)
Co-authored-by: Matthew Barnes <mbarnes@fedorapeople.org>
2023-10-11 10:34:55 -04:00
Tanmay Satam cfc546e798
Fail cluster deletion if RP is not authorized to delete managed resource group (#3207)
* Fail cluster deletion if RP is not authorized to perform resource group delete

* Refactor deleteResourcesAndResourceGroup to be simpler and more testable
2023-10-10 20:46:51 -04:00
Ben Vesel 2703a95f8a
Fix Construction of Monitor for Clusters with kube-apiservers Down (#3201) 2023-10-05 14:56:59 +02:00
gniranjan b1109af1cd Merge branch 'master' of https://github.com/gouthamMN/ARO-RP 2023-10-04 15:14:21 -05:00
gniranjan 29a7188502 validate workerProfilesStatus to be nil only during create 2023-10-04 15:13:51 -05:00
Christoph Blecker cef6b015a6
Add customer data regression tests for get/create/update/delete k8s admin api (#3198) 2023-10-03 20:14:59 -04:00
Ben Vesel 36df0152cf
Prevent memory leaks with DialContext usage and new k8s client creation (#3090) 2023-10-03 18:47:49 -04:00
Nicolas Ontiveros 082e9a5a0f
Set maintenace task to "" after PUCM pending operation (#3197)
* Fix pucm pending

* unit test

* unit test

* Fix response

* Fix maintenance task

* Fix comment

* Refactor

* undo refactor

* fix unit test

* Fix doc

---------

Co-authored-by: Nicolas Ontiveros <nicolas.ontiveros@microsoft.com>
2023-10-03 15:43:35 -04:00
Ben Vesel 54e90e8895
Convert the pucm pending boolean in the AdminAPI (#3196) 2023-10-02 12:52:15 -04:00
Jeff Yuan 2fa4d56574
Merge pull request #3191 from yjst2012/remove-openshift-config
removed openshift-config as protected namespace as suggested
2023-09-29 10:16:05 +13:00
Nicolas Ontiveros 85e9ab32de
Don't Put Cluster in Update State after PUCM Pending (#3194)
* Fix pucm pending

* unit test

* unit test

* Fix response

---------

Co-authored-by: Nicolas Ontiveros <nicolas.ontiveros@microsoft.com>
2023-09-28 15:23:19 -04:00
Spencer Amann 6a9a817e56
retry operations on pull-secret when receiving a conflict (#3195)
* retry operations on pull-secret when receiving a conflict

* retry the entire pull-secret rotation

* use apply instead of update

* uses aro-rp fieldManager and forces the apply

* refactors to reduce to only one apply for pull secret
2023-09-28 14:30:13 -04:00
Jeff Yuan 618366d94a removed host mount policy as suggested cont. 2023-09-27 18:50:36 +13:00
Jeff Yuan ca41d3201c removed protection under openshift-cluster-csi-drivers ns as suggested 2023-09-27 16:59:20 +13:00
Jeff Yuan 29de88449f removed host mount policy as suggested 2023-09-27 16:57:19 +13:00
Jeff Yuan 98fd988491 remove protection over machine obj 2023-09-27 11:36:34 +13:00
Jeff Yuan c6996d8cf3 removed openshift-config as protected namespace as suggested 2023-09-25 18:04:11 +13:00
Ben Vesel c45d6cc21f
Revendor go-cosmosdb to fix database.EmitMetrics() json decode error (#3190) 2023-09-24 09:31:09 -04:00
Hilliary Lipsig a7d026270d
Merge pull request #3189 from Azure/revert-3171-412-default
Revert "Default to 4.12 for new installations"
2023-09-21 16:35:22 -07:00
Hilliary Lipsig d29d6a1eb7 Revert "Default to 4.12 for new installations (#3171)"
This reverts commit 1d6d144493.
2023-09-21 14:28:52 -07:00
Spencer Amann 33de1ab921 fallback to new pull secret if we cant merge 2023-09-21 16:15:59 -04:00
Spencer Amann 6750ea5b36 attempts to merge original and new even when recreating 2023-09-21 16:15:59 -04:00
Spencer Amann aeea7eb6e0 continue forward if openshiftConfigSecret is missing 2023-09-21 16:15:59 -04:00
Spencer Amann ca6c4cd6af merge existing data with new token 2023-09-21 16:15:59 -04:00
Spencer Amann 9ea5c627ce adds retries to delete/create/update on the openshift-config pull secret 2023-09-21 16:15:59 -04:00
Spencer Amann 584f7735df refactor to reduce use of conditions 2023-09-21 16:15:59 -04:00
Spencer Amann 1f38c932c8 rotate the openshift-config pull-secret on acr token rotation 2023-09-21 16:15:59 -04:00
Ben Vesel b123a45f25
Fix WorkerProfilesStatus PUCM Bug (#3184) 2023-09-21 16:14:07 -04:00
Tony Schneider e23c4d3e94
remove UserDefinedRouting AFEC feature flag (#3180) 2023-09-20 16:55:08 -04:00
Ben Vesel 019c2fb11b
Fix bad logic comparison against an already existing gateway record (#3182) 2023-09-20 13:35:56 -04:00
Jory Horeman 2a16a3634c
upstreaming changes required by upstream CI (#3117)
Co-authored-by: b-jhoreman <b-jhoreman@microsoft.com>
2023-09-20 10:15:03 -06:00
Ben Vesel bfddd185ce
Return error if we fail to delete hive resources during deletion (#3179) 2023-09-19 16:54:40 -04:00