Граф коммитов

522 Коммитов

Автор SHA1 Сообщение Дата
Ramiro 666f36cfb8
CNS - Ensuring no stale NCs during NNC reconcile (#2192)
* ensuring no stale ncs during nnc reconcile

* only save state if mutated

* ensuring we only remove stale ncs if none of their ips are assigned
2023-09-07 20:39:08 -07:00
Evan Baker 9ace4b7593
change total IPs and add secondary IP metric (#2172)
* change total IPs and add secondary IP metric

Updates the Total IPs metrics to include the NC
Primary IP in the total. Adds a Secondary IPs
metric which holds the value that the Total IPs
previously held: NC Secondary IPs known to CNS
which could be used by Pods.

Signed-off-by: GitHub <noreply@github.com>

* update help wording on IPAM metrics

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* reword PrimaryIP metric help

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

---------

Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2023-08-28 12:45:36 -07:00
Evan Baker 1adf24adb5
Add MTPNC reconciler for cache population in Swift V2 (#2164)
add mtpnc watcher

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2023-08-28 18:50:01 +00:00
Quang Nguyen e767b150cd
feat: CNS writes SWIFT conflist (#2110)
* cns writes swift conflist

* gofumpt
ed

* var naming

* add swift scenario to switch
2023-08-23 16:04:05 -04:00
Evan Baker 06e3877cdf
Upgrade controller-runtime (#2162)
* upgrade controller-runtime

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* update build tools and regen crds

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* fix import conflict

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2023-08-23 08:58:52 -07:00
Evan Baker 1b2a04ab6a
feat: stub CNS Pod watcher (#2112)
* feat: cns watches pods

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* indirect pod reconcile for more dynamic behavior

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2023-08-21 09:56:43 -07:00
Evan Baker ba689eb932
add swift v2 config based on node label (#2144)
* add swift v2 config based on node label

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* add tentative swiftv2 label

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2023-08-17 12:09:56 -07:00
rsagasthya 0603722598
Updated log line for IP Usage metrics. (#1961)
Added the pendingprogramming and pendingrelease count in the log line.
2023-08-09 19:22:26 +00:00
Ramiro c8005ed94b
CNS - consider multiple IPs for a pod in reconciliation after restart (#2079)
* modifying reconcile function to take multiple ncs instead of one. we need this because for reconciliation in dual stack cases both IPs for a pod which can come from distinct ncs must be added at the same time

* adding comments, and renaming functions and types, to make the intent clearer

* adding some dummy values to cns.NewPodInfo invocations in tests, instead of empty strings since we need distinct interface ids

* adding a basic test for dual stack reconciliation

* adding validation for multiple ips for the same ip family on the same pod, which is not allowed

* changing direct use of interface id to pod key, which is better for reconcile flows using information from kubernetes instead of cni

* fixing comments to use host network terminology instead of system pod

* improving error message on duplicate ip from cni found; improving readability of error handling on ip parsing

* only checking for pod ips that are actually set on a pod
2023-08-02 01:49:21 +00:00
Ramiro 986a1f097f
CNS - fix locking in AssignDesiredIPConfigs (#2080)
moving lock to before accessing data it is supposed to protect

Co-authored-by: rjdenney <105380463+rjdenney@users.noreply.github.com>
2023-07-27 14:35:55 -07:00
Matthew Long cc251ca58a
fix: assume invalid semver CNI has the required dump state command (#2078) 2023-07-26 09:52:27 -07:00
Evan Baker 97fdf81f89
fix: use cached ctrlruntime client in IPAM pool monitor (#2043)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2023-07-21 11:35:39 -05:00
ZetaoZhuang 6325924bf1
add EnableAZR flag to make home az goroutine as an opt-in feature (#2067)
add EnabledAZR flag to make home az goroutine as an opt-in feature
 change PopulateHomeAzCacheRetryIntervalSecs default to 60

Co-authored-by: rjdenney <105380463+rjdenney@users.noreply.github.com>
2023-07-20 21:32:35 +00:00
rjdenney b1a63a9112
adding CNS fix for requesting IPs with 0 NCs (#2063)
* adding fix for requesting IPs with 0 NCs

* addressing comments

* error change

* fixing comments
2023-07-20 10:27:58 -05:00
Jaeryn 33890dd704
fix: skip reserved IPs (#2060)
Co-authored-by: Jaeryn <tsch@microsoft.com>
2023-07-17 20:05:59 +00:00
Jaeryn dc37cef19e
fix: return new error when no ncs found in nnc crd (#2061)
Co-authored-by: Jaeryn <tsch@microsoft.com>
2023-07-17 17:29:48 +00:00
Evan Baker c48c3e17a4
fix: CNS init must have an NC (#2030)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2023-06-23 00:47:38 +00:00
Evan Baker 21b5bbeaf5
fix: reconcile initial state from CRD regardless of existing podInfo (#2022)
* fix: reconcile initial state from CRD regardless of existing podInfo

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* fix: add test for no state and pending release in NNC

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* fix: add test for restoring state and pending release in NNC

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2023-06-22 14:11:57 -05:00
tamilmani1989 d4df70838e
[Fix][CNS] InterfaceID should be set to containerID instead of interfaceName if CNS managing state (#2023)
* InterfaceID should be set to containerID instead of interfaceName if CNS manage endpoint state is enabled

* add log in GetExistingIPConfig function to debug

* Fix UT (set containerid as pod interface id)
lint fix
2023-06-20 11:33:37 -07:00
Evan Baker 6fd545a5e6
fix: implement String for logging of PodInfo and IPConfig (#2020)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2023-06-19 11:55:47 -05:00
Rajvi fa2de6d579
set mellanox reg key (#1768) 2023-06-13 22:50:03 +00:00
Paul Johnston f687dee67e
'overlay' conflist generator for CNS (#2010) 2023-06-12 21:06:54 +00:00
Paul Yu a68d6382f4
add v6 port mapping policy to dualstack overlay (#1989)
* cosmic v6 port mapping policy fix
2023-06-08 03:27:56 +00:00
Paul Johnston 84063469ea
CNS to be able to generate dualstack overaly CNI conflist (#1981) 2023-06-06 16:44:32 -07:00
Saksham Mittal acfefd2efa
Initial getHomeAZ 404 changes (#1994)
* initial getHomeAZ 404 changes

* treat 404 as success

* address comments
2023-06-06 15:20:57 -07:00
aggarwal0009 ba18fc779c
[Vnet Scale - CNS]: Flattening CIDR ranges for Node NNC to a list (#1921)
* Read secondary CIDRs from VnetScale NNC

* fix comment

* update comment

* For VnetScale mode, Use 1st IP for def gateway instead of 0th for windows

* fix/add import

* address pr comments

* add comments

* address pr comments

* wrap error

* fix typo

* fix UT
2023-06-02 09:01:59 -07:00
Evan Baker 727f8df2b2
Always use 0 for NC version in Overlay (#1979)
always use 0 for NC version in overlay

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2023-06-01 21:05:22 +00:00
Matthew Long e620685c7d
fix: reserve 0th IP as gateway for overlay on Windows (#1968)
* fix: reserve 0th IP as gateway for overlay on Windows

* fix: allow gateway to be updated
2023-05-23 10:31:01 -05:00
Evan Baker 40022dab86
add retry to nnc update during scaledown (#1970)
* add retry to nnc update during scaledown

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* test for panic in pool monitor

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2023-05-22 10:01:09 -07:00
Evan Baker 92abf0306b
fix nnccli log message (#1955)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2023-05-18 13:14:35 -05:00
Paul Yu d711636bda
Ncid upper case hotfix (#1943)
* NCID uppercase hotfix
2023-05-17 16:34:04 -04:00
Paul Yu cd137eb0af
Cnsclientlog fix (#1960)
* cnsclient log fix
2023-05-17 13:01:48 -04:00
Ramiro a14eb1f769
CNS - Remove vnet join cache check during NC publish (#1946)
removing vnet join cache check during nc publish, such that join is always attempted during publish. this is the classic behavior circa cns v1.4.37
2023-05-05 13:08:45 -05:00
Matthew Long 7f223f2b49
fix: generate cni conflist if NCs already exist in state (#1940)
* fix: generate cni conflist if NCs already exist in state

* fix: lint

* fix: data race in test

* fix: lint in tests
2023-05-04 16:29:25 -07:00
bohuini b9861e2582
Add log to happy path (#1929)
Add log to happy path for NC refresh
2023-05-04 19:00:41 +00:00
rsagasthya 41f451a1e3
Added IP Usage metrics at Rest server. (#1932)
* Added IP Usage metrics at Rest server.
2023-05-02 20:21:55 +00:00
Paul Yu fec45f9e18
fix CNS client (#1920)
* case insensitive fix

* cnsclientfix Do()
---------

Co-authored-by: paulyu <paulyu@microsoft.com>
2023-04-18 18:28:48 +00:00
rjdenney cf5d99b5ec
CNS ipam and client changes to address comments (#1913)
* Update for previous comments

* adding tests

* fix linter issues
2023-04-14 19:33:13 +00:00
Camryn Lee e792ef5705
CNS writes Cilium Conflist (#1901)
* implement cilium conflist generator

* add cilium conflist for generator test

* update generator-windows

* cleaning up generator constants
2023-04-11 15:41:54 +00:00
ZetaoZhuang 5f89e378ba
add logs for home az monitor (#1899)
* add logs for home az monitor
2023-04-11 02:15:48 +00:00
ZetaoZhuang 76822f4de5
validating home az value (#1902)
validate home az value
change default PopulateHomeAzCacheRetryIntervalSecs value from 15 to 30
2023-04-10 22:07:34 +00:00
ZetaoZhuang aec0bf9d67
fix: print DeleteNetworkContainerRequestBody correctly in log (#1896)
* fix: print DeleteNetworkContainerRequestBody correctly in log

* align String() with others
2023-04-07 20:10:53 +00:00
Ramiro 6048ba2b97
Adding Pprof register method (#1885)
* adding pprof register method that can be invoked in multiple paths

* fixing lint issues
2023-04-05 15:40:15 +00:00
rjdenney 3e06a07ca2
Dualstack CNS Changes (#1773)
1. Enables CNS to handle multiple NCs in NNC.
2. Adds new APIs that allows multiple IPs to be requested and released.
3. This change is needed for dualstack overlay
---------

Co-authored-by: Tim Raymond <traymond@microsoft.com>
2023-04-04 09:00:02 -07:00
Evan Baker 18d4c70e55
timeout after 15 minutes waiting for the NNC reconciler to start (#1861)
* timeout after 15 minutes waiting for the NNC reconciler to start

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* log error and retry instead of crashing out

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* add metric for nnc reconciler failed to start

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2023-03-31 11:11:26 -07:00
Evan Baker 4dfd97c274
update to go1.20 (#1781)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2023-03-29 10:53:21 -07:00
Evan Baker ae8a11c7c8
add metric for tracking failure to start the controller-runtime manager (#1860)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2023-03-21 09:20:26 -07:00
Evan Baker 1d390a1f75
remove vendor again (#1835)
* remove vendor (again)

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* update mockgen usage for novendor

Signed-off-by: GitHub <noreply@github.com>

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
Signed-off-by: GitHub <noreply@github.com>
2023-03-06 12:14:37 -06:00
Paul Yu 1c854b470d
Allow DNC to create NCs on a node with duplicate orchestrator context -- CNI & CNS (#1625) 2023-03-03 22:51:12 -05:00
Evan Baker 21d2355889
instantiate a default cns config if loading fails (#1829)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2023-03-02 19:32:16 +00:00
Saksham Mittal 2a9b132434
Include body for unpublishNC calls to support AZR (#1826)
* initial delete NC changes to include body

* initial delete NC changes to include body

* lint error

* initial delete NC changes to include body

* initial delete NC changes to include body

* lint error

* test change

* test change

* add comment
2023-03-01 13:44:32 -08:00
Evan Baker 358de20d2c
Add multiplat Windows 2019 and 2022 image support to tooling and pipelines, use for CNS/NPM (#1820)
* build: add ws2019 cns image to build pipeline

* add windows2019 build pool

* fix: npm pipelines

* fix: npm pipelines

* update multiplat build process for winver flavors

* optional buildx push for npm cyclonus

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
Co-authored-by: Matthew Long <61910737+thatmattlong@users.noreply.github.com>
Co-authored-by: Matthew Long <Matthew.Long@microsoft.com>
2023-03-01 00:25:00 +00:00
Ramiro c8d2c1f576
CNS - Wireserver "proxy" (#1825)
* adding wireserver proxy

* using proxy for publish

* injecting dependency. refactoring unpublish nc

* default wireserver ip config

* setting content type

* better printing of requests and responses

* fixing unit tests
2023-02-28 12:20:07 -05:00
tamilmani1989 96835dcc76
CNS in-mem cache to use containerid as key if manageendpointstate is enabled (#1811)
update podInfo key as containerid instead of name and namespace
2023-02-21 12:04:47 -08:00
Timothy J. Raymond 48c7a149e5
Fix incorrect 200 for a 401 from NMAgent (#1799)
In scenarios where the subnet token does not match (leading to a 401
from NMAgent), CNS returns a 200 for the PublishStatusCode. This is
incorrect, and a 401 should be returned instead. This leads clients of
CNS to take incorrect action on, what they believe to be, a successful
response.
2023-02-14 09:48:35 -08:00
Matthew Long df431437b1
fix: multitenantnetwork reconciler should check errors correctly (#1791) 2023-02-10 00:26:21 +00:00
shchen da56b37a65
Fix the NC ID error when get container (#1767)
* Fix nc id mismatch

* Fix the nc ID prefix issue
2023-02-07 07:03:45 -08:00
Matthew Long 36757fefc2
feat: allow the CNI conflist generation settings to be configured via… (#1765)
* feat: allow the CNI conflist generation settings to be configured via cmd line args

* make env var opt-in
2023-02-07 01:30:22 +00:00
shchen ddb5954351
nmagent get nv version list api V2 refactor (#1744)
* tmp commit for onbaording nma v2

* Remove test output file

* Remove unnecessary code when CNS onboard get nc version list without
token

* tmp commit to fix getnc version tests when onboarding nc version api v2
from nmagent.

* Fix the unit test for nmagent v2 api change.

* Fix unit test TestGetNetworkContainerVersionStatus

* Revert back to GetNCVersionF test.

* Roll back to get nc version api v1 for test.

* Continue revert back and store nc version url

* Onboard nmagent get nc version api v2.

* Address pr feedback of returning early and remove comment out code.

* Remove unnecessary ncVersionURLs and NCVersionRequest.

* Remove unnecessary variables.

* Update nmagent get nc version api v2 to v2 url

* Remove comment out code.

* tmp commit for onbaording nma v2

* Remove test output file

* Remove unnecessary code when CNS onboard get nc version list without
token

* tmp commit to fix getnc version tests when onboarding nc version api v2
from nmagent.

* Fix the unit test for nmagent v2 api change.

* Fix unit test TestGetNetworkContainerVersionStatus

* Revert back to GetNCVersionF test.

* Roll back to get nc version api v1 for test.

* Continue revert back and store nc version url

* Onboard nmagent get nc version api v2.

* Address pr feedback of returning early and remove comment out code.

* Remove unnecessary ncVersionURLs and NCVersionRequest.

* Remove unnecessary variables.

* Update nmagent get nc version api v2 to v2 url

* Remove comment out code.

* Update nmagent get nc version list.

* Address feedback and fix golint

* Fix lint issue.

* Fix the remaining 2 lint issues.

* Revert back test error generation to address feedback.
2023-01-19 17:10:23 -06:00
Timothy J. Raymond 45108ae514
Fix incorrect HTTP status from publish NC (#1757)
* Fix incorrect HTTP status from publish NC

CNS was responding with an HTTP status code of "0" from NMAgent.
Successes are supposed to be 200. The C-style var block at the beginning
of publishNetworkContainer was the reason for this. During refactoring,
the location where this status code was set to a successful value of 200
was accidentally removed. Because the var block declared the variable
and silently initialized it to 0, the compiler did not flag this bug as
it otherwise would have. The status code has been removed from this
block and explicitly defined and initialized to a correct value of 200.
Subsequent error handling will change this as necessary.

Also, despite consumers depending on this status, there were no tests to
verify that the status was set correctly. Tests have been added to
reflect this dependency.

* Ensure that NMAgent body is always set

DNC depends on the NMAgent body being set for its vestigial functions of
retrying failed requests. Since failed requests will now be retried
internally (to CNS) by the NMAgent client, this isn't really necessary
anymore. There are versions of DNC out there that depend on this body
though, so it needs to be present in order for NC publishing to actually
work.

* Fix missing NMAgent status for Unpublish

It was discovered that the Unpublish endpoints also omitted the status
codes and bodies expected by clients. This adds those and fixes the
associated tests to guarantee the expected behavior.

* Silence the linter

There were two instances where the linter was flagging dynamic errors,
but this is just in a test. It's perfectly fine to bend the rules there,
since we don't expect to re-use the errors (they really should be
t.Fatal / t.Error anyway, but due to legacy we're returning errors here
instead).
2023-01-19 04:17:46 +00:00
ZetaoZhuang 063fe58a41
refactor and move the nmagentConfig code from cns to namgent package. (#1723)
* refactor and move the nmagentConfig code from cns to namgent package.
2023-01-10 23:00:03 -08:00
dependabot[bot] e86c188d46
deps: bump sigs.k8s.io/controller-runtime from 0.12.3 to 0.14.1 (#1749) 2023-01-11 00:37:46 +00:00
Paul Johnston 4e5530cc65
Fix broken NC publishing due to type mismatch (#1740) 2023-01-09 19:46:05 +00:00
Behzad Mirkhanzadeh 7b647be285
fix: repair windows cni lock issue (#1712)
* Moving the lock from InitializeKeyValueStore() function to restore/save functions to improve cni performance on windows.

* fix: use defer function to unlock statefile.

* fix: fixing the IPAM lock and defer func

* fix: Optimizing cni file lock by moving SetSdnRemoteArpMacAddress() on startup for CRD and MultitenantCRD mode.

* adding store lock on telemetry service start to avoid race condition on windows.
2023-01-07 08:03:09 +00:00
ZetaoZhuang e6eadd3379
feat: expose getHomeAzInfo api in cns to retrieve node home az infos from NMAgent (#1642) 2022-12-08 08:23:41 -08:00
Evan Baker 353b7e01c6
fix: cns to use controller runtime (cached) clients in reconcilers (#1668)
* update crd clients to accept existing ctrlcli core clients

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* fix cns to use managed cached clients for reconcilers

Signed-off-by: GitHub <noreply@github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
Signed-off-by: GitHub <noreply@github.com>
2022-12-05 17:55:45 -08:00
Timothy J. Raymond 57120ca41f
Fix failure to start CNS when setting WireserverIP (#1707)
There was insufficient coverage over cases involving different
permutations of the "WireserverIP" configuration option. Consequently,
there were instances where reasonable values for this option caused CNS
to fail to start.

This moves the logic for transforming the CNS configuration into
configuration suitable for the NMAgent client into a method off the
CNSConfig. It also permits adding coverage over different scenarios that
are likely to emerge.
2022-11-30 20:09:53 +00:00
Matthew Long 7b91752d10
feat: cns writes cni conflist (#1702)
* feat: add cni conflist generator for v4 overlay scenario

* feat: use atomic fs operations

* fix: use same directory as temp dir since /tmp is a tmpfs
2022-11-29 04:56:08 +00:00
Evan Baker 3a5c72b079
fix: hide echo banner in log (#1669)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-11-16 23:13:08 +00:00
Timothy J. Raymond 31a0906102
Fix inability to unmarshal nmagent request (#1695)
* Fix inability to unmarshal nmagent request

During the previous switchover to using the client from the
`nmagent` client (as opposed to the `cns/nmagent` client), an assumption
was made that "proxied" requests to NMAgent were provided by clients as
nested JSON. This assumption was wrong--they are Base64-encoded strings
of JSON. Even though they're ultimately similar, they're very different
from the perspective of the JSON unmarshaler. Consequently, this
restores the nested request body back to a []byte and performs the
second-stage decoding manually (similarly to how it was previously
done).

Fixes #1694

* Fix swallowed error when body is not JSON

In one instance, an error was accidentally swallowed because the
existing code does not return errors (it sets variables instead). This
makes controlling the flow of execution difficult. To fix this, the
offending code has been moved to a separate function where returns can
be used effectively.
2022-11-10 18:22:17 -06:00
Matthew Long 03b8508e5e
feat: allow the user to pass an aikey in config (#1680) 2022-10-27 11:39:56 -07:00
Timothy J. Raymond 9cc0b0ba3e
Replace the NMAgent client in CNS with the one from the nmagent package (#1643)
* Switch PublishNetworkContainer to nmagent package

This removes the PublishNetworkContainer method from the CNS client's
nmagent package in favor of using the one from the top-level nmagent
client package. This is to ensure that there's only one nmagent client
to maintain.

* Use Unpublish method from nmagent package

The existing unpublish endpoints within CNS used the older nmagent
client. This converts them to use the newer one.

* Use JoinNetwork from new nmagent client

The API in CNS was using the old nmagent client endpoints which we want
to phase out so that there is exactly one nmagent client across all
systems.

* Add SupportedAPIs method to nmagent and use it

The CNS client uses this SupportedAPIs method to proxy requests from DNC
and detect capabilities of the node.

* Add GetNCVersion to nmagent and use it in CNS

CNS previously used its own client method for accessing the GetNCVersion
endpoint of nmagent. In the interest of having a singular NMAgent
client, this adopts the behavior into the nmagent package and uses it
within CNS instead.

* Use test helpers for context and checking errs

Checking whether the error was present and should it be present was
annoying and repetitive, so there's a helper now to take care of that.
Similarly, contexts are necessary but it's nice to create them from the
test's deadline itself. There's a helper to do that now as well.

* Fix broken tests & improve the suite

There were some broken tests left over from implementing new NMAgent
interface methods. This makes those tests pass and also improves the
test suite by detangling mixed concerns of the utility functions within.
Many of them returned errors and made assertions, which made them
confusing to use. The utility functions now only do things you request,
and the tests themselves are the only ones making assertions (sometimes
through helpers that were added to make those assertions easier).

* Add GetNCVersionList endpoint to NMAgent client

This is the final endpoint that was being used in CNS without being
present in the "official" NMAgent client. This moves that
implementation, more-or-less, to the nmagent package and consumes it in
NMAgent through an interface.

* Remove incorrect error shadowing

There were a few places where errors were shadowed. Also removed the
C-ism of pre-declaring variables for the sake of pre-declaring
variables.

* Replace error type assertions with errors.As

In two instances, type assertions were used instead of errors.As. Apart
from being less obvious, there are apparently instances where this can
be incorrect with error wrapping.

Also, there was an instance where errors.As was mistakenly used in the
init clause of an if statement, instead of the predicate. This was
corrected to be a conjunctive predicate converting the error and then
making assertions using that error. This is safe because short-circuit
evaluation will prevent the second half of the predicate from being
evaluated if the error is not of that type.

* Use context for joinNetwork

The linter rightly pointed out that context could be trivially
propagated further than it was. This commit does exactly that.

* Add error wrapping to an otherwise-opaque error

The linter highlighted this error, showing that the error returned would
be confusing to a user of this function. This wraps the error indicating
that a join network request was issued at the point where the error was
produced, to aid in debugging.

* Remove error shadowing in the tests

This is really somewhat unnecessary because it's just a test. It
shouldn't impact correctness, since the errors were properly scoped to
their respective if statements. However, to prevent other people's
linters from complaining, this corrects the lint.

* Silence complaints about dynamic errors in tests

This is an unnecessary lint in a test, because it's useful to be able to
define errors on the fly when you need them in a test. However, the
linter demands it presently, so it must be silenced.

* Add missing return for fmt.Errorf

Surprisingly, a return was missing here. This was caught by the linter,
and was most certainly incorrect.

* Remove yet another shadowed err

There was nothing incorrect about this shadowed err variable, but
linters complain about it and fail CI.

* Finish wiring in the NMAgent Client

There were missing places where the nmagent client wasn't wired in
properly in main. This threads the client through completely and also
alters some tests to maintain compatibility.

* Add config for Wireserver to NMAgent Client

This was in reaction to a lint detecting a vestigial use of
"WireserverIP". This package variable is no longer in use, however, the
spirit of it still exists. The changes adapt the provided configuration
into an nmagent.Config for use with the NMAgent Client instead.

* Silence the linter

The linter complained about a shadowed err, which is fine since it's
scoped in the `if`. Also there was a duplicate import which resulted
from refactoring. This has been de-duped.

* Rename jnr -> req and nma -> nmagent

The "jnr" variable was confusing when "req" is far more revealing of
what it is. Also, the "nma" alias was left over from a prior iteration
when the legacy nmagent package and the new one co-existed.

* Rename NodeInquirer -> NodeInterrogator

"Interrogator" is more revealing about the set of behavior encapsulated
by the interface. Depending on that behavior allows a consumer to
interrogate nodes for various information (but not modify anything about
them).
2022-10-19 21:38:01 +00:00
bohuini 48e523866b
Add two API in CNS client for GET and POST all NCs (#1632)
Added two api in client and added unit test
2022-10-13 12:25:54 -05:00
Evan Baker df4608253e
Emit a metric in CNS for NC Sync count by status (#1650)
* emit a metric for NC sync and an error if all NCs are not present

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* rename variables in nc sync host version for clarity

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-10-12 20:00:21 -05:00
Ashish Nair 0a98133f82
Fix: Added a metric to monitor the subnet exhaustion state within the Ipam Monitor Pool (#1620)
* Added a metric to monitor the subnet exhaustion state within the Ipam Monitor Pool

* Fixed the PR comments

* Added a reconciler error metric

* Addressed code review comments

* Updating lint on code

* Addressed all code review comments and changed the reconciler metric to a counter metric and fixed linting issues

* Added a count metric for IPAM pool as well to count the number of switches between subnet exhaustion and reversal for each subnet

* Updated the makefile to be able to run linting with better garbage collection

* Updated the code with the PR review comments

* Updated the label values based on a discussion offline with Evan

Co-authored-by: asn <asn@microsoft.com>
2022-09-27 22:22:55 -07:00
Evan Baker b62c4f9221
vendor and use vendor in windows image builds (#1630)
to resolve network timeouts when pulling deps

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-09-27 12:29:24 -07:00
Timothy J. Raymond fc690d2274
Add path to CNS client for DeleteNC (#1619)
This path was missed in a prior iteration, but is necessary for the
client to function properly. Without it, the route cannot be found and
thus the request cannot be created.
2022-09-22 21:43:05 +00:00
bohuini 8e37321055
NC refresh optimization (#1544)
* NC refresh optimization: Added api and test

* NC refresh optimization: Address comments

* NC refresh optimization: Address comments

* NC refresh optimization: optimize createNetworkContainers()

* NC refresh optimization: Fixed log info
2022-09-16 05:33:14 +00:00
Evan Baker ee910869a8
Optimize subnet exhaustion handling (#1594)
compile crd schemes and improve subnet exhaustion handling

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-09-15 16:52:11 +00:00
Evan Baker bd7ee76d12
Align pool to batch during scale up (#1593)
align pool to batch during scale up

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-09-15 09:01:55 -07:00
Timothy J. Raymond a9a0a5f7c8
Add two more CNS endpoints to the CNS client (#1541)
* Add NumOfCPUCores method to CNS client

This is the last part of CNS's API surface that is used by DNC, but not
covered by a method from the CNS client.

* Remove unnecessary error wrapping lambda

The error handling here is simple enough that it doesn't really warrant
wrapping it with a lambda.

* Add NmAgentSupportedAPIs to the CNS Client

DNC uses this API to detect GRE key capabilities. Since it's not
supported by the client, it does this with direct HTTP requests
currently. In order to ensure that there's only one way of accessing
CNS, this implements the method so that the client can be used in DNC
instead.

* Add NMA supported APIs to client URL list

This was forgotten in a previous iteration.

* Handle non-200 status code by returning an error

The NMAgentSupportedAPIs method did not error when a non-2XX status code
was encountered. This leads to surprising behavior on the part of the
consumer.

* Use http.NoBody instead of nil

This was a linter suggestion, and a good one. http.NoBody is more
semantically meaningful than passing a nil.

* Create a FailedHTTPRequest error type

This is in response to a linter complaint that dynamic errors were being
used instead of a static error. Declaring an error type with var
wouldn't carry along necessary debugging information (namely the HTTP
status code), so it wasn't really appropriate. Rather than disable the
check entirely with a linter directive, this defines a reusable error
type so that HTTP errors can be communicated upward consistently.

* Use an alias for CNS client paths

These really should be a single set of paths, but there's this existing
block of constants to define a "contract with DNC." Whether or not
that's complete, the spirit of it is somewhat clear. However, it's not
great to be duplicating constants all over the place. This is somewhat
of a compromise by defining the newer constants in terms of the older
ones.

* Add missing context to outbound HTTP requests

This was a good suggestion from the linter.
2022-09-13 12:56:42 -04:00
Evan Baker 99ce6aea97
Move primary IP count to separate metric to not affect the IPAM math (#1579)
move primary IP count to separate metric to not affect the ipam math

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-09-07 23:18:08 -07:00
Evan Baker c3b13da8b3
fix level of no valid NC log in reconciler (#1567)
fix no valid NC log in reconciler

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-09-02 08:24:15 +00:00
Evan Baker 02c3d767bb
Update to Go 1.19 (#1505)
go 1.19

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-08-24 20:18:38 +00:00
Quang Nguyen 60e5a26565
feat: CNS manages endpoints state for delegate IPAM use case (#1500)
* rebase

* linting

* rebase

* missing if condition for releaseIPConfig

* update azure-cns.yaml and add UTs

* rebase

* update program iptables changes

* linting

* fix broken tests

* fix podinfoprovider returns error when key is not found

* log when no endpoint state exist when reconcilling

* not remove endpoint state file on failure to read in restserver.restoreState()

* addressed comments

* update acn tag

* go get on acn

* addressed comments

Co-authored-by: Evan Baker <rbtr@users.noreply.github.com>
2022-08-23 16:42:40 -07:00
Evan Baker 31b75cf44f
move reconciler-started log in to the once (#1538)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-08-23 00:22:29 +00:00
Matthew Long 07432787cc
feat: compare node IP with NNCs node IP (#1527)
* feat: compare node IP with NNCs node IP

* fix: don't block reconciler from starting if nnc doesn't have status
2022-08-17 10:47:20 -07:00
Quang Nguyen 46321fb423
feat : Add ifname to ipconfigrequest (#1530)
add ifname to ipconfigrequest
2022-08-16 17:06:23 -07:00
Quang Nguyen 877970022a
Fix CNS Program iptables for delegated IPAM (#1499)
* rebase

* rebase

* rebase

* adding snat iptables rules using coreos lib

* fix iptables cmd not running

* docs

* added conflist back

* change chain name from SWIFT to SWIFT-POSTROUTING

* update go.mod

* split internalapi into linux and windows

* add imports

* fix iptables programming login

* fix iptables programming logic

* change program iptables rules logic
2022-08-15 13:00:32 -07:00
Evan Baker 8b5e1672b9
Accept NNC with no NC as valid and consider pool monitor and reconciler started (#1526)
accept an nnc with no nc as valid and consider monitor and reconciler started

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-08-15 12:59:24 -07:00
Timothy J. Raymond 717b7a03e7
Add missing client paths to CNS Client (#1518)
These were forgotten during a previous effort to add endpoints to the
CNS client. Consequently, those endpoints don't actually work unless
these paths are present here.
2022-08-09 16:10:15 -07:00
Evan Baker 3ecb7fb15a
Watch ClusterSubnetState and drive subnet exhaustion batch changes (#1494)
* add clustersubnetstate reconciler and inject exhaustion in ipampool monitor

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* delint

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* add subnetscarcity config opt

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* dynamically adjust batch/minfree/maxfree when exhausted

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* update poolmonitor tests with exhausted scenario

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-08-01 09:56:38 -07:00
Evan Baker 0e3ef75d3c
move singletenantcontroller to nested kubecontroller package and rename (#1493)
move singletenantcontroller to kubecontroller package and rename

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-07-28 11:33:05 -07:00
Evan Baker 9dc679e26b
Update poolmonitor to support batch size of 1 (#1492)
updates to poolmonitor to support batch size of 1

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-07-27 13:33:12 -07:00
Evan Baker 65ac6a8be8
add test to purge pendingDelete IPs in ipampoolmonitor (#1467)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-07-26 09:16:38 -07:00
Evan Baker 05868abe8f
only log full pool state on change (#1446)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-07-23 23:33:51 +00:00
Quang Nguyen c6cafd48f9
Program IPTables SNAT rules in CNS (#1484)
* iptables snat rules in cns

* linting issue

* add cns config option for running delegated ipam

* change cns config option
2022-07-21 23:34:10 +00:00
Evan Baker 1396676ed1
fix latency metrics that should be seconds (#1468)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-07-21 11:15:13 -07:00
Timothy J. Raymond 4e0faf0fa1
Port DeleteNetworkContainer and SetOrchestratorType from DNC (#1454)
* Add DeleteNetworkContainer method to CNS Client

DNC needs to delete network containers using CNS, and it currently does
so through raw HTTP requests. This instead makes this an official
operation within the CNS client so that it can be used there instead.

* Remove return value from DeleteNetworkContainer

The return value from DeleteNetworkContainer was basically unused in
DeleteNetworkContainer, since it only contains a CNSResponse type which
can only ever be a successful response. Given this, it's sufficient to
just return no error as a signifier of success.

* Add SetOrchestratorType endpoint to cns client

DNC uses SetOrchestratorType currently by making straight HTTP requests
to the CNS backend. Since we should have one client only, this moves
these endpoints into the CNS client so that it can be consumed in DNC.

* Fix two linter issues

In one instance the linter had a false positive on an error that doesn't
really need to be wrapped. The other was a good suggestion since it
helps readers of the test understand what is going on.

* Add CreateNetworkContainer endpoint to client

Turns out DNC uses a few more endpoints than it would seem. This adds a
CreateNetworkContainer method to encapsulate the
/network/createorupdatenetworkcontainer endpoint.

* Add a PublishNetworkContainer to CNS client

Publishing network containers via CNS is something that DNC does
directly through an HTTP client. Given how common this is, it makes
sense to adopt this into the CNS client so that DNC can use the client
instead.

* Fix placeholder error message

This was just used in testing and was forgotten.

* Add context to DeleteNetworkContainer

Everything involving the network should take a context parameter.

* Fix PublishNetworkContainer return type

The PublishNetworkContainer endpoint returns a wrapping type around the
cns.Response. This updates the tests and the endpoint to reflect that.

* Add UnpublishNC to CNS Client

Unpublishing NCs is accomplished by DNC currently by directly making
HTTP calls. This adds that functionality to the client so the client can
be used instead.
2022-07-11 12:32:28 -04:00
tamilmani1989 ba3bbe0f26
Remove azure-vnet-telemetry for windows multitenancy (#1430)
* Remove azure-vne-telemetry for windows multitenancy and telemetry service for windows multitenancy will be started from cns.

* start telemetry service from cns

* lint and log fix

* minor change

* addressed comment
2022-07-01 15:09:39 -07:00
rsagasthya 618bf295da
Total IPs metric (#1342)
* Total IPs metric
Now includes Primary IP count
2022-06-22 17:08:48 -05:00
Ramiro 69f851a25a
adding a cert refresher construct (#1426) 2022-06-16 23:32:33 +00:00
zhangming870 85c0fe091b
change the field name to NumCores, so that it can be correctly decode in dnc side (#1358)
Co-authored-by: Ming Zhang <zhangming@microsoft.com>
2022-06-16 11:24:55 -05:00
Ramiro 46d6b9db79
Refactoring CNSLogger (#1429)
* refactoring cns logger to make it easier to stub out. addressing data race with concurrent reads and writes to node id and orchestrator
2022-06-16 00:19:12 +00:00
Evan Baker deb0c0cf37
add readyz and pprof listeners (#1395)
* add readyz and pprof listeners

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* add more pprof handlers

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-06-03 10:57:33 -05:00
Jaeryn 7219bb2dd9
fix: Service Account Mitigation for CNS on k8s Windows 2022 (#1367)
* Service Account Mitigation for CNS on k8s Windows 2022

* pick up Neha's bug fix

* addressing comments

* add node selector back

Co-authored-by: Jaeryn <tsun.chu@microsoft.com>
2022-05-20 10:03:10 -07:00
Evan Baker f35c92cb99
add independent healthserver for metrics and healthz (#1378)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-05-16 12:01:15 -05:00
neaggarwMS 661bd5bd46
Fix CNS Reconciling and AssignmentMode handling for AKS Swift (#1368)
* Fix CNS Reconciling and ASsignmentMode handling for AKS Swift

* Fixed the handling of default mode == Dynamic

* Incorporate feedback

* golint fixes

* Fix for whyNoLint: include an explanation for nolint directive (gocritic)
2022-05-13 17:58:06 -05:00
Camryn Lee 281eb70929
Fix CNS logging (#1372)
* modify cns logging

* update to implement stringer interface

* fixing linting errors

* removing log metadata

Co-authored-by: Evan Baker <rbtr@users.noreply.github.com>

Co-authored-by: Camryn Lee <camrynlee@microsoft.com>
Co-authored-by: Evan Baker <rbtr@users.noreply.github.com>
2022-05-12 11:31:00 -07:00
Evan Baker a8cf7a243a
include 0th address when parsing overlay cidrs (#1365)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-05-11 19:49:16 +00:00
Evan Baker 8750b346ed
CNS Prometheus and Grafana examples (#1366)
* cns prometheus examples

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* grafana samples

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-05-11 12:34:47 -05:00
Jaeryn dbb4f68393
build: Add Windows CNS & NPM as Part of Multi-Arch Manifest (#1345)
* add windows cns manifest to multi arch image

* try to use generic windows template w/ containerize stage in pipeline

* try and use buildah to pull images

* update manifest build and push for buildah

* create manifest by referencing images instead of pulling to avoid OS mismatch error

* remove unused windows-image.yaml

* remove REGISTRY var and use IMAGE_REGISTRY from makefile

Co-authored-by: Jaeryn <tsun.chu@microsoft.com>
Co-authored-by: Evan Baker <rbtr@users.noreply.github.com>
2022-05-05 15:32:53 -07:00
Evan Baker bd37755a35
Revert "move metrics to service port, add healthz, pprof (#1318)" (#1362)
This reverts commit 5ec6a3f2a8.
2022-05-05 04:42:06 +00:00
Evan Baker 7122dc246d
pull images from mcr (#1347)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-26 17:42:45 -05:00
Evan Baker 95155ca9b4
Support NetworkContainers of Static or Dynamic types (#1325)
* add overlay static nnc to nc listener

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* inline err check

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* pass NodeNetworkConfigs to Update method by value

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* rework nnc reconcile flow for NetworkContainer modes

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-18 22:07:08 -05:00
Evan Baker 704ba1adf5
fix: correctly return an error for non http 200 response from nmagent (#1333)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-18 21:50:03 +00:00
Evan Baker 31a3b27d7d
save local target spec in ipampoolmonitor instead of overwriting from NNC (#1328)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-14 08:57:34 -07:00
Evan Baker 3d5660cf14
create nnc listener concept and adapt existing poolmonitor and swift service to it (#1323)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-11 19:34:18 -05:00
Evan Baker 90e0999b89
support CIDR notation in the NC PrimaryIP (#1320)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-06 10:11:39 -07:00
Evan Baker 5ec6a3f2a8
move metrics to service port, add healthz, pprof (#1318)
* combine metrics and healthz

* use root mux and bind pprof routes on debug config opt

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Co-authored-by: Paul Miller <pmiller@microsoft.com>
2022-04-05 22:22:49 +00:00
Evan Baker 77dfdcf2af
update to go1.18 (#1281)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-04 16:49:49 -05:00
Jaeryn 257925e066
Set remote ARP MAC address when CNS is running in CRD mode (#1306)
Co-authored-by: Jaeryn <tsun.chu@microsoft.com>
windows dualstack e2e failures not related to this PR
2022-04-01 16:43:02 -07:00
rsagasthya 65ce2f1d96
Added the Pod Net ARM ID as a Label. (#1260)
* Added the Pod Net ARM ID as a Label.

Added Pod Net Arm Id as labels to the CNS Metrics.

* CNS Metrics: ARM ID as Label

Necessary changes to the read the ARM ID metadata coming from DNC.

* Subtle changes in NNC

Variable names had to be modified to meet lint standards.

* Subtle changes in NNC

Variable names had to be modified to meet lint standards.

* Made NNC arg a Pointer

NNC that was passed to create the ARM ID was too heavy. Converted that argument to a pointer and passed the reference of the NNC in that method.

* [CNS] Metrics Prefix

Added the "cx_" prefix to prometheus metrics of CNS.

* Minor Changes

Minor edits with the Metric Names and Documentation.

* ConstLabels in Metrics

Made changes to the ConstLabels in the CNS Metrics.

* ConstLabels in Metrics

Made changes to the ConstLabels in the CNS Metrics.
2022-03-16 02:39:36 +00:00
Evan Baker b5e599a715
ignore pods that have not been assigned an IP yet during cns initialization (#1277)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-03-15 16:21:25 -07:00
Paul Miller e42e41fa1d
feat: wrap a histogram around syncHostNCVersion (#1273)
* wrap a histogram around syncHostNCVersion

* try and fix linter errros though dubious about one

* clean up

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* kick pr?

* use success to be consistent with dnc

Co-authored-by: Evan Baker <rbtr@users.noreply.github.com>
2022-03-12 01:11:12 +00:00
rsagasthya 569c946435
Subnet labels (#1238)
* 12622609: Dimensions to metrics.

* Modified the constants declaration.

* Made a slight change to make the observeIPPoolState() method inline and remove a whitespace.

* Modified label values to stay consistent with other prometheus labels.
2022-02-24 14:40:00 -08:00
Evan Baker 8f9d6b32d6
use fully-qualified image names in all Dockerfiles (#1248)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-02-24 12:08:13 -06:00
Paul Johnston 81fd321fef
Allow cns to register node if no NCs are present (#1232) 2022-02-17 12:25:36 -08:00
Evan Baker 09c06b01e1
Parameterize Makefile, parallelize platform builds, support buildah (#1102)
* templatize the container make targets

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* add container push to npm conformance task

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* update pipelines

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* default OS and ARCH to GOOS and GOARCH

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-02-15 19:00:04 -06:00
Evan Baker a77e2475b7
Don't send NNC to Pool Monitor before Reconciler starts (#1236)
* remove early nnc send to pool monitor

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* order service starts better and add logging in initialization

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* wait for reconciler to start

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* update comments and docs

Signed-off-by: GitHub <noreply@github.com>
2022-02-15 18:50:30 -06:00
Paul Miller 7b15cbd8f2
get return codes for ipconfig at least (#1231)
* get return codes for ipconfig at least

* fix lint

* valid label name
2022-02-15 12:35:19 -06:00
Evan Baker cadca2f764
Include PendingRelease when calculating IP pool for scale-down (#1237)
* fix scale down

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* update pool state variable naming

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* add test for pending release

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-02-15 11:21:08 -06:00
aegal d52ed96a20
guard against unstructured url forwarding in CNS API's (#1192)
* remove forwarding unstructured urls

* remove forwarding unstructured urls

* addressing feedback

* fix linting issues

* update getncversion test

* fix test issue where nmagent handler returns 500

* update for linter

* address comments

* small liniter changes

* quick linting changes

* lint

* use gofumpt instead of gofmt

* ignore line count
2022-02-01 15:13:44 -08:00
Evan Baker 427a1e4200
fix initialize from kube pods flow (#1194)
* fix initialize from kube pods flow

Signed-off-by: Evan Baker <rbtr@users.noreply.github.

* move list pods inside closure

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-01-31 14:57:55 -08:00
Evan Baker 80e574b036
match on Node controller ref in NNC event predicate (#1190)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-01-28 17:37:20 -08:00
Evan Baker 1b68ce1ae9
Server-side filtering for NodeNetworkConfig, Node, and Pod objects (#1198)
* server-side filtering for nnc objects

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* yagni node and pod fields in cache builder

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-01-21 11:57:35 -08:00
Evan Baker 39f2bb9ae5
fix: use server-side apply in nnc client (#1152)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-12-07 15:10:04 -08:00
Evan Baker 9fff334a19
feat: shim in metric for ipconfigstatus state transition durations (#1080)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-12-06 18:32:51 -06:00
Vamsi Kalapala 83d161ec41
fix: [CNS] fixing a regex to extract hostip in getncversion req (#1140)
* fix: [CNS] fixing a regex to extract hostip in getncversion req

* taking advice and adding net/url package to extract hosT
2021-12-02 09:38:54 -08:00
Evan Baker a8ba90e608
Update IPAM pool monitor's pool state logic and language (#1126)
* move and rename ipstate and change allocated to assigned

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

typo

* build ip pool state and populate struct pre-reconcile

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* upcast to int64 and move scaler fields to metastate

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* delint

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* fix tests

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* upcast usage of int to int64 in monitor fake tests

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* update remaining usages of allocate to assign

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-12-01 18:02:27 -06:00
Evan Baker a664889867
fix IPs typos in CNS code and comments (#1111)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-11-29 15:48:31 -06:00
Evan Baker 22906eb9b2
fix ipampool scaling - only release IPs we have (#1110)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-11-22 12:22:30 -06:00
Paul Johnston 6d208e9762
Cns windows aks (#1059)
* chore: add in some functionality for CNS on windows host process pods
2021-11-22 09:23:01 -08:00
Paul Johnston b0b6edfcfe
Azure cns yaml (#1058)
* [feat] making azure-cns-windows yaml
2021-11-18 12:29:51 -08:00
Eng Zer Jun e812bc82b8
refactor: move from io/ioutil to io and os packages (#1096)
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-11-17 16:31:42 -06:00
Evan Baker b1d7723c3a
use correct duration for pool refresh (#1094)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-11-15 12:44:51 -06:00
tamilmani1989 cdab7d0241
Use Lockedfile api to acquire lock (#1070)
* added lockedfileapi support for CNI

* fixed interface changes

* addressed comments
fixed ut

* addressed comments

* fixed copy to buffer part in writer api

* fixed copy to buffer part in writer api

* keeping old code not changing it.
2021-11-09 08:19:44 -08:00
Paul Miller 27ac431a6e
Use avast to retry init cns and register node. (#1087)
* stupid simple retry

* go lint fixes

* missed one //

* avast retry

* try out avast

* vendor

* try nad make linters happy

* fix wrap check
2021-11-05 13:20:26 -05:00
Evan Baker 4a4370b0be
chore: tidy up nmagent client for context timeouts (#1056)
* chore: tidy up nmagent client for context timeouts

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* address review feedback

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* fix tests

* change integration test timeout to 1h

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* guard against nil in wireserver util

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* read body

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* correct durations

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-10-21 13:14:03 -05:00
Evan Baker 17bd9425d0
chore: tidy up the wireserver client and usage (#1065)
* chore: tidy up IMDS client

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* rename from imds to wireserver

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-10-20 14:23:05 -05:00