Граф коммитов

522 Коммитов

Автор SHA1 Сообщение Дата
Saksham Mittal 2a9b132434
Include body for unpublishNC calls to support AZR (#1826)
* initial delete NC changes to include body

* initial delete NC changes to include body

* lint error

* initial delete NC changes to include body

* initial delete NC changes to include body

* lint error

* test change

* test change

* add comment
2023-03-01 13:44:32 -08:00
Evan Baker 358de20d2c
Add multiplat Windows 2019 and 2022 image support to tooling and pipelines, use for CNS/NPM (#1820)
* build: add ws2019 cns image to build pipeline

* add windows2019 build pool

* fix: npm pipelines

* fix: npm pipelines

* update multiplat build process for winver flavors

* optional buildx push for npm cyclonus

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
Co-authored-by: Matthew Long <61910737+thatmattlong@users.noreply.github.com>
Co-authored-by: Matthew Long <Matthew.Long@microsoft.com>
2023-03-01 00:25:00 +00:00
Ramiro c8d2c1f576
CNS - Wireserver "proxy" (#1825)
* adding wireserver proxy

* using proxy for publish

* injecting dependency. refactoring unpublish nc

* default wireserver ip config

* setting content type

* better printing of requests and responses

* fixing unit tests
2023-02-28 12:20:07 -05:00
tamilmani1989 96835dcc76
CNS in-mem cache to use containerid as key if manageendpointstate is enabled (#1811)
update podInfo key as containerid instead of name and namespace
2023-02-21 12:04:47 -08:00
Timothy J. Raymond 48c7a149e5
Fix incorrect 200 for a 401 from NMAgent (#1799)
In scenarios where the subnet token does not match (leading to a 401
from NMAgent), CNS returns a 200 for the PublishStatusCode. This is
incorrect, and a 401 should be returned instead. This leads clients of
CNS to take incorrect action on, what they believe to be, a successful
response.
2023-02-14 09:48:35 -08:00
Matthew Long df431437b1
fix: multitenantnetwork reconciler should check errors correctly (#1791) 2023-02-10 00:26:21 +00:00
shchen da56b37a65
Fix the NC ID error when get container (#1767)
* Fix nc id mismatch

* Fix the nc ID prefix issue
2023-02-07 07:03:45 -08:00
Matthew Long 36757fefc2
feat: allow the CNI conflist generation settings to be configured via… (#1765)
* feat: allow the CNI conflist generation settings to be configured via cmd line args

* make env var opt-in
2023-02-07 01:30:22 +00:00
shchen ddb5954351
nmagent get nv version list api V2 refactor (#1744)
* tmp commit for onbaording nma v2

* Remove test output file

* Remove unnecessary code when CNS onboard get nc version list without
token

* tmp commit to fix getnc version tests when onboarding nc version api v2
from nmagent.

* Fix the unit test for nmagent v2 api change.

* Fix unit test TestGetNetworkContainerVersionStatus

* Revert back to GetNCVersionF test.

* Roll back to get nc version api v1 for test.

* Continue revert back and store nc version url

* Onboard nmagent get nc version api v2.

* Address pr feedback of returning early and remove comment out code.

* Remove unnecessary ncVersionURLs and NCVersionRequest.

* Remove unnecessary variables.

* Update nmagent get nc version api v2 to v2 url

* Remove comment out code.

* tmp commit for onbaording nma v2

* Remove test output file

* Remove unnecessary code when CNS onboard get nc version list without
token

* tmp commit to fix getnc version tests when onboarding nc version api v2
from nmagent.

* Fix the unit test for nmagent v2 api change.

* Fix unit test TestGetNetworkContainerVersionStatus

* Revert back to GetNCVersionF test.

* Roll back to get nc version api v1 for test.

* Continue revert back and store nc version url

* Onboard nmagent get nc version api v2.

* Address pr feedback of returning early and remove comment out code.

* Remove unnecessary ncVersionURLs and NCVersionRequest.

* Remove unnecessary variables.

* Update nmagent get nc version api v2 to v2 url

* Remove comment out code.

* Update nmagent get nc version list.

* Address feedback and fix golint

* Fix lint issue.

* Fix the remaining 2 lint issues.

* Revert back test error generation to address feedback.
2023-01-19 17:10:23 -06:00
Timothy J. Raymond 45108ae514
Fix incorrect HTTP status from publish NC (#1757)
* Fix incorrect HTTP status from publish NC

CNS was responding with an HTTP status code of "0" from NMAgent.
Successes are supposed to be 200. The C-style var block at the beginning
of publishNetworkContainer was the reason for this. During refactoring,
the location where this status code was set to a successful value of 200
was accidentally removed. Because the var block declared the variable
and silently initialized it to 0, the compiler did not flag this bug as
it otherwise would have. The status code has been removed from this
block and explicitly defined and initialized to a correct value of 200.
Subsequent error handling will change this as necessary.

Also, despite consumers depending on this status, there were no tests to
verify that the status was set correctly. Tests have been added to
reflect this dependency.

* Ensure that NMAgent body is always set

DNC depends on the NMAgent body being set for its vestigial functions of
retrying failed requests. Since failed requests will now be retried
internally (to CNS) by the NMAgent client, this isn't really necessary
anymore. There are versions of DNC out there that depend on this body
though, so it needs to be present in order for NC publishing to actually
work.

* Fix missing NMAgent status for Unpublish

It was discovered that the Unpublish endpoints also omitted the status
codes and bodies expected by clients. This adds those and fixes the
associated tests to guarantee the expected behavior.

* Silence the linter

There were two instances where the linter was flagging dynamic errors,
but this is just in a test. It's perfectly fine to bend the rules there,
since we don't expect to re-use the errors (they really should be
t.Fatal / t.Error anyway, but due to legacy we're returning errors here
instead).
2023-01-19 04:17:46 +00:00
ZetaoZhuang 063fe58a41
refactor and move the nmagentConfig code from cns to namgent package. (#1723)
* refactor and move the nmagentConfig code from cns to namgent package.
2023-01-10 23:00:03 -08:00
dependabot[bot] e86c188d46
deps: bump sigs.k8s.io/controller-runtime from 0.12.3 to 0.14.1 (#1749) 2023-01-11 00:37:46 +00:00
Paul Johnston 4e5530cc65
Fix broken NC publishing due to type mismatch (#1740) 2023-01-09 19:46:05 +00:00
Behzad Mirkhanzadeh 7b647be285
fix: repair windows cni lock issue (#1712)
* Moving the lock from InitializeKeyValueStore() function to restore/save functions to improve cni performance on windows.

* fix: use defer function to unlock statefile.

* fix: fixing the IPAM lock and defer func

* fix: Optimizing cni file lock by moving SetSdnRemoteArpMacAddress() on startup for CRD and MultitenantCRD mode.

* adding store lock on telemetry service start to avoid race condition on windows.
2023-01-07 08:03:09 +00:00
ZetaoZhuang e6eadd3379
feat: expose getHomeAzInfo api in cns to retrieve node home az infos from NMAgent (#1642) 2022-12-08 08:23:41 -08:00
Evan Baker 353b7e01c6
fix: cns to use controller runtime (cached) clients in reconcilers (#1668)
* update crd clients to accept existing ctrlcli core clients

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* fix cns to use managed cached clients for reconcilers

Signed-off-by: GitHub <noreply@github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
Signed-off-by: GitHub <noreply@github.com>
2022-12-05 17:55:45 -08:00
Timothy J. Raymond 57120ca41f
Fix failure to start CNS when setting WireserverIP (#1707)
There was insufficient coverage over cases involving different
permutations of the "WireserverIP" configuration option. Consequently,
there were instances where reasonable values for this option caused CNS
to fail to start.

This moves the logic for transforming the CNS configuration into
configuration suitable for the NMAgent client into a method off the
CNSConfig. It also permits adding coverage over different scenarios that
are likely to emerge.
2022-11-30 20:09:53 +00:00
Matthew Long 7b91752d10
feat: cns writes cni conflist (#1702)
* feat: add cni conflist generator for v4 overlay scenario

* feat: use atomic fs operations

* fix: use same directory as temp dir since /tmp is a tmpfs
2022-11-29 04:56:08 +00:00
Evan Baker 3a5c72b079
fix: hide echo banner in log (#1669)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-11-16 23:13:08 +00:00
Timothy J. Raymond 31a0906102
Fix inability to unmarshal nmagent request (#1695)
* Fix inability to unmarshal nmagent request

During the previous switchover to using the client from the
`nmagent` client (as opposed to the `cns/nmagent` client), an assumption
was made that "proxied" requests to NMAgent were provided by clients as
nested JSON. This assumption was wrong--they are Base64-encoded strings
of JSON. Even though they're ultimately similar, they're very different
from the perspective of the JSON unmarshaler. Consequently, this
restores the nested request body back to a []byte and performs the
second-stage decoding manually (similarly to how it was previously
done).

Fixes #1694

* Fix swallowed error when body is not JSON

In one instance, an error was accidentally swallowed because the
existing code does not return errors (it sets variables instead). This
makes controlling the flow of execution difficult. To fix this, the
offending code has been moved to a separate function where returns can
be used effectively.
2022-11-10 18:22:17 -06:00
Matthew Long 03b8508e5e
feat: allow the user to pass an aikey in config (#1680) 2022-10-27 11:39:56 -07:00
Timothy J. Raymond 9cc0b0ba3e
Replace the NMAgent client in CNS with the one from the nmagent package (#1643)
* Switch PublishNetworkContainer to nmagent package

This removes the PublishNetworkContainer method from the CNS client's
nmagent package in favor of using the one from the top-level nmagent
client package. This is to ensure that there's only one nmagent client
to maintain.

* Use Unpublish method from nmagent package

The existing unpublish endpoints within CNS used the older nmagent
client. This converts them to use the newer one.

* Use JoinNetwork from new nmagent client

The API in CNS was using the old nmagent client endpoints which we want
to phase out so that there is exactly one nmagent client across all
systems.

* Add SupportedAPIs method to nmagent and use it

The CNS client uses this SupportedAPIs method to proxy requests from DNC
and detect capabilities of the node.

* Add GetNCVersion to nmagent and use it in CNS

CNS previously used its own client method for accessing the GetNCVersion
endpoint of nmagent. In the interest of having a singular NMAgent
client, this adopts the behavior into the nmagent package and uses it
within CNS instead.

* Use test helpers for context and checking errs

Checking whether the error was present and should it be present was
annoying and repetitive, so there's a helper now to take care of that.
Similarly, contexts are necessary but it's nice to create them from the
test's deadline itself. There's a helper to do that now as well.

* Fix broken tests & improve the suite

There were some broken tests left over from implementing new NMAgent
interface methods. This makes those tests pass and also improves the
test suite by detangling mixed concerns of the utility functions within.
Many of them returned errors and made assertions, which made them
confusing to use. The utility functions now only do things you request,
and the tests themselves are the only ones making assertions (sometimes
through helpers that were added to make those assertions easier).

* Add GetNCVersionList endpoint to NMAgent client

This is the final endpoint that was being used in CNS without being
present in the "official" NMAgent client. This moves that
implementation, more-or-less, to the nmagent package and consumes it in
NMAgent through an interface.

* Remove incorrect error shadowing

There were a few places where errors were shadowed. Also removed the
C-ism of pre-declaring variables for the sake of pre-declaring
variables.

* Replace error type assertions with errors.As

In two instances, type assertions were used instead of errors.As. Apart
from being less obvious, there are apparently instances where this can
be incorrect with error wrapping.

Also, there was an instance where errors.As was mistakenly used in the
init clause of an if statement, instead of the predicate. This was
corrected to be a conjunctive predicate converting the error and then
making assertions using that error. This is safe because short-circuit
evaluation will prevent the second half of the predicate from being
evaluated if the error is not of that type.

* Use context for joinNetwork

The linter rightly pointed out that context could be trivially
propagated further than it was. This commit does exactly that.

* Add error wrapping to an otherwise-opaque error

The linter highlighted this error, showing that the error returned would
be confusing to a user of this function. This wraps the error indicating
that a join network request was issued at the point where the error was
produced, to aid in debugging.

* Remove error shadowing in the tests

This is really somewhat unnecessary because it's just a test. It
shouldn't impact correctness, since the errors were properly scoped to
their respective if statements. However, to prevent other people's
linters from complaining, this corrects the lint.

* Silence complaints about dynamic errors in tests

This is an unnecessary lint in a test, because it's useful to be able to
define errors on the fly when you need them in a test. However, the
linter demands it presently, so it must be silenced.

* Add missing return for fmt.Errorf

Surprisingly, a return was missing here. This was caught by the linter,
and was most certainly incorrect.

* Remove yet another shadowed err

There was nothing incorrect about this shadowed err variable, but
linters complain about it and fail CI.

* Finish wiring in the NMAgent Client

There were missing places where the nmagent client wasn't wired in
properly in main. This threads the client through completely and also
alters some tests to maintain compatibility.

* Add config for Wireserver to NMAgent Client

This was in reaction to a lint detecting a vestigial use of
"WireserverIP". This package variable is no longer in use, however, the
spirit of it still exists. The changes adapt the provided configuration
into an nmagent.Config for use with the NMAgent Client instead.

* Silence the linter

The linter complained about a shadowed err, which is fine since it's
scoped in the `if`. Also there was a duplicate import which resulted
from refactoring. This has been de-duped.

* Rename jnr -> req and nma -> nmagent

The "jnr" variable was confusing when "req" is far more revealing of
what it is. Also, the "nma" alias was left over from a prior iteration
when the legacy nmagent package and the new one co-existed.

* Rename NodeInquirer -> NodeInterrogator

"Interrogator" is more revealing about the set of behavior encapsulated
by the interface. Depending on that behavior allows a consumer to
interrogate nodes for various information (but not modify anything about
them).
2022-10-19 21:38:01 +00:00
bohuini 48e523866b
Add two API in CNS client for GET and POST all NCs (#1632)
Added two api in client and added unit test
2022-10-13 12:25:54 -05:00
Evan Baker df4608253e
Emit a metric in CNS for NC Sync count by status (#1650)
* emit a metric for NC sync and an error if all NCs are not present

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* rename variables in nc sync host version for clarity

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-10-12 20:00:21 -05:00
Ashish Nair 0a98133f82
Fix: Added a metric to monitor the subnet exhaustion state within the Ipam Monitor Pool (#1620)
* Added a metric to monitor the subnet exhaustion state within the Ipam Monitor Pool

* Fixed the PR comments

* Added a reconciler error metric

* Addressed code review comments

* Updating lint on code

* Addressed all code review comments and changed the reconciler metric to a counter metric and fixed linting issues

* Added a count metric for IPAM pool as well to count the number of switches between subnet exhaustion and reversal for each subnet

* Updated the makefile to be able to run linting with better garbage collection

* Updated the code with the PR review comments

* Updated the label values based on a discussion offline with Evan

Co-authored-by: asn <asn@microsoft.com>
2022-09-27 22:22:55 -07:00
Evan Baker b62c4f9221
vendor and use vendor in windows image builds (#1630)
to resolve network timeouts when pulling deps

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-09-27 12:29:24 -07:00
Timothy J. Raymond fc690d2274
Add path to CNS client for DeleteNC (#1619)
This path was missed in a prior iteration, but is necessary for the
client to function properly. Without it, the route cannot be found and
thus the request cannot be created.
2022-09-22 21:43:05 +00:00
bohuini 8e37321055
NC refresh optimization (#1544)
* NC refresh optimization: Added api and test

* NC refresh optimization: Address comments

* NC refresh optimization: Address comments

* NC refresh optimization: optimize createNetworkContainers()

* NC refresh optimization: Fixed log info
2022-09-16 05:33:14 +00:00
Evan Baker ee910869a8
Optimize subnet exhaustion handling (#1594)
compile crd schemes and improve subnet exhaustion handling

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-09-15 16:52:11 +00:00
Evan Baker bd7ee76d12
Align pool to batch during scale up (#1593)
align pool to batch during scale up

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-09-15 09:01:55 -07:00
Timothy J. Raymond a9a0a5f7c8
Add two more CNS endpoints to the CNS client (#1541)
* Add NumOfCPUCores method to CNS client

This is the last part of CNS's API surface that is used by DNC, but not
covered by a method from the CNS client.

* Remove unnecessary error wrapping lambda

The error handling here is simple enough that it doesn't really warrant
wrapping it with a lambda.

* Add NmAgentSupportedAPIs to the CNS Client

DNC uses this API to detect GRE key capabilities. Since it's not
supported by the client, it does this with direct HTTP requests
currently. In order to ensure that there's only one way of accessing
CNS, this implements the method so that the client can be used in DNC
instead.

* Add NMA supported APIs to client URL list

This was forgotten in a previous iteration.

* Handle non-200 status code by returning an error

The NMAgentSupportedAPIs method did not error when a non-2XX status code
was encountered. This leads to surprising behavior on the part of the
consumer.

* Use http.NoBody instead of nil

This was a linter suggestion, and a good one. http.NoBody is more
semantically meaningful than passing a nil.

* Create a FailedHTTPRequest error type

This is in response to a linter complaint that dynamic errors were being
used instead of a static error. Declaring an error type with var
wouldn't carry along necessary debugging information (namely the HTTP
status code), so it wasn't really appropriate. Rather than disable the
check entirely with a linter directive, this defines a reusable error
type so that HTTP errors can be communicated upward consistently.

* Use an alias for CNS client paths

These really should be a single set of paths, but there's this existing
block of constants to define a "contract with DNC." Whether or not
that's complete, the spirit of it is somewhat clear. However, it's not
great to be duplicating constants all over the place. This is somewhat
of a compromise by defining the newer constants in terms of the older
ones.

* Add missing context to outbound HTTP requests

This was a good suggestion from the linter.
2022-09-13 12:56:42 -04:00
Evan Baker 99ce6aea97
Move primary IP count to separate metric to not affect the IPAM math (#1579)
move primary IP count to separate metric to not affect the ipam math

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-09-07 23:18:08 -07:00
Evan Baker c3b13da8b3
fix level of no valid NC log in reconciler (#1567)
fix no valid NC log in reconciler

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-09-02 08:24:15 +00:00
Evan Baker 02c3d767bb
Update to Go 1.19 (#1505)
go 1.19

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-08-24 20:18:38 +00:00
Quang Nguyen 60e5a26565
feat: CNS manages endpoints state for delegate IPAM use case (#1500)
* rebase

* linting

* rebase

* missing if condition for releaseIPConfig

* update azure-cns.yaml and add UTs

* rebase

* update program iptables changes

* linting

* fix broken tests

* fix podinfoprovider returns error when key is not found

* log when no endpoint state exist when reconcilling

* not remove endpoint state file on failure to read in restserver.restoreState()

* addressed comments

* update acn tag

* go get on acn

* addressed comments

Co-authored-by: Evan Baker <rbtr@users.noreply.github.com>
2022-08-23 16:42:40 -07:00
Evan Baker 31b75cf44f
move reconciler-started log in to the once (#1538)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-08-23 00:22:29 +00:00
Matthew Long 07432787cc
feat: compare node IP with NNCs node IP (#1527)
* feat: compare node IP with NNCs node IP

* fix: don't block reconciler from starting if nnc doesn't have status
2022-08-17 10:47:20 -07:00
Quang Nguyen 46321fb423
feat : Add ifname to ipconfigrequest (#1530)
add ifname to ipconfigrequest
2022-08-16 17:06:23 -07:00
Quang Nguyen 877970022a
Fix CNS Program iptables for delegated IPAM (#1499)
* rebase

* rebase

* rebase

* adding snat iptables rules using coreos lib

* fix iptables cmd not running

* docs

* added conflist back

* change chain name from SWIFT to SWIFT-POSTROUTING

* update go.mod

* split internalapi into linux and windows

* add imports

* fix iptables programming login

* fix iptables programming logic

* change program iptables rules logic
2022-08-15 13:00:32 -07:00
Evan Baker 8b5e1672b9
Accept NNC with no NC as valid and consider pool monitor and reconciler started (#1526)
accept an nnc with no nc as valid and consider monitor and reconciler started

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-08-15 12:59:24 -07:00
Timothy J. Raymond 717b7a03e7
Add missing client paths to CNS Client (#1518)
These were forgotten during a previous effort to add endpoints to the
CNS client. Consequently, those endpoints don't actually work unless
these paths are present here.
2022-08-09 16:10:15 -07:00
Evan Baker 3ecb7fb15a
Watch ClusterSubnetState and drive subnet exhaustion batch changes (#1494)
* add clustersubnetstate reconciler and inject exhaustion in ipampool monitor

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* delint

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* add subnetscarcity config opt

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* dynamically adjust batch/minfree/maxfree when exhausted

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* update poolmonitor tests with exhausted scenario

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-08-01 09:56:38 -07:00
Evan Baker 0e3ef75d3c
move singletenantcontroller to nested kubecontroller package and rename (#1493)
move singletenantcontroller to kubecontroller package and rename

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-07-28 11:33:05 -07:00
Evan Baker 9dc679e26b
Update poolmonitor to support batch size of 1 (#1492)
updates to poolmonitor to support batch size of 1

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-07-27 13:33:12 -07:00
Evan Baker 65ac6a8be8
add test to purge pendingDelete IPs in ipampoolmonitor (#1467)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-07-26 09:16:38 -07:00
Evan Baker 05868abe8f
only log full pool state on change (#1446)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-07-23 23:33:51 +00:00
Quang Nguyen c6cafd48f9
Program IPTables SNAT rules in CNS (#1484)
* iptables snat rules in cns

* linting issue

* add cns config option for running delegated ipam

* change cns config option
2022-07-21 23:34:10 +00:00
Evan Baker 1396676ed1
fix latency metrics that should be seconds (#1468)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-07-21 11:15:13 -07:00
Timothy J. Raymond 4e0faf0fa1
Port DeleteNetworkContainer and SetOrchestratorType from DNC (#1454)
* Add DeleteNetworkContainer method to CNS Client

DNC needs to delete network containers using CNS, and it currently does
so through raw HTTP requests. This instead makes this an official
operation within the CNS client so that it can be used there instead.

* Remove return value from DeleteNetworkContainer

The return value from DeleteNetworkContainer was basically unused in
DeleteNetworkContainer, since it only contains a CNSResponse type which
can only ever be a successful response. Given this, it's sufficient to
just return no error as a signifier of success.

* Add SetOrchestratorType endpoint to cns client

DNC uses SetOrchestratorType currently by making straight HTTP requests
to the CNS backend. Since we should have one client only, this moves
these endpoints into the CNS client so that it can be consumed in DNC.

* Fix two linter issues

In one instance the linter had a false positive on an error that doesn't
really need to be wrapped. The other was a good suggestion since it
helps readers of the test understand what is going on.

* Add CreateNetworkContainer endpoint to client

Turns out DNC uses a few more endpoints than it would seem. This adds a
CreateNetworkContainer method to encapsulate the
/network/createorupdatenetworkcontainer endpoint.

* Add a PublishNetworkContainer to CNS client

Publishing network containers via CNS is something that DNC does
directly through an HTTP client. Given how common this is, it makes
sense to adopt this into the CNS client so that DNC can use the client
instead.

* Fix placeholder error message

This was just used in testing and was forgotten.

* Add context to DeleteNetworkContainer

Everything involving the network should take a context parameter.

* Fix PublishNetworkContainer return type

The PublishNetworkContainer endpoint returns a wrapping type around the
cns.Response. This updates the tests and the endpoint to reflect that.

* Add UnpublishNC to CNS Client

Unpublishing NCs is accomplished by DNC currently by directly making
HTTP calls. This adds that functionality to the client so the client can
be used instead.
2022-07-11 12:32:28 -04:00
tamilmani1989 ba3bbe0f26
Remove azure-vnet-telemetry for windows multitenancy (#1430)
* Remove azure-vne-telemetry for windows multitenancy and telemetry service for windows multitenancy will be started from cns.

* start telemetry service from cns

* lint and log fix

* minor change

* addressed comment
2022-07-01 15:09:39 -07:00
rsagasthya 618bf295da
Total IPs metric (#1342)
* Total IPs metric
Now includes Primary IP count
2022-06-22 17:08:48 -05:00
Ramiro 69f851a25a
adding a cert refresher construct (#1426) 2022-06-16 23:32:33 +00:00
zhangming870 85c0fe091b
change the field name to NumCores, so that it can be correctly decode in dnc side (#1358)
Co-authored-by: Ming Zhang <zhangming@microsoft.com>
2022-06-16 11:24:55 -05:00
Ramiro 46d6b9db79
Refactoring CNSLogger (#1429)
* refactoring cns logger to make it easier to stub out. addressing data race with concurrent reads and writes to node id and orchestrator
2022-06-16 00:19:12 +00:00
Evan Baker deb0c0cf37
add readyz and pprof listeners (#1395)
* add readyz and pprof listeners

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* add more pprof handlers

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-06-03 10:57:33 -05:00
Jaeryn 7219bb2dd9
fix: Service Account Mitigation for CNS on k8s Windows 2022 (#1367)
* Service Account Mitigation for CNS on k8s Windows 2022

* pick up Neha's bug fix

* addressing comments

* add node selector back

Co-authored-by: Jaeryn <tsun.chu@microsoft.com>
2022-05-20 10:03:10 -07:00
Evan Baker f35c92cb99
add independent healthserver for metrics and healthz (#1378)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-05-16 12:01:15 -05:00
neaggarwMS 661bd5bd46
Fix CNS Reconciling and AssignmentMode handling for AKS Swift (#1368)
* Fix CNS Reconciling and ASsignmentMode handling for AKS Swift

* Fixed the handling of default mode == Dynamic

* Incorporate feedback

* golint fixes

* Fix for whyNoLint: include an explanation for nolint directive (gocritic)
2022-05-13 17:58:06 -05:00
Camryn Lee 281eb70929
Fix CNS logging (#1372)
* modify cns logging

* update to implement stringer interface

* fixing linting errors

* removing log metadata

Co-authored-by: Evan Baker <rbtr@users.noreply.github.com>

Co-authored-by: Camryn Lee <camrynlee@microsoft.com>
Co-authored-by: Evan Baker <rbtr@users.noreply.github.com>
2022-05-12 11:31:00 -07:00
Evan Baker a8cf7a243a
include 0th address when parsing overlay cidrs (#1365)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-05-11 19:49:16 +00:00
Evan Baker 8750b346ed
CNS Prometheus and Grafana examples (#1366)
* cns prometheus examples

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* grafana samples

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-05-11 12:34:47 -05:00
Jaeryn dbb4f68393
build: Add Windows CNS & NPM as Part of Multi-Arch Manifest (#1345)
* add windows cns manifest to multi arch image

* try to use generic windows template w/ containerize stage in pipeline

* try and use buildah to pull images

* update manifest build and push for buildah

* create manifest by referencing images instead of pulling to avoid OS mismatch error

* remove unused windows-image.yaml

* remove REGISTRY var and use IMAGE_REGISTRY from makefile

Co-authored-by: Jaeryn <tsun.chu@microsoft.com>
Co-authored-by: Evan Baker <rbtr@users.noreply.github.com>
2022-05-05 15:32:53 -07:00
Evan Baker bd37755a35
Revert "move metrics to service port, add healthz, pprof (#1318)" (#1362)
This reverts commit 5ec6a3f2a8.
2022-05-05 04:42:06 +00:00
Evan Baker 7122dc246d
pull images from mcr (#1347)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-26 17:42:45 -05:00
Evan Baker 95155ca9b4
Support NetworkContainers of Static or Dynamic types (#1325)
* add overlay static nnc to nc listener

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* inline err check

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* pass NodeNetworkConfigs to Update method by value

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* rework nnc reconcile flow for NetworkContainer modes

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-18 22:07:08 -05:00
Evan Baker 704ba1adf5
fix: correctly return an error for non http 200 response from nmagent (#1333)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-18 21:50:03 +00:00
Evan Baker 31a3b27d7d
save local target spec in ipampoolmonitor instead of overwriting from NNC (#1328)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-14 08:57:34 -07:00
Evan Baker 3d5660cf14
create nnc listener concept and adapt existing poolmonitor and swift service to it (#1323)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-11 19:34:18 -05:00
Evan Baker 90e0999b89
support CIDR notation in the NC PrimaryIP (#1320)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-06 10:11:39 -07:00
Evan Baker 5ec6a3f2a8
move metrics to service port, add healthz, pprof (#1318)
* combine metrics and healthz

* use root mux and bind pprof routes on debug config opt

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Co-authored-by: Paul Miller <pmiller@microsoft.com>
2022-04-05 22:22:49 +00:00
Evan Baker 77dfdcf2af
update to go1.18 (#1281)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-04 16:49:49 -05:00
Jaeryn 257925e066
Set remote ARP MAC address when CNS is running in CRD mode (#1306)
Co-authored-by: Jaeryn <tsun.chu@microsoft.com>
windows dualstack e2e failures not related to this PR
2022-04-01 16:43:02 -07:00
rsagasthya 65ce2f1d96
Added the Pod Net ARM ID as a Label. (#1260)
* Added the Pod Net ARM ID as a Label.

Added Pod Net Arm Id as labels to the CNS Metrics.

* CNS Metrics: ARM ID as Label

Necessary changes to the read the ARM ID metadata coming from DNC.

* Subtle changes in NNC

Variable names had to be modified to meet lint standards.

* Subtle changes in NNC

Variable names had to be modified to meet lint standards.

* Made NNC arg a Pointer

NNC that was passed to create the ARM ID was too heavy. Converted that argument to a pointer and passed the reference of the NNC in that method.

* [CNS] Metrics Prefix

Added the "cx_" prefix to prometheus metrics of CNS.

* Minor Changes

Minor edits with the Metric Names and Documentation.

* ConstLabels in Metrics

Made changes to the ConstLabels in the CNS Metrics.

* ConstLabels in Metrics

Made changes to the ConstLabels in the CNS Metrics.
2022-03-16 02:39:36 +00:00
Evan Baker b5e599a715
ignore pods that have not been assigned an IP yet during cns initialization (#1277)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-03-15 16:21:25 -07:00
Paul Miller e42e41fa1d
feat: wrap a histogram around syncHostNCVersion (#1273)
* wrap a histogram around syncHostNCVersion

* try and fix linter errros though dubious about one

* clean up

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* kick pr?

* use success to be consistent with dnc

Co-authored-by: Evan Baker <rbtr@users.noreply.github.com>
2022-03-12 01:11:12 +00:00
rsagasthya 569c946435
Subnet labels (#1238)
* 12622609: Dimensions to metrics.

* Modified the constants declaration.

* Made a slight change to make the observeIPPoolState() method inline and remove a whitespace.

* Modified label values to stay consistent with other prometheus labels.
2022-02-24 14:40:00 -08:00
Evan Baker 8f9d6b32d6
use fully-qualified image names in all Dockerfiles (#1248)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-02-24 12:08:13 -06:00
Paul Johnston 81fd321fef
Allow cns to register node if no NCs are present (#1232) 2022-02-17 12:25:36 -08:00
Evan Baker 09c06b01e1
Parameterize Makefile, parallelize platform builds, support buildah (#1102)
* templatize the container make targets

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* add container push to npm conformance task

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* update pipelines

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* default OS and ARCH to GOOS and GOARCH

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-02-15 19:00:04 -06:00
Evan Baker a77e2475b7
Don't send NNC to Pool Monitor before Reconciler starts (#1236)
* remove early nnc send to pool monitor

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* order service starts better and add logging in initialization

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* wait for reconciler to start

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* update comments and docs

Signed-off-by: GitHub <noreply@github.com>
2022-02-15 18:50:30 -06:00
Paul Miller 7b15cbd8f2
get return codes for ipconfig at least (#1231)
* get return codes for ipconfig at least

* fix lint

* valid label name
2022-02-15 12:35:19 -06:00
Evan Baker cadca2f764
Include PendingRelease when calculating IP pool for scale-down (#1237)
* fix scale down

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* update pool state variable naming

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* add test for pending release

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-02-15 11:21:08 -06:00
aegal d52ed96a20
guard against unstructured url forwarding in CNS API's (#1192)
* remove forwarding unstructured urls

* remove forwarding unstructured urls

* addressing feedback

* fix linting issues

* update getncversion test

* fix test issue where nmagent handler returns 500

* update for linter

* address comments

* small liniter changes

* quick linting changes

* lint

* use gofumpt instead of gofmt

* ignore line count
2022-02-01 15:13:44 -08:00
Evan Baker 427a1e4200
fix initialize from kube pods flow (#1194)
* fix initialize from kube pods flow

Signed-off-by: Evan Baker <rbtr@users.noreply.github.

* move list pods inside closure

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-01-31 14:57:55 -08:00
Evan Baker 80e574b036
match on Node controller ref in NNC event predicate (#1190)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-01-28 17:37:20 -08:00
Evan Baker 1b68ce1ae9
Server-side filtering for NodeNetworkConfig, Node, and Pod objects (#1198)
* server-side filtering for nnc objects

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* yagni node and pod fields in cache builder

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-01-21 11:57:35 -08:00
Evan Baker 39f2bb9ae5
fix: use server-side apply in nnc client (#1152)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-12-07 15:10:04 -08:00
Evan Baker 9fff334a19
feat: shim in metric for ipconfigstatus state transition durations (#1080)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-12-06 18:32:51 -06:00
Vamsi Kalapala 83d161ec41
fix: [CNS] fixing a regex to extract hostip in getncversion req (#1140)
* fix: [CNS] fixing a regex to extract hostip in getncversion req

* taking advice and adding net/url package to extract hosT
2021-12-02 09:38:54 -08:00
Evan Baker a8ba90e608
Update IPAM pool monitor's pool state logic and language (#1126)
* move and rename ipstate and change allocated to assigned

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

typo

* build ip pool state and populate struct pre-reconcile

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* upcast to int64 and move scaler fields to metastate

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* delint

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* fix tests

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* upcast usage of int to int64 in monitor fake tests

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* update remaining usages of allocate to assign

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-12-01 18:02:27 -06:00
Evan Baker a664889867
fix IPs typos in CNS code and comments (#1111)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-11-29 15:48:31 -06:00
Evan Baker 22906eb9b2
fix ipampool scaling - only release IPs we have (#1110)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-11-22 12:22:30 -06:00
Paul Johnston 6d208e9762
Cns windows aks (#1059)
* chore: add in some functionality for CNS on windows host process pods
2021-11-22 09:23:01 -08:00
Paul Johnston b0b6edfcfe
Azure cns yaml (#1058)
* [feat] making azure-cns-windows yaml
2021-11-18 12:29:51 -08:00
Eng Zer Jun e812bc82b8
refactor: move from io/ioutil to io and os packages (#1096)
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-11-17 16:31:42 -06:00
Evan Baker b1d7723c3a
use correct duration for pool refresh (#1094)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-11-15 12:44:51 -06:00
tamilmani1989 cdab7d0241
Use Lockedfile api to acquire lock (#1070)
* added lockedfileapi support for CNI

* fixed interface changes

* addressed comments
fixed ut

* addressed comments

* fixed copy to buffer part in writer api

* fixed copy to buffer part in writer api

* keeping old code not changing it.
2021-11-09 08:19:44 -08:00
Paul Miller 27ac431a6e
Use avast to retry init cns and register node. (#1087)
* stupid simple retry

* go lint fixes

* missed one //

* avast retry

* try out avast

* vendor

* try nad make linters happy

* fix wrap check
2021-11-05 13:20:26 -05:00
Evan Baker 4a4370b0be
chore: tidy up nmagent client for context timeouts (#1056)
* chore: tidy up nmagent client for context timeouts

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* address review feedback

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* fix tests

* change integration test timeout to 1h

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* guard against nil in wireserver util

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* read body

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* correct durations

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-10-21 13:14:03 -05:00
Evan Baker 17bd9425d0
chore: tidy up the wireserver client and usage (#1065)
* chore: tidy up IMDS client

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* rename from imds to wireserver

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-10-20 14:23:05 -05:00
Evan Baker 943c1aecfd
fix: parse subnet cidr and calculate gateway (#1064) 2021-10-19 12:51:22 -05:00
tamilmani1989 279911c94a
Support for Dualstack transparent (#1046)
* ipv6 dualstack support transparent mode

* golint fixes

* fixed linter errors

* enable ipv6 setting

* dualstack transparent changes

* abstracted platform execute command

* lint fixes
fix compilation issues

* addressed comments

* fixed a bug
2021-10-15 14:28:37 -07:00
Paul Johnston c64c5ab04a
Azure cns windows dockerfile (#1043)
* cns windows dockerfile and make target
2021-10-14 15:42:56 -07:00
Evan Baker 1bcf6d80b5
fix: prevent ipampoolmonitor from reconciling before it receives a nodenetworkconfig (#1044)
fix ipampoolmonitor starting before receiving a nodenetworkconfig

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-10-13 12:14:50 -05:00
Evan Baker 060a4d8504
add histo metric for IP allocation latency per unique pod (#1027)
* add histo metric for IP allocation latency per unique pod

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* fix typo in ipam metrics

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* delint

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* noop on non-existent key

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* create bounded, mapped heap to record pod-first-seen times during ip allocation

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-10-07 16:14:05 -05:00
Evan Baker c970d07e8f
Create NNC scoped client and isolate ipampoolmonitor and NNC reconciler (#1006)
* create nodenetworkconfig client

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* add envnodename to config

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* rewrite and fully test nnc to nc conversion

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* use nnc clients to separate reconciler and ipampoolmonitor

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* inline reconciler event filters

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* delint

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* test env config

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* wrap errs

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* test reconciler

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* address review comments

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* rebase cleanup

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* fix test

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* address review feedback

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-10-07 11:55:09 -05:00
Paul Johnston 3f97a3040f
Arm64 docker (#1030)
* chore: making docker images arch agnostic through docker buildx
2021-09-28 14:17:25 -07:00
Paul Johnston 950da9a068
Remove cns http client which did nothing (#1019)
* chore: removing cns http client which did nothing
2021-09-21 17:02:03 -07:00
aegal 62172a4387
wrap hnsv2 calls and add a test with mocks (#1012)
* feat: update cns client (#992)

* fix debug commands

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* fix: update cns client

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* add ctx to debug calls

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* repackage cns client

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* add ctx to all methods and preinit all route urls

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* down-scope cns client interface and move to consumer packages

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* no unkeyed struct literals

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* trace updated client method signatures out through windows paths

* delint

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* fix windows build

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* delint

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* Remove dead codes from telemetry package (#1004)

* Netlink package interfacing and adding a fake (#996)

* Initial pass at Netlink interface

* changing some netlink and epc

* Resolcing all dependencies on netlink package

* first pass at adding a netlinkinterface

* windows working now

* feat: update cns client (#992)

* fix debug commands

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* fix: update cns client

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* add ctx to debug calls

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* repackage cns client

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* add ctx to all methods and preinit all route urls

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* down-scope cns client interface and move to consumer packages

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* no unkeyed struct literals

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* trace updated client method signatures out through windows paths

* delint

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* fix windows build

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* delint

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* windows working now

* Some golints checks

* commenting a flaky NPM UT and adding some golint checks

* renaming fakenetlink to mocknetlink

* removing a mock netlink usage

* fixing more golints and a test fix

* fixing more go lints

* Adding in netlink from higher level as input

* adding netlinkinterface to windows endpoint impl

* removing netlink name confusion

Co-authored-by: Evan Baker <rbtr@users.noreply.github.com>

* wrap hns calls

* correct delete endpoint test

* tag with windows

* change uuid for fake

* change parameter name to be relevant

* update to add comment and fix build tag

* include build tag on wrapper

* correct merge conflict error

Co-authored-by: Evan Baker <rbtr@users.noreply.github.com>
Co-authored-by: JungukCho <jungukcho@microsoft.com>
Co-authored-by: Vamsi Kalapala <vakr@microsoft.com>
2021-09-21 10:39:01 -06:00
Mathew Merrick 715b14a437
chore: Add mock prefix so generated code is not counted against coverage (#983)
* add mock prefix

* add ignore tag
2021-09-21 09:26:23 -07:00
Mathew Merrick f9b65e95d2
test: add multitenancy ut's (#1018)
* multitenancy minor refactor, and add ut's
2021-09-21 09:11:58 -07:00
Hunter Gregory fe23878507
Remove test coverage (#1007)
* removed test/ and testutil/ from code coverage

* remove promutil from coverage

* removed tools/ from code coverage

* removed crd/ from code coverage and updated multitenantnetworkcontainer's manifest

* switch to !ignore_NAME syntax for test and cli tags

* add coverage back to crd (besides autogenerated files)

* rename ignore_test and ignore_cli tags to ignore_uncovered

* make cns/fakes/ uncovered

* mark go files in crd api folders as uncovered again

* add main.go back for nnsmock server
2021-09-17 15:29:40 -07:00
Evan Baker c8f3f53137
test cns config and hns linux stub (#1005)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-09-16 20:25:40 -05:00
Paul Johnston 97f36db4dc
Client http refactor & testing (#1001)
* cns client refactor and unit tests
2021-09-16 10:52:25 -07:00
Evan Baker 69abf11d4c
feat: update cns client (#992)
* fix debug commands

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* fix: update cns client

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* add ctx to debug calls

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* repackage cns client

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* add ctx to all methods and preinit all route urls

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* down-scope cns client interface and move to consumer packages

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* no unkeyed struct literals

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* trace updated client method signatures out through windows paths

* delint

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* fix windows build

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* delint

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-09-14 12:56:32 -05:00
Evan Baker c38865d5dc
simplify debug api handlers (#991)
* simplify debug api handlers

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* review comments

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-09-08 18:37:38 -05:00
Evan Baker 96bec09d41
chore: appease the linter (3/?), the big gofumpt (#987)
* gofumpt -w -s .

* small addtl cleanups after gofumpt

* rerun after rebase
2021-09-02 16:33:18 -05:00
Evan Baker ac0bede752
feat: crash cns when dupe ips are found during reconcile (#939) 2021-09-01 18:35:19 -05:00
Evan Baker 1087201b28
chore: appease the linter, pt 2 of ? (#925) 2021-09-01 18:28:17 -05:00
Evan Baker 26bd9aadc1
enable prometheus metrics in cns (#972)
* gofumpt

* fix unsafe map access

* add metric wrapper to restserver IP endpoints

* add metrics to ip churn

* add metrics for ip pool scaling latency

* address review comments

* make MetricsBindAddress configurable

* use ifaces and test alloc metrics

* rename alloc to pool and update usage of scaler metric

* address review comments
2021-08-27 15:38:17 -07:00
Evan Baker 642028dabc
chore: remove rename from nnc/v1alpha import (#980) 2021-08-25 11:57:14 -05:00
Evan Baker a94dee97d7
typed State and lazy reusable filters (#978)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-08-23 12:29:45 -05:00
Evan Baker b7a82caa19
Consolidate CRD packages (#975)
* relocate nnc to crd/ package and update all CRDs

* update usages of CRDs to crd package
2021-08-18 17:11:52 -05:00
Evan Baker 059b7b5d5c
chore: cleanups in ipampoolmonitor (#969) 2021-08-16 09:14:51 -07:00
Li Ma b85e611e46 fix notfound error for multitenantnetworkcontainer 2021-08-16 13:39:16 +08:00
Evan Baker 3bd237cbf1
fix: hysteresis in ipampool during rapid pool scaling (#967)
* Fix CNS Scale Up/down fix to avoid cleanup pending ToBeDeleted ips

* remove pendingRelease bool flag

* Incorporate feedback

* incorporate lint failures

* fix more lint issues

* nit fixes again for lint errors

* fixes for lint

* fix comment spacing

* fix final lints

* fix broken ipampoolmonitor test

Co-authored-by: neaggarw <neaggarw@microsoft.com>
Co-authored-by: Matthew Long <61910737+thatmattlong@users.noreply.github.com>
2021-08-09 19:34:34 -07:00
Matthew Long 35630c2651
fix: broken multitenant crd unit test (#963)
* fix: broken multitenant crd unit test

* fix: failing test
2021-08-06 16:51:33 -07:00
Evan Baker 212105f260
chore: typed response codes (#954)
* chore: typed error response code
2021-08-06 14:19:21 -07:00
Matthew Long 2d87089f89 check that multitenant info is set 2021-08-04 15:00:24 -07:00
Matthew Long c35a7b72da fix: pass vlan id and encap type to CNS from CRD 2021-08-04 15:00:24 -07:00
Evan Baker fc88940256
chore: update controller-runtime (#957)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-07-30 12:28:21 -05:00
Pengfei Ni 219e7efee4 feat: set primaryInterfaceIdentifier for cns.CreateNetworkContainerRequest 2021-07-26 17:57:54 +08:00
Pengfei Ni fa77d4348f fix: allow Kubernetes orchestrator for CreateOrUpdateNetworkContainer 2021-07-22 11:07:03 +00:00
Pengfei Ni 403f7b3916 fix: pass in orchestrator context when querying NC and add more logs 2021-07-22 06:34:31 +00:00
Evan Baker 617f898b12
fix: bump minimum cni to 1.4.7 for reconcile flow (#948)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2021-07-20 18:06:35 -05:00
Ashvin Deodhar 0bf6a7b162
update cns.yaml to version 1.4.7 (#943) 2021-07-18 10:23:25 -07:00
Evan Baker ef660042db
fix: mount cni bin dir (#937) 2021-07-15 19:02:43 -05:00
Evan Baker 87e4f91fdc
chore: update cns daemonset specs with new mounts (#936) 2021-07-15 17:46:10 -05:00
Evan Baker 8bf1a2418d
fix: add addtl log context when cns mutates cni statefile (#935) 2021-07-14 18:49:24 -05:00
Evan Baker e33035520d
write empty json object to empty cni statefile if exists (#933)
* write empty json object to empty cni statefile if exists

* update tests, address review comments
2021-07-14 16:01:35 -07:00
Mathew Merrick 9b24dbd95a
test: [NPM] Use fakeexec for ipsm and iptm tests (#868)
* iptmgr

* more iptm testing

* grep call

* progress

* progress

* ipsm

* ioshim

* update tests

* package restructure

* fix broken test and delint

* reduce scope of ioshim

* reduce scope of ioshim

* ioshim scope

* require no error, retrigger ci

* ut return multiple results

* fix tests from master changes

* unexport ioshim

* update ut

* fix tests

* vendor

* test fix

* go version

* go version

* pipeline fixes

* fix tests
2021-07-14 12:53:45 -07:00
Evan Baker 45f3668401
chore: appease the linter, pt 1 of ? (#922) 2021-07-08 13:30:59 -05:00
Evan Baker ac3024a442
fix: only use rootCtx in main() (#908) 2021-07-06 17:37:05 -05:00
gossion 3497bb8510
Merge pull request #915 from feiskyer/fix-multitenant-reconciler
fix: set correct orchestratorType and orchestratorContext
2021-07-01 09:10:28 +08:00
Mathew Merrick f2e763050d
fix: [CNI] handle getting endpoints when state file is empty (#916)
* handle empty state file

* update tests

* restore

* fix: add custom unmarshaller for struct with embedded custom interface type

* mkdir images

Co-authored-by: Evan Baker <rbtr@users.noreply.github.com>
2021-06-30 17:54:43 -07:00
Pengfei Ni c9d941f890 fix: set correct orchestratorType and orchestratorContext 2021-06-30 13:54:29 +08:00
Evan Baker 7d224bf3a6
feat: add flow to initialize CNS from CNI (#890)
* feat: add flow to initialize cns state from cni

* address review comments

* Rename the PodIp map

* fix test

* fix version check

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
Co-authored-by: neaggarw <neaggarw@microsoft.com>
2021-06-29 17:14:11 -05:00
Evan Baker 0bcb0d0da5
chore: migrate from exit/err channels to context (#900)
* chore: migrate from exit/err channels to context

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* pass context instead of storing

* rename controller packages

* comment init
2021-06-23 10:01:13 -07:00
Jaeryn 8acc37d6eb
Removing telemetry processed on HostNetAgent (#903)
Co-authored-by: Jaeryn <tsun.chu@microsoft.com>
2021-06-21 13:22:44 -07:00
neaggarwMS fecf08dd3d
Avoid creating timer each time SyncHostVersion is called, instead use… (#897)
* Avoid creating timer each time SyncHostVersion is called, instead use timer channel

* incorporate feedback

* Use time.Tick instead of NewTicker

* Handle SyncNodeStatus too
2021-06-16 09:27:06 -07:00
aegal b2ea947e94
remove deleting network adapter when no endpoints (#883)
* remove deleting network adapter when no endpoints

* update feedback
2021-06-15 12:12:50 -07:00
Pengfei Ni 4168954200
feat: rename NetworkContainer to MultiTenantNetworkContainer (#892) 2021-06-14 10:17:31 -07:00
Mathew Merrick b09ca83ef7
[CNI] Add GET_ENDPOINT_STATE command to dump CNI state to stdout (#891)
* inital dump state and ipam interface update

* add reconcile command to CNI

* add integration test

* pass endpoint id on add

* address some feedback

* fix test path and linting

* address feedback and logging

* remove return and rename to PodEndpointID
2021-06-11 14:01:42 -07:00
Matthew Long cfa1f508e7
Use cns 1.4.0 in CNS Daemonset yaml (#895)
* update cns tag in yaml

* pipeline cleanup

* Use int64 for NC version in NNC status

* use CNS 1.4.0

Co-authored-by: Matthew Long <Matthew.Long@microsoft.com>
Co-authored-by: Matthew Long <matlong@microsoft.com>
2021-06-10 15:38:43 -07:00
Pengfei Ni de466f58e7
feat: add multi-tenant NetworkContainer controller (#876) 2021-06-03 20:49:00 -07:00
Mathew Merrick fda89d091a
[CNI] Update CNI to pass IP address on delete (#875)
* release with ip address

* CNI contract update

* update contract

* pass endpoint info to invoker delete

* update test

* fix merge artifacts
2021-06-02 00:20:19 -07:00
Mathew Merrick 1fa243e5f5
CI: Add golint-ci (#888)
* add golint-ci

* add gofmt

* enable linters

* uncap count

* fix linting/fmt issues
2021-06-01 16:58:56 -07:00
neaggarwMS d19e04f7c9
Fix to bubble up the error and restart in case it fails to create CRD… (#867)
* Fix to bubble up the error and restart in case it fails to create CRD controllers

* Fix returned error
2021-05-10 16:26:01 -07:00
Mathew Merrick 8c424b245a
[CNI] Specify CNSClient HTTPClient timeout (#862)
* cnsclient timeout

* update request timeout
2021-05-05 17:01:23 -07:00
Ashvin Deodhar 7e834d45bd
Update azure-cns.yaml 2021-05-05 11:02:37 -07:00
Paul Johnston 8449676b4b
feature: don't let CNS ask for more IPS than node limit (#843)
* feature: CNS to honor MaxIPCount in CRD for scaling events
2021-04-09 14:27:10 -07:00
Kshitija Murudi 9aa3781a0a Adding UpdatingIpsNotInUse and Cached NNC to Debug API
In-memory data API - adding 2 more fields to IPAMPoolMonitor
2021-04-07 11:25:35 -07:00
Kshitija Murudi 9947dba9c0 [CNS] Debug API to expose In-Memory Data HTTPRestService
This commit has following changes -
1. Removed AllocatedIPCount field from HTTPRestService
struct in restserver.go as it is not used in project.
2. Added debug command getInMemory and API to expose
fields-HTTPRestService and 2 fields from IPAMPoolMonitor.
Please review the naming of the command, handler and end
point.
3. Added Test function to test the new api response.
4. Added changes as per review comments - Get request
for Test endpoint.
2021-04-06 23:59:26 -07:00
Kshitija Murudi ef138df361 Debug command to expose inmemory struct fields 2021-04-06 23:23:59 -07:00
Kshitija Murudi 8513ed1671
Merge pull request #822 from kmurudi/CNS_debugApi_getPodContexts
CNS Debug API to expose PodIpIdByOrchrestratorContext
2021-03-26 17:32:47 -07:00
Kshitija Murudi 2d318a8d4a CNS Debug API for getting PodContexts
Added GET endpoint instead for API Handler
instead of POST.
Added separate test function.
Formatting on multiple files under /cns.
2021-03-26 14:10:20 -07:00
Mathew Merrick fe66f5b03e
[build] Makefile update to generate image tarballs in artifacts (#804)
Consolidate CNS docker files into 1 only, and output docker images to neat images directory as part of build for release purposes
2021-03-25 14:08:02 -07:00
shchen dc7ff48c78
Fix log which let log to show more readable info. (#803)
Address feedback.
2021-03-17 17:10:23 -07:00
shchen 580eb0c1ad
1. Change comparison in MarkIpsAsAvailableUntransacted from string to int to avoid bug when version 9 update to 10. (#821)
2. Add more log in SyncHostNCVersion.
3. Remove unnecessary check for MarkIpsAsAvailable.
4. Auto formatting.
2021-03-17 16:54:16 -07:00
Kshitija Murudi dbb4e129b8 CNS Debug API to expose PodIpIdByOrchrestratorContext
This API exposes the Httpservice struct field to display
map of OrchestratorContext->PodIpID (UUID) and includes
debug command getPodContexts for CLI use-case.
2021-03-17 12:26:38 -07:00
Kshitija Murudi 5a6c4846de Minor changes for the review comments
GetIPAddressStateResponse kept as original struct, and added
GetIPAddressStatusResponse
Minor changes to reflect GetIPAddressStatusResponse
2021-03-02 09:27:53 -08:00
Kshitija Murudi 199980cb42 CNS_Fixing debug API
The GetIPAddressesMatchingStates now returns IPConfigurationStatus
 type, which also includes PodInfo along with IP address and state
fix : minor formatting
2021-03-02 09:14:27 -08:00
neaggarwMS cab6615310
Azure CNS fix reconcile bug (#809)
Handle the race scenario where HTTP Listener gets started before initializing CNS state. During Initialization/Reconciliation, CNS gets the CRD and recreates the NC request with SecondaryIPs, then it gets list of pods and set those ips as Allocated. If HTTP listener is started then CNI will start requesting ips to CNS and CNS will start allocating right after they are set to Available, thus double allocations. Thus reordered the calls and ensured HTTP Listener is started after initializing CNS.

Also fixed

some logging
avoid double release during failure in IPAM pool monitor
2021-03-02 09:01:49 -08:00
neaggarwMS c5f7dcd81a
Cns fixipampoolmonitor errorhandling (#806)
* temp changes

* Fixed error handling in IPAMPoolMonitor

* incorporate feedback

* Fixed the String() implementation for struct

* Fixed error message

* Fixed logging

* Fixed regressions
2021-03-01 18:38:52 -08:00
Paul Johnston f590717d6b
chore: consolidating logger package for cns aks swift scenario code (#805) 2021-02-26 14:59:03 -08:00
neaggarwMS c9d184183c
Remove Log noise from ipam pool monitor (#798)
* Remove Log noise from ipam pool monitor

* incorporate feedback

* Fix logging string

* build fix
2021-02-22 17:46:15 -08:00
tamilmani1989 658684092b
updated cns version and created cni installer for swift (#794)
no change in code,..so skipping build and test
2021-02-19 18:13:40 -08:00
shchen d4f1f4d4b2
Remove overriding host verson when IP already exist to avoid pending programming IPs become available after CNS restarts and restore from previous state. (#781) 2021-02-11 15:40:01 -08:00
Ashvin Deodhar 4c4e5f5d6d
fix: Remove unreadable CNS state file upon CNS service start (#782)
This change removes unreadable state file ( azure-cns.json ) while starting the CNS service
2021-02-11 09:10:13 -08:00
Mathew Merrick 73ae4e0325
chore: update tools directory and deployments (#779) 2021-02-09 17:56:37 -08:00
shchen c1334728ce
Update CNS ubuntu version to 20.04 (#772) 2021-01-28 12:19:31 -08:00
Mathew Merrick 39c9ad3a72
test: Add Swift Testing Pipeline (#712)
* Add new Swift test scenarios
2021-01-20 11:00:32 -08:00
tamilmani1989 33a26bd830
changed blocking channel send to non-blocking (#766) 2021-01-15 20:44:14 -08:00
shchen 327b0b6415
Ensure pending programming IPs will be released before available IPs when scale down. (#750)
* Ensure pending programming IPs will be released first when scale down.

* Addressed feedbacks for testing and coding style.
2021-01-15 10:05:59 -08:00
tamilmani1989 98f838ef1b
Write to intermediate file before moving to state file (#755)
* write to temp file and move to state file

* fixed memleak and other issues

* call windows replace function with MOVEFILE_WRITE_THROUGH flag

* moved few functions to platform package

* moved test files to correct dir

* addressed comments
2021-01-07 17:43:33 -08:00
shchen 3c48a34df6
Add a go routine to update NC host version from NMAgent periodically. (#714)
* Add a go routine to update NC host version from NMAgent periodically.
If orchestrator type is CRD, update pending programming IPs as well.

* Update NC version in test from 0 to -1, which will allow default IP state as Avaialable instead of pending programming.

* Add secondary IP status updation when reconcile.
Resovle conflicts manually.

Update unit test nc version value.

Update unit test nc version.

Add get nmagent default value back for integ testing purpose.
Unit test can be break by this change.

Update default new IP CNS status to available.

Assign value to host version if none exist in util.go

Addressed feedback and perform cluster integ test with 1 sec frequent nc version update.
Need to clean logNCSnapshots when send out PR.

Update nc version associate with secondary ip. Add new nmagent api test.

Add versionResponseWithoutToken.Containers log

Add containerId from our runner sub.
Add containerId from NMAgent team.

Addressed feedback and add real nmagent logic.

Add timeout when query nmagent for nc version.

* Update comments.

* Add context background with timeout function for syncing node nc version.

* Add 5 second force update CNS pending programming IP to available logic.

* Resovle merge conflict from master.

* Debug and it pass all the test.
This is the final version.

Change the way of http get request to add context. Change channel to no buffer with same goroutine.
Found always fall in ctx.Done() condition.
Add channel close for get nc version list.
Add milisecond unit for timeout.

Testing with different context version.

* Resolve merge conflict.

* Remove force update pending programming IP to available logic. Remain retry if no response from NMAgent.
Release pending programming IP when scale down.

* Remain VMVersion, HostVersion variable name as it is and use the Version inside CreateNetworkContainerRequest.

* Addressed team member feedback.
2020-12-11 13:54:17 -08:00
Matthew Long 2e2efa7a79
Update cns yaml (#743)
* update cns tag in yaml

* pipeline cleanup

* Use int64 for NC version in NNC status

* bump cns version to 1.2.0

Co-authored-by: Matthew Long <Matthew.Long@microsoft.com>
Co-authored-by: Matthew Long <matlong@microsoft.com>
2020-12-07 19:08:44 -05:00
aegal 169d7d7935
Updates for TLS: reading from encrypted PEM file & hostname fix (#742)
* tls fixes

* updating test

* update to support linux as well

* update to support linux and windows

* remove old test file

* pushing minor changes
2020-11-30 14:05:25 -08:00
Ashvin Deodhar 8e7c43edb4
Remove GetNetworkContainerStatus func (#741)
This change removes the unused / unsupported function exposed by CNS
2020-11-25 13:21:56 -08:00
Ashvin Deodhar 007e903a21
Prevent storing auth token in CNS state file (#739)
This change prevents the subnet auth token getting saved in the CNS state file.
2020-11-24 16:29:44 -08:00
Ramiro 1ece2a6d7f making endpoint configurable, attempt to extract host 2020-11-23 18:38:19 -08:00
Vamsi Kalapala a22a852381
Merge pull request #680 from Azure/vakr/cns_lb_mnat
mNAT + LB support for swift containers
2020-11-19 14:15:00 -08:00
vakr 4d3de91316 Addressing comments 2020-11-19 12:27:53 -08:00
vakr cf34795529 changing logic around noderegister 2020-11-19 11:48:02 -08:00
vakr aafc04124f Addressing comments 2020-11-17 19:47:21 -08:00
neaggarwMS 28207fc493
CNS: Fix IPAM scaler down to handle incremental decrease (#731) 2020-11-16 16:47:28 -08:00
vakr 2129a6f5b1 Adjusting logic on re-registering the Node in mDNC case 2020-11-16 10:08:34 -08:00
vakr 6717e485e7 Correcting a timing of the ticker 2020-11-12 13:13:13 -08:00
vakr b971ce473f Correcting a timing of the ticker 2020-11-12 13:03:16 -08:00
vakr 3eedd84c86 changing wireserver to Var, as tests are trying to change this value 2020-11-12 09:54:11 -08:00