Граф коммитов

1339 Коммитов

Автор SHA1 Сообщение Дата
aegal 251edbfd04
Alegal/add timeout to hns calls (#1369)
* initial implemenation with timeout

* initial implemenation with timeout hns

* modify test

* modify code slightly

* updating to read in timeout flag and settings

* updating to read in timeout settings

* remove extra space

* correct a typo

* timeout value greater than zero for detection

* add couple ut's and remove needless code

* including timeout in hnsv1

* wip

* address comments

* address comments

* supress linter errors and update conflist

* fix linter and ensure we don't regress our tests

* updating with p.r feedback

* addressing comments

* updating linter warning

* update to address TM's comments

* fix lint error

* correct a linter spacing complaint

* remove fmt.sprintf
2022-05-25 14:18:46 -07:00
Hunter Gregory b1ae762c89
fix race (#1400) 2022-05-25 22:06:20 +05:30
Hunter Gregory ebd695e23f
fix: [NPM] fix windows intersection of IPs (#1385)
* fix windows intersection

* test intersection

* fix build

* separate out children of pod selector and add translate UTs

* address comments

* address named return comment

* optimize space/time for checking if an ip satisfies a pod selector
2022-05-23 14:05:17 -07:00
Mathew Merrick dfddb2092d
save image tarballs as pipeline artifacts (#1394)
* skopeo save
2022-05-23 11:44:06 -07:00
Keith Nguyen 81b84f6eea
Bugfix/14337706/remove state file (#1392)
* fix: remove state file to let CNI autorecover if CNI state file not present and RefCount not 0

* fix: move to invoker_azure.go

* fix: move to invoker_azure.go

* fix: revert error to private

* fix: unit test

* fix: move into delegate add/delete

Co-authored-by: Keith Nguyen <keithnguyen@microsoft.com>
2022-05-20 17:43:34 -05:00
Jaeryn 7219bb2dd9
fix: Service Account Mitigation for CNS on k8s Windows 2022 (#1367)
* Service Account Mitigation for CNS on k8s Windows 2022

* pick up Neha's bug fix

* addressing comments

* add node selector back

Co-authored-by: Jaeryn <tsun.chu@microsoft.com>
2022-05-20 10:03:10 -07:00
dependabot[bot] 4a599154ee
deps: bump github.com/hashicorp/go-version from 1.4.0 to 1.5.0 (#1393) 2022-05-19 20:45:10 +00:00
Mathew Merrick 1d5f682802
Add cleanup network script (#1371)
* add cleanup network script

* generalize

* generic cwd

* loop through with prefix

* simplify checks

* remove network option
2022-05-19 11:58:50 -07:00
tamilmani1989 75e1239132
Remove duplicate logs (#1375)
* removed cni read config log

* removed duplicated and spam logs

* addressed comment

* commit

* reverting back to old permission

* revert files baxck to original state

* addressing hunter comments
2022-05-19 10:00:52 -07:00
Hunter Gregory cb856749e2
fix translate bug (#1384) 2022-05-19 09:42:27 -07:00
dependabot[bot] 47268cca62
deps: bump sigs.k8s.io/controller-runtime from 0.12.0 to 0.12.1 (#1388)
Bumps [sigs.k8s.io/controller-runtime](https://github.com/kubernetes-sigs/controller-runtime) from 0.12.0 to 0.12.1.
- [Release notes](https://github.com/kubernetes-sigs/controller-runtime/releases)
- [Changelog](https://github.com/kubernetes-sigs/controller-runtime/blob/master/RELEASE.md)
- [Commits](https://github.com/kubernetes-sigs/controller-runtime/compare/v0.12.0...v0.12.1)

---
updated-dependencies:
- dependency-name: sigs.k8s.io/controller-runtime
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-19 10:40:11 -05:00
Evan Baker 5d41d5dc8e
add build/tools to dependabot tracking (#1381)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-05-18 23:23:01 +00:00
Jaeryn bc4a72fab1
Use default build pool for AKS-Swift (#1387)
Co-authored-by: Jaeryn <tsun.chu@microsoft.com>
2022-05-18 13:26:10 -05:00
rjdenney d417856b51
Fix the CNI conflist of multitenancy and add the correct ipam type (#1373)
* Fix the CNI conflist of multitenancy and add the correct ipam type

* Adding the change to windows
2022-05-18 09:45:08 -07:00
Evan Baker f35c92cb99
add independent healthserver for metrics and healthz (#1378)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-05-16 12:01:15 -05:00
Tahmid Alam 2d646ef97a
CNI plugin version has been updated to the latest version. (#1377) 2022-05-16 12:02:17 -04:00
neaggarwMS 661bd5bd46
Fix CNS Reconciling and AssignmentMode handling for AKS Swift (#1368)
* Fix CNS Reconciling and ASsignmentMode handling for AKS Swift

* Fixed the handling of default mode == Dynamic

* Incorporate feedback

* golint fixes

* Fix for whyNoLint: include an explanation for nolint directive (gocritic)
2022-05-13 17:58:06 -05:00
dependabot[bot] 25a73e8c30
deps: bump sigs.k8s.io/controller-runtime from 0.11.2 to 0.12.0 (#1376) 2022-05-13 22:57:38 +00:00
Hunter Gregory a15630bc1b
fix/refactor: [NPM] eliminate struct fields to prevent errors & reduce memory (#1374)
* remove Name field from NPMNetworkPolicy

* wip moving acl policy id

* fix policy key typo for removing Name field

* fix lint and log

* temp debug logs

* another debug log

* update a couple UTs
2022-05-13 11:41:02 -07:00
Camryn Lee 281eb70929
Fix CNS logging (#1372)
* modify cns logging

* update to implement stringer interface

* fixing linting errors

* removing log metadata

Co-authored-by: Evan Baker <rbtr@users.noreply.github.com>

Co-authored-by: Camryn Lee <camrynlee@microsoft.com>
Co-authored-by: Evan Baker <rbtr@users.noreply.github.com>
2022-05-12 11:31:00 -07:00
Evan Baker a8cf7a243a
include 0th address when parsing overlay cidrs (#1365)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-05-11 19:49:16 +00:00
Evan Baker 8750b346ed
CNS Prometheus and Grafana examples (#1366)
* cns prometheus examples

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* grafana samples

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-05-11 12:34:47 -05:00
Evan Baker 2c77774852
update dependabot (#1370)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-05-10 15:40:53 -05:00
Evan Baker d56fac6e4e
build overlay cni tgz (#1351)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-05-06 12:14:46 -05:00
dependabot[bot] f860b1660c
vendor: bump github.com/google/go-cmp from 0.5.7 to 0.5.8 (#1353) 2022-05-06 16:58:15 +00:00
Evan Baker 1e0602ba77
hack pods for debugging (#1364)
* add cx-access kubectl pod

Signed-off-by: GitHub <noreply@github.com>

* add additional manifests

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-05-05 16:35:46 -07:00
Timothy J. Raymond 94f73740e7
Add NMAgent Client (#1305)
* Add implementation for GetNetworkConfiguration

Previously the NMAgent client did not have support for the
GetNetworkConfiguration API call. This adds it and appropriate coverage.

* Refactor retry loops to use shared function

The cancellable retry was common enough that it made sense to extract it
to a separate BackoffRetry function in internal. This made its
functionality easier to test and reduced the number of tests necessary
for each new endpoint

* Slight re-org

The client had enough extra stuff in it that it made sense to start
separating things into different files

* Add retries for Unauthorized responses

In the original logic, unauthorized responses are treated as temporary
for a specific period of time. This makes the nmagent.Error consider
Unauthorized responses as temporary for a configurable time. Given that
BackoffRetry cares only whether or not an error is temporary, this
naturally causes them to be retried.

Additional coverage was added for these scenarios as well.

* Add a WireserverTransport

This deals with all the quirks of proxying requests to NMAgent through
Wireserver, without spreading that concern through the NMAgent client
itself.

* Reorganize the nmagent internal package

The wireserver transport became big enough to warrant its own file

* Use WireserverTransport

This required some changes to the test so that the WireserverTransport
middleware could take effect always

* Add PutNetworkContainer method to NMAgent client

This is another API that must be implemented

* Switch NMAgent client port to uint16

Ports are uint16s by definition.

* Add missing body close and context propagation

* Add DeleteNetworkContainer endpoint

* Move internal imports to another section

It's a bit clearer when internal imports are isolated into one section,
standard library imports in another, then finally external imports in
another section.

* Additional Validation / Retry improvements

This is a bit of a rollup commit, including some additional validation
logic for some nmagent requests and also some improvements to the
internal retry logic. The retry logic is now a struct, and the client
depends only on an interface for retrying. This is to accommodate the
existing retry package (which was unknown).

The internal Retrier was enhanced to add a configurable Cooldown
function with strategies for Fixed backoff, Exponential, and a Max
limitation.

* Move GetNetworkConfig request params to a struct

This follows the pattern established in other API calls. It moves
validation to the request itself and also leaves the responsibility for
constructing paths to the request.

* Add Validation and Path to put request

To be consistent with the other request types, this adds Validate and
Path methods to the PutNetworkContainerRequest

* Introduce Request and Option

Enough was common among building requests and validating them that it
made sense to formalize it as an interface of its own. This allowed
centralizing the construction of HTTP requests in the nmagent.Client. As
such, it made adding TLS disablement trivial. Since there is some
optional behavior that can be configured with the nmagent.Client,
nmagent.Option has been introduced to handle this in a clean manner.

* Add additional error documentation

The NMAgent documentation contains some additional documentation as to
the meaning of particular HTTP Status codes. Since we have this
information, it makes sense to enhance the nmagent.Error so it can
explain what the problem is.

* Fix issue with cooldown remembering state

Previously, cooldown functions were able to retain state across
invocations of the "Do" method of the retrier. This adds an additional
layer of functions to allow the Retrier to purge the accumulated state

* Move Validation to reflection-based helper

The validation logic for each struct was repetitive and it didn't help
answer the common question "what fields are required by this request
struct." This adds a "validate" struct tag that can be used to annotate
fields within the request struct and mark them as required.

It's still possible to do arbitrary validation within the Validate
method of each request, but the common things like "is this field a zero
value?" are abstracted into the internal helper. This also serves as
documentation to future readers, making it easier to use the package.

* Housekeeping: file renaming

nmagent.go was really focused on the nmagent.Error type, so it made
sense to rename the file to be more revealing. The same goes for
internal.go and internal_test.go. Both of those were focused on retry
logic.

Also added a quick note explaining why client_helpers_test.go exists,
since it can be a little subtle to those new to the language.

* Remove Vim fold markers

While this is nice for vim users, @ramiro-gamarra rightly pointed out
that this is a maintenance burden for non-vim users with little benefit.
Removing these to reduce the overhead.

* Set default scheme to http for nmagent client

In practice, most communication for the nmagent client occurs over HTTP
because it is intra-node traffic. While this is a useful option to have,
the default should be useful for the common use case.

* Change retry functions to return durations

It was somewhat limiting that cooldown functions themselves would block.
What was really interesting about them is that they calculated some
time.Duration. Since passing 0 to time.Sleep causes it to return
immediately, this has no impact on the AsFastAsPossible strategy

Also improved some documentation and added a few examples at the request
of @aegal

* Rename imports

The imports were incorrect because this client was moved from another
module.

* Duplicate the request in wireserver transport

Upon closer reading of the RoundTripper documentation, it's clear that
RoundTrippers should not modify the request. While effort has been made
to reset any mutations, this is still, technically, modifying the
request. Instead, this duplicates the request immediately and re-uses
the context that was provided to it.

* Drain and close http ResponseBodies

It's not entirely clear whether this is needed or not. The documentation
for http.(*Client).Do indicates that this is necessary, but
experimentation in the community has found that this is maybe not 100%
necessary (calling `Close` on the Body appears to be enough).

The only harm that can come from this is if Wireserver hands back
enormous responses, which is not the case--these responses are fairly
small.

* Capture unexpected content from Wireserver

During certain error cases, Wireserver may return XML. This XML is
useful in debugging, so we want to capture it in the error and surface
it appropriately. It's unclear whether Wireserver notes the
Content-Type, so we use Go's content type detection to figure out what
the type of the response is and clean it up to pass along to the NMAgent
Client. This also introduces a new ContentError which semantically
represents the situation where we were given a content type that we
didn't expect.

* Don't return a response with an error in RoundTrip

The http.Client complains if you return a non-nil response and an error
as well. This fixes one instance where that was happening.

* Remove extra vim folding marks

These were intended to be removed in another commit, but there were some
stragglers.

* Replace fmt.Errorf with errors.Wrap

Even though fmt.Errorf provides an official error-wrapping solution for
Go, we have made the decision to use errors.Wrap for its stack
collection support. This integrates well with Uber's Zap logger, which
we also plan to integrate.

* Use Config struct instead of functional Options

We determined that a Config struct would be more obvious than the
functional options in a debugging scenario.

* Remove validation struct tags

The validation struct tags were deemed too magical and thus removed in
favor of straight-line validation logic.

* Address Linter Feedback

The linter flagged many items here because it wasn't being run locally
during development. This addresses all of the feedback.

* Remove the UnauthorizedGracePeriod

NMAgent only defines 102 processing as a temporary status. It's up to
consumers of the client to determine whether an unauthorized status
means that it should be retried or not.

* Add error source to NMA error

One of the problems with using the WireserverTransport to modify the
http status code is that it obscures the source of those errors. Should
there be an issue with NMAgent or Wireserver, it will be difficult (or
impossible) to figure out which is which. The error itself should tell
you, and WireserverTransport knows which component is responsible. This
adds a header to the HTTP response and uses that to communicate the
responsible party. This is then wired into the outgoing error so that
clients can take appropriate action.

* Remove leftover unauthorizedGracePeriod

These blocks escaped notice when the rest of the UnauthorizedGracePeriod
logic was removed from the nmagent client.

* Remove extra validation tag

This validation tag wasn't noticed when the validation struct tags were
removed in a previous commit.

* Add the body to the nmagent.Error

When errors are returned, it's useful to have the body available for
inspection during debugging efforts. This captures the returned body and
makes it available in the nmagent.Error. It's also printed when the
error is converted to its string representation.

* Remove VirtualNetworkID

This was redundant, since VNetID covered the same key. It's actually
unclear what would happen in this circumstance if this remained, but
since it's incorrect this removes it.

* Add StatusCode to error

Clients still want to be able to communicate the status code in logs, so
this includes the StatusCode there as well.

* Add GreKey field to PutNetworkContainerRequest

In looking at usages, this `greKey` field is undocumented but critical
for certain use cases. This adds it so that it remains supported.

* Add periods at the end of all docstrings

Docstrings should have punctuation since they're documentation. This
adds punctuation to every docstring that is exported (and some that
aren't).

* Remove unused Option type

This was leftover from a previous cleanup commit.

* Change `error` to a function

The `nmagent.(*Client).error` method wasn't actually using any part of
`*Client`. Therefore it should be a function. Since we can't use `error`
as a function name because it's a reserved keyword, we're throwing back
to the Perl days and calling this one `die`.
2022-05-05 17:58:21 -05:00
Jaeryn dbb4f68393
build: Add Windows CNS & NPM as Part of Multi-Arch Manifest (#1345)
* add windows cns manifest to multi arch image

* try to use generic windows template w/ containerize stage in pipeline

* try and use buildah to pull images

* update manifest build and push for buildah

* create manifest by referencing images instead of pulling to avoid OS mismatch error

* remove unused windows-image.yaml

* remove REGISTRY var and use IMAGE_REGISTRY from makefile

Co-authored-by: Jaeryn <tsun.chu@microsoft.com>
Co-authored-by: Evan Baker <rbtr@users.noreply.github.com>
2022-05-05 15:32:53 -07:00
dependabot[bot] f9bb532fe3
vendor: bump google.golang.org/grpc from 1.45.0 to 1.46.0 (#1348) 2022-05-05 20:53:43 +00:00
dependabot[bot] 53c81536d1
vendor: bump k8s.io/apiextensions-apiserver from 0.23.5 to 0.24.0 (#1360) 2022-05-05 09:20:22 +00:00
Evan Baker bd37755a35
Revert "move metrics to service port, add healthz, pprof (#1318)" (#1362)
This reverts commit 5ec6a3f2a8.
2022-05-05 04:42:06 +00:00
dependabot[bot] d7a5b9eb57
vendor: bump k8s.io/client-go from 0.23.5 to 0.24.0 (#1361) 2022-05-05 02:51:40 +00:00
Evan Baker 6394cf6707
build multiplat manifests with buildah (#1356)
* build multiplat manifests with buildah

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* copy container in to docker-daemon cache and re-enable trivvy

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* set -e

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-05-04 18:18:26 -05:00
Evan Baker 11dcadda18
update missed docker image pull (#1354)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-27 16:38:47 -05:00
Evan Baker 7122dc246d
pull images from mcr (#1347)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-26 17:42:45 -05:00
Evan Baker dd91f431f7
remove workspace and add Make helpers to set it up (#1350)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-25 13:42:29 -05:00
Evan Baker 41479d8de3
make zapai a module (#1334)
* make zapai a module

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* add build tools module to workspace

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-25 10:05:29 -07:00
Jaeryn bfe5d082d6
test: Add WS2022 Test to Pipeline (#1299)
* Adding ws2022 tests

* add eof newline

* bumping orchestrator releas/version

* use containerd v1.6.2

* trying different orchestrator versions

* update pipeline template to pull aks-e from private release

* updating image reference

* Exposing ws2022 e2e gallery/image subscription and version as pipeline variables

* Enabling host process containers in api server config

Co-authored-by: Jaeryn <tsun.chu@microsoft.com>
2022-04-21 17:37:36 -07:00
msvik 9604d6e22f
CNI Linux: Add 10 seconds timeout for ExecuteCommand (#1336)
* CNI Linux: Add 10 seconds timeout for ExecuteCommand

* ran gofmt on changed file

* Fix lint error

* Added unit test for ExecuteCommand for linux

* Moved timeout to execlient. It has default timeout and method to override it now

* reduce timeout values in unit tests

Co-authored-by: msvik <msvik@users.noreply.github.com>
Co-authored-by: VK <misvik@users.noreply.github.com>
2022-04-21 10:00:50 -07:00
Evan Baker e4de3c1b74
remove vendor (#1338)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-19 23:54:29 +00:00
Cristina Kovacs 3df08d9419
test: added additional ingress tests (#1316)
* added additional tests

* fixed lint issue
2022-04-19 13:29:19 -07:00
Hunter Gregory 9bae90a728
fix: [NPM] clean up logs (#1332)
* wip

* fix lint

* change wording in logs

* address comment
2022-04-19 13:29:02 -07:00
Steven Nguyen 5b9cbd4e15
test: add tests for npm/util (#1303) 2022-04-19 13:27:00 -07:00
Evan Baker 95155ca9b4
Support NetworkContainers of Static or Dynamic types (#1325)
* add overlay static nnc to nc listener

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* inline err check

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* pass NodeNetworkConfigs to Update method by value

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* rework nnc reconcile flow for NetworkContainer modes

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-18 22:07:08 -05:00
Hunter Gregory 3b876d4b92
perf: [NPM] make apply ipsets faster (#1307)
* wip

* fix member update

* fix hashed name bug

* fix hashed name bug and bug for original members for delete cache

* fix delete bug

* print cache in logs again

* some UTs

* dirty cache UTs

* fix windows build problem

* dirty cache with members to add/delete & keep old apply with save file code

* fix windows

* fix windows 2

* fix windows 3

* UTs and dirty cache in same struct shared between each OS

* fix windows build

* change some error situations to valid situations

* log clean up and addressing comments
2022-04-18 18:03:52 -07:00
Evan Baker 704ba1adf5
fix: correctly return an error for non http 200 response from nmagent (#1333)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-18 21:50:03 +00:00
Tahmid Alam b13d75a99d
Return if execution mode is v4OnlyOverlay (#1326)
* Return if execution mode is v4OnlyOverlay

* adding Ipam mode and addressing comments

* Fixing tests after IPAM struct changes.

* variable name fix
2022-04-18 12:59:16 -04:00
dependabot[bot] 5b3ed19805
vendor: bump github.com/spf13/viper from 1.10.1 to 1.11.0 (#1330)
Bumps [github.com/spf13/viper](https://github.com/spf13/viper) from 1.10.1 to 1.11.0.
- [Release notes](https://github.com/spf13/viper/releases)
- [Commits](https://github.com/spf13/viper/compare/v1.10.1...v1.11.0)

---
updated-dependencies:
- dependency-name: github.com/spf13/viper
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-14 17:22:43 -05:00
Evan Baker c0d88234c5
move mode to NC and add type (#1331)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-04-14 17:21:39 -05:00
Paul Johnston 7fb08d9775
Make multitenant configs plumb-able in cni manager (#1301) 2022-04-14 17:18:47 -05:00