Граф коммитов

300 Коммитов

Автор SHA1 Сообщение Дата
Hunter Gregory 2688c59684
fix: [NPM] [Linux] race condition when editing NetPol with "except" CIDR (#2841)
* fix: syntax error when deleting nomatch CIDR ipset

Signed-off-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>

* test: ut members with nomatch

Signed-off-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>

---------

Signed-off-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>
2024-07-24 19:22:14 +00:00
Evan Baker 994ba651b9
combine linux and win Dockerfiles using build targets (#2559)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2024-06-17 21:53:44 +00:00
Evan Baker 0c598e3e68
fix: pin Windows images (#2742)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2024-05-16 23:45:44 +00:00
John Payne cc68e88585
ci: update NPM Windows daemonset for k8s version 1.28 (#2622)
* ci: add working directory for k8s 1.28

* ci: match workflows

* ci: use branch daemonset
2024-03-06 18:39:18 +00:00
rayaisaiah e89223e29f
chore: [NPM] Updated NPM to Not Use Privileged containers and to Mount Container's Root Filesystem as Read Only (#2598)
Resolved NPM vulnerability Do not use privileged containers and  Mount container's root filesystem as read only
2024-02-29 02:56:16 +00:00
rayaisaiah 8d68e7527a
chore: [NPM] Updated NPM to Not Share Host's UTS Namespace + Image/Configmap Alignment with Prod (#2589)
* Added a security context for allowPrivilegeEscalation and readOnlyRootFilesystem

* Update npm linux to not share the host's UTS namespace and tested locally

* Updated image and configmap of npm to match prod/managed

* kept EnablePprof on for debugging

* Updating k8s version for kind for cyclonus tests

* test

* test

* updated cluster name

* Revert "updated cluster name"

This reverts commit 7715c91098.

* update name

* Updated k8s version

* updated k8s version

* changed k8s version to version of local cluster

* updated kind node version for control plane

* version update

* updated kind version

* updated worker images for kind
2024-02-15 22:28:26 +00:00
rayaisaiah ea7e733637
chore: [NPM] Remove TLS Certifications in NPM (#2561)
Remove certs created in npm images
2024-02-07 17:16:20 +00:00
Evan Baker ad10662314
chore: migrate from disallowed registries (#2455)
Signed-off-by: GitHub <noreply@github.com>
2024-01-03 18:07:28 +00:00
rayaisaiah a69bd6cd13
fix: NPM Changed Allow All Policy for Ingress and Egress (#2409)
Changed allow all policy for ingress and egress to allow all even with other rules in netpol and added tests.
2023-11-30 09:44:11 -08:00
Evan Baker 5cad713f31
chore: update to go1.21 (#2384)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2023-11-15 15:17:26 -06:00
rayaisaiah 74a67b306c
fix: Changed the SetPolicySetting struct value to correct name in Windows NPM (#2386)
Created a new PR in non forked ACN repoo that changed the setPolicy struct PolicyType to Type in Windows NPM.
2023-11-15 01:23:56 +00:00
Hunter Gregory 2e1601a9af
build: [NPM] update repo for ubuntu 20 (#2217)
* build: from ubuntu 20 (just adding whitespace)

* build: update repo

---------

Signed-off-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>
2023-10-30 14:12:05 -07:00
Hunter Gregory f4dd79c69a
test(capz): [WIN-NPM] support containerd 1.7 filesystem (#2267)
* style: whitespace in all NPM yamls

* fix(capz): first try for containerd 1.7 filesystem change

* fix(capz): remove kubeconfig arg and add comment
2023-10-05 14:20:25 +00:00
Hunter Gregory 0f3cd581b2
chore: bump azure-npm-capz.yaml (#2053)
* update azure-npm-capz.yaml

* style: whitespace
2023-07-19 16:13:12 -07:00
Hunter Gregory ebddca18bd
perf: [NPM] [LINUX] add NetPols in background (#1969)
* wip: apply dirty NetPols every 500ms in Linux

* only build npm linux image

* fix: check for empty cache

* feat: toggle for netpol interval. default 500 ms

* ci: remove stages "build binaries" and "run windows tests"

* wip: max batched netpols (toggle-specified)

* ci: remove manifest build/push for win npm

* wip: handle ipset deletion properly and max batch for delete too

* fix: correct remove policy

* fix: only remove policy if it was in kernel

* finalize toggles, allowing ability to turn off iptablesInBackground

* ci: conf + cyc use PR's configmaps

* fix: lints

* fix dp toggle: iptablesInBackground

* fix lock typo and config logging

* fix background thread. add comments. only add tmp ref when enabled

* copy pod selector list

* fix: removepolicy needs namespace too

* rename opInfo to event

* fix: fix references and prevent concurrent map read/write

* tmp: debug logging

* fix: missing set references by swap keys and values

* Revert "tmp: debug logging"

This reverts commit 70ed34c714ea4a6d009a1fe90a7168be4bedd5bf.

* fix: add podSelectorList to fake NetPol

* log: do not print error when failing to delete non-existent nft rule

* log: verbose iptables bootup

* log: use fmt.Errorf for clean logging

* log: never return error for iptables in background and fix some lints

* fix: activate/deactivate azure chain rules

* fix: correctly decrement netpols in kernel

* ci: run UTs again

* ci: update profiles. default to placefirst=false

* address comment: rename batch to pendingPolicy

* refactor: make dirty cache  OS-specific

* test: UTs

* test: put UT cfg back to placefirst to not break things

* ci: update cyclonus workflows

* fmt: address comment & lint

* fmt: rename numInKernel to policiesInKernel

* log: switch to fmt.Errorf

* fmt: whitespace

* feat: resiliency to errors while reconciling dirty netpols

* log: temporarily print everything for ipset restore

* fix: remove nomatch from ipset -D for cidr blocks

* test: UTs for non-happy path

* test: fix hns fake

* fix: don't change windows. let it delete ipsets when removing policies

* fix windows lint

* fix: ignore chain doesn't exist errors for iptables -D

* feat: latency and failure metrics

* test: update exit code for UT

* metrics: new metrics should go in node-metrics path

* style: simplify nesting

* style: move identical windows & linux code to shared file

* ci: remove v1 conformance and cyclonus

* feat: add NetPols in background from the DP (revert background code in pMgr)

* style: remove "background" from iptables metrics

* revert changes in ipsetmanager, const.go, and dp.Remove/UpdatePolicy

* style: whitespace

* perf: use len() instead of creating slice from map

* remove verbosity for iptables bootup

* build: add return statement

* style: whitespace

* build: fix variable shadowing

* build: fix more import shadowing

* build: windows pointer issue and UT issue

* test: fix UT for iptables error code 2

* ci: enable linux scale test

* ci: revert to master pipeline.yaml

* revert changes to chain-management. do changes in PR #2012

* log: change wording

* test: UTs for netpol in background

* log: wording

* feat: apply ipsets for each netpol individually

* config: rearrange ConfigMap & update capz yaml

* fix: windows bootup phase logic for addpolicy

* feat: restrict netpol in background to linux + nftables

* test: skip nftables check for UT

* style: netpols[0] instead of loop

* log: address log comments

* style: lint for long line

---------

Co-authored-by: Vamsi Kalapala <vakr@microsoft.com>
2023-07-19 09:13:52 -07:00
Vamsi Kalapala e4691364ad
fix: disabling CGO for NPM (#2044) 2023-07-14 06:35:18 +00:00
Hunter Gregory 36b67b4cb2
fix: [WIN-NPM] race during bootup where we may not add one NetPol to a Pod (#2028)
* fix: lock while adding policy in bootup phase

* test: fix UT to model true pod controller behavior

* fix: prevent deadlock

* style: lint
2023-06-23 10:03:54 -07:00
Hunter Gregory 256dc890c7
log: [NPM] don't log benign nft errors at bootup (#2012)
log: clean up logs for nft
2023-06-19 12:02:34 -07:00
Hunter Gregory 96243c325e
fix: [WIN-NPM] fix units of new latency metrics (#2018)
* fix: new latency metrics use seconds instead of milliseconds

* chore: lint

* fix: lower buckets to start at 8 milliseconds
2023-06-16 23:22:27 +00:00
Hunter Gregory e6eeb5014a
feat: [NPM] metric for total Pod IPs (#1999)
* feat: remove metric for max members in ipset and add metric for total pod IPs in cluster

* feat: update total pod IP metric in pod controller

* log: make sure we log apply dp errors

* test: UTs for pod count metric

* style: rename metric to customer_pods

* test: revert changes from previous PR to ipsetMgr windows UTs

* style: rename metric to pods_watched

* test: try longer wait

* debug: tmp log

* Revert "debug: tmp log"

This reverts commit 71529a8643.

* style: fix whitespace
2023-06-09 09:59:27 -07:00
Hunter Gregory 6159781ed5
log/test: [WIN-NPM] log when we finish bootup phase & make UT more consistent (#2001)
* log: indicate when we finish bootup phase

* test: add extra apply DP for multi job UTs

* test: check error for applydp
2023-06-09 09:59:03 -07:00
Hunter Gregory 04f92857f2
feat: [WIN-NPM] metrics for latencies and failures (#1959)
* implement metrics

* add npm prefix

* rename windows files

* metrics pkg UTs

* allow reinitializing prometheus metrics

* fix: hns wrapper should not throw error for empty SetPolicy values

* test: metric UTs in dataplane

* fix: record list endpoint latency always

* remove flaky UT

* feat: metric for max ipset members

* fix lint

* fix lint 2

* fix build

* fix lint 3

* simplify conditionals and protect against maxMembers becoming negative

* remove bottom 4 histogram buckets. start at 16 ms

* reset metrics for ipset UTs

* style: don't check for windows dp in *_windows.go files

* build: remove unused import

* test: reset windows metrics in UT
2023-06-05 12:43:39 -07:00
Hunter Gregory 2066717ccb
perf: [WIN-NPM] fast bootup (#1900)
* wip

* wip2

* use other apply DP func

* address comment about if statement

* finish bootup for both DPs

* fix lint

* fix lint 2

* fix lint 3

* longer UT timeout and add missing UTs for apply in background
2023-06-02 14:48:08 -07:00
Hunter Gregory aaf639c50b
fix: [NPM] check if policy exists in case of nil pointer (#1974)
fix: check for nil first
2023-06-02 10:30:06 -07:00
Vamsi Kalapala 77c92b3422
[NPM] feat: use iptables nft for linux (#1937)
* [NPM] feat: use iptables nft for linux

* adding the capability to detect nft

* fixing an issue

* fixing same issue

* cleaning up old state

* fixing UTs

* correcting tests

* fixing windows lint
2023-05-02 23:25:27 -07:00
Hunter Gregory 61aae0371b
perf: [WIN-NPM] add all cached NetworkPolicies to a Pod at once (#1893)
* cherry-picking stuff from apply in background POC

* add all policies poc

* add debug prints

* fix deadlock

* fix other GetPolicy deadlock

* update whitespace in yamls

* properly merge

* properly merge 2

* add ACLs in batches

* cleanup errors

* lint and log

* persist state as we add

* refactor into function so we can do UTs on batching

* fix lint

* batch struct

* successful policies

* reduce batch limit to 30
2023-04-26 08:33:50 -07:00
Hunter Gregory aa163aad3f
perf: [WIN-NPM] apply ipsets in background (#1875)
* wip

* fix UT by applying dataplane immediately for RemovePolicy()

* configmap options for apply in background

* fix deadlocks

* better logging

* rename config variables, update default config, change shouldApply check

* update configmap values

* FIXME: remove tmp commit overriding applyDP config (using for pipeline tests)

* optimize applying ipsets for add policy

* cleanup code and finalize apply ipsets for netpols

* flip order of if statement

* UTs. address comments. fix netpol behavior by waiting to start pod controller

* all UTs except ones related to issue #1729

* remove bootup phase stuff

* fix lints and move applyinbackground to toggle

* fix lint

* don't check isWindows every time

* use diff var for applyinbackground

* fix lint
2023-04-19 13:46:18 -07:00
Hunter Gregory 026c201d0d
fix: [WIN-NPM] process updatePods in fifo order (#1856)
* process updatePods in fifo order

* fix lint

* better UT

* comments and better naming

* stop skipping UTs

* fix lint

* redesign

* dequeue returns nil when cache is empty

* Revert "dequeue returns nil when cache is empty"

This reverts commit 3e8d1872a8.

* requeue if node name has changed

* Revert "Revert "dequeue returns nil when cache is empty""

This reverts commit 3f5f99da1f.

* UT for nil result from dequeue
2023-04-12 09:52:31 -07:00
Hunter Gregory ddb3417cad
fix: [WIN-NPM] allow readiness probes (#1887)
* get node IP

* add allow-host-to-endpoint ACL

* update ACL ID to be equal to other ACLs in the netpol

* add node ip to acl

* UTs and make node IP a part of pMgr cfg

* fix skip test logic from #1857

* fix pMgr UTs and prom metrics

* fix lints and add comments

* fix UT and prom metrics for linux

* UT for getting node IP

* revert skipTest change

* error out if node IP is an empty string

* update logging for node ip and only get node ip for windows

---------

Co-authored-by: Vamsi Kalapala <vakr@microsoft.com>
2023-04-11 14:34:17 -07:00
Hunter Gregory 626582661c
fix: [WIN-NPM] filter HNS endpoints (#1848)
* filter endpoints

* UTs

* fix: fake wrapper never tracked Flags for remote endpoints

* fix lints and address comment

* fix lints

* fix lint

* golint timeout from 10 to 15m

* undo golint timeout change. needs to happen in master

* try to fix lints in hnsv2wraper

* wrap checks
2023-04-06 12:32:58 -07:00
Evan Baker 4dfd97c274
update to go1.20 (#1781)
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2023-03-29 10:53:21 -07:00
Hunter Gregory 0c9a726744
fix: [WIN-NPM] back-compat mitigation to PATH issue (#1867)
* fix: [WIN-NPM] update PATH so we can use debugging tools via NPM

* append to path instead of overwriting

* set original path

* Update azure-npm-capz.yaml

* add powershell to PATH in setkubeconfigpath.ps1

* remove pwsh call and undo path updates and change to setkubeconfig.ps1
2023-03-27 11:07:44 -07:00
Hunter Gregory 8fcc0849cc
test: [WIN-NPM] skip unnecessary UTs (#1857)
skip randomly failing UTs
2023-03-22 09:59:43 -07:00
Hunter Gregory f0440f3bda
fix: [WIN-NPM] revert breaking change to .exe filename (#1843)
revert .exe filename
2023-03-08 16:44:05 -08:00
Cristina Kovacs d7e4f6c984
fix: [NPM] add check for invalid /0 cidr (#1739)
* add check for invalid /0 cidr

* added UT for valid cidr

---------

Co-authored-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>
2023-03-08 12:50:07 -08:00
Hunter Gregory 60fcc5c17e
fix: [NPM-WIN] ability to reassign Pod associated with Endpoint (#1806)
* reset acls for pods marked for delete

* simplify: remove stalePodKey

* add log for associating pod with endpoint

* reorganize win dp uts

* UTs for issue 1729 (fail because podMarkedForDelete is not used)

* update controller mock to use deleted pod metadata

* rename podMarkedForDelete to deletedPod

* all UTs (noticing edge in ipsetmanager now)

* updatePod() works for all sequences

* work with updatePodCache and existing controlplane

* remove pod delete logic

* remove change incorrect comment

* fix lint

* fix gofumpt

* remove changes to dp.go logs/comments
2023-03-08 10:45:31 -08:00
Hunter Gregory 37684cbbbe
chore: [WIN-NPM] update mcr image to v1.4.45 for capz testing (#1837) 2023-03-07 13:54:21 -08:00
Hunter Gregory 70aafa54c1
fix: [WIN-NPM] change .exe name and set PATH for containerd issue (#1838) 2023-03-06 16:59:12 -08:00
Cristina Kovacs 2e211ef05e
fix: [NPM-WIN] do not refresh endpoints if pod cache is empty (#1831)
* do not refresh endpoints if pod cache is empty

* moved cache lock up

* reposition cache lock

* fixed incorrect logic
2023-03-06 12:55:41 -08:00
Hunter Gregory 2832b50375
feat: [NPM-WIN] support for CAPZ windows testing (#1752)
* set kubeconfig on capz

* update dockerfile

* test network name Calico

* add base acls

* add WindowsNetworkName toggle and revert hard coded Calico parts

* update base acls for calico and add UTs

* capitalize calico network name

* fix connectivity. try with host allow acls

* revert change to policy_windows.go

* more UTs and add base ACLs for other "new endpoint" scenario

* run all UTs

* update npm image to .42

* add log line

* allow traffic going inter-node

* Revert "allow traffic going inter-node"

This reverts commit e1014822d5.

* add long-runner pod for testing vfp tags in capz

* fix lints
2023-03-02 13:24:31 -08:00
Evan Baker 358de20d2c
Add multiplat Windows 2019 and 2022 image support to tooling and pipelines, use for CNS/NPM (#1820)
* build: add ws2019 cns image to build pipeline

* add windows2019 build pool

* fix: npm pipelines

* fix: npm pipelines

* update multiplat build process for winver flavors

* optional buildx push for npm cyclonus

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
Co-authored-by: Matthew Long <61910737+thatmattlong@users.noreply.github.com>
Co-authored-by: Matthew Long <Matthew.Long@microsoft.com>
2023-03-01 00:25:00 +00:00
Hunter Gregory 69892152bb
log: [NPM-WIN] clean up some logs (#1807)
clean up some logs

Co-authored-by: Vamsi Kalapala <vakr@microsoft.com>
2023-02-24 14:15:59 -08:00
Cristina Kovacs 0d4c65f978
fix: [NPM] update endpointcache after remove policy (#1804)
* update endpointcache

* update delete logic

* fix lint issue

* added check for endpoint in cache

* add unit test

* fix type lint issue

* fix naming lint error

* updated endpoint cache lock

* moved cache lock down and added check for windows
2023-02-24 13:06:37 -08:00
Hunter Gregory 09cd371fb4
fix: [NPM] cleanup restarted pod stuck with no IP (#1503)
* print statements

* cleanup Running pod with empty IP

* add log line

* revert previous 3 commits

* enqueue updates with empty IPs and add prometheus metric

* fix lints

* handle pod assigned to wrong endpoint edge case

* log and update comment

* UTs and fixed named port + build

* reset entire endpoint regardless of cache

* remove comment in dp.go

* fix windows build issues

* skip refreshing endpoints and address comments

* only sync empty ip if pod running. add tmp log

* undo special pod delete logic

* reference GH issue

* fix Windows UTs

* remove prometheus metrics and a log

---------

Co-authored-by: Vamsi Kalapala <vakr@microsoft.com>
2023-02-15 13:38:47 -08:00
Cristina Kovacs 04a3d2904e
fix: [NPM] add check for valid IPV4 addresses in TranslatePolicy (#1738)
* add check for ipv6 addresses in translatePolicy

* fix static error lint issue

* updated static error name and moved IPV4 logic

* moved check to ipBlockRule

* updated UT

---------

Co-authored-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>
2023-02-08 04:55:55 +00:00
Hunter Gregory bd299fe727
fix: [NPM-WIN] ignore irrelevant errors from ipsetmanager (#1741)
* ignore certain errors

* remove changes to updatePod tracking (on/off-node)
2023-01-19 09:45:39 -08:00
Hunter Gregory 3e5915f802
perf: [NPM-WIN] only track updatepods on-node (#1743)
only track updatepods on-node
2023-01-19 09:44:24 -08:00
Hunter Gregory 8bbb99e2d9
fix: [NPM] add NetworkPolicy validation for matchExpression values (#1717)
* add validation for matchExpression values

* fix lint

* make regex identical to kubectl validation and include more test cases

* remove debugging lines
2023-01-19 09:43:55 -08:00
Hunter Gregory 1f1aacf8af
log: [NPM] warn instead of error on invalid Pod IPs (#1718)
warn instead of error on invalid Pod IPs
2022-12-14 09:55:38 -08:00
Vamsi Kalapala 8cd63a8d0d
fix: [NPM-Win] Get local endpoints for updatepod, but all endpoints for cleanup. (#1606)
* fix: [NPM-WIN] Get only local endpoints to apply ACLs on

* addding a const

* fix lints

* update UTs (TODO: uncomment multi-job test when fixed)

* resolve lint again

* true backwards compatibility

* resolve TODO by uncommenting UT

* fix lint

Co-authored-by: Hunter Gregory <hunterlgregory@gmail.com>
2022-12-12 16:07:12 -08:00