* Added a security context for allowPrivilegeEscalation and readOnlyRootFilesystem
* Update npm linux to not share the host's UTS namespace and tested locally
* Updated image and configmap of npm to match prod/managed
* kept EnablePprof on for debugging
* Updating k8s version for kind for cyclonus tests
* test
* test
* updated cluster name
* Revert "updated cluster name"
This reverts commit 7715c91098.
* update name
* Updated k8s version
* updated k8s version
* changed k8s version to version of local cluster
* updated kind node version for control plane
* version update
* updated kind version
* updated worker images for kind
* wip: apply dirty NetPols every 500ms in Linux
* only build npm linux image
* fix: check for empty cache
* feat: toggle for netpol interval. default 500 ms
* ci: remove stages "build binaries" and "run windows tests"
* wip: max batched netpols (toggle-specified)
* ci: remove manifest build/push for win npm
* wip: handle ipset deletion properly and max batch for delete too
* fix: correct remove policy
* fix: only remove policy if it was in kernel
* finalize toggles, allowing ability to turn off iptablesInBackground
* ci: conf + cyc use PR's configmaps
* fix: lints
* fix dp toggle: iptablesInBackground
* fix lock typo and config logging
* fix background thread. add comments. only add tmp ref when enabled
* copy pod selector list
* fix: removepolicy needs namespace too
* rename opInfo to event
* fix: fix references and prevent concurrent map read/write
* tmp: debug logging
* fix: missing set references by swap keys and values
* Revert "tmp: debug logging"
This reverts commit 70ed34c714ea4a6d009a1fe90a7168be4bedd5bf.
* fix: add podSelectorList to fake NetPol
* log: do not print error when failing to delete non-existent nft rule
* log: verbose iptables bootup
* log: use fmt.Errorf for clean logging
* log: never return error for iptables in background and fix some lints
* fix: activate/deactivate azure chain rules
* fix: correctly decrement netpols in kernel
* ci: run UTs again
* ci: update profiles. default to placefirst=false
* address comment: rename batch to pendingPolicy
* refactor: make dirty cache OS-specific
* test: UTs
* test: put UT cfg back to placefirst to not break things
* ci: update cyclonus workflows
* fmt: address comment & lint
* fmt: rename numInKernel to policiesInKernel
* log: switch to fmt.Errorf
* fmt: whitespace
* feat: resiliency to errors while reconciling dirty netpols
* log: temporarily print everything for ipset restore
* fix: remove nomatch from ipset -D for cidr blocks
* test: UTs for non-happy path
* test: fix hns fake
* fix: don't change windows. let it delete ipsets when removing policies
* fix windows lint
* fix: ignore chain doesn't exist errors for iptables -D
* feat: latency and failure metrics
* test: update exit code for UT
* metrics: new metrics should go in node-metrics path
* style: simplify nesting
* style: move identical windows & linux code to shared file
* ci: remove v1 conformance and cyclonus
* feat: add NetPols in background from the DP (revert background code in pMgr)
* style: remove "background" from iptables metrics
* revert changes in ipsetmanager, const.go, and dp.Remove/UpdatePolicy
* style: whitespace
* perf: use len() instead of creating slice from map
* remove verbosity for iptables bootup
* build: add return statement
* style: whitespace
* build: fix variable shadowing
* build: fix more import shadowing
* build: windows pointer issue and UT issue
* test: fix UT for iptables error code 2
* ci: enable linux scale test
* ci: revert to master pipeline.yaml
* revert changes to chain-management. do changes in PR #2012
* log: change wording
* test: UTs for netpol in background
* log: wording
* feat: apply ipsets for each netpol individually
* config: rearrange ConfigMap & update capz yaml
* fix: windows bootup phase logic for addpolicy
* feat: restrict netpol in background to linux + nftables
* test: skip nftables check for UT
* style: netpols[0] instead of loop
* log: address log comments
* style: lint for long line
---------
Co-authored-by: Vamsi Kalapala <vakr@microsoft.com>
* feat: remove metric for max members in ipset and add metric for total pod IPs in cluster
* feat: update total pod IP metric in pod controller
* log: make sure we log apply dp errors
* test: UTs for pod count metric
* style: rename metric to customer_pods
* test: revert changes from previous PR to ipsetMgr windows UTs
* style: rename metric to pods_watched
* test: try longer wait
* debug: tmp log
* Revert "debug: tmp log"
This reverts commit 71529a8643.
* style: fix whitespace
* wip
* wip2
* use other apply DP func
* address comment about if statement
* finish bootup for both DPs
* fix lint
* fix lint 2
* fix lint 3
* longer UT timeout and add missing UTs for apply in background
* [NPM] feat: use iptables nft for linux
* adding the capability to detect nft
* fixing an issue
* fixing same issue
* cleaning up old state
* fixing UTs
* correcting tests
* fixing windows lint
* cherry-picking stuff from apply in background POC
* add all policies poc
* add debug prints
* fix deadlock
* fix other GetPolicy deadlock
* update whitespace in yamls
* properly merge
* properly merge 2
* add ACLs in batches
* cleanup errors
* lint and log
* persist state as we add
* refactor into function so we can do UTs on batching
* fix lint
* batch struct
* successful policies
* reduce batch limit to 30
* wip
* fix UT by applying dataplane immediately for RemovePolicy()
* configmap options for apply in background
* fix deadlocks
* better logging
* rename config variables, update default config, change shouldApply check
* update configmap values
* FIXME: remove tmp commit overriding applyDP config (using for pipeline tests)
* optimize applying ipsets for add policy
* cleanup code and finalize apply ipsets for netpols
* flip order of if statement
* UTs. address comments. fix netpol behavior by waiting to start pod controller
* all UTs except ones related to issue #1729
* remove bootup phase stuff
* fix lints and move applyinbackground to toggle
* fix lint
* don't check isWindows every time
* use diff var for applyinbackground
* fix lint
* process updatePods in fifo order
* fix lint
* better UT
* comments and better naming
* stop skipping UTs
* fix lint
* redesign
* dequeue returns nil when cache is empty
* Revert "dequeue returns nil when cache is empty"
This reverts commit 3e8d1872a8.
* requeue if node name has changed
* Revert "Revert "dequeue returns nil when cache is empty""
This reverts commit 3f5f99da1f.
* UT for nil result from dequeue
* get node IP
* add allow-host-to-endpoint ACL
* update ACL ID to be equal to other ACLs in the netpol
* add node ip to acl
* UTs and make node IP a part of pMgr cfg
* fix skip test logic from #1857
* fix pMgr UTs and prom metrics
* fix lints and add comments
* fix UT and prom metrics for linux
* UT for getting node IP
* revert skipTest change
* error out if node IP is an empty string
* update logging for node ip and only get node ip for windows
---------
Co-authored-by: Vamsi Kalapala <vakr@microsoft.com>
* fix: [WIN-NPM] update PATH so we can use debugging tools via NPM
* append to path instead of overwriting
* set original path
* Update azure-npm-capz.yaml
* add powershell to PATH in setkubeconfigpath.ps1
* remove pwsh call and undo path updates and change to setkubeconfig.ps1
* reset acls for pods marked for delete
* simplify: remove stalePodKey
* add log for associating pod with endpoint
* reorganize win dp uts
* UTs for issue 1729 (fail because podMarkedForDelete is not used)
* update controller mock to use deleted pod metadata
* rename podMarkedForDelete to deletedPod
* all UTs (noticing edge in ipsetmanager now)
* updatePod() works for all sequences
* work with updatePodCache and existing controlplane
* remove pod delete logic
* remove change incorrect comment
* fix lint
* fix gofumpt
* remove changes to dp.go logs/comments
* set kubeconfig on capz
* update dockerfile
* test network name Calico
* add base acls
* add WindowsNetworkName toggle and revert hard coded Calico parts
* update base acls for calico and add UTs
* capitalize calico network name
* fix connectivity. try with host allow acls
* revert change to policy_windows.go
* more UTs and add base ACLs for other "new endpoint" scenario
* run all UTs
* update npm image to .42
* add log line
* allow traffic going inter-node
* Revert "allow traffic going inter-node"
This reverts commit e1014822d5.
* add long-runner pod for testing vfp tags in capz
* fix lints
* update endpointcache
* update delete logic
* fix lint issue
* added check for endpoint in cache
* add unit test
* fix type lint issue
* fix naming lint error
* updated endpoint cache lock
* moved cache lock down and added check for windows
* print statements
* cleanup Running pod with empty IP
* add log line
* revert previous 3 commits
* enqueue updates with empty IPs and add prometheus metric
* fix lints
* handle pod assigned to wrong endpoint edge case
* log and update comment
* UTs and fixed named port + build
* reset entire endpoint regardless of cache
* remove comment in dp.go
* fix windows build issues
* skip refreshing endpoints and address comments
* only sync empty ip if pod running. add tmp log
* undo special pod delete logic
* reference GH issue
* fix Windows UTs
* remove prometheus metrics and a log
---------
Co-authored-by: Vamsi Kalapala <vakr@microsoft.com>
* add check for ipv6 addresses in translatePolicy
* fix static error lint issue
* updated static error name and moved IPV4 logic
* moved check to ipBlockRule
* updated UT
---------
Co-authored-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>