* ci: comment out windows cyclonus and windows scale test
Signed-off-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>
* ci: slim down windows conformance to 14 tests
Signed-off-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>
* ci: make sure conformance skips "Linux Only"
Signed-off-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>
* ci: conformance was not running test cases due to formatting
Signed-off-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>
---------
Signed-off-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>
* test(kwok): try standard tier for cluster
* Revert "test(kwok): try standard tier for cluster"
This reverts commit f76e50a559.
* test: run kwok as pod
* fix: add execute permission to sh files
* fix: allow scheduling on linux for kwok pod
* fix: wait timeouts and add retry logic
* fix: make sure to reapply kwok nodes if wait fails
* test: print out cluster state if wait fails
* test: prevent kwok from scheduling on windows node
* test: first wait for kwok pods (20 minutes)
* style: rearrange wait check
* fix: scale up kwok controller for reliability
* fix: typo in scaling kwok pods
* fix: check kwok pods running in test-connectivity instead of test-scale
* fix: wait for pods before adding NetPol
* fix: 7 second timeout for windows agnhost connect
* feat: get cluster state on failure
* debug: fake a failure to verify log capture
* fix: bugs in getting cluster state
* fix: remove newline instead of "n"
* Revert "debug: fake a failure to verify log capture"
This reverts commit 24ec927425.
* feat(win-debug): get prom metrics
* fix: leave timeout=5s for win
* style: remove new, unused --connect-timeout parameter
* style: comment
* feat: top node/pod
* ci: conditional run logic
* add: SCENARIO env var control
* ci: change RUN to CNI
* ci: move SCENARIO to job level
* fix: change env vars to unique values
* ci: condition logic and description
* wip: apply dirty NetPols every 500ms in Linux
* only build npm linux image
* fix: check for empty cache
* feat: toggle for netpol interval. default 500 ms
* ci: remove stages "build binaries" and "run windows tests"
* wip: max batched netpols (toggle-specified)
* ci: remove manifest build/push for win npm
* wip: handle ipset deletion properly and max batch for delete too
* fix: correct remove policy
* fix: only remove policy if it was in kernel
* finalize toggles, allowing ability to turn off iptablesInBackground
* ci: conf + cyc use PR's configmaps
* fix: lints
* fix dp toggle: iptablesInBackground
* fix lock typo and config logging
* fix background thread. add comments. only add tmp ref when enabled
* copy pod selector list
* fix: removepolicy needs namespace too
* rename opInfo to event
* fix: fix references and prevent concurrent map read/write
* tmp: debug logging
* fix: missing set references by swap keys and values
* Revert "tmp: debug logging"
This reverts commit 70ed34c714ea4a6d009a1fe90a7168be4bedd5bf.
* fix: add podSelectorList to fake NetPol
* log: do not print error when failing to delete non-existent nft rule
* log: verbose iptables bootup
* log: use fmt.Errorf for clean logging
* log: never return error for iptables in background and fix some lints
* fix: activate/deactivate azure chain rules
* fix: correctly decrement netpols in kernel
* ci: run UTs again
* ci: update profiles. default to placefirst=false
* address comment: rename batch to pendingPolicy
* refactor: make dirty cache OS-specific
* test: UTs
* test: put UT cfg back to placefirst to not break things
* ci: update cyclonus workflows
* fmt: address comment & lint
* fmt: rename numInKernel to policiesInKernel
* log: switch to fmt.Errorf
* fmt: whitespace
* feat: resiliency to errors while reconciling dirty netpols
* log: temporarily print everything for ipset restore
* fix: remove nomatch from ipset -D for cidr blocks
* test: UTs for non-happy path
* test: fix hns fake
* fix: don't change windows. let it delete ipsets when removing policies
* fix windows lint
* fix: ignore chain doesn't exist errors for iptables -D
* feat: latency and failure metrics
* test: update exit code for UT
* metrics: new metrics should go in node-metrics path
* style: simplify nesting
* style: move identical windows & linux code to shared file
* ci: remove v1 conformance and cyclonus
* feat: add NetPols in background from the DP (revert background code in pMgr)
* style: remove "background" from iptables metrics
* revert changes in ipsetmanager, const.go, and dp.Remove/UpdatePolicy
* style: whitespace
* perf: use len() instead of creating slice from map
* remove verbosity for iptables bootup
* build: add return statement
* style: whitespace
* build: fix variable shadowing
* build: fix more import shadowing
* build: windows pointer issue and UT issue
* test: fix UT for iptables error code 2
* ci: enable linux scale test
* ci: revert to master pipeline.yaml
* revert changes to chain-management. do changes in PR #2012
* log: change wording
* test: UTs for netpol in background
* log: wording
* feat: apply ipsets for each netpol individually
* config: rearrange ConfigMap & update capz yaml
* fix: windows bootup phase logic for addpolicy
* feat: restrict netpol in background to linux + nftables
* test: skip nftables check for UT
* style: netpols[0] instead of loop
* log: address log comments
* style: lint for long line
---------
Co-authored-by: Vamsi Kalapala <vakr@microsoft.com>
* ci: check if directory is empty before applying it
* ci: don't wait for pods if they weren't created
* docs: fix script name
* ci: wip for enabling linux scale test
* ci: parameters for linux vs windows
* ci: adjust params
* ci: fix bash typo
* ci: fix cp
* ci: fix npm url
* ci: increase max pods for linux nodepool
* ci: start building windows image again
* tmp: use apply netpol in background image
* Revert "tmp: use apply netpol in background image"
This reverts commit eff43c5439.
* refactor: use CLUSTER_NAME variable
* ci: require succeeded() for scale & conformance tests
* test: fix vars used in test-scale.sh checks
* ci: disable linux, reenable windows
* ci: increase sleep before waiting for NPM to start & log info when it doesn't
* ci: better log capture & remove command from other pipeline
* ci: do not get logs of npm on kwok nodes
* ci: do not get logs of npm on kwok nodes (part 2)
* wip
* temporarily disable most conf runs
* update readme
* back to raw yamls and clone the branch to run scale test
* fix raw yaml URLs
* fix inline script
* fix length of rg name
* uncomment all conf again
* comment out everything unnecessary for testing
* remove commented out dependencies
* use master branch for pipeline
* label nodes
* multiple nodes
* uncomment rest of conformance pipeline (originally commented for testing)
* fix print out for time taken in test-connectivity.sh
* fix: run kwok command in background
* mkdir for kwok log
* try azure cli 1 to fix login error
* Revert "try azure cli 1 to fix login error"
This reverts commit f1671e3939.
* move scale test to new pipeline yaml
* remove scale test from conformance pipeline yaml
* revert name change for cyclonus job
* remove unnecessary image build and variable
* error codes and display names
* change sleep and wait for npm logic
* look at directory
* use pre-cloned repo
* fix directory path
* install kubectl first
* FIXME: comment out succeeded condition
* kubectl binary arg
* kubectl for scale test
* fix label selector
* fix kubectl path
* fix kubectl binary arg
* fix kwok, more steps
* FIXME: temporarily use custom fast image
* fix kwok pid and add comment
* 10m timeout for connectivity after crud
* fix kwok command invocation
* bump up timeouts for testing
* higher memory limit
* add note to connectivity script
* fix sed
* no need to curl npm yaml
* tmp: comment things out to test final step
* only check if kwok pods are running, not necessarily ready
* Revert "tmp: comment things out to test final step"
This reverts commit 7b21125ab1.
* update registry keys to fix HNS reliability
* update regkey code
* sleep to let NPM restart in case of bootup failure
* adaptive wait timeout
* change some errors to warnings
* log date
* make sure all pods are labeled
* delete and readd labels after deleting pods
* tmp: skip large scale up and connectivity check for testing
* fix overwrite arg
* rename tasks and uncomment things
* update command for updating reg key
* make timeout logic simpler
* back to reg add command for regkeys
* official timeouts instead of test values
* delete task updating registry keys and stop hardcoding npm image
* increase sleep
* update linux conformance binary
* temporarily comment out test profiles until one works
* Revert "temporarily comment out test profiles until one works"
This reverts commit db623d3833.
* undo change to git checkout for windows
* prevent chance of overlap in resource groups
* print npm pod state and capture previous logs
* fail cyclonus if it doesn't complete
* redirect to /dev/null
* remove incorrect use of --kubeconfig
* [NPM][Windows] remove preview flag
* remove k8s version in one file
* remove k8s version in another file
* remove k8s version in third file
Co-authored-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>
* pipelines: Adding a new stage for NPM continous integration
* pipelines: Adding a new stage for NPM continous integration
* pipelines: Adding a new stage for NPM continous integration
* removing download phase
* removing download phase
* remove v1 default
* use submodule specific tags
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
* support separate go submodule versions
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
* move version and tag responsibilities to the makefile
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
* update integration tests to use component tags
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
* test logs
* wip
* add IS_STRESS_TEST to matrix
* finish
* Copy instead of move kubeconfig
* fix folder name
* Fix folder name again
* sleep and fix published folder
* should be done
* rearrange and exit properly
* test exit code
* finalized
* make num parallel jobs a global setting
* mtest: run conformance for npm profiles in parallel
* update dependsOn field
* FIXME updating trigger for pipeline run for now
* Revert "FIXME updating trigger for pipeline run for now"
This reverts commit dce5ac2fbd.
* fix yaml compile error
* shorten cluster name because it was too long
* remove test for v1-place-azure-chain-first profile
* remove always() condition for cleanup and publish kubeconfig artifact
* rename kubeconfig artifact name
* try putting kubeconfig in the published npmlogs folder
* ipset fail if set doesn't exist when attempting to add to list
* bail out on add to set when ip is empty
* check set type when adding to list
* revert checking nil value
* fix test build issues
* additional ip check
* ip check with parseip
* prometheus count
* delete from set checks
* delete from set checks
* log on skipping pod
* logging pipeline
* npm logs ci
* first pass at decoupling resource maps
* First pass on decoupling resource maps
* Adding telemetry capabilities to resource CRUD events
* Initializing new maps in nprMgr for tests
* Initializing new maps in nprMgr for tests
* Adding artifact for Npm logs
* Addressing comments
* Addressing comments
* First pass at implementing pod cache
* handling namedports in case of pod update
* Correcting print error
* Cleaning up pod cache update event. moving pod cache to nsMAP
* Correcting namespace prefix
* Adding in checks on protlists and Podips
* changing some variable names
* changing some variable names
* Adding resource versions checks for Pod, NS and netpols
* fixing some tests
* changing ResourceVersion to uint64 and cleaning up oldpodobj references
* rearranging hostneptol and correcting a UT failure
* Fixing the hostnet pod UT
* Addressing comments
* fixing UT
* Fixing UTs
* correcting pod delete failure bug
* Fixing clean up bug
* Handling hostnet pods in Delete pod
* Addressing comments and ficing a panic error