Граф коммитов

51 Коммитов

Автор SHA1 Сообщение Дата
John Payne 68ede66cfa
[FIX] ci: Update AzCLI task to v2 (#2922)
* ci: Update AzCLI task to v2

* ci: add scriptType: "bash"
2024-08-19 16:51:26 +00:00
John Payne 0bdb3ab76a
ci: [NPM] NPM Conformance Test improvements (#2875)
* ci: Always delete npm conformance clusters

* test: pin k8s suite version

* ci: comment out updated cyclonus

* chore: cleanup yaml
2024-07-24 19:03:36 +00:00
Hunter Gregory c561c895fe
ci: [NPM] speed up Windows conformance and disable Windows Cyclonus/Scale (#2874)
* ci: comment out windows cyclonus and windows scale test

Signed-off-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>

* ci: slim down windows conformance to 14 tests

Signed-off-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>

* ci: make sure conformance skips "Linux Only"

Signed-off-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>

* ci: conformance was not running test cases due to formatting

Signed-off-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>

---------

Signed-off-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>
2024-07-24 01:57:02 +00:00
John Payne e28500c0af
ci: Update triggers for NPM pipelines (#2867)
ci: add triggers to npm pipelines
2024-07-22 19:25:45 +00:00
John Payne 1a38ce0c18
ci: Updating k8s test suite for NPM to latest (#2707)
* test: latest k8s test suite

* ci: Update test suite
2024-04-24 22:35:17 +00:00
John Payne cc68e88585
ci: update NPM Windows daemonset for k8s version 1.28 (#2622)
* ci: add working directory for k8s 1.28

* ci: match workflows

* ci: use branch daemonset
2024-03-06 18:39:18 +00:00
John Payne ba5faa5895
ci: Improve CNI|NPM integration test (#2498)
* ci: increase timeout for CNI|NPM intergration test

* add: retry to generate NPM logs
2024-01-11 02:21:36 +00:00
Hunter Gregory 7e90960ed0
test(scale): [NPM] fix flakes in kwok and capture kernel state on failure (#2249)
* test(kwok): try standard tier for cluster

* Revert "test(kwok): try standard tier for cluster"

This reverts commit f76e50a559.

* test: run kwok as pod

* fix: add execute permission to sh files

* fix: allow scheduling on linux for kwok pod

* fix: wait timeouts and add retry logic

* fix: make sure to reapply kwok nodes if wait fails

* test: print out cluster state if wait fails

* test: prevent kwok from scheduling on windows node

* test: first wait for kwok pods (20 minutes)

* style: rearrange wait check

* fix: scale up kwok controller for reliability

* fix: typo in scaling kwok pods

* fix: check kwok pods running in test-connectivity instead of test-scale

* fix: wait for pods before adding NetPol

* fix: 7 second timeout for windows agnhost connect

* feat: get cluster state on failure

* debug: fake a failure to verify log capture

* fix: bugs in getting cluster state

* fix: remove newline instead of "n"

* Revert "debug: fake a failure to verify log capture"

This reverts commit 24ec927425.

* feat(win-debug): get prom metrics

* fix: leave timeout=5s for win

* style: remove new, unused --connect-timeout parameter

* style: comment

* feat: top node/pod
2023-11-29 15:37:46 -08:00
John Payne 6ddc44c01d
ci: Add control through environment variables for CNI Load Test (#2277)
* ci: conditional run logic

* add: SCENARIO env var control

* ci: change RUN to CNI

* ci: move SCENARIO to job level

* fix: change env vars to unique values

* ci: condition logic and description
2023-10-12 16:57:44 -07:00
John Payne 64f01b2740
ci: Add log template to PR and Load Test Pipeline (#2264)
* Initial commit

* add: log paths and label to priv. daemonset

* add: log template

* add: log template to load-test yamls

* remove: kubeconfig calls

* add: capture failed pods on job failure

* add: Linux state files

* add: Windows state files

* style: change terminal output

* add: log template to PR pipeline

* fix: rebase

* style: add comments to log-template

* chore: Addressing Comments

* add: sub-directories

* ci: Only call log-template on fail for PR
2023-10-03 16:51:12 -07:00
John Payne 477200881d
ci: Add CNIv2 Linux to Load Test pipeline (#2141)
* Initial Commit

* Add sleep for swift cluster

* Change NPM|CNI Integration

* Addressing comments

* Add: NPM continueOnError

* Add: Generate logs for NPM

* Change NPM Linux branch - long sleep 10s

* refactor: linux validate

* fix: rebase

* Add: maxSkew for noop deployments

* Add: Capture improper node restart

* Add: Restart CNS case for Cilium

* Addressing Comments

* Add: Restart CNS template
2023-09-19 15:37:05 -04:00
Hunter Gregory e102891d98
test(cyclonus): [WIN-NPM] fix consistent failure from not sleeping long enough (#2174)
* test(cyclonus): [WIN-NPM] fix consistent failure from not sleeping long enough

* build: fix syntax

* fix: specify namespace in kubectl

* fix: wait timeout=5m

Signed-off-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>

---------

Signed-off-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>
2023-09-01 11:25:14 -07:00
John Payne a82e39ec9b
ci: [CNI] [NPM] Add NPM|CNI integration test to load-test pipeline. (#2105)
* ci: [CNI] [NPM] Add NPM|CNI integration test to load-test pipeline

* Change artifact name

* vmSize Increase

* Addressing comments
2023-08-09 18:02:50 -07:00
John Payne 451c691a57
ci:[CNI] Replace AKS-Engine Tests with k8s conformance tests (#2062)
* Initial Commit

* Add attempts to prevent flakyness

* Add taint for windows tests

* Add k8s e2e tests

* Testing vmSizes

* Artifact k8se2e binary

* Remove NPM E2E

* Add testing and increase processes

* Addressing comments
2023-07-21 19:42:13 +00:00
Hunter Gregory ebddca18bd
perf: [NPM] [LINUX] add NetPols in background (#1969)
* wip: apply dirty NetPols every 500ms in Linux

* only build npm linux image

* fix: check for empty cache

* feat: toggle for netpol interval. default 500 ms

* ci: remove stages "build binaries" and "run windows tests"

* wip: max batched netpols (toggle-specified)

* ci: remove manifest build/push for win npm

* wip: handle ipset deletion properly and max batch for delete too

* fix: correct remove policy

* fix: only remove policy if it was in kernel

* finalize toggles, allowing ability to turn off iptablesInBackground

* ci: conf + cyc use PR's configmaps

* fix: lints

* fix dp toggle: iptablesInBackground

* fix lock typo and config logging

* fix background thread. add comments. only add tmp ref when enabled

* copy pod selector list

* fix: removepolicy needs namespace too

* rename opInfo to event

* fix: fix references and prevent concurrent map read/write

* tmp: debug logging

* fix: missing set references by swap keys and values

* Revert "tmp: debug logging"

This reverts commit 70ed34c714ea4a6d009a1fe90a7168be4bedd5bf.

* fix: add podSelectorList to fake NetPol

* log: do not print error when failing to delete non-existent nft rule

* log: verbose iptables bootup

* log: use fmt.Errorf for clean logging

* log: never return error for iptables in background and fix some lints

* fix: activate/deactivate azure chain rules

* fix: correctly decrement netpols in kernel

* ci: run UTs again

* ci: update profiles. default to placefirst=false

* address comment: rename batch to pendingPolicy

* refactor: make dirty cache  OS-specific

* test: UTs

* test: put UT cfg back to placefirst to not break things

* ci: update cyclonus workflows

* fmt: address comment & lint

* fmt: rename numInKernel to policiesInKernel

* log: switch to fmt.Errorf

* fmt: whitespace

* feat: resiliency to errors while reconciling dirty netpols

* log: temporarily print everything for ipset restore

* fix: remove nomatch from ipset -D for cidr blocks

* test: UTs for non-happy path

* test: fix hns fake

* fix: don't change windows. let it delete ipsets when removing policies

* fix windows lint

* fix: ignore chain doesn't exist errors for iptables -D

* feat: latency and failure metrics

* test: update exit code for UT

* metrics: new metrics should go in node-metrics path

* style: simplify nesting

* style: move identical windows & linux code to shared file

* ci: remove v1 conformance and cyclonus

* feat: add NetPols in background from the DP (revert background code in pMgr)

* style: remove "background" from iptables metrics

* revert changes in ipsetmanager, const.go, and dp.Remove/UpdatePolicy

* style: whitespace

* perf: use len() instead of creating slice from map

* remove verbosity for iptables bootup

* build: add return statement

* style: whitespace

* build: fix variable shadowing

* build: fix more import shadowing

* build: windows pointer issue and UT issue

* test: fix UT for iptables error code 2

* ci: enable linux scale test

* ci: revert to master pipeline.yaml

* revert changes to chain-management. do changes in PR #2012

* log: change wording

* test: UTs for netpol in background

* log: wording

* feat: apply ipsets for each netpol individually

* config: rearrange ConfigMap & update capz yaml

* fix: windows bootup phase logic for addpolicy

* feat: restrict netpol in background to linux + nftables

* test: skip nftables check for UT

* style: netpols[0] instead of loop

* log: address log comments

* style: lint for long line

---------

Co-authored-by: Vamsi Kalapala <vakr@microsoft.com>
2023-07-19 09:13:52 -07:00
Hunter Gregory 940a7a73d0
ci: [NPM] improve scale pipeline & fix edge case in scale script (#1975)
* ci: check if directory is empty before applying it

* ci: don't wait for pods if they weren't created

* docs: fix script name

* ci: wip for enabling linux scale test

* ci: parameters for linux vs windows

* ci: adjust params

* ci: fix bash typo

* ci: fix cp

* ci: fix npm url

* ci: increase max pods for linux nodepool

* ci: start building windows image again

* tmp: use apply netpol in background image

* Revert "tmp: use apply netpol in background image"

This reverts commit eff43c5439.

* refactor: use CLUSTER_NAME variable

* ci: require succeeded() for scale & conformance tests

* test: fix vars used in test-scale.sh checks

* ci: disable linux, reenable windows

* ci: increase sleep before waiting for NPM to start & log info when it doesn't

* ci: better log capture & remove command from other pipeline

* ci: do not get logs of npm on kwok nodes

* ci: do not get logs of npm on kwok nodes (part 2)
2023-06-09 10:00:03 -07:00
Hunter Gregory ce11a8da2b
ci: [NPM] scale test pipeline using KWOK (#1915)
* wip

* temporarily disable most conf runs

* update readme

* back to raw yamls and clone the branch to run scale test

* fix raw yaml URLs

* fix inline script

* fix length of rg name

* uncomment all conf again

* comment out everything unnecessary for testing

* remove commented out dependencies

* use master branch for pipeline

* label nodes

* multiple nodes

* uncomment rest of conformance pipeline (originally commented for testing)

* fix print out for time taken in test-connectivity.sh

* fix: run kwok command in background

* mkdir for kwok log

* try azure cli 1 to fix login error

* Revert "try azure cli 1 to fix login error"

This reverts commit f1671e3939.

* move scale test to new pipeline yaml

* remove scale test from conformance pipeline yaml

* revert name change for cyclonus job

* remove unnecessary image build and variable

* error codes and display names

* change sleep and wait for npm logic

* look at directory

* use pre-cloned repo

* fix directory path

* install kubectl first

* FIXME: comment out succeeded condition

* kubectl binary arg

* kubectl for scale test

* fix label selector

* fix kubectl path

* fix kubectl binary arg

* fix kwok, more steps

* FIXME: temporarily use custom fast image

* fix kwok pid and add comment

* 10m timeout for connectivity after crud

* fix kwok command invocation

* bump up timeouts for testing

* higher memory limit

* add note to connectivity script

* fix sed

* no need to curl npm yaml

* tmp: comment things out to test final step

* only check if kwok pods are running, not necessarily ready

* Revert "tmp: comment things out to test final step"

This reverts commit 7b21125ab1.

* update registry keys to fix HNS reliability

* update regkey code

* sleep to let NPM restart in case of bootup failure

* adaptive wait timeout

* change some errors to warnings

* log date

* make sure all pods are labeled

* delete and readd labels after deleting pods

* tmp: skip large scale up and connectivity check for testing

* fix overwrite arg

* rename tasks and uncomment things

* update command for updating reg key

* make timeout logic simpler

* back to reg add command for regkeys

* official timeouts instead of test values

* delete task updating registry keys and stop hardcoding npm image

* increase sleep
2023-05-18 09:57:21 -07:00
Hunter Gregory 62e7e1939e
ci: [WIN-NPM] mitigate Cyclonus infra issues (#1958) 2023-05-16 17:10:10 -07:00
Hunter Gregory 83104c74f9
ci: [WIN-NPM] fix download failures in Cyclonus Pipeline (#1903)
remove artifact download for cyclonus test
2023-04-10 09:34:30 -07:00
Evan Baker 358de20d2c
Add multiplat Windows 2019 and 2022 image support to tooling and pipelines, use for CNS/NPM (#1820)
* build: add ws2019 cns image to build pipeline

* add windows2019 build pool

* fix: npm pipelines

* fix: npm pipelines

* update multiplat build process for winver flavors

* optional buildx push for npm cyclonus

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
Co-authored-by: Matthew Long <61910737+thatmattlong@users.noreply.github.com>
Co-authored-by: Matthew Long <Matthew.Long@microsoft.com>
2023-03-01 00:25:00 +00:00
Hunter Gregory 702196639a
test: [NPM-LINUX] update linux conformance binary (#1748)
* update linux conformance binary

* temporarily comment out test profiles until one works

* Revert "temporarily comment out test profiles until one works"

This reverts commit db623d3833.

* undo change to git checkout for windows
2023-01-17 11:40:27 -08:00
estebancams ef1bff6046
Migrate powershell docker scripts to docker@2 (#1666)
* feat: migrate powershell docker scripts to docker@2

* feat: added retry to docker task

* fix: added repo info for image push step

* fix: added missing parameters for windows template parents

* fix: removed debugging lines

Co-authored-by: Esteban Capillo <estebancams@microsoft.com>
2022-11-17 10:14:45 -06:00
Hunter Gregory 4d2331d960
test: [NPM] fail cyclonus if unfinished (#1681)
* prevent chance of overlap in resource groups

* print npm pod state and capture previous logs

* fail cyclonus if it doesn't complete

* redirect to /dev/null

* remove incorrect use of --kubeconfig
2022-10-31 13:00:42 -05:00
Vamsi Kalapala 062139e314
[NPM][Windows] remove Windows22 preview flag and k8s version (#1660)
* [NPM][Windows] remove preview flag

* remove k8s version in one file

* remove k8s version in another file

* remove k8s version in third file

Co-authored-by: Hunter Gregory <42728408+huntergregory@users.noreply.github.com>
2022-10-18 13:10:46 -07:00
Hunter Gregory eebbf83dbb
Revert "test: [NPM] rename resource group for win cyc latest to prevent cleanup (#1603)" (#1616)
This reverts commit 7d85ea4e0a.
2022-09-14 13:51:04 -07:00
Hunter Gregory 6e458a6ba4
test: [NPM] deploy hns-debugger in latest cyc pipeline (#1607)
deploy hns-debugger to capture traces
2022-09-12 13:22:59 -07:00
Hunter Gregory 7d85ea4e0a
test: [NPM] rename resource group for win cyc latest to prevent cleanup (#1603)
rename rg to prevent cleanup
2022-09-12 09:22:14 -07:00
Vamsi Kalapala 7d89a83041
fix: [NPM] cyclonus latest pipeline cleanup issue (#1600)
* fix: [NPM] cyclonus latest pipeline cleanup issue

* fix: [NPM] cyclonus latest pipeline cleanup issue

* fix: [NPM] cyclonus latest pipeline cleanup issue
2022-09-09 16:19:35 -07:00
Vamsi Kalapala 08ce35190d
Pipelines: Failfast on an error (#1592)
* fix: pipelines exit early on error

* fix: pipelines exit early on error

* adding fail check
2022-09-09 15:08:27 -07:00
Vamsi Kalapala ffc1c87da5
pipeline: New Cyclonus NPM windows only pipeline (#1595)
* pipeline: New Cyclonus NPM windows only pipeline

* pipeline: New Cyclonus NPM windows only pipeline
2022-09-09 11:40:34 -07:00
Vamsi Kalapala 1c43a5cf37
pipelines: Adding a new stage for NPM continous integration (#1580)
* pipelines: Adding a new stage for NPM continous integration

* pipelines: Adding a new stage for NPM continous integration

* pipelines: Adding a new stage for NPM continous integration

* removing download phase

* removing download phase

* remove v1 default
2022-09-07 14:19:19 -07:00
Mathew Merrick 312bf12bad
Update npm-conformance-tests.yaml (#1574)
* Update npm-conformance-tests.yaml

* push tag
2022-09-02 12:09:54 -07:00
Hunter Gregory e11b452096
test: [NPM] wait on containerizing windows for ws22 conformance (#1570)
need to wait on containerizing windows for ws22-default
2022-09-01 14:29:52 -07:00
Evan Baker 8888338c62
Pipeline support for Go submodules versioned independently of root repo (#1533)
* use submodule specific tags

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* support separate go submodule versions

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* move version and tag responsibilities to the makefile

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* update integration tests to use component tags

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-09-01 18:35:01 +00:00
Vamsi Kalapala a07b422828
chore: Update k8s verion for NPM conformance tests (#1534) 2022-08-19 10:23:28 -07:00
Mathew Merrick ef95e7d9f4
Add Cyclonus Windows job (#1363)
* add cyclonus windows job

* control plane jobs

* add cyclonus to conformance

* run cyclonus from pipeline vm

* containerize

* cluster name

* rg name

* shorter rg

* k8s version

* update wait timeout

* logging

* get kubeconfigs

* date

* kubeconfig

* shorter name

* job timeout

* bigger vm

* increase vm size

* kubernetes version

* feature preview register

* increase node size

* exit code

* failonstderr

* move windows specifics closer

* omit failstderr

* wait for npm pods to restart

* update conformance scope

* retry

* seperate conformance for platforms

* revert branch

* hunter's branch
2022-08-08 10:43:44 -07:00
Hunter Gregory c44187c175
test: [NPM] Stress Testing & Fix Conformance Logs (#1264)
* test logs

* wip

* add IS_STRESS_TEST to matrix

* finish

* Copy instead of move kubeconfig

* fix folder name

* Fix folder name again

* sleep and fix published folder

* should be done

* rearrange and exit properly

* test exit code

* finalized

* make num parallel jobs a global setting
2022-03-08 10:19:23 -08:00
Evan Baker 09c06b01e1
Parameterize Makefile, parallelize platform builds, support buildah (#1102)
* templatize the container make targets

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* add container push to npm conformance task

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* update pipelines

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* default OS and ARCH to GOOS and GOARCH

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
2022-02-15 19:00:04 -06:00
Hunter Gregory 667743d79f
fix: [NPM] fix incorrect ipset create, fix 1-off prometheus logic, and add cache checking for UTs (#1202)
* wip

* dont touch v1 metrics code and fix lints

* add comment and comment code to resolve lint

* optimize looping and dirty cache updates

* address comments and change param type of modifyCacheForKernelMemberUpdate to reduce map lookups

* add exec time metrics

* UTs

* fix lints

* initialize metrics in policymanager tests

* fix bug in publishing npm logs
2022-02-01 14:47:20 -08:00
Vamsi Kalapala cfc1eecb0a
fix: [NPM] fix conformance pipeline NPM Logs name (#1171)
* fix: [NPM] fix conformance pipeline NPM Logs name

* dummy testing logs folder name

* fixing folder name
2021-12-13 12:54:47 -08:00
Hunter Gregory ed4aa371c3
test: [NPM] conformance tests for multiple profiles (#1138)
* mtest: run conformance for npm profiles in parallel

* update dependsOn field

* FIXME updating trigger for pipeline run for now

* Revert "FIXME updating trigger for pipeline run for now"

This reverts commit dce5ac2fbd.

* fix yaml compile error

* shorten cluster name because it was too long

* remove test for v1-place-azure-chain-first profile

* remove always() condition for cleanup and publish kubeconfig artifact

* rename kubeconfig artifact name

* try putting kubeconfig in the published npmlogs folder
2021-12-08 11:33:55 -08:00
Paul Johnston 3f97a3040f
Arm64 docker (#1030)
* chore: making docker images arch agnostic through docker buildx
2021-09-28 14:17:25 -07:00
Mathew Merrick 9b24dbd95a
test: [NPM] Use fakeexec for ipsm and iptm tests (#868)
* iptmgr

* more iptm testing

* grep call

* progress

* progress

* ipsm

* ioshim

* update tests

* package restructure

* fix broken test and delint

* reduce scope of ioshim

* reduce scope of ioshim

* ioshim scope

* require no error, retrigger ci

* ut return multiple results

* fix tests from master changes

* unexport ioshim

* update ut

* fix tests

* vendor

* test fix

* go version

* go version

* pipeline fixes

* fix tests
2021-07-14 12:53:45 -07:00
Mathew Merrick a736954e7a
chore: Remove CodeCov from CI (#853)
* Remove codecov and use service connections
2021-04-22 13:09:31 -07:00
Mathew Merrick d929d1acb0
chore: Specify CI build pool name (#841)
* Specify pool name
2021-04-13 11:00:49 -07:00
Mathew Merrick a0c36fe50b
[fix] IPset fail if set doesn't exist when attempting to add to list (#828)
* ipset fail if set doesn't exist when attempting to add to list

* bail out on add to set when ip is empty

* check set type when adding to list

* revert checking nil value

* fix test build issues

* additional ip check

* ip check with parseip

* prometheus count

* delete from set checks

* delete from set checks

* log on skipping pod

* logging pipeline

* npm logs ci
2021-03-24 10:50:52 -07:00
Vamsi Kalapala 455f5cb9f0
[NPM] Decoupling resource cache maps from NSmap (#820)
* first pass at decoupling resource maps

* First pass on decoupling resource maps

* Adding telemetry capabilities to resource CRUD events

* Initializing new maps in nprMgr for tests

* Initializing new maps in nprMgr for tests

* Adding artifact for Npm logs

* Addressing comments

* Addressing comments
2021-03-15 16:08:42 -07:00
Vamsi Kalapala ba8394046d
[NPM] Caching ability for Pods and NS (#814)
* First pass at implementing pod cache

* handling namedports in case of pod update

* Correcting print error

* Cleaning up pod cache update event. moving pod cache to nsMAP

* Correcting namespace prefix

* Adding in checks on protlists and Podips

* changing some variable names

* changing some variable names

* Adding resource versions checks for Pod, NS and netpols

* fixing some tests

* changing ResourceVersion to uint64 and cleaning up oldpodobj references

* rearranging hostneptol and correcting a UT failure

* Fixing the hostnet pod UT

* Addressing comments

* fixing UT

* Fixing UTs

* correcting pod delete failure bug

* Fixing clean up bug

* Handling hostnet pods in Delete pod

* Addressing comments and ficing a panic error
2021-03-08 10:19:22 -08:00
Mathew Merrick 4a17414888
Update npm-conformance-tests.yaml (#800) 2021-02-24 12:47:51 -08:00
Mathew Merrick d78002d3c0
[build] Update pipeline and enable debug symbols in bins (#793)
* add symbols, add gcflags, update timeouts, only show azure core errors
2021-02-23 12:58:49 -08:00