* use ipset save to update members and update error handling logic for ipsets to skip previously run lines. Also update some logging for iptables chain management
* remove unused code
* add comment block describing high level ipset restore logic
* fix bug in piping to grep. need to add pipe errors for fexec UTs so that we dont revert on dataplane UTs
* grep for npm sets working for ipsets save, but this breaks DP UTs
* VerifyCalls method for mock ioshim
* add ability to unit test piped commands
* update logging
* UTs for piping a command to grep
* grep for npm sets in ipset save, verify number of calls in UTs, and update ApplyIPSets test calls for dataplane UTs
* update comments based on PR suggestions
* addressing comments
* remove out-of-scope policy changes for this PR
* rename restore file creator files
* FIXME: setting v2 controllers toggle to true to create an image in pipeline
* Revert "FIXME: setting v2 controllers toggle to true to create an image in pipeline"
This reverts commit 31148c3034.
* wrap errors
* Generallize function for ingress and egress
* Use function to support both ingress and egress and update UTs
* Simplify UTs and correct comments
* Add more comments and correct policies.MatchType
* Reorganize codes to better understand and simplify UTs
* Addresses comments (e.g., fix typo and correct wrong comments)
* Address comments (e.g., srcList and dstList) and correct MatchType values
* Address comments (fromRulesExists -> peerRuleExists)
* Delete unneeded codes for podSelector and update UTs
* Delete unneeded codes for nameSpaceSelector and UTs
* Delete unneeded codes in parseSelector
* Use nil slices instead of zero slices for TranslatedIPSet from namespaceSelector
* Use Variadic functions in NewTranslatedIPSet to use nil slice instead of empty slice and update UTs accordingly
* Use right settype for all-namespaces
* no export for flattenNameSpaceSelector function in parseSelector (vamsi's comments)
* Add comments for package and functions
* Handle namespaceSelector in simple way and remove unnecessary code for ipBlock field
* Translate podSelector in a simple way and update its UTs.
* Use parsedSelectors struct to reduce duplicated codes and update comments
* Address lint errors and add missing comments
* Correct comments
* Addressed comments
* Use correct settype for all-namespaces
* Address comments
* Addressed comments
* Support graceful shutdown in pod
* Update detailed comments and cleaning up codes
* Add Unit tests
* Address lint errors
* Address comments
* Addressed comment and add UTs
* adding a legacy build command
* Adding all v2 controller test files
* v2 podcontroller changes
* completing all pod v2 controllers uts
* Adding netpol v2 controller UTs
* Removing unused make file command
* Fixing lints and correcting a test case
* Fixing an error in expected values
* dealing with flaky tests
* Fixing an issue with HCN vendor, until we wait for the fix to be rolled out
* Addressing some comments
* Removing addPolicy call and relying on updatepolicy
* Saving only spec of netpol and not whole object
* changing name of rawNPMap to rawNPSpecMap
* changing name of rawNPMap to rawNPSpecMap
* Deep equal type for spec was not equal corrected the pointers
* Deep equal type for spec was not equal corrected the pointers
* Create generic translation struct and start using it networkPolicyController
* Update ipsm to get more information from NPMCache
* Working on translateIngress part and its UTs
* Done functions of Translate ingress rule (need to add UTs to test its functions and clean up codes)
* Use function for repeated codes
* Cleanup UTs
* Remove all unused codes in this PR except for ingress rules
* Create functions to make codes concise and reorganize ipsets for better readability
* Remove duplicated data for targetPod information in every ACL
* Move translation logics to /pkg/controlplane/translation dir
* Remove redundant codes and resolved some of lint errors
* Resolve lint errors and remove unused codes in parseSelector and its UTs
* Use unique id for acl policy among network policies and add UTs for port rules
* Addresses some comments (will resolve more later)
* Complete namespaceSelector UTs and correct some logics for handling namespaceSelector
* Use consistent variables and variable for flexibility in UTs
* Add more UTs for allowAll and defaultDrop rules and clean-up codes
* Remove unused codes
* Resolve lint errors
* Clean-up and reorganize codes
* Revert "Update ipsm to get more information from NPMCache"
This reverts commit 477bbaf43d56a6535f5cc035dfe15d5b6035647a.
* Address comments
* Resolve part of lint errors
* Add comments for todo things in next PR
* Delete unused file and clean-up code
* Fix Uts
* Remove unnecessary code
* feat: policy manager for linux
* remove composer from this PR (saving progress on my local machine)
* redesign iptables, rework/complete UTs
* use strings instead of constants in UTs
* fixed go lints
* fixed bug found in integration testing
* added integration tests for policymanager (should replace with dataplane interface calls later)
* rearrange UTs, add extra coverage for chain management, and fix bug for reporting error on chain destroy failures
* rename variable
* fixed lint (removed newline)
* update errors in policymanager and add an error in linux pMgr if we cant delete jump rules from ingress/egress chain to policy chain
* fix tiny lint
* address comments, update UTs in dataplane_test.go, update error wrapping, add windows UT files
* address more feedback and fix DP UTs by commenting out pMgr init/reset for now
* add comment
* [NPM] Windows Policy Manager changes for OS22
* Adding new NPM ACLSettings with ID
* first pass on both add and remove policies
* fixing a merge issue
* Working 1st level Setpolicy CRUD operations
* have NPMACl to HNSACL conversion logic ready
* updating policy endpoints only after adding policy to an endpoint
* updating policy endpoints only after adding policy to an endpoint
* fixing a build issue
* fixing issue in linux files
* Addressing some comments and also completing some integrations with V2 control plane
* Updating policy ID logic and update pod
* Updating policy ID logic and update pod
* Addressing some comments
* adding basic reset bits
* fixnig build issue in linux
* Fixing the _linux_test.go build failures
* fix lints
* Addressing some comments and correcting windows logic to apply set policies in order
* cleaning up logic for calculating set policies
* Applying some feedback.
* fixing a failing test and panic
* add note in comment
* fix ipset metrics, add TODO comments, and add delete cache
* linux ipsetmanager
* Revert "add note in comment"
This reverts commit b20c486cfa.
* update comments and add stub for destroyNPMIPsets() func
* move restore file logic to external package
* remove existence check in updateDirtyCache()
* rearranging stuff and updates for old version of new ipsetmanager
* update ipsetmanager to version in master (from ipsetmanager-update PR)
* updates to ipsetmanager linux test
* revamped file creator for retry logic (for ipset and iptables restore)
* renaming variables and moving code around
* completed logic for retrying (generic and for ipset restore)
* addressing sectionID comment and including a call to errorHandler Callback
* remove obsolete struct fields
* Revert "remove obsolete struct fields"
This reverts commit b53af2c2d7.
* unit tests and some changes/reordering of code for file-creator
* file creator unit tests (forgot to add in last commit)
* refactored file creator for external testing of error handling
* added missing argument to fucntion
* first pass at unit testing for ipsetmanager_linux
* fix file creator a bit (use ioshim, etc.) and finish basic UTs for ipsetmanager linux
* implemented ApplyAll mode
* update error messages
* full UT coverage for file creator
* update comments, var names, and remove debug code
* full UT coverage for ipsetmanager linux
* resolvee golint problems with function literals
* resolve golint problems with function literals v2
* rename variable
* added reboot function to generic ipsetmanager
* added caveat about ApplyAll mode
* initialize metrics in dataplane constructor
* fix go lints and update error messages in file creator
* add basic integration testing file
* addressing comments
* fix unit tests
* fix lints in integration test file
* fixed bad function name in windows ipsetmanager
* [NPM] Adding prefixes to IPSets in dataplane
* Correcting a linting issue
* Using the correct case for metadata
* Adding IOShim for both windows and linux
* splitting ioshim for each os
* correcting a import error
* correcting some mistakes
* Adding tests for policies in Dp
* fixing a testname
* Updating the dataplane mock file
* removing dataplane mocks from dataplane tests as their scope is controllers
* Add uts for parseiptable.go
Co-authored-by: Hunter Gregory <hgregory@microsoft.com>
* test commit
* deleted file from test commit
* added a UT for convertiptable and moved shared UT functionality to a new file. also renamed some command constants to avoid confusion with real commands
* removing print statements from when I was debugging
* Add UTs for start.go
* Add simple UT for start.go
* make it clear that cache file and iptables save file need to be used together
* remove unnecessary wantEmptyOutput field in test struct
* Refactor cobra command and adjust unit tests
* UT for gettuples cmd
* comment out test without cache file and refactor args
* Delete unnecessary comments and commeted codes
* Remove lint errors
* Use correct files and expected values in UTs
Co-authored-by: Hunter Gregory <hgregory@microsoft.com>
Co-authored-by: Hunter Gregory <hunterlgregory@gmail.com>
* removing some redundant info
* Adding some policy related changes
* initial pass on some update pod dependencies
* adding update pod logic
* cleaning up some unused code
* Adding some basic test cases
* Correcting some test cases
* fixing a testcase
* correcting some golints
* fixing a test
* Fixing some golints:
* Addressing a comment
* add new npm errors
* add logic for adding/removing sets to kernel in ipsetmanager, update usage of prometheus metrics, and update dataplane API to not use IPSets
* update to a pointer return value for NewPolicyManager
* fix go lints
* renamed count to kernelReferCount
* fix a bug with kernel logic, rearrange code, rename things, and update comments
* rearrange functions
* removed checkIfExists, consolidated AddReference and DeleteReference with a special type, and fixed go lints
* moved logic for different reference types to ipset.go
* remove file with just notes on it
* remove redundant boolean calculation that is always true
* add clarifying comment
* fix ipset metrics to be for all of NPM (not necessarily in kernel), and write TODOs for kernel-based metrics
* update based on code review
* var name change
* update unit test to use DeleteIPSet
* moved internal functions to the bottom
* fixed bug in NumIPSetsIsPositive()
* moved code for getting metric values to a new file
* renamed file
* unit tests for prometheus metrics
* fix go lints
* use fexec for TestDestroyNpmIpsets()
* Initial pass at Netlink interface
* changing some netlink and epc
* Resolcing all dependencies on netlink package
* first pass at adding a netlinkinterface
* windows working now
* feat: update cns client (#992)
* fix debug commands
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
* fix: update cns client
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
* add ctx to debug calls
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
* repackage cns client
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
* add ctx to all methods and preinit all route urls
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
* down-scope cns client interface and move to consumer packages
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
* no unkeyed struct literals
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
* trace updated client method signatures out through windows paths
* delint
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
* fix windows build
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
* delint
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
* windows working now
* Some golints checks
* commenting a flaky NPM UT and adding some golint checks
* renaming fakenetlink to mocknetlink
* removing a mock netlink usage
* fixing more golints and a test fix
* fixing more go lints
* Adding in netlink from higher level as input
* adding netlinkinterface to windows endpoint impl
* removing netlink name confusion
Co-authored-by: Evan Baker <rbtr@users.noreply.github.com>
Co-authored-by: Vamsi Kalapala <vakr@microsoft.com>
Co-authored-by: Evan Baker <rbtr@users.noreply.github.com>
* wip
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
Co-authored-by: JungukCho <jungukcho@microsoft.com>
* wip
Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
Co-authored-by: JungukCho <jungukcho@microsoft.com>
* Add unit tests
* Use correct parameters
* Check nil value of ipsMgr before calling marshal function and add UT
* Resolve lint errors
* Resolve all lint errors
* Define error
* Use a right file for UT and resolve lint error
* Use better directory name for managing testfiles
* removed test/ and testutil/ from code coverage
* remove promutil from coverage
* removed tools/ from code coverage
* removed crd/ from code coverage and updated multitenantnetworkcontainer's manifest
* switch to !ignore_NAME syntax for test and cli tags
* add coverage back to crd (besides autogenerated files)
* rename ignore_test and ignore_cli tags to ignore_uncovered
* make cns/fakes/ uncovered
* mark go files in crd api folders as uncovered again
* add main.go back for nnsmock server
* adding initial dataplane interface
* adding some skeleton code for crud functions
* adding dirty cache initial support for ipset manager
* correcting some go lints
* Adding some skeleton around os specific files
* removing interface and adding some network logic
* adding apply ipsets logic
* Addressing some comments and also adding comments to code
* Fixing some golints
* addressing some comments
* adding some golint checks
* Adding a new field setProperty and also adding structure for policies
* applying some comments
* correcting a condition
* Adding some comments
* Adding some test cases
* Addressing some comments
* Addressing more comments
* resolving some comments
* resolving some comments
* fixing some comments
* removing lock on policymap
* fixing some golints
* merging with master
* fixingsome wrap checks
* fixing lints
* made prometheus exec time metrics for ipsets and iptables in line with those for network policies (exec time recorded even for failures). Also made prometheus timer variable names clearer.
* fixed faulty prometheus handler test looking for a node metric name when testing the cluster metric handler
* add clarity in comments related to the IPSetInventory metric
* Include prometheus metrics for lists and in DestroyNPMIpsets(). Only make metric updates when there's no error
* refactor prometheus testing and include metric tests for lists and NPMDestroyIpsets()
* better check for empty response to ipset list in DestroyNpmIpsets()
* remove unused clientset from controllers
* replace function for setting ipset inventory with function for removing ipset for better readability. updating comments too
* reset ipset inventory before each unit test
* added unit test for adding to set with pod cache
* remove unused cluster state function and clientset from np manager
* fix build problems: remove clientset from calls to npm.NewNetworkPolicyManager()
* fix logic for destroy ipsets for situation when destroy is called while num ipsets is 0
* delete commented out function
* encapsulated prometheus metrics, refactored prometheus testing for iptm and netpol controller, and removed clientset from controller creation in test files (fixing build error)
* update test for DestroyNpmIpsets() to always use a new Exec
* Locate key data structure in each controller and remove locks in each controller if possible and lower lock location into ipset manager
* Create npmNameSpaceCache to manage shared namespace objects and mutex
* Encapsulate listMap and setMap in ipset manager struct not to expose them to other packages.
Remove unnecessary codes and clean-up initialization codes.
* Encapsulate methods and members to avoid unintentional access to variables and manage better shared resource. Correct UTs and Clean-up codes
* Update expected values in UTs according to architectural change and clean-up code (remove unnecessary comment and duplicated logging)
* Add comments and clean-up codes (removing redundant codes, etc)
* Remove threadness variable to avoid unintentional increase in the number of workers in each controller without safe synchronization
* Handle errors and return values in right ways based on lint hints
* Remove error handling codes in initializing iptables and ipset when NPM starts
* Call a correct function to avoid UT failure
* Resolve comments
* Locate key data structure in each controller and remove locks in each controller if possible and lower lock location into ipset manager
* Create npmNameSpaceCache to manage shared namespace objects and mutex
* Encapsulate listMap and setMap in ipset manager struct not to expose them to other packages.
Remove unnecessary codes and clean-up initialization codes.
* Encapsulate methods and members to avoid unintentional access to variables and manage better shared resource. Correct UTs and Clean-up codes
* Update expected values in UTs according to architectural change and clean-up code (remove unnecessary comment and duplicated logging)
* Add comments and clean-up codes (removing redundant codes, etc)
* Remove threadness variable to avoid unintentional increase in the number of workers in each controller without safe synchronization
* Handle errors and return values in right ways based on lint hints
* Remove error handling codes in initializing iptables and ipset when NPM starts
* Call a correct function to avoid UT failure
* Resolve comments
* Correct lint's complaint
* Correct chain order
* resolved llc lint warnings and renamed variables
* Resolved lint warnings if possible
* Removed unnecessary variables and codes
* Locate key data structure in each controller and remove locks in each controller if possible and lower lock location into ipset manager
Encapsulate listMap and setMap in ipset manager struct not to expose them to other packages.
Remove unnecessary codes and clean-up initialization codes.
Encapsulate methods and members to avoid unintentional access to variables and manage better shared resource. Correct UTs and Clean-up codes
Update expected values in UTs according to architectural change and clean-up code (remove unnecessary comment and duplicated logging)
Add comments and clean-up codes (removing redundant codes, etc)
Handle errors and return values in right ways based on lint hints
Remove error handling codes in initializing iptables and ipset when NPM starts
Call a correct function to avoid UT failure
Resolve comments
Correct lint's complaint
Correct chain order
* Add custom encoding and decoding logic for NPMCache
* Revise UT case in server_test.go and resolve lint warning
* Add npm cache file which was revised based on custom encoding and revise corresponding UTs
* Add unit test for npmCache (need to remove redundancy in server_test.go)
* Resolve lint warnings
* CLI functions
* fix whitespace bug in CIDRmatch + go lint issue
* update main.go from master
* addressed CR comments
* addressed Matt's comments
* make config flag to be a root cmd flag only
* make config flag to be a root cmd flag only
* organized iptable parser code
* print functions for iptable object + comments and testing template for parser
* add converter package + code refractoring parser
* fix bug where the program throws an error when the length of an option's value is 1 in parser
* add tests for parseTarget and parseModule + code refactoring for parser
* add ConvertIptablesObject func
* tests for parser
* add converter UT
* experimenting with protobuff
* used constructors, getters and setters for iptables' struct
* export GrapIptableLock
* add parser for negation in npm rules + add SetInfo obj for converter + update protobuf
* move folder into npm + changes to converter
* change hack folder name + move within npm
* make changes to converter
* change gitignore
* temporarily remove http folder
* converter ut
* fix converter tests + partial tests for tupleProcessor
* fix go lint issue with json unmarshal
* changed npmcache.exec type to interface to pass tests in converter
* change back policy file
* add conditions to get npm cache and iptable-save from node
* Update const.go
* Update converter.go
* Update converter_test.go
* Changes to return error statements in converter.go
* Update converter_test.go
* Change import path
* Update iptables strings method
* Update parser.go
* Update parser_test.go
* Update networkTupleProcessor.go
* update tupleProcessor_test.go
* Delete main.go
* resolve golint issue
* fix returning errors in tupleProcessor
* changed unit tests so they are more aligned with guidelines + add cidrblocks set type placeholder
* pull updates from master
* move everything to the datplane package
* refactoring code
* fix golint issues
* java style is not the way to Go ;)
* add more comments to the parseLine function
* fix more golint issues
* fix line length linting issues
* fix more linting issues
* add parse CIDR Block functionality
* minor bug fixes + more test coverage
* fix remaining lint issues
* minor linting issue
* fix the final linting issus this time for real
* for real
* remove todos
* addressed some CR reviews
* moved parser and iptables to their own package
* change package name
* minor comments
* change package name
* addressed more CR comments
* minor linting issue
* rename tupleProcessor to trafficAnalyzer
* remove a test that used exec
* fix parse iptable logic + re adding the previous test
* Fix save and restore function to work properly with real exec
* Delete unused save and restore codes
* Clean-up deadcodes (i.e., save and restore) for ipset
* Remove an unused variable
* basic scenario and investigation
* adding some basic imcrements to yaml
* First pass at adding 2nd level ipsets for multi value selector expressions
* resolving some references for drop match fields
* Adding some fixes around nil checks
* ignoring 1 value expr to be added to list mems
* not adding lists to the setMap
* correcting some UTs
* Correcting UTs.
* fixing all UTs
* Adding and cleaning some comments
* basic scenario and investigation
* adding some basic imcrements to yaml
* First pass at adding 2nd level ipsets for multi value selector expressions
* resolving some references for drop match fields
* Adding some fixes around nil checks
* ignoring 1 value expr to be added to list mems
* not adding lists to the setMap
* correcting some UTs
* Correcting UTs.
* fixing all UTs
* Adding and cleaning some comments
* fixing ns- prefixes for some lists and adding sort for comment
* fixing nsSlectors, flattening nsSelectors, since list of lists is not allowed in ipset
* correcting some UTs for nsselectors
* putting back removed testcase to resolve conflicts
* addressing some comments
* Addressing some comments regarding policy translation
* fixing some test after the merge
* addressing a potential nil deference issue
* adding a test case for some corner cases in flattenNS
* adding a TODO to clean up stale code if not used in next iteration
* Fixing some golint
* Fixing a sed in github actions
* Returning in DROP chains
* adding a comment about future cleanup of chains:
* Removing duplicate rules in PORT chains of TO/FROM and DROP chain jumps
* Adding a image of new chains behavior
* Addressing comments
* addressing some comments
* Correcting some UTs to not have the jump rules
* removing jump flag
* inital dump state and ipam interface update
* add reconcile command to CNI
* add integration test
* pass endpoint id on add
* address some feedback
* fix test path and linting
* address feedback and logging
* remove return and rename to PodEndpointID
* Use sharedInformer for only local storage to cache resource to make deterministic unit tests
* add comments in the codes
* add comments for future enhancement for unit tests
* fix incorrect NsMap local cache management between nameSpaceController and podcontroller
* Correct comment
* PodController adds list in ipset and npm-namespace into NsMap local cache
* [NPM] better error handling and cache building
* using const for label operations
* tests correction to discard non-exists delete op
* cleaning up some stale code
* fixing some port issues
* Addressing some comments
* Addressing some comments
* Addressing some comments
* fixing a missed ns- prefix while deleting from ns ipset
* fixing a missed ns- prefix while deleting from ns ipset
* resducing complexity when pod ip changes
* Adding some test validations
* Adding some test validations
* Adding some test validations
* removing more unsed fields
* removing redundant checks
* Adding improvement of podinformer
* removing the deepcopy logic
* streamlining some redudant log messages
* moving a comment
* first version of network policy controller and its unit tests
* update reconcile and deleteNetworkPolicy function to correctly install and uninstall default Azure NPM chain.
* To explicitly manage default Azure NPM chain in deleteNetworkPolicy function
* correct comments and delete unused variable
* fix missed returing errors in codes
* Correct to check DeletionTimestamp and DeletionGracePeriodSeconds variables
* removed placeholder functions in network policy controoler and added more test cases (e.g., update and adding multiple network policies)
* - applied comments (use explict names, locating lock in a better place)
* add two methods to save and restore iptables in unit test
* comment out unused function
* early filter in updateNetworkPolicy function if they are the same network policies. Update unit tests to test more network policies events
* - start using klog package instead of log package
* remove unneeded defer for lock
* Locate of adding and deleting network policy object from our network policy cache in a right place. Correct prometheus metric code.
* use cached network policy key instead of network policy object as method parameter in cleanUpNetworkPolicy
* remove redundant check
* Remove ns- prefix as key in RawNpMap. Update UT to check prometheus metrics. Applied better naming and removed redundancy codes.
* minor update for varialbe names
* remove dependency between UT by re-initializing metrics. Correct message.
* Junguk cho pod controller support (#1)
* code layout to support pod controller in npm
* filter events if they do not need to handle and clean up business logic
* put lock when it needs to access shared resource
use namespace/name as key
clean up functions
* - use pod key instead of uid.
- remove unnecessary error check
* use namespace prefix in pod namespace. add log messages to know what events happens.
* move event logs with more contexts in needsync func
* Put returning an error in a right place
* Return if the RV of both pod obj are the same. Proactively start cleaning up pod when the pod is deleted.
* first version of podController UT
* add ipset management in pod controller unit test
* Make methods flexible for ipset store and restore operation
* clean up functions and variables
Co-authored-by: Junguk Cho <jungukcho@microsoft.com>
* correct and clean functions and error messages. Return errors from appendNamedPortIpsets function to retry syncPod operation
* Check npmPod exists in cleanUpDeletedPod function. Use GetIPSetListFromLabels in syncAddedPod and cleanUpDeletedPod functions. Correct error messages, functions, etc
* Clean up podController code. Make podController UT more flexible.
* clean up appendNamedPortIpsets to improve readibility
* minor update (v1 -> corev1) for consistency
* clean up syncPod code to compare last applied states and new pod's states
* add validation for casting old pod. correct log message
* delete unneeded codes
* minor fix: correct comments and removed unneeded variables
* add pre-filter codes to avoid unnecessary reconcile process in updatePod event
* correct a comment
* check workqueue length to validate case where it does not need to reconcile in unit test
* Upgrading Namespace to use NSController
* Unit tests for namespace controller
* correcting testcases
* correcting testcases
* correcting testcases
* Adding in more testcases
* adding threadness to NS
* minor imporvements wrt RV
* minor imporvements wrt RV
* minor imporvements wrt RV
* Cleaning up some stale code
* resolving comments
* Addressing some comments
* Addressing some comments
* Adding missing newcontroller
* Addressing more comments
* Addressing more comments
* ipset fail if set doesn't exist when attempting to add to list
* bail out on add to set when ip is empty
* check set type when adding to list
* revert checking nil value
* fix test build issues
* additional ip check
* ip check with parseip
* prometheus count
* delete from set checks
* delete from set checks
* log on skipping pod
* logging pipeline
* npm logs ci
* first pass at decoupling resource maps
* First pass on decoupling resource maps
* Adding telemetry capabilities to resource CRUD events
* Initializing new maps in nprMgr for tests
* Initializing new maps in nprMgr for tests
* Adding artifact for Npm logs
* Addressing comments
* Addressing comments
* First pass at implementing pod cache
* handling namedports in case of pod update
* Correcting print error
* Cleaning up pod cache update event. moving pod cache to nsMAP
* Correcting namespace prefix
* Adding in checks on protlists and Podips
* changing some variable names
* changing some variable names
* Adding resource versions checks for Pod, NS and netpols
* fixing some tests
* changing ResourceVersion to uint64 and cleaning up oldpodobj references
* rearranging hostneptol and correcting a UT failure
* Fixing the hostnet pod UT
* Addressing comments
* fixing UT
* Fixing UTs
* correcting pod delete failure bug
* Fixing clean up bug
* Handling hostnet pods in Delete pod
* Addressing comments and ficing a panic error
* First pass at implementing pod cache
* handling namedports in case of pod update
* Correcting print error
* Cleaning up pod cache update event. moving pod cache to nsMAP
* Correcting namespace prefix
* Adding in checks on protlists and Podips
* changing some variable names
* changing some variable names
* Adding resource versions checks for Pod, NS and netpols
* fixing some tests
* changing ResourceVersion to uint64 and cleaning up oldpodobj references
* rearranging hostneptol and correcting a UT failure
* Fixing the hostnet pod UT
* Addressing comments
* fixing UT
* Fixing UTs
* correcting pod delete failure bug
* merge conflict
* Adding in wait to get mitigate defunct process
* Adding error handling and addressing comments
* Addressing some comments
* Adding a testcase for GetlineNumber
* Adding error message in failures
* Adding error message in failures
* correcting the name for Kubeservices chain
* Ignoring hostnetwork pods from being added into Ipsets
* generalizing the check on hostnetwork pod
* Adding tests for add, update and delete hostnetwork pods
This change will help evaluate both INGRESS and EGRESS rules before accepting/taking a decision on a packet. NPM will now MARK a packet for ingress/egress and RETURN the MARK'ed packet. Then packet will be accepted in the main chain after all the ingress and egress rules are processed.
* first pass trying to return instead of accept
* Adding initial marking capability
* Adding accept on ingress and egress marks
* Correcting an ingress marker
* Correcting unit test cases to show the appropriate markers
* Correcting a comment
* Addressing comments
* Adding port_ format for named ports
* Cleaning existing ipsets incase of a upgrade
* Changing the delimiter for port prefix
* Some basic formatting
* Adding fixes to testcases
* Addressing comments
* Adding mitigation for empty string in named port
* Changing port nil behavior. NPM does partial rule handling in port nil case
* removing exists check, as setmap is not accessible from this test
* Adding support to delete only azure-npm ipsets
* Adding support to delete only azure-npm ipsets
* Addressing comments
* Changing azure-npm to const flag and cleaning up un wanted error log
* Changing the make entries to 0
* Accelerate metrics report from every 30 mins to every 5 mins.
* Add errCountTest metric.
* Refactor SendAiMetrics. AI initialization is in main routine while send metrics is in another go routine.
* Add aiMetadata config.
* Add SendErrorMetrics function in ai utils.
* Going to push error log to AI telemetry.
* Add error log to AI telemetry.
* Change error message format.
* Add error log and metrics to AI telemetry.
* Remove unnecessary const.
* Change heartbeat back to every 30 mins.
* Seperate send log from SendErrorMetric function for better reuse.
* Change a unit test set name to avoid kernel conflict.
* Address comments. Make error log and metrics sending more generic.
* Fix typo.
* Fix indentation.
* Fix AI initialize issue.
* Remove unnecessary log.
* Use break in if condition.
* made ipset inventory metric more efficient for container insights scraping. Added metric for total ipset entries
* updated comment for GetVecValue
* changed prometheus metrics port number from 8000 to 10091 to be next to the node port used in CNS
* added cluster service for NPM Prometheus metrics (lets a scraper only scrape this service for node redundant metrics)
* separated node and cluster metrics into separate registries and HTTP endpoints
* separated functionality for getting IPSetInventory labels and made public
* updated initialization of IPSetInventory to have hash set label and changed ipsm tests to mirror this
* added two yaml options for configuring a prometheus server to scrape NPM efficiently. Removed generic prometheus annotations on NPM pod to prevent default scraping of NPM for a helm prometheus server, and added a specific annotation for the alternative prometheus server config
Co-authored-by: Hunter Gregory <t-hugreg@microsoft.com>
* prometheus additions to testmain (commented out right now)
* home of the npm prometheus metrics and tools for updating them, testing them
* add/remove policy metrics
* add/remove iptables rule metric measurements
* add/remove ipset metric measurements
* testing for gauges. want to soon remove the boolean for including prometheus in unit testing
* run http server that exposes prometheus from main
* cleaner test additions with less code
* removed incorrect instance of AddSet in the TestDeleteSet test
* added prometheus annotations to pod templates
* deleted unused file
* much more organized initialization of metrics now. now includes map from metric to metric name
* add ability to get summary count value. now getting gauge values and this new count value are done by passing the metric itself as a param instead of a string
* condenses prometheus testing code base by condensing all prometheus error messages into a function
* added testing for summary counts, condensed prometheus error handling code, and updated calls to use new form for getting metric values
* update based on variable spelling change in metrics package
* Added comments for functions and moved http handler code to the http file
* fixed problem of registering same metric name for different metrics, and passing in the wrong param type for testing
* made prometheus testing folder with interactive testing file. moved old random metric flux testing function over from ipsm_test
* moved testing around again
* fixed spelling mistake
* counting mistake in unit test
* handler variable ws in wrong file. Changed stdout printing to logging
* fixed parameter errors and counting error in a test
* moved utilities for testing prometheus metrics to npm/util. Updated StartHTTP to have an additional parameter for waiting after starting the server
* updated uses of StartHTTP to have the extra parameter
* updated GetValue and GetCountValue uses to use the prometheus features of the util package, which is now moved to a promutil package within npm/metrics/
* removed unnecessary comments, removed print statement, and added quantiles to all summary metrics
* fixed problem of double registering metrics
* wait longer for http server to start
* moved tool in test-util.go to promutil/util.go
* fixed timer to be in milliseconds and updated metric descriptions to mention units
* removed unnecessary comments
* http server always started in a go routine now. Added comment justifying the use of an http server
* debugging http connection refused in pipeline
* fixed syntax error
* removed debugging wrapper around http service
* sleep so that the testing metrics endpoint can be pinged
* redesigned GetValue and GetCountValue so that they don't use http calls
* removed random but helpful testing file - will write about quick testing in a wiki page
* milliseconds were being truncated. now they have decimals
* use direct Prometheus metric commands instead of wrapping them
* removed code used when testing was done through http server. Moved registering to metric creation functions
* added createGaugeVec, updated comments, made all help strings constants
* added metric that counts number of entries in each ipset. still need to add tests
* fixed creation of GaugeVecs, and use explicit labeling instead of order-based labeling now
* updated GetVecValue method signature
* added set to metrics on creation and wrote unit tests for CreateSet, AddToSet, DeleteFromSet, DeleteSet
* use custom registry to limit content that Container Insights scrapes. Also log the start of http server
* wrote TODO item comments for Restore and Destroy (currently these functions are only used in testing)
* NPM won't crash if a Prometheus metric fails to register now (unlikely). Added logging for metric registration/creation, and explicit public function to initialize metrics so that we can finish log config first
* initialize metrics in unit tests
* renamed util.go to test-util.go
Co-authored-by: Hunter Gregory <t-hugreg@microsoft.com>
* Changes for cahing pod ip
* Test fix for API changes
* added test
* Fixed merge conflicts
* Add tests for pod cache
* Add one more check to validate the cache
* Incorporated the comment
Co-authored-by: neaggarw <neaggarwMS@users.noreply.github.com>
* Move AZURE-NPM chain under KUBE-SERVICES chain; Move default allow CONNECTED/RELATED entry to the end of AZURE-NPM chain.
* Find index of KUBE-SERVICES chain.
* Initial changes to support named ports.
* add support for named ports via ipset ip+port hash
* fixing a couple of operational bugs
* adding simple test to validate named port parsing
* Remove old npm chains which were causing errors on uninit
* Utilize rawNpMap and refrain from updating policies with no change.
* redacted
* add added policy to processedNpMap
* The value of minor was incorrectly assumed to be e.g. 14.8-hotfix.20191113 instead of 14+
* adding Jonathan Chauncey's test
* addressing Robbie's comments
log.SetTarget creates the log file under log directory using golang os package. Whenever code sets the log directory, it needed to call SetTarget to create the actual log file under that directory. In the recent logger changes, InitLogger by default set the log directory to the current folder. This created the log file in the current folder. The code then set the log directory to a different location without a subsequent call to log.SetTarget. This resulted into the logger to not find the actual log file in the set log directory.
This fix updates the logger InitLogger function to accept the log directory to create the file in correct log directory. To avoid having such issue, this fix also combines the function calls to set log directory and set target into a single function. This prevents any out of order calls resulting into such issue.
* poll api-server version for a minute before panicking
* always add namespace set, when adding nw policy
* create the ns set in add pod, if add namespace has not been called yet
* give precedence to drop rules (over allow)
* - Moving kube-system-chain above target-sets-chain
- Add drop entry at the end of Ingress-From and Egress-To chains when there are non Allow-All* entries
* write logs to stdout (and log file) so that we can see logs via kubectl
* removing kube-system chain and fixing tests
* removing telemetry buffer
* Retrieve and append the appropriate default drop entries based on policy type.
* Modifying translate_policy unit tests that use getDefaultDropEntries.
* Address Yongli's comments
* change telemetry to message queue and add npm
* remove [Azure-NPM] prefix
* remove npmreport url
* fair scheduling
* holds up to 1k reports for each type
* fix cap on reports
* Adding telemetry report functions for DNC.
* Addressing Yongli's suggestions.
* commit to switch branches
* Adding some changes to npm due to telemetry change.
* Modifying tests for interface reports...