This is required for maintaining specific endpoints that should not be exercised, and is a cleaner way
to do so than the existing path_regex switch.
Also:
- fixed a bug in path_regex.
Testing:
- manual testing on demo_server.
- updated unit tests
Partially implements #282. A new property will be added separately to completely exclude DELETEs from running.
1) The body could be declared as an object, and contain a constant in a restler_fuzzable_object.
2) The authentication token element should not be deleted, because currently it is used as a body delimiter.
3) Handle the case when the body is not json (e.g. a string).
This case is not currently supported in RESTler, but it should not crash.
- Added the primitives to 'get_original_blocks'.
- Added test.
Note: get_blocks was not modified - payload body checker support
will need to be addressed separately. This fix only addresses 'get_original_blocks',
which is used in the main algorithm to generate payloads.
- Fixed compiler inconsistency - handle restler_custom_payload_uuid4_suffix
in the same way as other primitives for quoting.
- Fixed example handling in main algorithm
- fixed several quoting bugs
This change is needed for the invalid values checker to work correctly.
Instead of generating constant examples in the grammar, generate fuzzable elements with example values. The engine has been updated to use only the constant example value in the main algorithm.
The checker can then find the fuzzable elements that need to be fuzzed.
Testing:
- manual testing
Currently, if a user wants to fuzz invalid values, this is done by adding
values to the main dictionary and testing them through the main algorithm.
This is not the correct approach, because RESTler tests all combinations of
the parameter values in a request, and testing all combinations of the
invalid values is not desirable. Today, users must re-run RESTler with
custom payloads in order to test large numbers of invalid values
for each parameter efficiently.
This 'invalidvalue' checker is introduced in order to have a separate way to specify the
invalid values. A static dictionary and/or custom value generators for
invalid values may be provided.
If both a static dictionary and custom value generators are specified,
static values are tested first, then the value generator is used for
the remaining combinations (up to max combinations).
Remaining work:
- Issue warning message is there is an insufficient number of combinations to test
all of the parameters.
Settings file:
- max combinations is max combinations per request.
- One value per parameter is always fuzzed, so if there are 1000
parameters, this maximum will be exceeded.
"invalidvalue": {
"max_combinations": 10,
"custom_value_generators": "path/to/valuegen.py",
"custom_dictionary": "path/to/invalid/dict.json"
}
Test mode is modified as follows:
- instead of just testing the first schema, test all examples (if available), then
the request with all parameters from the specification, and then
the same request with optional parameters filtered out.
- 'max_combinations' is applied as usual: if there are 7 examples, and the schema has 10 value combinations, RESTler will test: 7 value combinations, 1 for each example; 10 value combinations for schema with all parameters; 3 value combinations for schema with required parameters
- Fixed the examples 'get_blocks' to return the constant rather than enum type definition.
Test changes:
- Modify quick start test baseline - now that more combinations are being explored in the 'Test' phase, coverage goes up to 6 / 6
```substitute_body``` did not handle the following 2 cases:
1) Optional body: when the first example does not have a body, RESTler could not plug in a generated body, so the body was never fuzzed
=> Fixed by appending the body to the end once this case is confirmed
2) Empty body or body specified inline: RESTler failed to find the body because there was no static string with the body start character.
=> fixed by searching for the body start character starting from the authentication token
Testing:
- manual testing
This change has two parts:
1) Remove extra slash in the compiler. This makes only one slash appear in the grammar.
2) Remove extra slash in the engine. This is not needed to fix the reported issue, but will the case where the user
dynamically specifies a basepath that ends in a slash in the settings.
Testing:
- manual testing
Closes#580
For debugging purpose, we need to log `cmd_result` before call `parse_authentication_tokens` function. That's because if `parse_authentication_token` throw exception. It's hard to know what happened.
` metadata = ast.literal_eval(metadata)` in `parse_authentication_token` is very easy to trigger exception. If get token scripts output unexpected log. For example, in different os platform, some log are output by system not scripts. In this case, it will throw exception. `Error: invalid syntax (<unknown>, line 1)`
Closes#506.
Testing:
- manual testing by adding the checker to the engine settings and invoking it in 'Test' mode by adding ```--enable_checkers demo``` on the command line.
```
"custom_checkers": ["C:\\git\\restler-fuzzer\\restler\\checkers\\demo_checker.py"]
```
This migration is being done in order to avoid recurring
flask-restplus related dependency breaks
and to resolve a recent vulnerability alert in a dependent component.
Note: coverage collection that was done on the flask-based
demo_server was not re-tested on the FastAPI-based one, and hence may not work.
This was only used for initial fuzzing experiments
(from which the graphs were generated for the first research paper),
hence it should not impact anyone actively using demo_server.
Testing: end to end tests that use demo_server passing in CI pipeline.
This change makes several fixes and enhancements to the schema parser and request
definition generator as well as adds a sanity test for a large portion
of the existing functionality.
Specific changes include:
- added 'get_original_blocks' function to each param object, which is supposed to generate
exactly the python grammar corresponding to the original json grammar.
The existing 'get_blocks' and 'get_fuzzing_blocks' do not satisfy this requirement.
They are kept for backwards compatibility for the payload body checker.
- fixed several bugs in schema parsing
- added missing elements to the data types
- added the ability to filter schema nodes using a filter function
(this will be used in a subsequent change)
Testing:
- new unit test
- manual real-world testing of Test, test_all_combinations, FuzzLean on several services
These settings can be used to more conveniently limit the total number of different schemas tested
per request.
Testing:
- manual testing.
- added settings unit test.
Header examples were already present in the grammar,
but were not yet implemented in the engine.
This change enables the following additional fuzzing with header examples:
- Test all combinations will use header examples if present
(previously, it would have used the default header schema)
- Test each header example with the examples checker (consistently with the
other examples coverage supported by this checker).
Note: 'useHeaderExamples' is currently 'False' by default - they are not
included during compilation unless this setting is added to config.json and
set to 'True'.
Testing:
- Manual test with demo_server with added header example.
- Modified existing tests to have header parameter examples in the grammar.
A single dictionary was used for per-request value generators, based on the index of the request definition block.
When multiple schemas are tested, each has its own request definition blocks, so per-schema value generators must be
maintained.
The case when multiple examples were present, and only some of them had a body example, was not handled in the checker.
Testing: manual testing with the original repro.
Closes#505.
The payload body checker filters out the dictionary values for restler_fuzzable_string when calculating coverage metrics based on (error code, error message). The issue above is that when a dynamic generator is specified instead of a static list, this filter can no longer be initialized once at the start of fuzzing. Moreover, the filter does not address every type of property (e.g. custom payloads), so it is incomplete.
As a workaround, simply skip the dynamic generator when initializing the static list of strings to exclude from analysis as above.
The root cause fix should be addressed by either implementing a better bucketization scheme (similar to results analyzer), or by having the analysis exclude all fuzzables by traversing the initialized python grammar elements (similar to invalid value checker).
1. Fixed replay logic, which was broken after shared connection perf improvements.
2. Added end to end test for FuzzLean and Replay that caught the original regression.
3. After adding the two new tests, 2 other issues were found and fixed.
a) The demo_server did not consume all the bytes of the request on all failure
paths. This was only caught during FuzzLean when invalid payloads were specified
in the path and body. The fix was to provide a teardown function that consumed
the entire request. The following issue has more details:
https://github.com/pallets/flask/issues/2188.
b) test-quick-start hung in CI because the process output from demo_server was not consumed.
This was not caught previously because there is more output in Fuzz modes.
Empty request ids (endpoint = "/") were not handled correctly.
This was caught by a user observing the error message
"Request from grammar does not exist in the Request Collection!"
The impact of this was that the requests were not fuzzed
with any checkers that use grammar.json (e.g. the payload body checker).
Testing: manual testing with simple repro.
This change enables values to be generated dynamically for fuzzable
and custom primitives in RESTler.
For every combination (up to 'max_combinations'), if a custom generator
is available for a fuzzable or custom primitive, RESTler will use it to
generate a new value instead of using values from the static dictionary.
The user can add value generators in python corresponding to fuzzable and
custom primitives (see documentation for the list of supported primitives).
RESTler fetches the generators from a dictionary with the same structure
as the custom values dictionary. The generator dictionary must be defined in a
python file and specified in the engine settings (see documentation).
A template file with placeholders for all possible generators corresponding
to the dictionary is generated when compiling the spec.
**Implementation notes:**
When RESTler generates sequences of requests, the prefix of the sequence
(all requests up to the last one) contains requests that have been
previously successfully executed. In order to send the request with
the same set of values in the presence of dynamic generators, the
fuzzable values previously sent are cached and simply plugged in
when a sequence is being re-rendered.
There was some complexity required in integrating with 'render_iter',
which to date has been stateless - if skip=N was specified, it simply
re-generated all the combinations in order to get to N.
1) The same value needs to be used for re-rendering sequences.
This is solved as noted above, by saving the previously rendered values.
2) Since it's not clear how generated values should be combined with others for a given schema and
primitive type, a simple solution was chosen to fetch a new value any time it is needed.
This ensures that all the static values will be combined as usual up to 'max_combinations',
and the necessary generated values will be plugged in as needed.
For example, if there are two fuzzable strings in the request,
and 3 'restler_fuzzable_string' values in the static dictionary,
a total of 9 combinations will be generated.
If these same 3 values are returned in a value generator instead,
only 3 combinations will be tested ((v1,v1), (v2,v2), and (v3, v3)).
3) Before testing, RESTler iterates to see how many total combinations there are.
The state of any cached values, including the value generators, must be reset after
this initial count to make sure all of the values intended to be tested are used.
Because requests are deep copied when saving rendered sequences,
special logic needed to be added to exclude the generators and associated data from
being copied.
Testing:
- Manual testing with demo_server
- New compiler baseline test for the generated dictionary
- Engine unit test to confirm sequence re-rendering works
correctly with the same values when values are dynamically generated
- Engine test to confirm that the number of combinations is correct
Changes include:
- Implement reconnect logic in all parts of RESTler.
- Improve logging.
- The old behavior (new connection on every request) can be restored with the
reconnect_on_every_request setting in the engine settings.
- Fixes so all parts of RESTler are using the same throttling logic.
Testing:
- Manual testing on large services with throttling. Confirmed all objects are GCed.
- Manual testing of perf improvement.
The tests started failing due to a break in demo_server dependencies.
This change implements the following:
1) Resolve flask import issue - pin dependency to an earlier version.
2) Add more debugging output to the quick start script to be able to
diagnose similar failures from build logs.
Added new engine option as follows:
```
"cache_prefix_sequence_settings": {
"per_request_settings": [
{
"methods": [ "GET", "HEAD" ],
"endpoints": "*",
"reset_after_success": false
}
]
}
```
Testing:
- demo_server manual testing
- added unit test
- manual testing with large services to confirm all the resources are cleaned up
Closes#469
The condition did not account for the arguments switching to a tuple.
This regressed in #467 (the previous change).
This was not caught by existing tests because there are no unit tests for the
dynamic objects cache, and this only caused the GC to have to
go through more entries.
Before this change, only 'restler_custom_payload_uuid4_suffix' could have
an associated input-only producer.
Now, any parameter that can be annotated may be associated with an
input-only producer.
Testing:
- Manual testing: modified demo_server to have a writer.
- Added unit test
After this change, producer-consumer dependencies may be defined in the grammar
with input variables for any fuzzable or custom payload primitive (not constants).
Closes#448.
Added new primitive 'restler_basepath; and a new engine setting 'basepath'.
If the engine settings do not specify the base path, the one from the
specification is used.
Updated test baselines.
* fix: avoid making substitutions in auth token, plus add new option to skip_uuid_substitution
Co-authored-by: Marina Polishchuk <marinapo@microsoft.com>
Closes issue Conflict should not be always re-tried #416.
For backwards compatibility, kept a hard-coded default value with the original re-try for which the
unconditional 409 was added.
Also added a custom re-try interval setting.
Logs may be turned off via a new option, "disable_logging.
This can be useful for long fuzzing runs when it
is not desired to log every request for performance reasons.
This causes perf issues when fuzzing a large number of strings.
Printing all of the array elements is not necessary - printing these one time at the top
of the network log is sufficient.
Closes#266
Using RESTler in CI/CD, it was reported that runs occasionally take 10 minutes while usually finishing in a few seconds.
This was determined to be due to DELETE requests that non-deterministically hung,
and the RESTler GC not using the timeout in the settings, but the default parameter value timeout of 600s.
As part of this fix, the default value of 600 is removed to avoid such issues in the future.
Fix#1: The main parsing logic was not supporting restler_custom_payload_header and restler_custom_payload_query
Fix#2: The ad-hoc grammar.py parsing for header parameters that should not be fuzzed
did not handle the case when they are on multiple lines, which is the case for Content-Type after recent changes.
Fix#3: The 'DictionaryCustomPayload' payload type should be included in the list of schema parameters, since these are
the additional parameters injected by the user.
These issues were found testing parameter combinations with demo_server in the presence of the above grammar elements.
Testing: manual testing.
(Lack of automated tests for the schema parser is a known test gap that will be addressed soon.)
Previously, if async polling from a PUT response in a certain format was received, that was used as the final
response from which to parse dynamic objects, otherwise the GET request was used. This did not always work -
in some cases, there were properties missing from the response or an empty body was returned.
After this change, the first response on which the response parser will be invoked will be the one from the GET
request. If the GET failed or parsing the response of the GET, the response from the PUT will still be tried as before.
Testing: ad-hoc testing using original repro (in progress).
When specified in the 'pre_send' section of the grammar, reader variables
are added to the _consumer list, in the same way as when they are
part of the payload.
Testing: tested manually by adding the following annotations to demo_server,
and confirming that every sequence starts with the GET request.
{"x-restler-global-annotations":
[{
"producer_endpoint": "/blog/posts",
"producer_method": "GET",
"consumer_endpoint": "/blog/posts",
"consumer_method": "POST"
}]
}
This change fixes a regression due to a missed update to the grammar.json schema parsing in the engine.
This caused custom payloads that were not strings (e.g., Uuid suffix and Query) to be incorrectly generated.
Do not add dates by default; add an engine setting that may be specified to keep the current behavior.
(Note: a further fix to the current behavior is needed to generate dates in the format corresponding to the
ones specified in the spec or dictionary.)
- Also fix flaky engine test.
Today, when valid payloads are sent, async polling is not done and the objects
are leaked.
This fix makes the checker behave the same way as everywhere else - async wait
until the dynamic objects can be parsed out of the response
and garbage collected to avoid quota issues.
Note: this is a quick fix to attempt to resolve the observed quota issues.
It may be desirable to improve this in the future, for example, to avoid
waiting for async polling if all the dynamic objects are available from
names in the request (to avoid waiting to parse the response before deleting it).
The above may not be useful in all cases, since some resources may need to be
fully created before they can be referenced in order to be deleted.
Recently, RESTler was changed to return rendering information for sequences with a prefix that failed to re-render (for improved root cause analysis in speccov.json).
However, the checkers should not be executed in this case. This was missed, and now discovered by triggering a crash in the leakage rule checker.
The fix is to simply not run the checkers in this case (it used to be covered by the check that the rendered sequence is None).
This was not caught because there are only 'smoketest' tests for failures to re-render, not covering the checkers.
1) Compiler changes:
- add kwargs and parse out headers from the parser arguments
- add ability to infer header dependencies using the header names, similar to
query parameters
- update test baselines
- add new tests
2) Engine changes:
- parse the headers and invoke the parser with the additional named argument
The bug was due to missing parenthesis. Fixed this and another location to use a list instead.
Testing: we have no unit test covering this code path, so this needs to be confirmed with the
original repro. Automated testing in CI for async should be added, but this will not be
a part of this change.
When the example lists were changed from sets to lists that may contain 'None' values,
the examples checker should have been updated to filter them.
This was not covered by existing examples checker tests.
Testing: manual testing.
This was intended in #320, but not currently working due to a bug.
For schema combinations: the local max_combinations take precedence over the global value.
However, for overall combinations (schema + values), the usual global setting of 'max_combinations' is used.
If the settings are 10 parameter combinations, and 100 schema+value combinations, multiple schema combinations will only be tested if there are fewer than 100 schema+value combinations per schema.
Testing: manual testing with engine settings where max_combinations is not specified for header combinations.
**BREAKING CHANGE** the settings for test combinations now have an object instead of a single value.
This additional setting allows configuring different RESTler runs with different maximum values for
combinations. For example, shorter CI/CD invocations may be configured with fewer combinations than
longer or one-off regression tests.
A response of the HEAD method does not have a body,
but returns the content-length of the body that would have been returned.
RESTler needs to handle this when reading the response.
Closes#317.
Currently, if a response returns a boolean, it is read as follows:
data = json.loads(response)
boolean_var = str(data["propertyName"])
This causes the json value 'true' to be converted to Python 'True' , which then incorrectly initializes
the dynamic object to 'True' instead of 'true'.
The previous fix for this was incorrect - the correct fix is to only convert the value, if boolean, at
the time it is parsed from the response.
The default was previously 20s, which was reported by users to be too small, and that it is desirable
to avoid having to configure individual resource timing delays for one outliers.
Increasing the time to 4 minutes to address this issue.
Before, the schema parser simply used the dynamic object identifier
as a string (so any payload containing a dynamic object was invalid).
This change fixes this by getting the dynamic object value.
Testing: manual testing with real-world example.
Also contains a small change to improve exception handling in restler.py.
This works similarly to testing all header combinations, using the following engine setting:
```json
"test_combinations_settings": {
"query_param_combinations" : "all"
}
```
Closes#278
Testing: manual testing with demo_server.
This change adds support for producers that are not returned in the response.
Currently, this is supported via annotations only (which are specified in the same
way as usual). For example, if PUT /A/{aId} does not return 'aId' in the
response, prior to this change it was not possible for GET /a/{aId} to use
the same ID if they are uniquely generated. This is enabled by this change
by adding an annotation:
{
"producer_endpoint": "/A/{aId}",
"producer_method": "PUT",
"producer_resource_name": "aId",
"consumer_param": "aId"
}
In the future, scenarios such as the one above will also be enabled without
having to provide annotations.
See #114
Use restler_custom_payload_query to inject query params not in the spec.
These parameters will be injected at the end of the query string for every request.
An example use case for restler_custom_payload_query is an application ID or authentication token that needs to be passed in the query.
Closes#287
Today, RESTler only supports producer-consumer dependencies by
parsing producers out of the response.
This enables a producer to be specified only as an input parameter.
In the grammar, a new named parameter called 'writer' is added to
the custom_payload_uuid4_suffix parameters. The generated value
is then written similarly to how values would be parsed out of the response,
to be used in subsequent requests.
(Currently, only uuid4_suffix custom payloads are supported but we may
add support for custom payloads if that is a useful scenario.)
This partially implements #114
Bug 1: for requests without bodies, a placeholder needs to be added to correctly represent what is in the grammar, because
the examples are currently tied together via the index only.
Bug 2: incorrect property name
(Note: this means there is no test coverage for having both query and body examples in unit tests.
This needs to be addressed, a separate issue will be opened).
To use this in 'Test' mode:
--test_all_combinations must be specified.
in the engine settings, also specify this property:
"test_combinations_settings": {
"example_payloads": "all"
},
* Add support for testing multiple header combinations.
See SettingsFile.md for documentation.
* Fixes for the payload body checker.
* Fix quoting bug.
* Fix bug where parameter required/optional were not passed through in the compiler correctly.
* Make it possible to analyze flaky sequence failures without parsing full network logs.
This update logs a sample request for sequence failures in the spec coverage file.
The seq.render() code path has been updated to return a meaningful error in case of all failures, and return the full sequence information in case of sequence failures. This data can now be added to the spec coverage log.
For large-scale testing, it is useful to have the specific values of
parameters fuzzed in the spec coverage file, in order to facilitate
root cause failure analysis.
This change adds an option to the compiler that produces a grammar which
contains the parameter names to track for every fuzzable value.
Testing:
- manual testing
- added/fixed engine tests to make sure new code path is exercised
- Note: the payload body checker code is not modified in this change because
it does not require this feature.
* Track parameter combinations in the spec coverage file.
This change adds a new option to track parameter combinations to the engine.
When the fuzzing mode is 'test-all-combinations',
a new property 'tracked_parameters' will appear in speccov.json.
This property contains key-value pairs of all of the parameters
for which more than one combination is being tested.
In this commit, the parameters for which tracking is supported are:
- enums
- custom payloads
For example, an enum 'per_page' with several values will appear as:
"tracked_parameters": {
"per_page_14": "2"
}
The suffix '14' is used to disambiguate between several primitives
of the same name appearing in the payload, and is the position of the argument
in the request definition.
Full support for tracking fuzzable primitives will be enabled in a future
update. For fuzzable primitives, the name of the parameter or property in the payload
is not yet included, and a default 'tracked_param' name is used instead.
For example:
"tracked_parameters": {
"tracked_param_14": "1"
}
* Fix failing unit test - update needed to payload body checker.
* update doc
* New test mode for exhaustive regression testing.
Customer request: in 'Test' mode, instead of stopping at the first
valid request, allow testing all available parameter combinations.
This change implements the following:
- Integrate directed smoke test implementation into the main
fuzzing loop, because the code is now largely shared
- Incrementally log spec coverage (also a customer request).
Testing:
- demo_server in both modes
- unit tests
- manual diffing of before & after on various scenarios
* Add rendering cache.
- Now prefixes are cached, and no longer re-rendered from scratch.
- All prefixes of a sequence are also no longer tested each time - only
the generations required by the "goal sequence".
- ABC test and ABC test with "invalid B" added
- Also fixed a long-standing bug in the renderings output for the smoke test mode.
fixes
* Address more PR feedback.
* More PR fixes.
* more logging and workaround for bug hit during examples checker run
* fix crash due to the examplesChecker injecting a None/null value
Co-authored-by: Marina Polishchuk <marinapo@microsoft.com>
In preparation for merging #233, we need to have a simple unit test with which we can review
the behavior change. This commit adds such a unit test.
The unit test has 4 requests, A, B, C, D, and E. A and B do not have any dependencies, while C and D
both depend on both A and B, and E depends on D.
The baseline 'abc_smoke_test_testing_log.txt' captures the current behavior with respect to this grammar.
In the current implementation, all sequences for A, B, C, D will be
rendered "from scratch", but the sequence for E will reuse the 'D' prefix.
* Working CBA changes
* working changes
* Changes for CBA
* remove cert password arg
* refactor to new user args method
* revert changes on create_connection
* Revert pythonPath local change
* Update SettingsFile.md for new setting
* Additional spacing
* tooltip update
* add tests for client_certificate_path
* forgot comma
* escape slashes, forward slash to back slash
* matching urls
* Revisions/refactors
* missing quote, revision
* Option type the value parse
Co-authored-by: landenblackmsft <landenblackmsft@users.noreply.github.com>
* Support inline examples as fuzzable values
This change introduces the following behavior:
- If example values are specified in the Swagger specification, they
are plugged into default values in the grammar (instead of 'fuzzstring', etc.).
These do not affect the schema.
- If example payloads are specified outside Swagger, these are used
in the same way as before - both to determine the schema and to select values.
The two are mutually exclusive. For example, if 'useQueryExamples' is specified,
and a query example is found in external payload examples, they are used and
the Swagger spec values are ignored.
* Also support multiple examples in the schema.
* another fix
* Remove the previous code to ignore values in the grammar, since
they may be example values.
Add logic to eliminate duplicates, which handles the default generated grammar and dictionary,
since they will be initialized to the same values.
* Refactoring to use the same default values in the grammar and dictionary.
* Fix unit tests.
* Fixes
* Fixes
restler_custom_payload_header should be treated like any other custom payload.
Remove the special case handling in the engine, which causes an extra header
name to be printed.
Closes#192.
* Error: 'utf-8' codec can't decode byte 0x9d in position 681: invalid start byte
If decoding fails and ignore_decoding_failures is set - then try again to decode with "ignore" setting
* addressing pr comments
* update documentation and tests
Co-authored-by: stas <statis@microsoft.com>
* Refactor the Parameter type in the RESTler grammar.
- make it a record instead of tuple and add serialization property.
This refactoring is done in preparation of allowing different style values.
See issue #175.
* Fix unit tests