* Omitting authentication tokens from the request text while logging to the trace databases when no_tokens_in_logs setting is set to true
* Fixing the code to address unit test failures realted to the changes
* Have to pass the attribute no_tokens_in_logs, without which the setting will not be honored
---------
Co-authored-by: Swamy Nallamalli <swamyn@ntdev.microsoft.com>
For use cases when a trace database is saved and re-played,
this change allows using only replay blocks without the rendered requests.
This is being added for scenarios where only replay blocks are saved during
custom serialization.
The rendered requests continue to be logged by default, since they are
useful for debugging.
Testing:
- manual testing
- new unit test that replays from ndjson with vs. without the request text
This change enables a new 'replay' mode option that replays requests from the trace database.
Changes include:
1. Refactor driver to consume sequences from different sources (with trace database another supported source of specific
sequences, similar to the existing smoke test mode)
2. Generate 'replay blocks' as part of payload rendering. Replay blocks contain dynamic objects, but do not support
custom payloads (this is future work).
3. Enables replay to be based on replay blocks
With this change, RESTler generates sequences from replay blocks if available,
or from raw request/responses (similar to the existing replay mode) when they are not available.
The current version of replay is based on the rendered data sent at the time of
original sequence rendering. This has several limitations, such as not being able to
garbage-collect resources or plugging in custom payloads.
In particular, when a bug is reproduced in RESTler today, the same replay mechanism is used, which
means that GC does not collect resources created while reproducing the bug. Moreover, re-running
the same sequence without GC will not work if the resource has not been deleted if a resource-generating
request has a unique ID parameter (e.g. PUT /resource/{resourceId}), causing false negative non-reproducible
bugs.
This change addresses this issue by implementing replay blocks-based replay, which is invoked when reproducing bugs.
Note: the existing replay based on .replay.txt files will also be able to use this new mechanism if a grammar is provided,
but this is not implemented as part of this change.
4. Added a setting to filter the origin during DB replay.
To only replay specific origins, add the following to the settings file:
"replay": {
"include_origins": ["main_driver", "InvalidValueChecker"]
}
Testing:
- manual testing of demo_server replay as follows:
1) Run 'test' task and generate trace database
>restler.exe test --grammar_file .\Compile\grammar.py --dictionary_file .\Compile\dict.json --host localhost --target_port 8888 --settings .\compile\engine_settings.json --no_ssl
Engine settings:
{
"use_trace_database": true,
"trace_database": {
"root_dir": "d:\\demo_server\\trace_databases",
},
}
2) Run 'replay' task from above trace database
>restler.exe replay --replay_log ./trace_data.ndjson --grammar_file .\Compile\grammar.py --dictionary_file .\Compile\dict.json --host localhost --target_port 8888 --settings .\compile\engine_settings.json --no_ssl
- updated unit tests
- Also bump sqlmodel version
- A few app issues needed to be fixed as part of this upgrade:
1) Use union type to be able to return None and trigger one of the planted bugs
2) Work around OverflowError: Python int too large to convert to SQLite INTEGER
3) adjust baseline since 500 is no longer getting returned after pydantic upgrade
The payload {"body":0} now generates a well-formed error response before reaching the demo_server implementation
4) Update baseline for use after free checker
This checker now returns a 500
* driver: set origin when performing replay
* test_basic_functionality_end_to_end: validate that origin is set
* test_basic_functionality_end_to_end: add replay file trace db test
* unit_tests: rename test_replay.txt to replay_sanity_test_for_tracedb.replay.txt
* driver: clear origin after sequence replay
* unit tests: add additional logging
* restler.py: wait for trace_db thread when replaying file
---------
Co-authored-by: William Baker <William.Baker@microsoft.com>
Add a setting that, when true, adds a unique ID for every request sent.
This setting is enabled by default.
Testing:
- manual testing
Note: the baseline diffs ignore headers, which is why baselines did not need to be updated.
This change adds structured logging for all of the network traffic that is currently
logged to the network.*.txt log. The data is written to a 'trace database'.
The default format is newline-delimited json, but custom logging formats
are supported via a module specified in the engine settings (similar to checkers).
Below is an example for demo_server:
```
{"sent_timestamp": "2023-11-02T10:10:59.544824+00:00", "received_timestamp": null, "request": "GET /api/blog/posts?page=1&per_page=1 HTTP/1.1\r\nAccept: application/json\r\nHost: localhost\r\nContent-Length: 0\r\nr\n\r\n", "response": null, "request_json": "null", "response_json": "null", "tags": {"request_id": "27f9653431313fdc3fecc4a890b72b80b4ce1e59", "sequence_id": "fe2172ab-151c-419f-b918-f3e7483b3230", "combination_id": "1950cbddab7726489624c3d346d3426561c921ad_1", "hex_definition": "6daf8d22c7a6b3472fc83c9f08f290b3507c3ff3", "origin": "main_driver"}}
{"sent_timestamp": null, "received_timestamp": "2023-11-02T10:10:59.597613+00:00", "request": null, "response": "HTTP/1.1 400 Bad Request\r\ndate: Thu, 02 Nov 2023 10:10:58 GMT\r\nserver: uvicorn\r\ncontent-length: 41\r\ncontent-type: application/json\r\n\r\n{\"detail\":\"per_page must be at least 2.\"}", "request_json": "null", "response_json": "null", "tags": {"request_id": "27f9653431313fdc3fecc4a890b72b80b4ce1e59", "sequence_id": "fe2172ab-151c-419f-b918-f3e7483b3230", "combination_id": "1950cbddab7726489624c3d346d3426561c921ad_1", "hex_definition": "6daf8d22c7a6b3472fc83c9f08f290b3507c3ff3", "origin": "main_driver"}}
```
Testing:
- demo_server
- added new unit test
- manual testing for multiple db files
This bug was found when specifying a `producer_timing_delay` for a resource that was
also specified as `create_once`. Here, the length of the sequence was 1, and RESTler
proceeded fuzzing with the `create_once` resource in a not-ready state.
By default, change the behavior to always wait after the resource creation.
This change adds two new settings, as specified in #780.
- A specific `random_seed` may now be set in the engine settings, which will be used everywhere random values are generated, except where a different random_seed is specified to checkers through the checker settings.
- An option to generate a new random seed `generate_random_seed`, which is helpful for CI/CD cases that run in `random-walk` mode
and would like to get different sequences exercised on every run
Testing:
- added new test
When one of the requests (e.g. DELETE) does not have an example payload, and testing all example payloads is configured in the engine settings, RESTler crashes [1].
The fix is to only mark combinations tested after at least one combination was found.
Exception in thread Garbage Collector:
Traceback (most recent call last):
File "C:\Users\marinapo\AppData\Local\Programs\Python\Python39\lib\threading.py", line 980, in _bootstrap_inner
self.run()
File "D:\restlerdrop\main\engine\engine\dependencies.py", line 627, in run
self._garbage_collector.run()
File "D:\restlerdrop\main\engine\engine\dependencies.py", line 339, in run
self.do_garbage_collection()
File "D:\restlerdrop\main\engine\engine\dependencies.py", line 429, in do_garbage_collection
self.apply_destructors(destructors)
File "D:\restlerdrop\main\engine\engine\dependencies.py", line 562, in apply_destructors
deleted_list = process_overflowing()
File "D:\restlerdrop\main\engine\engine\dependencies.py", line 514, in process_overflowing
rendered_data, _ , _, _ = destructor.\
File "D:\restlerdrop\main\engine\engine\core\requests.py", line 1276, in render_current
return next(self.render_iter(candidate_values_pool,
Closes#787 restler_custom_payload is not correctly plugged into examples
Also fix incorrectly passed 'quoted' parameters - this was caught by manually inspecting the logs and seeing incorrect quoting for several primitives when using the generated schema only.
Updated test baselines for invalidvalue after the bug fix.
Testing:
- confirmed the manual repro from #787 now works correctly
The checker was logging tokens to its checker log. This change removes
the statement that logged the payload separately in this checker log,
which is not necessary. The payloads are already logged to the network log,
which can be used for debugging the sent requests.
Testing:
- manual testing
Log the values (excluding tokens) to help pinpoint unexpected types of values in the request block.
Testing:
- manual testing with invalid dictionary ('restler_fuzzable_int' containing integer instead of string)
This change adds a script to help diff network logs when updating test baselines.
Usage:
a) Add python root directory (<repo_root_dir>\'restler') to PYTHONPATH when invoking outside the repo
b)
python D:\git\restler-fuzzer\restler\utils\network_logs\diff_network_logs.py --left_file .\logs_before\invalidvalue_testing_log.txt --right_file .\logs_after\invalidvalue_testing_log.txt >diff_out.txt
- improved log parsing to separate out requests and checkers.
- minor clean-up to log parsing
The dependency variable name was being extracted incorrectly in the case of lists.
This code was only being used by the invalid value checker.
Testing:
- manual testing with demo_server
- updated baselines
Since bug report files in json format have been added, add the suffix to the replay file to make it clear which
of the files can be used to replay the bug.
Also includes minor formatting changes in code related to bug logging/replay.
* Changes to improve the output format for RESTler bugs.
1. Added functionality to publish JSON formatted bug details file for each bug reported in addition to the txt file.
2. Added functionality to publish a Bugs.Json file containing the information on bugs found in a RESTler run. This will be an index to the individual bug details files.
* Resolved the PR comments.
1. fixed the naming convention issues.
2. resolved comments on code reusability.
* Changes to fix the PR comments
1. added tests to validate sequences that contain multiple requests.
2. removed request order from the bug details.
3. refactored code to pass error_code as part of bug bucket class.
* Fixed the build error in the last PR as glob command was returning the files in a different order and the test was trying to replay from json files
---------
Co-authored-by: Anand Nooli <annooli@microsoft.com>
Today, it can be slow and painful to figure out some basic questions during coverage investigations by reading the RESTler output files, without writing additional scripts. For example:
1) Which request, if fixed, would unblock the highest number of requests? In other words, which request should I work on fixing first?
2)What are all the failed requests that were actually tried (not skipped because of a dependency)?
3) What is the sequence of concrete requests that were sent for each failure?
Today, this needs to be gathered by navigating and copy-pasting in speccov.json, or by finding the same sequence in the network logs (which is not easy to correlate with the speccov file, since request/sequence IDs are not printed there).
This change includes the following improvements to address these issues:
1. The engine writes a second file in addition to speccov.json, speccov-min.json, which includes the raw request and response.
This leaves the speccov.json file unmodified, since speccov.json format is optimized for directly uploading to a database for further
analysis, whereas speccov-min is generated to provide a quick way to generate a report that is more user-friendly to inspect directly.
2. The engine now prints all combinations to the speccov.json file (up to a maximum), then post-processes this file to determine the data for each request, and provides the first request as the sample request. This allows failures for user-specified examples to be quickly investigated - the last request, on the contrary, is usually a RESTler-generated request (e.g. with 'fuzzstring' plugged in for all string parameters).
Closes#637
3. This new 'speccov-min.json' is post-processed by the RESTler driver to produce a new text file, which contains the failed requests, prioritized by the number of dependent requests. For each failed request, RESTler includes the full sequence of requests and responses up to the failure.
4. Fixed several bugs and missing information in the spec coverage file.
When x-ms-paths are used, the endpoint method removes the query part.
This code should use 'endpoint_no_dynamic_objects' instead, since this corresponds to
the endpoint in the Swagger spec (which could include query parameters for x-ms-paths)
- add the case where writers have to be extracted from the request definition
- added coverage to create_once test
- fix bug in writer_variables tracking for invalid value fuzzing (found after
updating the regression test)
Closes#671
* add basic unit tests for new auth settings
* add new settings to restler settings and restler.py
* add support for module and location authentication to restler
* Remove OneOf from settings parsing
* Remove OneOf from unit tests
* Inclusion checks rather than exception handling for loading configuration
* Add authentication test files folder and sample token
* Move unit_test_server_auth.py to authentication_test_files
* Update checkers log with new auth module path
* Add e2e tests for token location auth
* Implementation for token location auth
* Add e2e test for module authentication
* PR feedback, remove extra space
* Add e2e tests for token refresh cmd
* Update description for authentication settings
* Update failure message in cmd unit test
* Update comment in test cmd auth
* Exception handling for new auth mechanisms in request_utilities
* Revert changes to client cert path and client key path for now
* Initial module logging implementation
* Remove old comment
* Add OneOf validation to auth, split auth validation to function
* Add basic string matching for auth validation exception unit tests
* Add function descriptions and use import_utilities
* Update description for validate_auth_tokens
* Update retry handler with copyright and docstrings
* Add space between operator for readability
* Whitespace formatting
* Update unit tests to use "token_refresh_cmd" instead of "cmd"
* Split inclusion checks over multiple lines
* Rename token_module_method to token_module_function, remove extra space
* Make data an optional parameter to token function
* Rename token_module_method to token_module_function
* Update method to function in settings file
* Add certificate to authentication settings
* Simplify token interval and cmd parsing
* Add documentation for new auth features
* Add enum for token auth methoeds
* Add InvalidTokenAuthMethodException
* Add back existing documentation for client certificates
* Set custom_value_generators to blank
* Add license to unit_test_server_auth_module.py
* Split token formatting across multiple lines
* Make data and log required arguments to token function
* Make data and log required arguments to token function
* Use run_abc_smoke_test in auth unit tests
* Add exception, enums to retry handler. Add unit tests
* Remove colon from comment in execute_token_refresh
* Split module unit tests into data and no data
* Update Authentication.md for readability
* Reduce nesting for auth documentation
* Update formatting for for clarity
* Remove unnecessary file path, update commas in json
* Formatting, add refernece to SettingsFile.md
* Use load_module in import_attrs to load module from absolute path
* Remove unnecessary load_module call
* Revert newline addition
- Adapt existing async polling
- Add GC summary
Testing:
- manual testing with demo_server as-is and
temporarily modified to return location header on delete
- added unit test
'render_iter' should only generate combinations - it should not have any side effects. Garbage collection was being triggered in the presence of writer variables even if the request was never executed, because dynamic objects were added to the cache in render(). This was missed in the original implementation of writer variables.
Instead, writer variables should only be updated after the request has succeeded (returned a valid code), indicating that the dynamic object exists.
Testing: existing unit tests and manual inspection of GC logs.
- add max objects per resource type setting
- add logic to stop fuzzing if the specified quota is exceeded.
The GC will exit with an exception, and the testing summary will not be available.
- fix GC hang
Testing:
- manual testing with demo_server
The functions that generate the python grammar for different schema combinations use
ad-hoc parsing to determine the boundaries of body, header, and query parameters.
This is error prone and will be refactored to use grammar delimiters in the future.
As a quick fix, make `substitute_headers` keep the original location of the authentication token,
which `substitute_body` is using as a delimiter for currently unsupported body content types.
Closes#650.
Testing:
- added regression test
This was also a regression due to the readonly parameter refactoring.
Unfortunately, this case did not cause failures, even though it was covered
in the quick start. This may be due to nondeterminism in the payload body
checker, which will be addressed separately.
The schema parser for arrays and internal objects were not correctly updated
after introducing readonly parameters, causing a crash.
Found while reproducing #627.
Testing:
- confirmed original repro is fixed
- added unit test
Parameters were not being url-encoded. This is now fixed in the main driver.
Checkers that generate fuzzed data (e.g. the invalid value checker)
were not updated as part of this change - this additional change
will be tracked by a separate item.
Testing:
- manual testing with demo server and original repro
- Exclude them when generating the python grammar from the compiler
- Add new property to engine schema
- also exclude them in engine grammar generation
- added unit test
- also fixed a bug with required parameter filtering in the body
The special case parameter type that was added in order to allow fuzzing the content type was
missed in example schema parsing in the engine. This change adds it to all of the examples when generating
the request grammar.
The checker was using the request definition of the last request. This did not correspond to the current rendering, which may have a different schema. The fix is to save the last rendered schema for use by checkers.
Testing: demo_server test
When ```add_fuzzable_dates``` is specified, two additional dates are generated
for every date type near the current date, which handles common API date constraints
(e.g. expiration dates). Example payloads also have the same issue, but this option did
not previously apply to examples.
This change adds support for modifying example dates to be 7 days in the future for
common date formats.
Testing:
- added unit test
In some resource-constrained environments, it is required to
clean up any resource created to test a sequence as soon as it is no longer
needed.
This change implements a new engine configuration to support this:
```run_gc_after_every_sequence```
- Also factored out garbage collector into a separate class
Testing:
- manual testing
- add test that checks for equality of GCed objects into the checker test.