ce9f69a639 | ||
---|---|---|
.. | ||
example | ||
ql | ||
src/cg_trace | ||
tests | ||
.flake8 | ||
.gitignore | ||
.isort.cfg | ||
README.md | ||
helper.sh | ||
projects.json | ||
requirements.txt | ||
setup.py |
README.md
Recorded Call Graph Metrics
also known as call graph tracing.
Execute a python program and for each call being made, record the call and callee. This allows us to compare call graph resolution from static analysis with actual data -- that is, can we statically determine the target of each actual call correctly.
Using the call graph tracer does incur a heavy toll on the performance. Expect 10x longer to execute the program.
Number of calls recorded vary a little from run to run. I have not been able to pinpoint why.
Running against real projects
Currently it's possible to gather metrics from traced runs of the standard test suite of a few projects (defined in projects.json): youtube-dl
, wcwidth
, and flask
.
To run against all projects, use
$ ./helper.sh all $(./helper.sh projects)
To view the results, use
$ head -n 100 projects/*/Metrics.txt
Expanding set of projects
It should be fairly straightforward to expand the set of projects. Most projects use tox
for running their tests against multiple python versions. I didn't look into any kind of integration, but have manually picked out the instructions required to get going.
As an example, compare the tox.ini
file from flask with the configuration
"flask": {
"repo": "https://github.com/pallets/flask.git",
"sha": "21c3df31de4bc2f838c945bd37d185210d9bab1a",
"module_command": "pytest -c /dev/null tests examples",
"setup": [
"pip install -r requirements/tests.txt",
"pip install -q -e examples/tutorial[test]",
"pip install -q -e examples/javascript[test]"
]
}
Local development
Setup
-
Ensure you have at least Python 3.7
-
Create virtual environment
python3 -m venv venv
and activate it -
Install dependencies
pip install -r --upgrade requirements.txt
-
Install this codebase as an editable package
pip install -e .
-
Setup your editor. If you're using VS Code, create a new project for this folder, and use these settings for correct autoformatting of code on save:
{
"python.pythonPath": "venv/bin/python",
"python.linting.enabled": true,
"python.linting.flake8Enabled": true,
"python.formatting.provider": "black",
"editor.formatOnSave": true,
"[python]": {
"editor.codeActionsOnSave": {
"source.organizeImports": true
}
},
"python.autoComplete.extraPaths": [
"src"
]
}
- Enjoy writing code, and being able to run
cg-trace
on your command line 🎉
Using it
After following setup instructions above, you should be able to reproduce the example trace by running
cg-trace --xml example/simple.xml example/simple.py
You can also run traces for all tests and build a database by running tests/create-test-db.sh
. Then run the queries inside the ql/
directory.
Tracing Limitations
Multi-threading
Should be possible by using threading.setprofile
, but that hasn't been done yet.
Code that uses sys.setprofile
Since that is our mechanism for recording calls, any code that uses sys.setprofile
will not work together with the call-graph tracer.
Class instantiation
Does not always fire off an event in the sys.setprofile
function (neither in sys.settrace
), so is not recorded. Example:
r = range(10)
when disassembled (python -m dis <file>
):
9 48 LOAD_NAME 7 (range)
50 LOAD_CONST 5 (10)
52 CALL_FUNCTION 1
54 STORE_NAME 8 (r)
but no event 😞