Граф коммитов

4587 Коммитов

Автор SHA1 Сообщение Дата
CodeQL CI 5054d5b555
Merge pull request #7420 from RasmusWL/ssrf-new
Approved by yoff
2021-12-17 15:20:49 +00:00
Rasmus Wriedt Larsen 83f87f0272 Python: Adjust .expected based on new comment
That was changed in 9866214
2021-12-17 15:29:41 +01:00
Rasmus Wriedt Larsen 626009ea60 Python: Fix typo 2021-12-17 14:29:38 +01:00
yoff 9866214ebe
Update python/ql/test/query-tests/Security/CWE-918-ServerSideRequestForgery/full_partial_test.py 2021-12-17 14:26:43 +01:00
Rasmus Wriedt Larsen 83f1b2ca5d Python: Add SSRF qhelp
I included examples of both types in the qhelp of both queries, to
provide context of what each of them actually are.
2021-12-17 11:48:26 +01:00
Rasmus Wriedt Larsen e7abe43e3e Python: Add SSRF change-note 2021-12-17 10:04:55 +01:00
Rasmus Wriedt Larsen e309d8227c Python: Remove debug predicate
Accidentally committed :|
2021-12-17 09:44:35 +01:00
Rasmus Wriedt Larsen 1d00730753 Python: Allow `http[s]://` prefix for SSRF 2021-12-17 00:27:18 +01:00
Rasmus Wriedt Larsen 8d9a797b75 Python: Add tricky .format SSRF tests 2021-12-17 00:24:51 +01:00
Rasmus Wriedt Larsen 6f297f4e9c Python: Fix SSRF sanitizer tests
They were very misleading before, because a sanitizer that happened
early, would remove taint from the rest of the cases by use-use flow :|
2021-12-16 23:24:08 +01:00
Rasmus Wriedt Larsen 4b5599fe17 Python: Improve full/partial SSRF split
Now full-ssrf will only alert if **all** URL parts are fully
user-controlled.
2021-12-16 22:48:51 +01:00
Rasmus Wriedt Larsen cb934e17b1 Python: Adjust SSRF location to request call
Since that might not be the same place where the vulnerable URL part is.
2021-12-16 22:48:51 +01:00
Rasmus Wriedt Larsen b1bca85162 Python: Add interesting test-case 2021-12-16 22:48:51 +01:00
Rasmus Wriedt Larsen 5a7efd0fee Python: Minor adjustments to QLDoc of `HTTP::Client::Request` 2021-12-16 22:48:51 +01:00
Rasmus Wriedt Larsen 6ce1524192
Python: Apply suggestions from code review
Co-authored-by: yoff <lerchedahl@gmail.com>
2021-12-16 15:19:37 +01:00
Rasmus Wriedt Larsen 1cc5e54357 Python: Add SSRF queries
I've added 2 queries:

- one that detects full SSRF, where an attacker can control the full URL,
  which is always bad
- and one for partial SSRF, where an attacker can control parts of an
  URL (such as the path, query parameters, or fragment), which is not a
  big problem in many cases (but might still be exploitable)

full SSRF should run by default, and partial SSRF should not (but makes
it easy to see the other results).

Some elements of the full SSRF queries needs a bit more polishing, like
being able to detect `"https://" + user_input` is in fact controlling
the full URL.
2021-12-16 01:48:34 +01:00
Rasmus Wriedt Larsen 579de0c3f0 Python: Remove `getResponse` and do manual taint steps 2021-12-15 21:55:04 +01:00
Rasmus Wriedt Larsen f8fc583af3 Python: client request: getUrl => getAUrlPart
I think `getUrl` is a bit too misleading, since from the name, I would
only ever expect ONE result for one request being made.

`getAUrlPart` captures that there could be multiple results, and that
they might not constitute a whole URl.

Which is the same naming I used when I tried to model this a long time ago
a80860cdc6/python/ql/lib/semmle/python/web/Http.qll (L102-L111)
2021-12-15 21:55:04 +01:00
Rasmus Wriedt Larsen 6f81685f48 Python: Add modeling of `http.client.HTTPResponse` 2021-12-15 21:55:04 +01:00
Rasmus Wriedt Larsen a5bae30d81 Python: Add tests of `http.client.HTTPResponse` 2021-12-15 20:39:46 +01:00
Andrew Eisenberg 0669ef505e Fix semver for upgrades references
Ensure the version range is flexible enough to handle
future version changes.
2021-12-13 09:03:33 -08:00
Rasmus Wriedt Larsen cf2ee0672f Python: Model `requests` Responses 2021-12-13 15:09:46 +01:00
Rasmus Wriedt Larsen 35cba17642 Python: Consider taint of client http requests 2021-12-13 14:56:16 +01:00
Rasmus Wriedt Larsen b68d280129 Python: Add modeling of `requests` 2021-12-13 14:56:16 +01:00
Rasmus Wriedt Larsen 1ff56d5143 Python: Add tests of `requests`
Also adjusts test slightly. Writing
`clientRequestDisablesCertValidation=False` to mean that certificate
validation was disabled by the `False` expression is just confusing, as
it easily reads as _certificate validate was NOT disabled_ :|

The new one ties to each request that is being made, which seems like
the right setup.
2021-12-13 14:07:32 +01:00
Rasmus Wriedt Larsen 7bf285a52e Python: Alter `disablesCertificateValidation` to fit our needs
For the snippet below, our current query is able to show _why_ we
consider `var` to be a falsey value that would disable SSL/TLS
verification. I'm not sure we're going to need the part that Ruby did,
for being able to specify _where_ the verification was removed, but
we'll see.

```
requests.get(url, verify=var)
```
2021-12-13 11:37:12 +01:00
Rasmus Wriedt Larsen 08f6d1ab80 Python: Clearer sourceType for client response body 2021-12-13 11:24:38 +01:00
Rasmus Wriedt Larsen 5de79b4ffe Python: Add HTTP::Client::Request concept
Taken from Ruby, except that `getURL` member predicate was changed to
`getUrl` to keep consistency with the rest of our concepts, and stick
to our naming convention.
2021-12-13 11:09:09 +01:00
Andrew Eisenberg 66c1629974
Merge pull request #7285 from github/post-release-prep-2.7.3-ddd4ccbb
Post-release preparation 2.7.3
2021-12-10 09:59:45 -08:00
yoff d8857c7ce8
Merge pull request #7246 from tausbn/python/import-star-flow
Python: Support flow through `import *`
2021-12-10 16:34:32 +01:00
Rasmus Wriedt Larsen bd9b96e154
Merge pull request #7331 from tausbn/python-fix-bad-callsite-points-to-join
Python: Fix bad `callsite_points_to` join
2021-12-10 15:39:49 +01:00
Rasmus Wriedt Larsen 8ee020f79c
Merge pull request #7332 from tausbn/python-fix-bad-scope-entry-points-to-join
Python: Fix bad `scope_entry_points_to` join
2021-12-10 15:33:13 +01:00
Taus 6d247bfdf9
Merge pull request #7330 from tausbn/python-fix-bad-adjacentuseuse-join
Python: Fix bad join in SSA
2021-12-09 20:55:45 +01:00
yoff 8e11c2c476
Merge pull request #7259 from RasmusWL/even-more-path-injection-sinks
Python: Add more path-injection sinks from `os` and `tempfile` modules
2021-12-09 14:46:41 +01:00
Taus b871342e83 Python: A small further performance improvement
Unrolling the transitive closure had slightly better performance here.

Also, we exclude names of builtins, since those will be handled by a
separate case of `isDefinedLocally`.
2021-12-09 10:29:55 +00:00
Taus 8517eff0f7 Python: Fix bad performance
A few changes, all bundled together:

- We were getting a lot of magic applied to the predicates in the
  `ImportStar` module, and this was causing needless re-evaluation.
  To address this, the easiest solution was to simply cache the entire
  module.
- In order to separate this from the dataflow analysis and make it
  dependent only on control flow, `potentialImportStarBase` was changed
  to return a `ControlFlowNode`.
- `isDefinedLocally` was defined on control flow nodes, which meant we
  were duplicating a lot of tuples due to control flow splitting, to no
  actual benefit.

Finally, there was a really bad join in `isDefinedLocally` that was
fixed by separating out a helper predicate. This is a case where we
could use a three-way join, since the join between the `Scope`, the
`name` string and the `Name` is big no matter what.

If we join `scope_defines_name` with `n.getId()`, we'll get `Name`s
belonging to irrelevant scopes.

If we join `scope_defines_name` with the enclosing scope of the `Name`
`n`, then we'll get this also for `Name`s that don't share their `getId`
with the local variable defined in the scope.

If we join `n.getId()` with `n.getScope()...` then we'll get all
enclosing scopes for each `Name`.

The last of these is what we currently have. It's not terrible, but not
great either. (Though thankfully it's rare to have lots of enclosing
scopes.)
2021-12-08 22:53:45 +00:00
Anders Schack-Mulligen 38d0bb4a60
Merge pull request #7260 from hvitved/dataflow/argument-parameter-matching
Data flow: Introduce `ParameterPosition` and `ArgumentPosition`
2021-12-08 12:49:08 +01:00
Tom Hvitved 283173ad02 Address review comments 2021-12-08 11:26:44 +01:00
Tom Hvitved 490872173a Data flow: Sync files 2021-12-07 20:29:18 +01:00
Taus e7c298d903 Python: Fix bad `scope_entry_points_to` join
From `pritomrajkhowa/LoopBound`:

```
Definitions.ql-7:PointsTo::PointsToInternal::scope_entry_points_to#ffff#antijoin_rhs#2 ........... 55.1s
```

specifically

```
(443s) Tuple counts for PointsTo::PointsToInternal::scope_entry_points_to#ffff#antijoin_rhs#2/3@74a7cart after 55.1s:
184070    ~0%        {3} r1 = JOIN PointsTo::PointsToInternal::scope_entry_points_to#ffff#shared#1 WITH Variables::GlobalVariable#class#f ON FIRST 1 OUTPUT Lhs.0 'arg2', Lhs.1 'arg0', Lhs.2 'arg1'
184070    ~0%        {3} r2 = STREAM DEDUP r1
919966523 ~2%        {4} r3 = JOIN r2 WITH Essa::EssaDefinition::getSourceVariable_dispred#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'arg0', Lhs.2 'arg1', Lhs.0 'arg2'
4281779   ~2293%     {3} r4 = JOIN r3 WITH Essa::EssaVariable::getScope_dispred#ff ON FIRST 2 OUTPUT Lhs.1 'arg0', Lhs.2 'arg1', Lhs.3 'arg2'
                    return r4
```

First, this is an `antijoin`, so there's likely some negation involved.
Also, there's mention of `GlobalVariable`, `getScope`, and
`getSourceVariable`, none of which appear in `scope_entry_points_to`, so
it's likely that something got inlined.

Taking a closer look at the predicates mentioned in the body, we spot
`undefined_variable` as a likely culprit.

Evaluating this predicate in isolation reveals that it's not terribly
big, so we could try just marking it with `pragma[noinline]` (I opted
for the slightly more solid `nomagic`) and see how that fares. I also
checked that `builtin_not_in_outer_scope` was similarly small, and
made that one un-inlineable as well.

The result? Well, I can't even show you. Both `scope_entry_points_to`
and `undefined_variable` are so fast that they don't appear in the
clause timing report (so they can at most take 3.5s each to evaluate, as
that is the smallest timing in the list).
2021-12-07 18:51:44 +00:00
Taus b502ca1ea7 Python: Fix bad `callsite_points_to` join
From `pritomrajkhowa/LoopBound`:

```
Definitions.ql-7:PointsTo::InterProceduralPointsTo::callsite_points_to#ffff#join_rhs#3 ........... 5m53s
```

specifically

```
(767s) Tuple counts for PointsTo::InterProceduralPointsTo::callsite_points_to#ffff#join_rhs#3/3@f8f86764 after 5m53s:
832806293 ~0%     {4} r1 = JOIN PointsTo::InterProceduralPointsTo::callsite_points_to#ffff#shared#1 WITH PointsTo::InterProceduralPointsTo::var_at_exit#fff ON FIRST 1 OUTPUT Lhs.0, Lhs.1 'arg1', Rhs.1 'arg2', Rhs.2 'arg0'
832806293 ~0%     {3} r2 = JOIN r1 WITH Essa::TEssaNodeRefinement#ffff_03#join_rhs ON FIRST 2 OUTPUT Lhs.3 'arg0', Lhs.1 'arg1', Lhs.2 'arg2'
                return r2
```

This one is a bit tricky to unpack. Where is this `shared#1` defined?

```
EVALUATE NONRECURSIVE RELATION:
SYNTHETIC PointsTo::InterProceduralPointsTo::callsite_points_to#ffff#shared#1(int arg0, numbered_tuple arg1) :-
    SENTINEL PointsTo::InterProceduralPointsTo::callsite_points_to#ffff#shared
    SENTINEL Definitions::EscapingAssignmentGlobalVariable#class#f
    SENTINEL Essa::TEssaNodeRefinement#ffff_03#join_rhs
    {2} r1 = JOIN PointsTo::InterProceduralPointsTo::callsite_points_to#ffff#shared WITH Definitions::EscapingAssignmentGlobalVariable#class#f ON FIRST 1 OUTPUT Lhs.0 'arg0', Lhs.1 'arg1'
    {2} r2 = STREAM DEDUP r1
    {2} r3 = JOIN r2 WITH Essa::TEssaNodeRefinement#ffff_03#join_rhs ON FIRST 2 OUTPUT Lhs.0 'arg0', Lhs.1 'arg1'
    {2} r4 = STREAM DEDUP r3
    return r4
```

Looking at `callsite_points_to`, we see a likely candidate in `srcvar`.
It is guarded with an `instanceof` check for
`EscapingAssignmentGlobalVariable` (which lines up nicely with the
sentinel on its charpred) and `getSourceVariable` is just a projection
of `TEssaNodeRefinement`.

So let's try unbinding `srcvar` to prevent an early join.

The timing is now:

```
Definitions.ql-7:PointsTo::InterProceduralPointsTo::callsite_points_to#ffff ...................... 31.3s (2554 evaluations with max 101ms in PointsTo::InterProceduralPointsTo::callsite_points_to#ffff/4@i516#581fap5w)
```
(Showing the tuple counts doesn't make sense here, since all of the
`shared` and `join_rhs` predicates have been smooshed around.)
2021-12-07 18:25:53 +00:00
Taus a716482c1f Python: Fix bad join in SSA
On `pritomrajkhowa/LoopBound`:

```
Definitions.ql-3:SsaCompute::SsaComputeImpl::AdjacentUsesImpl::adjacentUseUse#ff ................. 4m35s
```

specifically

```
(376s) Tuple counts for SsaCompute::SsaComputeImpl::AdjacentUsesImpl::adjacentUseUse#ff/2@be04e9kp after 4m58s:
388843     ~0%     {4} r1 = JOIN Essa::TPhiFunction#fff_2#join_rhs WITH SsaCompute::SsaComputeImpl::AdjacentUsesImpl::definesAt#ffff ON FIRST 1 OUTPUT Rhs.1, Lhs.0, Rhs.2, Rhs.3
3629812090 ~1%     {7} r2 = JOIN r1 WITH SsaCompute::SsaComputeImpl::variableUse#ffff ON FIRST 1 OUTPUT Lhs.0, Rhs.2, Rhs.3, Lhs.2, Lhs.3, Lhs.1, Rhs.1 'use1'
0          ~0%     {2} r3 = JOIN r2 WITH SsaCompute::SsaComputeImpl::AdjacentUsesImpl::adjacentVarRefs#fffff ON FIRST 5 OUTPUT Lhs.5, Lhs.6 'use1'
0          ~0%     {2} r4 = JOIN r3 WITH SsaCompute::SsaComputeImpl::AdjacentUsesImpl::firstUse#ff ON FIRST 1 OUTPUT Lhs.1 'use1', Rhs.1 'use2'

897141     ~0%     {2} r5 = SsaCompute::SsaComputeImpl::AdjacentUsesImpl::adjacentUseUseSameVar#ff UNION r4
                    return r5
```

Clearly we do not want to join on the variable so soon. So we unbind it
and get

```
(78s) Tuple counts for SsaCompute::SsaComputeImpl::AdjacentUsesImpl::adjacentUseUse#ff/2@40e0e6uv after 434ms:
3377959 ~2%     {4} r1 = SCAN SsaCompute::SsaComputeImpl::variableUse#ffff OUTPUT In.0, In.2, In.3, In.1 'use1'
1026855 ~2%     {4} r2 = JOIN r1 WITH SsaCompute::SsaComputeImpl::AdjacentUsesImpl::adjacentVarRefs#fffff ON FIRST 3 OUTPUT Lhs.0, Rhs.3, Rhs.4, Lhs.3 'use1'
129484  ~0%     {2} r3 = JOIN r2 WITH SsaCompute::SsaComputeImpl::AdjacentUsesImpl::definesAt#ffff_1230#join_rhs ON FIRST 3 OUTPUT Rhs.3, Lhs.3 'use1'
0       ~0%     {2} r4 = JOIN r3 WITH Essa::TPhiFunction#fff_2#join_rhs ON FIRST 1 OUTPUT Lhs.0, Lhs.1 'use1'
0       ~0%     {2} r5 = JOIN r4 WITH SsaCompute::SsaComputeImpl::AdjacentUsesImpl::firstUse#ff ON FIRST 1 OUTPUT Lhs.1 'use1', Rhs.1 'use2'

897141  ~0%     {2} r6 = SsaCompute::SsaComputeImpl::AdjacentUsesImpl::adjacentUseUseSameVar#ff UNION r5
                return r6
```
2021-12-07 18:19:47 +00:00
Taus 59bac04d8f Python: Fix Python 2 failures 2021-12-07 18:00:46 +00:00
Taus ffc858e34d Python: Add missing file 2021-12-07 17:29:35 +00:00
Taus 7437cd4d85 Python: Fix syntax error locations 2021-12-07 16:51:33 +00:00
Erik Krogh Kristensen 3c59aa319e
Merge pull request #7245 from erik-krogh/explicit-this-all-the-places
All langs: apply the explicit-this patch to all remaining code
2021-12-07 10:40:26 +01:00
Taus 7cd9369d91 Python: Autoformat 2021-12-07 09:29:24 +00:00
Taus 33a9f86f54 Python: Change integer in `trois.py` 2021-12-07 08:54:07 +00:00
Taus dd33f4f4d2
Python: Apply suggestions from code review
Co-authored-by: yoff <lerchedahl@gmail.com>
2021-12-07 09:48:53 +01:00
Taus 7f44cebed7 Python: Add missing hidden flow
The easiest way to implement this was to change the definition of
`module_export` to account for chains of `import *`. We reuse the
machinery from `ImportStar.qll` for this, naturally.
2021-12-02 17:11:56 +00:00