C#: Update all security queries to path-problems by calumgrant · Pull Request #367 · github/codeql

calumgrant · 2018-10-25T14:38:36Z

Converts all of the security queries to type path-problem. Update the expected qltest output.

In most cases, the change has simply involved changing DataFlow::Node to DataFlow::PathNode, but there are a few cases where this has not been done. For example, when there would be semantic changes, or if the sources and sinks have additional member predicates. In this case, the predicate Node::getPathNode is used instead.

Do not merge this yet - we still need to validate the performance.

hvitved

Many tests still need to have their expected output updated. I generally prefer the style used by e.g. https://lgtm.com/query/rule:1506493930005/lang:cpp/ (see comment), but let me know what you think.

hvitved · 2018-10-26T09:13:02Z

 where c.hasFlow(source, sink)
-select sink, "$@ flows to here and is used in a path.", source, "User-provided value"
+select sink, source.getPathNode(c), sink.getPathNode(c),
+  "$@ flows to here and is used in a path.", source, "User-provided value"


I think I would prefer

from TaintTrackingConfiguration c, DataFlow::PathNode source, DataFlow::PathNode sink where c.hasFlowPath(source, sink) select sink, source, sink, "$@ flows to here and is used in a path.", source, "User-provided value"

Or actually I think this might be slightly better:

from TaintTrackingConfiguration c, DataFlow::PathNode source, DataFlow::PathNode sink where c.hasFlowPath(source, sink) select sink.getNode(), source, sink, "$@ flows to here and is used in a path.", source.getNode(), "User-provided value"

Why? Isn't toString() and getLocation() on DataFlow::Node and DataFlow::PathNode the same?

toString() is different when you have field flow, as the PathNode will include the accesspath.

It seems like there's a choice between

from Configuration c, DataFlow::Node source, DataFlow::Node sink where c.hasFlow(source, sink) select sink, source.getPathNode(c), sink.getPathNode(c), "$@ could flow here.", source, "User input"

vs

from Configuration c, DataFlow::PathNode source, DataFlow::PathNode sink where c.hasFlowPath(source, sink) select sink.getNode(), source, sink, "$@ could flow here.", source.getNode(), "User input"

The second is actually longer. Either way, this mapping to and fro seems somewhat unsatisfactory. I have a slight preference for the first version actually, because

The query is actually dealing with dataflow nodes. Paths are an implementation detail.

You often use a specific type of source and sink, not DataFlow::Node. Often the source and sink have additional predicates relating to information about the source and the sink. PathNodes don't capture any of this.

You could forget to call getNode().

Anyway, this sounds like something to be discussed offline in the CPH office.

I think those are some convincing arguments, @calumgrant!

You're missing an important argument for using PathNode. There are potentially many PathNodes corresponding to a single Node (e.g. if there are multiple configurations in scope with overlapping sources or sinks, which can easily happen), and in that case using getPathNode to get back to the PathNodes becomes semantically shaky, as you'll risk multiplying your results.

Yes but getPathNode(c) ensures that only nodes of configuration c are used in the path. Therefore I think the risk of multiplication -- which only occurs in the edges relation -- is the same in either case.

Still, you may get a pair of path nodes for which there isn't actually flow, as guaranteed by c.hasFlowPath(). So, unless the engine filters out results in the select predicate, for which the edges predicate does not produce a path from the select predicate's second to third column, I think we need to use the other approach.

hvitved · 2018-11-09T12:56:02Z

+from TaintTrackingConfiguration c, DataFlow::PathNode source, DataFlow::PathNode sink
+where c.hasFlowPath(source, sink)
+select sink.getNode(), source, sink,
+  "Private data returned by $@ is written to an external location.", source.getNode(), source.toString()


Should be source.getNode().toString().

hvitved · 2018-11-09T12:57:48Z

+where
+  source = sourcePath.getNode() and
+  sink = sinkPath.getNode() and
+  c.hasFlow(source, sink) and


use c.hasFlowPath(sourcePath, sinkPath) instead.

No, in this case it gives different results due to hackery around sources and sinks. The configuration's hasFlow has been changed, and hasFlow isn't final so I guess it's allowed.

It's probably best to avoid such hacks, as it's not very clear what's going on. It would be much clearer to just define the negationbody from the overridden hasFlow as a separate member predicate, which could then be explicitly anti-joined here.

xiemaisi · 2018-11-12T14:54:10Z

Drive-by comment: I was reminded today that our analysis toolchain assumes that each node selected by a path query appears in the edges relation unless an explicit nodes relation is provided. So in particular if you have an empty edges relation, any query results (which might well exist, since the path from source to sink could be empty) are discarded.

Are you guarding against that case somehow? (I think the easiest workaround is to define a nodes query relation in PathGraph.)

aschackmull · 2018-11-12T15:46:31Z

Drive-by comment: I was reminded today that our analysis toolchain assumes that each node selected by a path query appears in the edges relation unless an explicit nodes relation is provided. So in particular if you have an empty edges relation, any query results (which might well exist, since the path from source to sink could be empty) are discarded.

Are you guarding against that case somehow? (I think the easiest workaround is to define a nodes query relation in PathGraph.)

I don't know the exact structure of PathNodes in C#, but for java the PathNodes corresponding to sinks are special, so if a source is also a sink such that the Node path is the empty path then the corresponding PathNode path in the edges relation will have length 1.

xiemaisi · 2018-11-12T15:53:06Z

OK, glad to hear that you're already handling this for Java. Does the workaround of creating paths of length 1 have any additional benefits over explicitly specifying nodes (e.g., are there other parts of the toolchain that assume paths have at least length one)?

aschackmull · 2018-11-12T16:01:23Z

OK, glad to hear that you're already handling this for Java. Does the workaround of creating paths of length 1 have any additional benefits over explicitly specifying nodes (e.g., are there other parts of the toolchain that assume paths have at least length one)?

The main reason for modelling sinks explicitly was AFAIR to throw away the call context on the final node in a path in order to avoid potential issues with duplicated results (if a sink was reachable from multiple call contexts). I'm not sure about what the tool chain assumptions re. path length are, if any.

…ed output.

hvitved · 2018-11-21T12:32:06Z

@calumgrant : Happy to merge after you have added the nodes predicate suggested by @xiemaisi and fixed the tests.

Add a queries.xml file (for CWE coverage docs)

Add 1.6.20 support

calumgrant added the WIP This is a work-in-progress, do not merge yet! label Oct 25, 2018

calumgrant requested a review from a team as a code owner October 25, 2018 14:38

calumgrant closed this Oct 25, 2018

calumgrant reopened this Oct 25, 2018

hvitved requested changes Oct 26, 2018

View reviewed changes

calumgrant force-pushed the cs/path-problems branch 2 times, most recently from f3df87d to 42fdddb Compare October 26, 2018 11:52

calumgrant added the C# label Oct 26, 2018

calumgrant force-pushed the cs/path-problems branch from 1ef20e5 to 793eaee Compare October 29, 2018 17:25

calumgrant removed the WIP This is a work-in-progress, do not merge yet! label Nov 6, 2018

hvitved requested changes Nov 9, 2018

View reviewed changes

xiemaisi mentioned this pull request Nov 14, 2018

JavaScript: Convert security queries to path queries where applicable. #462

Merged

calumgrant added 3 commits November 16, 2018 10:31

C#: Convert security queries to path-problem and update qltest expect…

eddc528

…ed output.

C#: Always use PathNode in a path-problem query.

e908b09

C#: Address review comments - adding .getNode() where appropriate.

cf4b04a

calumgrant force-pushed the cs/path-problems branch from 793eaee to cf4b04a Compare November 16, 2018 11:52

C#: Fix ReDoS query.

8c753d7

C#: Add nodes predicate to all path queries.

69ab1ed

hvitved previously approved these changes Nov 21, 2018

View reviewed changes

C#: Update test outputs.

3eae1cd

calumgrant dismissed hvitved’s stale review via 3eae1cd November 21, 2018 17:28

hvitved approved these changes Nov 21, 2018

View reviewed changes

hvitved merged commit 201f64e into github:master Nov 22, 2018

aibaars added a commit that referenced this pull request Oct 19, 2021

Merge pull request #367 from github/shati-patel/queriesxml

9b88bbd

Add a queries.xml file (for CWE coverage docs)

smowton pushed a commit to smowton/codeql that referenced this pull request Apr 16, 2022

Merge pull request github#367 from github/kotlin-version-1-6-20

b41f8dd

Add 1.6.20 support

Uh oh!

Conversation

calumgrant commented Oct 25, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hvitved left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aschackmull Oct 26, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hvitved Oct 26, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xiemaisi commented Nov 12, 2018

Uh oh!

aschackmull commented Nov 12, 2018

Uh oh!

xiemaisi commented Nov 12, 2018

Uh oh!

aschackmull commented Nov 12, 2018

Uh oh!

hvitved commented Nov 21, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

calumgrant commented Oct 25, 2018 •

edited

Loading

aschackmull Oct 26, 2018 •

edited

Loading

hvitved Oct 26, 2018 •

edited

Loading