-
Notifications
You must be signed in to change notification settings - Fork 2k
C++: IR-based taint tracking #966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
158 changes: 158 additions & 0 deletions
158
cpp/ql/src/semmle/code/cpp/ir/dataflow/TaintTracking.qll
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,158 @@ | ||
| /** | ||
| * Provides classes for performing local (intra-procedural) and | ||
| * global (inter-procedural) taint-tracking analyses. | ||
| * | ||
| * We define _taint propagation_ informally to mean that a substantial part of | ||
| * the information from the source is preserved at the sink. For example, taint | ||
| * propagates from `x` to `x + 100`, but it does not propagate from `x` to `x > | ||
| * 100` since we consider a single bit of information to be too little. | ||
| */ | ||
|
|
||
| import semmle.code.cpp.ir.dataflow.DataFlow | ||
| import semmle.code.cpp.ir.dataflow.DataFlow2 | ||
| private import semmle.code.cpp.ir.IR | ||
|
|
||
| module TaintTracking { | ||
| /** | ||
| * A configuration of interprocedural taint tracking analysis. This defines | ||
| * sources, sinks, and any other configurable aspect of the analysis. Each | ||
| * use of the taint tracking library must define its own unique extension of | ||
| * this abstract class. | ||
| * | ||
| * A taint-tracking configuration is a special data flow configuration | ||
| * (`DataFlow::Configuration`) that allows for flow through nodes that do not | ||
| * necessarily preserve values but are still relevant from a taint-tracking | ||
| * perspective. (For example, string concatenation, where one of the operands | ||
| * is tainted.) | ||
| * | ||
| * To create a configuration, extend this class with a subclass whose | ||
| * characteristic predicate is a unique singleton string. For example, write | ||
| * | ||
| * ``` | ||
| * class MyAnalysisConfiguration extends TaintTracking::Configuration { | ||
| * MyAnalysisConfiguration() { this = "MyAnalysisConfiguration" } | ||
| * // Override `isSource` and `isSink`. | ||
| * // Optionally override `isSanitizer`. | ||
| * // Optionally override `isAdditionalTaintStep`. | ||
| * } | ||
| * ``` | ||
| * | ||
| * Then, to query whether there is flow between some `source` and `sink`, | ||
| * write | ||
| * | ||
| * ``` | ||
| * exists(MyAnalysisConfiguration cfg | cfg.hasFlow(source, sink)) | ||
| * ``` | ||
| * | ||
| * Multiple configurations can coexist, but it is unsupported to depend on a | ||
| * `TaintTracking::Configuration` or a `DataFlow::Configuration` in the | ||
| * overridden predicates that define sources, sinks, or additional steps. | ||
| * Instead, the dependency should go to a `TaintTracking::Configuration2` or | ||
| * a `DataFlow{2,3,4}::Configuration`. | ||
| */ | ||
| abstract class Configuration extends DataFlow::Configuration { | ||
| bindingset[this] | ||
| Configuration() { any() } | ||
|
|
||
| /** Holds if `source` is a taint source. */ | ||
| // overridden to provide taint-tracking specific qldoc | ||
| abstract override predicate isSource(DataFlow::Node source); | ||
|
|
||
| /** Holds if `sink` is a taint sink. */ | ||
| // overridden to provide taint-tracking specific qldoc | ||
| abstract override predicate isSink(DataFlow::Node sink); | ||
|
|
||
| /** | ||
| * Holds if taint should not flow into `node`. | ||
| */ | ||
| predicate isSanitizer(DataFlow::Node node) { none() } | ||
|
|
||
| /** | ||
| * Holds if the additional taint propagation step | ||
| * from `source` to `target` must be taken into account in the analysis. | ||
| * This step will only be followed if `target` is not in the `isSanitizer` | ||
| * predicate. | ||
| */ | ||
| predicate isAdditionalTaintStep(DataFlow::Node source, DataFlow::Node target) { none() } | ||
|
|
||
| final override predicate isBarrier(DataFlow::Node node) { isSanitizer(node) } | ||
|
|
||
| final override predicate isAdditionalFlowStep(DataFlow::Node source, DataFlow::Node target) { | ||
| this.isAdditionalTaintStep(source, target) | ||
| or | ||
| localTaintStep(source, target) | ||
| } | ||
| } | ||
|
|
||
| /** | ||
| * A taint-tracking configuration that is backed by the `DataFlow2` library | ||
| * instead of `DataFlow`. Use this class when taint-tracking configurations | ||
| * or data-flow configurations must depend on each other. | ||
| * | ||
| * See `TaintTracking::Configuration` for the full documentation. | ||
| */ | ||
| abstract class Configuration2 extends DataFlow2::Configuration { | ||
| bindingset[this] | ||
| Configuration2() { any() } | ||
|
|
||
| /** Holds if `source` is a taint source. */ | ||
| // overridden to provide taint-tracking specific qldoc | ||
| abstract override predicate isSource(DataFlow::Node source); | ||
|
|
||
| /** Holds if `sink` is a taint sink. */ | ||
| // overridden to provide taint-tracking specific qldoc | ||
| abstract override predicate isSink(DataFlow::Node sink); | ||
|
|
||
| /** | ||
| * Holds if taint should not flow into `node`. | ||
| */ | ||
| predicate isSanitizer(DataFlow::Node node) { none() } | ||
|
|
||
| /** | ||
| * Holds if the additional taint propagation step | ||
| * from `source` to `target` must be taken into account in the analysis. | ||
| * This step will only be followed if `target` is not in the `isSanitizer` | ||
| * predicate. | ||
| */ | ||
| predicate isAdditionalTaintStep(DataFlow::Node source, DataFlow::Node target) { none() } | ||
|
|
||
| final override predicate isBarrier(DataFlow::Node node) { isSanitizer(node) } | ||
|
|
||
| final override predicate isAdditionalFlowStep(DataFlow::Node source, DataFlow::Node target) { | ||
| this.isAdditionalTaintStep(source, target) | ||
| or | ||
| localTaintStep(source, target) | ||
| } | ||
| } | ||
|
|
||
| /** | ||
| * Holds if taint propagates from `nodeFrom` to `nodeTo` in exactly one local | ||
| * (intra-procedural) step. | ||
| */ | ||
| predicate localTaintStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo) { | ||
| // Taint can flow into using ordinary data flow. | ||
| DataFlow::localFlowStep(nodeFrom, nodeTo) | ||
| or | ||
| // Taint can flow through expressions that alter the value but preserve | ||
| // more than one bit of it _or_ expressions that follow data through | ||
| // pointer indirections. | ||
| nodeTo.getAnOperand().getDefinitionInstruction() = nodeFrom and | ||
| ( | ||
| nodeTo instanceof ArithmeticInstruction | ||
| or | ||
| nodeTo instanceof BitwiseInstruction | ||
| or | ||
| nodeTo instanceof PointerArithmeticInstruction | ||
| or | ||
| nodeTo instanceof FieldAddressInstruction | ||
| ) | ||
| or | ||
| nodeTo.(LoadInstruction).getSourceAddress() = nodeFrom | ||
| } | ||
|
|
||
| /** | ||
| * Holds if taint may propagate from `source` to `sink` in zero or more local | ||
| * (intra-procedural) steps. | ||
| */ | ||
| predicate localTaint(DataFlow::Node source, DataFlow::Node sink) { localTaintStep*(source, sink) } | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should block taint propagation at bitwise and by default. It really requires both operands to be tainted before the result of the and is tainted. Anyway, let's defer that until later.