Skip to content

Treat this as a parameter in IR generation#3571

Merged
rdmarsh2 merged 13 commits into
github:masterfrom
rdmarsh2:ir-this-parameter
May 28, 2020
Merged

Treat this as a parameter in IR generation#3571
rdmarsh2 merged 13 commits into
github:masterfrom
rdmarsh2:ir-this-parameter

Conversation

@rdmarsh2

Copy link
Copy Markdown
Contributor

This stops generating InitalizeThis instructions for C++, in favor of treating this as a parameter, with corresponding InitializeParameter and InitializeIndirection instructions. This makes alias analysis handle the this pointer accurately. C# IR generation is not currently affected.

This closes https://github.com/github/codeql-c-analysis-team/issues/50

@rdmarsh2 rdmarsh2 added the C++ label May 26, 2020
@rdmarsh2 rdmarsh2 requested review from a team as code owners May 26, 2020 18:43
@dbartol dbartol self-assigned this May 26, 2020

@dbartol dbartol left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM. A few suggested tweaks.

Does this really not affect any dataflow results (yet)?

private newtype TAllocation =
TVariableAllocation(IRVariable var) or
TIndirectParameterAllocation(IRAutomaticUserVariable var) {
TIndirectParameterAllocation(IRVariable var) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
TIndirectParameterAllocation(IRVariable var) {
TIndirectParameterAllocation(IRAutomaticVariable var) {


class IndirectParameterAllocation extends Allocation, TIndirectParameterAllocation {
IRAutomaticUserVariable var;
IRVariable var;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
IRVariable var;
IRAutomaticVariable var;


newtype TInstructionTag =
OnlyInstructionTag() or // Single instruction (not including implicit Load)
InitializeThisAddressTag() or

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could probably just reuse the tags we already use when generating other InitializeParameter sequences. Not a big deal either way, though.

@rdmarsh2

Copy link
Copy Markdown
Contributor Author

It does affect one result in dataflow, but this doesn't add the output side. I think it should affect a few more tests, though...

@MathiasVP

MathiasVP commented May 27, 2020

Copy link
Copy Markdown
Contributor

Does this really not affect any dataflow results (yet)?

Apparently the reason we're not getting a lot of dataflow changes is because a setter like this:

struct A {
  int x; int y;
  void set_x(int val) {
    this->x = val;
  }
};

is not getting a ReturnIndirection from which the partial definition of this gets propagated out to the caller. With this PR I would expect that setter to behave as the following code, though:

void set_x(A* a, int val) {
  a->x = val;
}

But the IR for the first setter is:

#  118| void A::set_x(int)
#  118|   Block 0
#  118|     v118_1(void)           = EnterFunction                :
#  118|     m118_2(unknown)        = AliasedDefinition            :
#  118|     m118_3(unknown)        = InitializeNonLocal           :
#  118|     m118_4(unknown)        = Chi                          : total:m118_2, partial:m118_3
#  118|     r118_5(glval<unknown>) = VariableAddress[#this]       :
#  118|     m118_6(glval<A>)       = InitializeParameter[#this]   : &:r118_5
#  118|     r118_7(glval<A>)       = Load                         : &:r118_5, m118_6
#  118|     m118_8(A)              = InitializeIndirection[#this] : &:r118_7
#  118|     r118_9(glval<int>)     = VariableAddress[val]         :
#  118|     m118_10(int)           = InitializeParameter[val]     : &:r118_9
#  119|     r119_1(glval<int>)     = VariableAddress[val]         :
#  119|     r119_2(int)            = Load                         : &:r119_1, m118_10
#  119|     r119_3(glval<unknown>) = VariableAddress[#this]       :
#  119|     r119_4(A *)            = Load                         : &:r119_3, m118_6
#  119|     r119_5(glval<int>)     = FieldAddress[x]              : r119_4
#  119|     m119_6(int)            = Store                        : &:r119_5, r119_2
#  119|     m119_7(unknown)        = Chi                          : total:m118_8, partial:m119_6
#  120|     v120_1(void)           = NoOp                         :
#  118|     v118_11(void)          = ReturnVoid                   :
#  118|     v118_12(void)          = AliasedUse                   : m118_3
#  118|     v118_13(void)          = ExitFunction                 :

and for the second one it's:

#  124| void set_x(A*, int)
#  124|   Block 0
#  124|     v124_1(void)       = EnterFunction            :
#  124|     m124_2(unknown)    = AliasedDefinition        :
#  124|     m124_3(unknown)    = InitializeNonLocal       :
#  124|     m124_4(unknown)    = Chi                      : total:m124_2, partial:m124_3
#  124|     r124_5(glval<A *>) = VariableAddress[a]       :
#  124|     m124_6(A *)        = InitializeParameter[a]   : &:r124_5
#  124|     r124_7(A *)        = Load                     : &:r124_5, m124_6
#  124|     m124_8(unknown)    = InitializeIndirection[a] : &:r124_7
#  124|     r124_9(glval<int>) = VariableAddress[val]     :
#  124|     m124_10(int)       = InitializeParameter[val] : &:r124_9
#  125|     r125_1(glval<int>) = VariableAddress[val]     :
#  125|     r125_2(int)        = Load                     : &:r125_1, m124_10
#  125|     r125_3(glval<A *>) = VariableAddress[a]       :
#  125|     r125_4(A *)        = Load                     : &:r125_3, m124_6
#  125|     r125_5(glval<int>) = FieldAddress[x]          : r125_4
#  125|     m125_6(int)        = Store                    : &:r125_5, r125_2
#  125|     m125_7(unknown)    = Chi                      : total:m124_8, partial:m125_6
#  126|     v126_1(void)       = NoOp                     :
#  124|     v124_11(void)      = ReturnIndirection[a]     : &:r124_7, m125_7
#  124|     v124_12(void)      = ReturnVoid               :
#  124|     v124_13(void)      = AliasedUse               : m124_3
#  124|     v124_14(void)      = ExitFunction             :

@rdmarsh2 is that intentional?

@rdmarsh2

Copy link
Copy Markdown
Contributor Author

Yes. I was planning to add the ReturnIndirection in a followup PR.

dbartol
dbartol previously approved these changes May 28, 2020
Robert Marsh added 2 commits May 28, 2020 08:32
Bring in new tests so their output can be fixed

@dbartol dbartol left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Take a quick look at the new consistency failure in fields/dataflow-ir-consistency.expected, but if it's just a dupe of the other two existing failures, feel free to merge.

@rdmarsh2

Copy link
Copy Markdown
Contributor Author

It looks like the same issue as the existing two failures (Chi node associated with a call argument that's a pointer to a field having both the pointer type and an unknown type)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants