C++: Optimize SubBasicBlocks library#1824
Merged
Merged
Conversation
geoffw0
reviewed
Aug 26, 2019
| pragma[nomagic] | ||
| private int outerPosToInnerPos(BasicBlock bb, int posInBB) { | ||
| posInBB = result + this.getIndexInBasicBlock(bb) and | ||
| result = [ 0 .. this.getNumberOfNodes() - 1 ] |
Contributor
There was a problem hiding this comment.
How confident are we that this isn't quadratic? Naively, it seems that for any particular node bb, posInBB, every possible this in bb must be checked.
Other than this question, FWIW, these changes LGTM also.
Contributor
Author
There was a problem hiding this comment.
This is the DIL we generate for this predicate:
noinline nomagic
SubBasicBlocks::SubBasicBlock::outerPosToInnerPos#ffff(/* SubBasicBlocks::SubBasicBlock */ cached entity this,
/* BasicBlocks::BasicBlock */ cached entity bb,
int posInBB, int result) :-
exists(cached int call_result |
exists(int range#high |
exists(int Sub#lhs |
SubBasicBlocks::SubBasicBlock::getIndexInBasicBlock#fff(this, bb,
call_result),
SubBasicBlocks::SubBasicBlock::getNumberOfNodes#ff(this, Sub#lhs),
Minus(Sub#lhs, 1, range#high)
),
range(0, range#high, result)
),
PlusInt(result, call_result, posInBB)
)
.
It first joins in the two getters (functional relations), so the tuple count will be the number of SubBasicBlock. It then joins with the range, which multiplies that tuple count by the average number of nodes per SubBasicBlock. All together, this predicate should have the same number of tuples as there are ControlFlowNodes.
Contributor
There was a problem hiding this comment.
Great, that gives me a bit more confidence.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Some predicates in this library had performance that was quadratic in the number of sub-basic-blocks per basic block. After field flow was added, that has become a problem on two projects: intel/mkl-dnn and laurentkneip/opengv. This PR should fix the join order and make these predicates faster on all projects.