[Compiler Refactor 6] Refactor Amber Workflow to use the new PhysicalPlan implementation#1807
Merged
Conversation
shengquan-ni
requested changes
Jan 25, 2023
shengquan-ni
left a comment
Contributor
There was a problem hiding this comment.
Left some comments.
Xiao-zhen-Liu
pushed a commit
that referenced
this pull request
Jan 27, 2023
…1812) This PR fixes an issue introduced in #1793 . In some scenarios, the `inputPortMapping` is not correctly passed into the data processor due to an initilaization order issue of a lazily evaluated variable. This causes the join operator to not produce results because it wrongly thinks all the input data are from the build side. This PR does a temporary hot-fix of this issue, this issue will be completely solved once #1807 is merged into master.
…onfig class (#1794) This PR is a follow up of #1791 . In #1791 a new `OpExecConfig` class is introduced. This PR updates the simple operators that are mostly one-to-one to use the new class. Most changes are one-line changes that directly map from the old API to the new API. Changes to more complicated operators (join, aggregate, etc..) will be completed in subsequent PRs.
shengquan-ni
approved these changes
Feb 2, 2023
yangzhang75
pushed a commit
to yangzhang75/texera
that referenced
this pull request
Jun 22, 2026
…pache#1812) This PR fixes an issue introduced in apache#1793 . In some scenarios, the `inputPortMapping` is not correctly passed into the data processor due to an initilaization order issue of a lazily evaluated variable. This causes the join operator to not produce results because it wrongly thinks all the input data are from the build side. This PR does a temporary hot-fix of this issue, this issue will be completely solved once apache#1807 is merged into master.
yangzhang75
pushed a commit
to yangzhang75/texera
that referenced
this pull request
Jun 22, 2026
…Plan implementation (apache#1807) This PR does the following changes: 1. Refactors the `Workflow` class to use the new `PhysicalPlan` class. After this change, all helper functions related to traversing the physical plan DAG is inside `PhysicalPlan`. 2. In all classes dealing with the actual physical plan, use `LayerID` (physical operator ID) instead of `OperatorID`(logical operator ID) to identify an operator. Specifically, `OperatorID` consists of `(workflowID, operatorID)` (logical operator). `LayerID` consists of `(workflowID, operatorID, layerID)`, one logical operator can corresponds to multiple physical operators (such as aggregate and visualization). 3. Changes from apache#1794 and apache#1817 are also merged into master in this branch. Previously, these two PRs are merged into this base branch. In the previous two PRs, all operators are updated to adopt the new `OpExecConfig` API. 4. After all operators adopt the new `OpExecConfig` API, the old code are all cleaned up. Specifically, many old `XxxOpExecConfig` classes are no longer needed and they are all unified into the new OpExecConfig` API. 5. Refactors the `WorkflowPipelinedBuilder` class to use the new `PhysicalPlan` API. Separate the logic of deiciding a region and adding a materialization operator with a new `MaterializationRewriter` class. 6. Refactors the compilation phase into a logical plan building and a physical plan building phase. In the physical plan building phase, adds a new `PartitionEnforcer` to decide the shuffle policies of each link. The old ad-hoc way to decide shuffle policies (`DeploymentFilter`) is removed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR does the following changes:
Workflowclass to use the newPhysicalPlanclass. After this change, all helper functions related to traversing the physical plan DAG is insidePhysicalPlan.LayerID(physical operator ID) instead ofOperatorID(logical operator ID) to identify an operator. Specifically,OperatorIDconsists of(workflowID, operatorID)(logical operator).LayerIDconsists of(workflowID, operatorID, layerID), one logical operator can corresponds to multiple physical operators (such as aggregate and visualization).OpExecConfigAPI.OpExecConfigAPI, the old code are all cleaned up. Specifically, many oldXxxOpExecConfigclasses are no longer needed and they are all unified into the new OpExecConfig` API.WorkflowPipelinedBuilderclass to use the newPhysicalPlanAPI. Separate the logic of deiciding a region and adding a materialization operator with a newMaterializationRewriterclass.PartitionEnforcerto decide the shuffle policies of each link. The old ad-hoc way to decide shuffle policies (DeploymentFilter) is removed.