Skip to content

[Compiler Refactor 8] Migrate all operators to use the new OpExecConfig class#1817

Merged
zuozhiw merged 4 commits into
zuozhi-migrate-opexecfrom
zuozhi-migrate-opexec-2
Jan 31, 2023
Merged

[Compiler Refactor 8] Migrate all operators to use the new OpExecConfig class#1817
zuozhiw merged 4 commits into
zuozhi-migrate-opexecfrom
zuozhi-migrate-opexec-2

Conversation

@zuozhiw

@zuozhiw zuozhiw commented Jan 30, 2023

Copy link
Copy Markdown
Contributor

This PR is a continuation of #1794 . In this PR, all the operators are migrated to use the new OpExecConfig class. This PR involves changes to more complicated operators such as sources, aggregations, and joins.

@zuozhiw zuozhiw changed the base branch from master to zuozhi-migrate-opexec January 30, 2023 21:46
@zuozhiw zuozhiw requested a review from shengquan-ni January 30, 2023 21:46
@zuozhiw zuozhiw added engine refactor Refactor the code java labels Jan 30, 2023

@shengquan-ni shengquan-ni left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zuozhiw zuozhiw merged commit 30e68fa into zuozhi-migrate-opexec Jan 31, 2023
@zuozhiw zuozhiw deleted the zuozhi-migrate-opexec-2 branch January 31, 2023 10:08
zuozhiw added a commit that referenced this pull request Feb 2, 2023
…Plan implementation (#1807)

This PR does the following changes:
1. Refactors the `Workflow` class to use the new `PhysicalPlan` class.
After this change, all helper functions related to traversing the
physical plan DAG is inside `PhysicalPlan`.
2. In all classes dealing with the actual physical plan, use `LayerID`
(physical operator ID) instead of `OperatorID`(logical operator ID) to
identify an operator. Specifically, `OperatorID` consists of
`(workflowID, operatorID)` (logical operator). `LayerID` consists of
`(workflowID, operatorID, layerID)`, one logical operator can
corresponds to multiple physical operators (such as aggregate and
visualization).
3. Changes from #1794 and #1817 are also merged into master in this
branch. Previously, these two PRs are merged into this base branch. In
the previous two PRs, all operators are updated to adopt the new
`OpExecConfig` API.
4. After all operators adopt the new `OpExecConfig` API, the old code
are all cleaned up. Specifically, many old `XxxOpExecConfig` classes are
no longer needed and they are all unified into the new OpExecConfig`
API.
5. Refactors the `WorkflowPipelinedBuilder` class to use the new
`PhysicalPlan` API. Separate the logic of deiciding a region and adding
a materialization operator with a new `MaterializationRewriter` class.
6. Refactors the compilation phase into a logical plan building and a
physical plan building phase. In the physical plan building phase, adds
a new `PartitionEnforcer` to decide the shuffle policies of each link.
The old ad-hoc way to decide shuffle policies (`DeploymentFilter`) is
removed.
yangzhang75 pushed a commit to yangzhang75/texera that referenced this pull request Jun 22, 2026
…Plan implementation (apache#1807)

This PR does the following changes:
1. Refactors the `Workflow` class to use the new `PhysicalPlan` class.
After this change, all helper functions related to traversing the
physical plan DAG is inside `PhysicalPlan`.
2. In all classes dealing with the actual physical plan, use `LayerID`
(physical operator ID) instead of `OperatorID`(logical operator ID) to
identify an operator. Specifically, `OperatorID` consists of
`(workflowID, operatorID)` (logical operator). `LayerID` consists of
`(workflowID, operatorID, layerID)`, one logical operator can
corresponds to multiple physical operators (such as aggregate and
visualization).
3. Changes from apache#1794 and apache#1817 are also merged into master in this
branch. Previously, these two PRs are merged into this base branch. In
the previous two PRs, all operators are updated to adopt the new
`OpExecConfig` API.
4. After all operators adopt the new `OpExecConfig` API, the old code
are all cleaned up. Specifically, many old `XxxOpExecConfig` classes are
no longer needed and they are all unified into the new OpExecConfig`
API.
5. Refactors the `WorkflowPipelinedBuilder` class to use the new
`PhysicalPlan` API. Separate the logic of deiciding a region and adding
a materialization operator with a new `MaterializationRewriter` class.
6. Refactors the compilation phase into a logical plan building and a
physical plan building phase. In the physical plan building phase, adds
a new `PartitionEnforcer` to decide the shuffle policies of each link.
The old ad-hoc way to decide shuffle policies (`DeploymentFilter`) is
removed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

engine refactor Refactor the code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants