-
Notifications
You must be signed in to change notification settings - Fork 456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GLUTEN-8479][CORE][Part-3] Split backend configs to its corresponding modules #8586
base: main
Are you sure you want to change the base?
Conversation
@jackylee-ch @baibaichen Would you please take a look, thanks! |
Run Gluten Clickhouse CI on x86 |
It seems that the current PR is about splitting config with |
@jackylee-ch Move the configuration of a specific backend to its own module. The backendType is used to identify the configuration type and has nothing to do with where the configuration is located. It's not certain whether dividing into multiple configuration classes makes sense. |
Prefer putting the configs in GlutenConfigs and using |
I am inclined to placing them in separated modules. But also would like to hear more views here. |
@jackylee-ch @zhztheplayer If separated, the final state seems quite good. The backend-related configurations will only be used by something like VeloxBackendApi, making it clearer. I tend to separate and see if anyone else has any ideas. |
Run Gluten Clickhouse CI on x86 |
BTW, Is our goal to split the configuration of all modules or just the backends? I only see backend config splitting in this PR. |
Sorry for the confusion caused to you. the purpose of this PR is to split the backend configuration to its corresponding module. I have also modified the PR title and description. |
Run Gluten Clickhouse CI on x86 |
import org.apache.gluten.execution.ColumnarNativeIterator | ||
import org.apache.gluten.memory.CHThreadGroup | ||
import org.apache.gluten.vectorized._ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line should not be removed
import org.apache.spark._ | ||
import org.apache.spark.scheduler.MapStatus | ||
import org.apache.spark.shuffle.celeborn.CelebornShuffleHandle | ||
import org.apache.spark.sql.vectorized.ColumnarBatch | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
Run Gluten Clickhouse CI on x86 |
@@ -80,10 +79,10 @@ class CHCelebornColumnarShuffleWriter[K, V]( | |||
nativeBufferSize, | |||
capitalizedCompressionCodec, | |||
compressionLevel, | |||
GlutenConfig.get.chColumnarShuffleSpillThreshold, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep the use of GlutenConfig.get
after the refactor? I prefer a uniform API. It requires either velox conf or CH conf to be covered by GlutenConfig at runtime somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that is to say, the definition of the configuration is independent, but all the acquisitions are from GlutenConfig?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can make VeloxConfig
inherit from GlutenConfig
? Then VeloxConfig
can be used uniformly in Velox backend for both Velox / Common configurations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation is an inheritance relationship.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, then I feel it's already convenient to developers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that is to say, the definition of the configuration is independent, but all the acquisitions are from GlutenConfig?
@yikf, yes, it's not a mature idea. Maybe, firstly keep using your current changes.
@baibaichen, please take a look. |
@yikf There are some code error for ch backend, could you fix it? |
Run Gluten Clickhouse CI on x86 |
Run Gluten Clickhouse CI on x86 |
1 similar comment
Run Gluten Clickhouse CI on x86 |
Run Gluten Clickhouse CI on x86 |
What changes were proposed in this pull request?
Fix #8479, This PR aims to split the backend configurations into individual modules that belong to each other.
This PR mainly accomplishes two things:
How was this patch tested?
GA.