Release Notes - Gluten version 1.3.0
Highlights
- Spark 3.2.2/3.3.1/3.4.3(upgraded)/3.5.2(upgraded)
- 268+ spark functions including json
- Update OAP's Velox codebase to 2025/01/07
- Join: Sort Merge Join support
- Shuffle: Sort based Shuffle(Row)
- Query Plan: RAS Optimization
- Datalake: Hudi 0.15.0 support/Iceberg 1.5.0/Delta 3.2.0
- RSS: Celeborn 0.5.2/Uniffle 0.9.1
- File Format: CSV support via arrow
- JVM libhdfs with viewfs/kerberos support
- Partial Project(UDF) support
- Mix backend refactor
- Bucket write in partitioned Hive table
- CI/Nightly Package Tools Update
- Build & Compile Tools Update(recommend to use vcpkg with static build)
- Fix several result mismatch issues
- Fix OOM/Yarn Kill unstable issues
What's Changed
- [VL] Make velox writer queue size configurable @yikf github.com//pull/6341
- [VL] Remove useless ctx variable @gaoyangxiaozhu github.com//pull/6348
- [1632][CH]Daily Update Clickhouse Version (20240706) @kyligence-git github.com//pull/6359
- [VL] fix build bundle package @zhouyuan github.com//pull/6364
- [VL] Fix process_setup_alinux3 arrow CMakeLists.txt path @liujiayi771 github.com//pull/6363
- [VL] Daily Update Velox Version (2024_07_08) @GlutenPerfBot github.com//pull/6366
- [6262][CH]Json input format ignore key case @KevinyhZou github.com//pull/6263
- [6285][VL] Add debian10 vcpkg depends @wenwj0 github.com//pull/6286
- [CELEBORN] CelebornShuffleManager#stop should stop non-null _vanillaCelebornShuffleManager @SteNicholas github.com//pull/6371
- [VL] Update ubuntu docker to use cmake 3.28 @boneanxs github.com//pull/6373
- [6304][CH]Support array_join @KevinyhZou github.com//pull/6305
- [VL] Daily Update Velox Version (2024_07_09) @GlutenPerfBot github.com//pull/6376
- [6378][CH] Support delta count optimizer for the MergeTree format @zzcclp github.com//pull/6379
- [6345][CH] Deprecate SCALAR_FUNCTIONS SerializedPlanParser @lgbo-ustc github.com//pull/6347
- [TEST] Use project version rather than Gluten version Gluten it @ulysses-you github.com//pull/6385
- [6377][CH] Support window function
percent_rank
@lgbo-ustc github.com//pull/6386 - [VL] Minor refactor for ValueStream node construction and usage @Yohahaha github.com//pull/6382
- [VL] Enable levenshtein function @zhli1142015 github.com//pull/6389
- [VL] Daily Update Velox Version (2024_07_10) @GlutenPerfBot github.com//pull/6384
- [1632][CH]Daily Update Clickhouse Version (20240710) @kyligence-git github.com//pull/6383
- Test input_file_name, input_file_block_start & input_file_block_length when scan falls back @gaoyangxiaozhu github.com//pull/6318
- [6394][VL] Fix the vcpkg package script @weixiuli github.com//pull/6395
- [6288][CH] Support BroadcastNestedLoopJoinExe[Part one] @loneylee github.com//pull/6290
- [CELEBORN] Rename CelebornHashBasedColumnarShuffleWriter to CelebornColumnarShuffleWriter @kerwin-zk github.com//pull/6391
- [VL] Fix E function fallback issue some condition @gaoyangxiaozhu github.com//pull/6397
- [CI] Fix centos7 failure @marin-ma github.com//pull/6404
- [1632][CH]Daily Update Clickhouse Version (20240711) @kyligence-git github.com//pull/6399
- [CELEBORN] Add compression for row-based shuffle @kerwin-zk github.com//pull/6380
- [VL] Daily Update Velox Version (2024_07_11) @GlutenPerfBot github.com//pull/6400
- [CORE] Remove local sort for TopNRowNumber @ulysses-you github.com//pull/6381
- [VL] Spark assert_true function support @gaoyangxiaozhu github.com//pull/6329
- [VL] Add schema validation for all operators @zhli1142015 github.com//pull/6406
- [CORE] Minor code cleanups against fallback tagging @zhztheplayer github.com//pull/6320
- [VL] Try to find arrow libs from velox bundled path firstly @PHILO-HE github.com//pull/6413
- [VL] disable tpch benchmarks on comment/merge @zhouyuan github.com//pull/6402
- [UT] Add a tool to validate any unary expression with all its accepted types @PHILO-HE github.com//pull/6392
- [CH] Fix a source file name typo @zhztheplayer github.com//pull/6412
- [VL] Fix Pi function fallback issue some condition @gaoyangxiaozhu github.com//pull/6408
- [CELEBORN] VeloxCelebornColumnarBatchSerializer uses the key and default value of SHUFFLE_COMPRESS to check whether to compress shuffle output @SteNicholas github.com//pull/6414
- [VL] Quick fix for commit conflicts @zhztheplayer github.com//pull/6418
- [Doc] Update new supported spark functions @gaoyangxiaozhu github.com//pull/6423
- [VL] Add a test to validate substring_index @boneanxs github.com//pull/6393
- [VL] Fix shuffle spill triggered by evicting buffers during stop @marin-ma github.com//pull/6422
- [VL] Enable repeat function @zhli1142015 github.com//pull/6419
- [VL] Accelerate Arrow compile @jinchengchenghh github.com//pull/6426
- [CI][VL] Update docker image for CI @zhouyuan github.com//pull/6401
- [VL] Daily Update Velox Version (2024_07_12) @GlutenPerfBot github.com//pull/6417
- [VL] Daily Update Velox Version (2024_07_13) @GlutenPerfBot github.com//pull/6436
- [VL] Daily Update Velox Version (2024_07_14) @GlutenPerfBot github.com//pull/6441
- [VL] Set Arrow_SOURCE to AUTO to allow using system arrow libs @PHILO-HE github.com//pull/6325
- [CELEBORN] CHCelebornColumnarShuffleWriter supports celeborn.client.spark.shuffle.writer to use memory sort shuffle ClickHouse backend @SteNicholas github.com//pull/6432
- [VL] Make sure the same thrift lib bundled arrow build is used for building Velox @zhztheplayer github.com//pull/6431
- [CORE] Make SparkSession transient HiveTableScanExecTransformer @yikf github.com//pull/6410
- [6176][CH] Add tpcds suite from decimal table schema @loneylee github.com//pull/6369
- [VL] Move dependencies setup ahead @PHILO-HE github.com//pull/6444
- [CH][CELEBORN] CHCelebornColumnarShuffleWriter supports celeborn.client.spark.shuffle.writer to use memory sort shuffle ClickHouse backend @SteNicholas github.com//pull/6454
- [VL] Enable right and anti join smj @JkSelf github.com//pull/6449
- [CH][CELEBORN] CHCelebornColumnarBatchSerializer uses AtomicBoolean to identify whether to call close() to avoid calling close() twice situation @SteNicholas github.com//pull/6455
- [CI][VL] Re-enable a build job running on clean dockers weekly @PHILO-HE github.com//pull/6424
- [CORE] Update LICENSE, NOTICE, LICENSE-binary, NOTICE-binary @weiting-chen github.com//pull/6443
- [CORE] Change DISCLAIMER to DISCLAIMER-WIP @weiting-chen github.com//pull/6442
- [VL] RAS: Minor code cleanup for offloading project @zhztheplayer github.com//pull/6452
- [VL] Add a way to create static build with docker container and gluten-te @zhztheplayer github.com//pull/6457
- [6467][CH] Minor Fix Build @baibaichen github.com//pull/6468
- [VL] Minor improvements and fixes for gluten-it and gluten-te @zhztheplayer github.com//pull/6471
- [CORE] Fix fallback for spark sequence function with literal array data as input @gaoyangxiaozhu github.com//pull/6433
- [VL] Fix offload input_file_name assert error @zml1206 github.com//pull/6390
- [VL] update docker image for cache-native-lib job @yma11 github.com//pull/6466
- [BUILD] Fix unbound variable @zml1206 github.com//pull/6474
- [VL] Daily Update Velox Version (2024_07_16) @GlutenPerfBot github.com//pull/6460
- [6437][BUILD] Fix vcpkg setup-build-dependens.sh for centos @wecharyu github.com//pull/6438
- [6470][CH]Fix Task not serializable error when inserting mergetree data @zzcclp github.com//pull/6473
- [6425][CH] Support day time internval @lgbo-ustc github.com//pull/6456
- [VL] remove redundant code parquet datasource to avoid memory leakage PR6430 @liujp github.com//pull/6462
- [Core] Spark version function support @gaoyangxiaozhu github.com//pull/6469
- [VL] Daily Update Velox Version (2024_07_17) @GlutenPerfBot github.com//pull/6479
- [VL] Minor improvements on gluten-it / gluten-te toolchains @zhztheplayer github.com//pull/6476
- [CH] Support merge MergeTree files @liuneng1994 github.com//pull/6472
- [6463][CH]refactor the code of parsing join parameters @lgbo-ustc github.com//pull/6485
- [1632][CH]Daily Update Clickhouse Version (20240718) @kyligence-git github.com//pull/6491
- [VL] Daily Update Velox Version (2024_07_18) @GlutenPerfBot github.com//pull/6492
- [6495][VL] Fix build issue: --build_arrow=ON wipes --build_type= setting silently @PHILO-HE github.com//pull/6498
- [VL] RAS: Make default rough cost model exhaustively offload computations @zhztheplayer github.com//pull/6493
- [VL] Print exception early when raised from ManagedReservationListener#unreserve @zhztheplayer github.com//pull/6504
- [VL] Fix broken GHA CI cache, add cache for Centos 8 build @zhztheplayer github.com//pull/6497
- [CORE] Prevent CH backend from referencing arrow-gluten Jars @zhztheplayer github.com//pull/6494
- [VL] Oops, a minor follow-up to #6497 @zhztheplayer github.com//pull/6516
- [1632][CH]Daily Update Clickhouse Version (20240719) @kyligence-git github.com//pull/6511
- [6067][VL] [Part 3-1] Refactor: Rename VeloxColumnarWriteFilesExec to ColumnarWriteFilesExec @baibaichen github.com//pull/6403
- [VL] Daily Update Velox Version (2024_07_19) @GlutenPerfBot github.com//pull/6512
- [DOC] Update documents @PHILO-HE github.com//pull/6344
- [VL] Ensure sed pattern can be matched when modifying velox setup scripts @PHILO-HE github.com//pull/6505
- [6463][CH] Enable cartesion product @lgbo-ustc github.com//pull/6510
- [6523][VL] fix: Remove fix for stringop-overflow warning alinux3 @majetideepak github.com//pull/6522
- [6501][VL] Fix the missing fileReadProperties when constructing a LocalFilesNode @kecookier github.com//pull/6503
- [VL][DOC] Add uniffle doc @summaryzb github.com//pull/6533
- [6529] Fix build error on macOS caused by ConfigArrow.cmake @xumingming github.com//pull/6530
- [6509] enable read iceberg table with timestamptz as partitioned column. @j7nhai github.com//pull/6508
- [VL] Add thread_safe to several VeloxRuntime classes @FelixYBW github.com//pull/6526
- [VL] Fix weekly build job @PHILO-HE github.com//pull/6543
- [MINOR] Add ep/_ep/ to .gitignore @wForget github.com//pull/6547
- [6534] [VL] Fix ObjectStore::stores initialized twice issue @xumingming github.com//pull/6549
- [6499][CH] Support soft affinity for mergetree @loneylee github.com//pull/6545
- [6535] Make helper scripts executable @xumingming github.com//pull/6536
- [VL] Daily Update Velox Version (2024_07_23) @zhztheplayer github.com//pull/6552
- [VL] Fix for centos9 build of Gluten @deepashreeraghu github.com//pull/6183
- [MINOR][VL] Remove duplicate mvn packages @wForget github.com//pull/6560
- [VL] Following #6526, minor fixes and improvements @zhztheplayer github.com//pull/6554
- [VL] Row based sort shuffle implementation @marin-ma github.com//pull/6475
- [6562][VL] Decouple BUILD_BENCHMARKS and BUILD_TESTS build options @NEUpanning github.com//pull/6563
- [VL] Daily Update Velox Version (2024_07_24) @GlutenPerfBot github.com//pull/6567
- [VL] Add config for show velox task metrics when finished @Yohahaha github.com//pull/6573
- [6477][VL] Fix occasional dead lock during spilling @zhztheplayer github.com//pull/6515
- [VL] Move setup-centos7.sh & setup-centos8.sh into Gluten and clean up some script code @PHILO-HE github.com//pull/6559
- [VL] Daily Update Velox Version (2024_07_25) @zhztheplayer github.com//pull/6582
- [VL] Update to add centos9 for weekly build @deepashreeraghu github.com//pull/6580
- [VL] Update Velox Version (2024_07_25-1) @zhztheplayer github.com//pull/6584
- [VL] Enable timestamp parquet write @JkSelf github.com//pull/6428
- [6067][CH] [Part 3-2] Basic support for Native Write Spark 3.5 @baibaichen github.com//pull/6586
- [VL] Gluten-it: --data-gen-strategy=once to skip generating data when it already exists @zhztheplayer github.com//pull/6587
- [6589][CH] Mergetree supported spark.sql.caseSensitive @loneylee github.com//pull/6592
- [Minor] Move a test from spark-3.2 module to a common test module @PHILO-HE github.com//pull/6585
- [6604][CH] Fix mergetree partition with whitespace error @loneylee github.com//pull/6605
- [6195][VL]Add unit tests for udf @NEUpanning github.com//pull/6603
- [VL] Minor: Remove deprecated GHA jobs @PHILO-HE github.com//pull/6606
- [VL] Daily Update Velox Version (2024_07_26) @GlutenPerfBot github.com//pull/6597
- [1632][CH]Daily Update Clickhouse Version (20240727) @kyligence-git github.com//pull/6611
- [VL] Fix std::min params type mismatch Apple clang 15 @zml1206 github.com//pull/6593
- [6544][CH] Support existence join @lgbo-ustc github.com//pull/6548
- [VL] Improve package scripts @wForget github.com//pull/6569
- [VL] Enable timestamp and binary type for HLL agg function @zhli1142015 github.com//pull/6619
- [VL] Expose API
SparkMemoryUtil.dumpMemoryManagerStats(tmm: TaskMemoryManager)
for debugging purpose @zhztheplayer github.com//pull/6617 - [VL] Row based sort follow-up @marin-ma github.com//pull/6579
- [VL] Daily Update Velox Version (2024_07_29) @zhztheplayer github.com//pull/6616
- [VL] Daily Update Velox Version (2024_07_30) @GlutenPerfBot github.com//pull/6626
- [6583][CH] Fix a bug serializing aggregating keys which are complicated types @lgbo-ustc github.com//pull/6624
- [VL] Row-based sort shuffle follow-up (minor) @marin-ma github.com//pull/6628
- [MINOR] Reduce unnecessary dependencies @wForget github.com//pull/6608
- [VL] Enable collect_set, min, max for complex types @zhli1142015 github.com//pull/6629
- [VL] Spark mask function support @gaoyangxiaozhu github.com//pull/6271
- [CH] Refactor off heap memory management, clean shuffle write code @liuneng1994 github.com//pull/6558
- [6561][CH] Fix incompatiable type exception throw capture function while processing array literal with
transform
@taiyang-li github.com//pull/6601 - [6632] Bump Celeborn 0.4.2 and 0.5.1 @SteNicholas github.com//pull/6633
- [1632][CH]Daily Update Clickhouse Version (20240730) @kyligence-git github.com//pull/6640
- [VL] Daily Update Velox Version (2024_07_31) @GlutenPerfBot github.com//pull/6643
- [VL] Reduce spill sort-based shuffle @marin-ma github.com//pull/6639
- [6612] Fix ParquetFileFormat issue caused by the setting of local property isNativeApplicable @PHILO-HE github.com//pull/6627
- [VL] Support Sum(Literal)/Count(Literal) with empty input schema @zml1206 github.com//pull/6631
- [6067][CH][MINOR][UT] Pass backends-clickhouse ut Spark 3.5 @baibaichen github.com//pull/6623
- [VL] Support row type and fix subfield filter push-down @rui-mo github.com//pull/6618
- [6610] Update clickhouse.md due to upgrade to clang-18 @lwz9103 github.com//pull/6654
- [6590][CH] Support compact mergetree file on s3 @lwz9103 github.com//pull/6591
- [VL] Daily Update Velox Version (2024_08_01) @GlutenPerfBot github.com//pull/6664
- [VL] Gluten-it: --auto-cluster-resource to automatically set up CPU cores and memory sizes for local cluster @zhztheplayer github.com//pull/6655
- [6600][VL] Support date type window range frame @zml1206 github.com//pull/6653
- [CH] Fix some test cases too slow @liuneng1994 github.com//pull/6659
- [6656][CELEBORN] Fix CelebornColumnarShuffleWriter assertion failed @exmy github.com//pull/6657
- [1632][CH]Daily Update Clickhouse Version (20240801) @kyligence-git github.com//pull/6665
- [CORE] Propagate SQLConf to code block of TaskResources.runUnsafe @zhztheplayer github.com//pull/6658
- [INFRA] Support automatically label new pull requests @ulysses-you github.com//pull/6668
- [6483] Support Uniffle 0.9.0 @SteNicholas github.com//pull/6484
- [CH] Hotfix a configuration bug shuffle writer @liuneng1994 github.com//pull/6677
- [VL] Set default validation log level to WARN @yma11 github.com//pull/6676
- [6483][VL][DOC] Upgrade Uniffle version to 0.9.0 Velox.md @SteNicholas github.com//pull/6680
- [Core] Make collectQueryExecutionFallbackSummary as a public util method @wForget github.com//pull/6679
- [6557][CH] Try to replace sort merge join with hash join when cannot offload it @lgbo-ustc github.com//pull/6570
- [VL] Daily Update Velox Version (2024_08_02) @GlutenPerfBot github.com//pull/6684
- [6645][VL] Remove VeloxWriteQueue which may introduce deadlock @WangGuangxin github.com//pull/6646
- [VL] Recover broken memory-trace option spark.gluten.backtrace.allocation @zhztheplayer github.com//pull/6635
- [VL] Allow specifying maximum batch size for batch resizing @zhztheplayer github.com//pull/6670
- [VL] Eliminate pre local sort after offload date type range frame window @zml1206 github.com//pull/6667
- [6701][CH] fix: Performace regression at 20240802 daily build @baibaichen github.com//pull/6702
- [1632][CH]Daily Update Clickhouse Version (20240803) @kyligence-git github.com//pull/6700
- [CH] Support CACHE DATA command for MergeTree table @liuneng1994 github.com//pull/6621
- [6695][CH] Introduce shuffleWallTime CHMetricsApi to calculate the overall shuffle write time @SteNicholas github.com//pull/6696
- [6588][CH] Cast columns if necessary before finally writing to ORC/Parquet files during native inserting @taiyang-li github.com//pull/6691
- [VL] Remove redundant hash function substrait function validation @jinchengchenghh github.com//pull/6690
- [6656][UNIFFLE] VeloxUniffleColumnarShuffleWriter should send commit for all ColumnBatch with empty rows @SteNicholas github.com//pull/6698
- [CH] Fix debug building error @taiyang-li github.com//pull/6710
- [VL] Daily Update Velox Version (2024_08_05) @GlutenPerfBot github.com//pull/6708
- [VL] Fix out-of-date centos 7 image velox_docker_cache.yml @zhztheplayer github.com//pull/6719
- [VL] Daily Update Velox Version (2024_08_06) @GlutenPerfBot github.com//pull/6717
- [CORE] Bump version to 1.3.0-SNAPSHOT @PHILO-HE github.com//pull/6607
- [6669][CH] Fix diff of cast string to boolean @exmy github.com//pull/6711
- [6531] Minor polish for metrics code @xumingming github.com//pull/6532
- [VL] Use conf to control C2R occupied memory @XinShuoWang github.com//pull/5952
- [1632][CH]Daily Update Clickhouse Version (20240806) @kyligence-git github.com//pull/6718
- [6686][CH] Disable percent_rank @lgbo-ustc github.com//pull/6687
- [VL] Add reader process to shuffle benchmark @marin-ma github.com//pull/6682
- [VL] Minor follow-ups for PRs @zhztheplayer github.com//pull/6693
- [VL] Allow udf type conversion @marin-ma github.com//pull/6660
- [VL] Use the scripts dir the current path as SCRIPTDIR @liujiayi771 github.com//pull/6729
- [6561][CH] Fix exception when mapFromArrays accepts its first argument with type Array(Nullable(T)) @taiyang-li github.com//pull/6721
- [VL] Doc: Update RSS docs @wForget github.com//pull/6692
- [6589][CH] Mergetree supported spark.sql.caseSensitive[Part.2] @loneylee github.com//pull/6733
- [VL] Skip UTF-8 validation JSON parsing @PHILO-HE github.com//pull/6661
- [CH] Fix memory config spill_mem_ratio always zero @liuneng1994 github.com//pull/6743
- [VL] Fix arrow lib conflict on centos-9 @PHILO-HE github.com//pull/6742
- [VL] Reduce memory waste sort based shuffle @marin-ma github.com//pull/6727
- [CORE] Fix schema mismatch between ReadRelNode and LocalFilesNode @jiangjiangtian github.com//pull/6746
- [6736] Phase 1: Use task-shared lock ManagedReservationListener @zhztheplayer github.com//pull/6741
- [6569][FOLLOWUP][VL] Delete unnecessary gcc9 enable of package script @wForget github.com//pull/6730
- [VL] Hot fix - mistakenly changed debug log @marin-ma github.com//pull/6751
- [6681] [CH]fix array(decimal32) CH columnar to row @loudongfeng github.com//pull/6722
- [6705] [CORE] [Part 1] Avoid adding c2r for ColumnarWriteFilesExec, since it neither output Columnar batch data nor InternalRow @baibaichen github.com//pull/6745
- [VL] Use Velox's monolithic build @PHILO-HE github.com//pull/6731
- [CELEBORN][FOLLOWUP] Add compression for row-based shuffle @kerwin-zk github.com//pull/6739
- [1632][CH]Daily Update Clickhouse Version (20240808) @kyligence-git github.com//pull/6755
- [VL] Daily update velox version 08-08 @jinchengchenghh github.com//pull/6752
- [VL] Daily Update Velox Version (2024_08_09) @GlutenPerfBot github.com//pull/6762
- [VL] Minor class name / package name clean-ups @zhztheplayer github.com//pull/6720
- [VL] Add a new test case for FlushableHashAggregateRule's coverage @zhztheplayer github.com//pull/6757
- [Minor] Clean up useless code ParquetFileFormat/OrcFileFormat @PHILO-HE github.com//pull/6663
- [6737]package delta into bundle jar when specify delta profile @dcoliversun github.com//pull/6738
- [6705]][CORE][VL][CH] [Part-2] Rework CumnarWriteFilesExec @baibaichen github.com//pull/6761
- [VL] Fix shuffle spill not reported to spark metric @marin-ma github.com//pull/6740
- [6750][CH] Fix optimize error if file mappings not loaded @lwz9103 github.com//pull/6753
- [MINOR][CH] Rename package of some extension rules @zml1206 github.com//pull/6747
- [MINOR] update repository first setup-ubuntu envs @wecharyu github.com//pull/6749
- [VL] RAS: Renew validator instance for each rule applier call @zhztheplayer github.com//pull/6766
- [1632][CH]Daily Update Clickhouse Version (20240809) @kyligence-git github.com//pull/6764
- [6388][CH] Support function format @taiyang-li github.com//pull/6716
- [VL] RAS: Add a new built-in cost model that avoids offloading trivial projects if its neighbor nodes fell back @zml1206 github.com//pull/6756
- [6768][CH] Clear mixed join contition to avoid uneccessary data copy @lgbo-ustc github.com//pull/6769
- [VL] Fix High Precision Rounding @ArnavBalyan github.com//pull/6707
- [6705][CH] Basic Support Delta write @baibaichen github.com//pull/6767
- [VL] Add Scala 2.13 support @Preetesh2110 github.com//pull/6326
- [VL] Daily Update Velox Version (2024_08_10) @GlutenPerfBot github.com//pull/6771
- [VL] Daily Update Velox Version (2024_08_11) @GlutenPerfBot github.com//pull/6775
- [VL] Fall back scan if file scheme is not supported by registered file systems @zhli1142015 github.com//pull/6672
- [6778][CH] Enable percent_rank again @lgbo-ustc github.com//pull/6779
- [6724][CH] Shuffle writer supports compression level configuration for CompressionCodecFactory @SteNicholas github.com//pull/6725
- [GLUEN-6506][CH]Fix ORC read wrong timestamp value @KevinyhZou github.com//pull/6507
- [VL] Update a dockerfile used for CI vcpkg build @PHILO-HE github.com//pull/6781
- [6736][VL] Phase 2: Minimize lock scope ListenableArbitrator @zhztheplayer github.com//pull/6783
- [6674][CH] Support sort merge join metrics @SteNicholas github.com//pull/6774
- [CORE] Following #6745, append some minor code cleanups @zhztheplayer github.com//pull/6788
- [VL] Daily update velox version(2024_08_13) @jinchengchenghh github.com//pull/6794
- [VL] Fix arrow dataset csv scan IncompatibleClassChangeError @jinchengchenghh github.com//pull/6785
- [VL] Fix parquet write sort spill OOM @jinchengchenghh github.com//pull/6480
- [6148][CORE] Simplify JniLibLoader loading mechanism for native libraries @ArnavBalyan github.com//pull/6791
- [VL] Daily Update Velox Version (2024_08_14) @GlutenPerfBot github.com//pull/6821
- [VL] Enable an integration test case CI OOM tests @zhztheplayer github.com//pull/6804
- [VL] Add shuffle writer type to ColumnarExchange display string @marin-ma github.com//pull/6799
- [6768][CH] Try to reorder hash join tables based on AQE statistics @lgbo-ustc github.com//pull/6770
- [6600]Fix NPE issue when running window sql @JkSelf github.com//pull/6803
- [CORE] Remove fixed 1.8 Java compiler version module gluten-ut-common (#6825) @zhztheplayer github.com//pull/6825
- [VL] Remove lz4 change modify_velox.patch @jinchengchenghh github.com//pull/6824
- [VL] Validate binary expressions with their accepted types @PHILO-HE github.com//pull/6521
- [6819][CH] Refactor source from jave iter && make casting happens before materializing @taiyang-li github.com//pull/6830
- [VL] Enable full functionality of split function @rui-mo github.com//pull/4752
- [VL] No need to obtain old shrunken memory @boneanxs github.com//pull/6847
- [6834][CORE] Remove unused DDL plan that doesn't correspond to Substrait spec @EpsilonPrime github.com//pull/6833
- [CORE] Remove an unused binary file @zhztheplayer github.com//pull/6838
- [6860][CH] Minor refactors on expand operator @taiyang-li github.com//pull/6861
- [VL] Verify empty2null is offloaded when v1writer fallback @Yohahaha github.com//pull/6859
- [VL] Update document for split and mask functions @gaoyangxiaozhu github.com//pull/6858
- [1632][CH]Daily Update Clickhouse Version (20240815) @kyligence-git github.com//pull/6848
- [VL] Add a docker build job and reuse pre-built arrow libs @PHILO-HE github.com//pull/6826
- [VL] Daily Update Velox Version(2024_08_15) @jinchengchenghh github.com//pull/6851
- [VL] Remove suspend section when spilling Velox task @zhztheplayer github.com//pull/6875
- [6768][CH] Try to use multi join on clauses instead of inequal join condition @lgbo-ustc github.com//pull/6787
- [6822][VL] Fix wrong maxRowsToInsert and sort time metrics @marin-ma github.com//pull/6832
- [VL] Fix warning when spark.gluten.sql.columnarToRowMemoryThreshold is not set @zhztheplayer github.com//pull/6866
- [VL] Fix Arrow ColumnarBatch cannnot revoke rowIterator correctly @jinchengchenghh github.com//pull/6797
- [6768][CH] Refactor reordering shuffle hash join tables @lgbo-ustc github.com//pull/6854
- [6819][CH] HOTFIX variable shadow source from jave iter @taiyang-li github.com//pull/6885
- [6879][CH] Fix partition value diff when it contains blank spaces @taiyang-li github.com//pull/6880
- [6878][CH] Avoid name collisions naming aggregate result @lgbo-ustc github.com//pull/6886
- [6887][VL] Daily Update Velox Version (2024_08_16) @GlutenPerfBot github.com//pull/6872
- [6067][CH][MINOR][UT] Followup 6623, fix backends-clickhouse ut issse CI @baibaichen github.com//pull/6891
- [1632][CH]Daily Update Clickhouse Version (20240817) @kyligence-git github.com//pull/6903
- [6827][VL] Add a new test case for Round's coverage @jiangjiangtian github.com//pull/6884
- [6849][VL] Call static initializers once Spark local mode / when session is renewed @zhztheplayer github.com//pull/6855
- [6889][VL] Rename test class
TestOperator
toMiscOperatorSuite
@zhztheplayer github.com//pull/6890 - [6915][MISC]Fix workflow permission issue. @weiting-chen github.com//pull/6911
- [VL][CI] Change to use push event to trigger docker build workflow @PHILO-HE github.com//pull/6918
- [6368] Redact sensitive configs when calling
gluten::printConfig
@ArnavBalyan github.com//pull/6793 - [6887][VL] Daily Update Velox Version (2024_08_19) @GlutenPerfBot github.com//pull/6910
- [CH] A simple job scheduler for merge tree cache sync load @liuneng1994 github.com//pull/6842
- [6915][CORE]Fix listComments TypeError @weiting-chen github.com//pull/6919
- [VL] Add helper function ColumnarBatches.toString and InternalRow toString @jinchengchenghh github.com//pull/6458
- [6887][VL] Daily Update Velox Version (2024_08_20) @GlutenPerfBot github.com//pull/6928
- [CH]duplicate column name case support broadcast join #6926 @loudongfeng github.com//pull/6927
- [3582][CH] Fix bug for decimal and float type @baibaichen github.com//pull/6925
- [6915][CH] Follow VL, fix github issue comment @lwz9103 github.com//pull/6922
- [1632][CH]Daily Update Clickhouse Version (20240820) @kyligence-git github.com//pull/6929
- [6902][VL]fix: Update to copy new LICENSE, NOTICE into jar @weiting-chen github.com//pull/6901
- [6882][CORE] Move Spark / columnar rule list to backend code @zhztheplayer github.com//pull/6931
- [6864][VL] Set a Velox gflag to allow growing buffer created another Velox task @zhztheplayer github.com//pull/6932
- [6893][VL] Change to using native libs generated by vcpkg build Gluten scala tests @PHILO-HE github.com//pull/6894
- [6935][CH]query fails when set session level join_algorithm to… @loudongfeng github.com//pull/6944
- [6887][VL] Daily Update Velox Version (2024_08_21) @GlutenPerfBot github.com//pull/6946
- [CH] Added cleanup logic for expiration mergetree part cache @liuneng1994 github.com//pull/6955
- [6840][CH] Enable cache files for hdfs @loneylee github.com//pull/6841
- [VL] Print memory statistics during task ending when leak is found @zhztheplayer github.com//pull/6959
- [6923][CH]
total_bytes_written
is not updated celeborn partition writers @lgbo-ustc github.com//pull/6939 - [6950][CORE] Move specific rules into backend modules @zhztheplayer github.com//pull/6953
- [6908][VL] Fix error when getting output from a Velox task that is under spilling by background thread @zhztheplayer github.com//pull/6934
- [VL] Malformed CI job name @zhztheplayer github.com//pull/6956
- [6893][VL] Fix wrong github workflows path for hashing and minor code refactor @PHILO-HE github.com//pull/6952
- [5936][VL] Add more types function type validation and document Cast function @PHILO-HE github.com//pull/6963
- [CH] Ignore cache file with hdfs suite @loneylee github.com//pull/6969
- [6938][CH] Fix core dump when range partition include literal @baibaichen github.com//pull/6964
- [6957][VL] Fix missing mvn when CI cache is hit @PHILO-HE github.com//pull/6966
- [6957][VL] Fix mvn not found cache job @PHILO-HE github.com//pull/6974
- [6887][VL] Daily Update Velox Version (2024_08_22) @GlutenPerfBot github.com//pull/6967
- [1632][CH]Daily Update Clickhouse Version (20240822) @kyligence-git github.com//pull/6968
- [6980][CORE] shim poms, use Scala Maven compiler configuration inherited from parent pom @zhztheplayer github.com//pull/6972
- [VL] Following #6959, leak memory dump is not correctly printed @zhztheplayer github.com//pull/6985
- [6987][DOC] fix: add shell newline character after spark-shell @dcoliversun github.com//pull/6986
- [1632][CH]Daily Update Clickhouse Version (20240823) @kyligence-git github.com//pull/6984
- [VL] Add wallnanos for WriteFiles @Yohahaha github.com//pull/6976
- [VL] Remove config
a.g.s.c.extended.columnar.transform.rules
anda.g.s.c.extended.columnar.post.rules
from Velox backend @zhztheplayer github.com//pull/6991 - [6887][VL] Daily Update Velox Version (2024_08_23) @GlutenPerfBot github.com//pull/6983
- [6981][CH]Not supported operator TakeOrderedAndProjectExecTransformer for BroadcastRelation @loudongfeng github.com//pull/6982
- [6877][CH] Support anti/semi join with inequal join condition @lgbo-ustc github.com//pull/6913
- [VL] Support create temporary function for native hive udf @marin-ma github.com//pull/6829
- [6877][CH][UT] HotFix: Exclude unstable merge join q72 @baibaichen github.com//pull/7006
- [7008][VL] Report spill metrics from Velox operators to Spark task @zhztheplayer github.com//pull/7009
- [6960][VL] Limit Velox untracked global memory manager's usage @zhztheplayer github.com//pull/6988
- [6951][CORE][CH] Move CustomerExpressionTransformer to CH backend @zhztheplayer github.com//pull/6993
- [VL] Following #6988, move a warning from core to Velox backend @zhztheplayer github.com//pull/7010
- [VL][uniffle] Correct the write wait duration log @zuston github.com//pull/6994
- [6887][VL] Daily Update Velox Version (2024_08_26) @GlutenPerfBot github.com//pull/7002
- [6997][VL] Ignore a test: cleanup file if job failed @PHILO-HE github.com//pull/6965
- [6887][VL] Daily Update Velox Version (2024_08_27) @GlutenPerfBot github.com//pull/7018
- [CORE] Rename OASPackageBridges @zhztheplayer github.com//pull/7022
- [CH] Enable more uts GlutenOrcV1SchemaPruningSuite @taiyang-li github.com//pull/6895
- [VL] Add write IO metrics for WriteFiles @Yohahaha github.com//pull/7011
- [7035][VL] Use first line of
ls-remote
's output as build's target commit @wForget github.com//pull/7036 - [6977][CH] Remove concat function parser @taiyang-li github.com//pull/6978
- [6995][Core] Limit soft affinity duplicate reading detection max cache items @zhli1142015 github.com//pull/7003
- [5471][VL]feat: Support read Hudi COW table @yma11 github.com//pull/6049
- [CORE] Fix incorrect precision of decimal literal @jiangjiangtian github.com//pull/6954
- [VL] Set Spark memory overhead automatically according to off-heap size when it's not explicitly configured @zhztheplayer github.com//pull/7045
- [VL] Remove including xsimd headers coming from velox build path @PHILO-HE github.com//pull/7044
- [7037][VL] Add dwarf dependency to folly when building with vcpkg @Z1Wu github.com//pull/7038
- [7033][VL] Improve vcpkg docker file @wForget github.com//pull/7030
- [4724][CH] Support function array_except @taiyang-li github.com//pull/7039
- [1632][CH]Daily Update Clickhouse Version (20240828) @kyligence-git github.com//pull/7040
- [CORE] Fix a variable name typo @ychris78 github.com//pull/7053
- [7049][VL] Install lib stemmer through vcpkg @PHILO-HE github.com//pull/7050
- [6989][CH] Support RTrim with const source column @lwz9103 github.com//pull/6992
- [7024][VL] Skip call collectMetrics when the task does not call next() @kecookier github.com//pull/7025
- [7014][CH] Fix: different results from
get_json_object
@lgbo-ustc github.com//pull/7034 - [7031][CORE] Initialize new module structure gluten-core / gluten-substrait @zhztheplayer github.com//pull/7057
- [6961][VL][feat] Add decimal write support for ArrowWritableColumnVector @jinchengchenghh github.com//pull/6962
- [1632][CH]Daily Update Clickhouse Version (20240830) @kyligence-git github.com//pull/7062
- [CH] Add GlutenJsonExpressionsSuite @exmy github.com//pull/7064
- [6887][VL] Daily Update Velox Version (2024_08_28) @GlutenPerfBot github.com//pull/7041
- [6887][VL] Daily Update Velox Version (2024_08_31) @GlutenPerfBot github.com//pull/7070
- [6887][VL] Daily Update Velox Version (2024_09_01) @GlutenPerfBot github.com//pull/7073
- [341][CH] Support BHJ + isNullAwareAntiJoin for the CH backend @zzcclp github.com//pull/7072
- [6887][VL] Daily Update Velox Version (2024_09_03) @GlutenPerfBot github.com//pull/7085
- [VL] Gluten-it: Remove a IDE-generated maven module name @zhztheplayer github.com//pull/7091
- [7031] Move task lifecycle management / memory consumer facilities to gluten-core @zhztheplayer github.com//pull/7088
- [7090][VL] fix: Number of sorting keys must be greater than zero @dcoliversun github.com//pull/7089
- [7054][CH] Fix cse alias issues @taiyang-li github.com//pull/7084
- [7077][CH]
have_compressed
is lostHashJoin::reuseJoinedData
@lgbo-ustc github.com//pull/7083 - [VL] Remove a limit for BHJ stage fallback policy @PHILO-HE github.com//pull/7105
- [7031] Move iterator wrappers to gluten-core @zhztheplayer github.com//pull/7095
- [6589][CH] Fix alias name cause caseSensitive error on mergetree create @loneylee github.com//pull/7063
- [6809][CH] Support function unix_seconds/unix_date/unix_micros/unix_millis @taiyang-li github.com//pull/7094
- [6887][VL] Daily Update Velox Version (2024_09_04) @GlutenPerfBot github.com//pull/7106
- [VL] Remove Spark tokenizer @rui-mo github.com//pull/6713
- [7068][CORE] Fix issue updating leaf input metrics @ivoson github.com//pull/7067
- [7015][VL] Remove udf native registration @marin-ma github.com//pull/7016
- [V] Remove complex type fallback for parquet @yma11 github.com//pull/6712
- [6887][VL] Daily Update Velox Version (2024_09_05) @GlutenPerfBot github.com//pull/7119
- [7118][VL] Fix duckdb target issue when vcpkg is enabled @PHILO-HE github.com//pull/7117
- [5880][CORE] Ignore fallback for ColumnarWriteFilesExec children @wForget github.com//pull/7113
- [6813][CH] Support soundex function @taiyang-li github.com//pull/7093
- [6748][CORE] Search stack trace to infer adaptive execution context @PHILO-HE github.com//pull/7121
- [6571][VL] Add platform and arch subdirectory for base lib package @wForget github.com//pull/6942
- [6863][VL] Pre-alloc and reuse compress buffer to avoid OOM spill @marin-ma github.com//pull/6869
- [7130][CORE] Skip command execution when collect qe fallback summary @wForget github.com//pull/7132
- [6887][VL] Daily Update Velox Version (2024_09_06) @GlutenPerfBot github.com//pull/7136
- [7031][CORE] Move JNI / exception utilities to gluten-core @zhztheplayer github.com//pull/7134
- [VL] CI: Run Q97 oom test but ignore the failure @zhztheplayer github.com//pull/7135
- [VL] New option to follow vanilla Spark's build side shuffled hash join @zhztheplayer github.com//pull/7133
- [VL] Minor follow-ups for #6942 @zhztheplayer github.com//pull/7129
- [CH] Add package with spark 3.5 @loneylee github.com//pull/7140
- [7028][CH][Part-1] Using
PushingPipelineExecutor
to write merge tree @baibaichen github.com//pull/7029 - [4039][VL] Support array insert function for spark 3.4+ @ivoson github.com//pull/7123
- [7032][CH] Fix incorrect result using timestamp in-filter @lwz9103 github.com//pull/7122
- [7004][CORE] Bump Spark version to 3.4.3 @Yohahaha github.com//pull/7115
- [VL] Fix function
input_file_name()
outputs empty string certain query plan patterns @zml1206 github.com//pull/7124 - [1632][CH]Daily Update Clickhouse Version (20240906) @kyligence-git github.com//pull/7137
- [7144][VL][RAS] Spark input file function support @zml1206 github.com//pull/7146
- [6834][CORE] feat: add other join types from the official Substrait @EpsilonPrime github.com//pull/6835
- [7148][CORE] Remove meaningless plan change log on TransformPreOverrides rule @wecharyu github.com//pull/7150
- [7155][CH] Fix bucket table create error by mergetree @loneylee github.com//pull/7156
- [CH] Refactor: Move SerializedPlanParser::global_context to QueryContext @baibaichen github.com//pull/7147
- Revert "[6930][VL] Print memory statistics during task ending when leak is found" @zhztheplayer github.com//pull/7158
- [6887][VL] Daily Update Velox Version (2024_09_07) @GlutenPerfBot github.com//pull/7152
- [VL] CI: GHA CI, set timeout for Q97 OOM job @zhztheplayer github.com//pull/7162
- [7023][CH] Shade dependency jars @loudongfeng github.com//pull/7027
- [VL] Fix weekly build job failure @PHILO-HE github.com//pull/7163
- [7112][CH] Pushdown aggregation's pre-projection ahead expand node @lgbo-ustc github.com//pull/7142
- [CH] Minor, update package.sh @lwz9103 github.com//pull/7175
- [6808][CH] support function arrays_zip @taiyang-li github.com//pull/7048
- [6887][VL] Daily Update Velox Version (2024_09_10) @GlutenPerfBot github.com//pull/7172
- [7164][VL] Disable background IO threads by default @zhztheplayer github.com//pull/7165
- [VL][CI] Upgrade GHA upload/download artifacts @PHILO-HE github.com//pull/7182
- [6887][VL] Daily Update Velox Version (2024_09_11) @GlutenPerfBot github.com//pull/7190
- [VL][MINOR] Allow build_gluten_cpp read custom velox home @boneanxs github.com//pull/7184
- [7177][CH] Fix read hdfs performance issue @loneylee github.com//pull/7187
- [7180][CH] Fix ut
Eliminate NAAJ when BuildSide is HashedRelationWithAllNullKeys
for the CH backend when the aqe is on @zzcclp github.com//pull/7181 - [7179][CH] Fix infinite loop with parquet column index reader @lwz9103 github.com//pull/7185
- [CH] Shuffle writer connects to CH pipeline @liuneng1994 github.com//pull/6723
- [7100][CH] support function timestamp_seconds/timestamp_millis/timestamp_micros @taiyang-li github.com//pull/7102
- [MINOR][BUILD] Extract gcc version from libgluten.so @wecharyu github.com//pull/7189
- [CH] Fix load cache missing columns @liuneng1994 github.com//pull/7192
- [VL] Make conf option
s.g.s.c.shuffledHashJoin.optimizeBuildSide
work correctly with options.g.s.c.forceShuffledHashJoin
@zhztheplayer github.com//pull/7186 - [6887][VL] Daily Update Velox Version (2024_09_12) @GlutenPerfBot github.com//pull/7199
- [CORE] minor: Duplicated inheritance on SparkPlan @zml1206 github.com//pull/7197
- [VL] Add tests for Velox SMJ's coverage @zhztheplayer github.com//pull/7195
- [7087][CH] Support
WindowGroupLimitExec
@lgbo-ustc github.com//pull/7176 - [7202][CH] Fix: local executor cannot dump pipeline stats @lgbo-ustc github.com//pull/7204
- [7145][CH][PART]refactor for rel parsers @lgbo-ustc github.com//pull/7193
- [VL] Fix ioWaitTime metrics for scan @Yohahaha github.com//pull/7198
- [7205] [VL] Optimize row to column for scalar type @jinchengchenghh github.com//pull/7206
- [6816][CH] support function zip_with with some minor refactors @taiyang-li github.com//pull/7211
- [7208][VL]fix: loading libvelox.so failed when using static glog @JinHelin404 github.com//pull/7209
- [6887][VL] Daily Update Velox Version (2024_09_13) @rui-mo github.com//pull/7219
- [7031][VL] Minimized backend API @zhztheplayer github.com//pull/7218
- [7224][CH]Update doc for compiling ch backend @lgbo-ustc github.com//pull/7225
- [7224][CH] update doc for compiling ch backend @lgbo-ustc github.com//pull/7227
- [VL] Collapse trivial projects generated by rule PushDownInputFileExpression @zml1206 github.com//pull/7188
- [VL] Rename ShuffledHashJoinExecTransformer.scala to HashJoinExecTransformer.scala @PHILO-HE github.com//pull/7228
- [CORE] Minor code cleanup for package object of
org.apache.gluten
@zhztheplayer github.com//pull/7231 - [VL] RAS: Avoid adding R2C whose schema contains complex data types @zhztheplayer github.com//pull/7229
- [7222][CH] Fail to compile ch backend @lgbo-ustc github.com//pull/7226
- [7116] [CH] support outer explode @shuai-xu github.com//pull/7207
- [7224][CH]Update ClickHouse.md @lgbo-ustc github.com//pull/7230
- [6805][CH] support function array_remove/array_repeat @taiyang-li github.com//pull/7210
- [6887][VL] Daily Update Velox Version (2024_09_14) @GlutenPerfBot github.com//pull/7237
- [7241][VL] Correct loaded libname SharedLibraryLoader @wForget github.com//pull/7245
- [CH] Support rocksdb disk metadata @liuneng1994 github.com//pull/7239
- [7243][VL] Fix Q97 cross-task spilling hangs @zhztheplayer github.com//pull/7244
- [7213][CORE] Make fallback reason for CheckOverflowInTableInsert clearer @wForget github.com//pull/7248
- [CH] Fix GlutenLiteralExpressionSuite and GlutenMathExpressionsSuite @taiyang-li github.com//pull/7235
- [6887][VL] Daily Update Velox Version (2024_09_18) @GlutenPerfBot github.com//pull/7259
- [7220][CH]Fix expand bug grouping sets query @KevinyhZou github.com//pull/7221
- [7203][CORE] Make push down filter to scan as a individual rule @zml1206 github.com//pull/7215
- [1632][CH]Daily Update Clickhouse Version (20240918) @kyligence-git github.com//pull/7260
- [7264][CORE][VL] Reduce module dependencies of
gluten-data
@zhztheplayer github.com//pull/7265 - [7276][VL] Make fallback reason for GetStructField clearer @wForget github.com//pull/7277
- [6887][VL] Daily Update Velox Version (2024_09_19) @JkSelf github.com//pull/7272
- [7264][VL] Rename module
gluten-data
togluten-arrow
@zhztheplayer github.com//pull/7278 - [VL] Fix bug when setting Spark memory overhead automatically @leoluan2009 github.com//pull/7275
- [7262][CH] Fix cache file commond run normal with config disabled @loneylee github.com//pull/7263
- [7028][CH][Part-2] Refactor: Move MergeTree related UT to mergetree module @baibaichen github.com//pull/7279
- [CORE][VL] Minor code cleanups @zhztheplayer github.com//pull/7280
- [VL] Fix columns added to
outNames
twice when building Substrait plan @zml1206 github.com//pull/7274 - [VL] Gluten-it: auto cluster mode, add option
--off-heap-ratio
for adjusting memory shares of off-heap and on-heap @zhztheplayer github.com//pull/7286 - [6768][CH] Clear unused configures after refactor reordering hash join tables @lgbo-ustc github.com//pull/7287
- [VL] CI: Q97 OOM test passed, stop ignoring its return code CI @zhztheplayer github.com//pull/7294
- [1632][CH]Daily Update Clickhouse Version (20240920) @kyligence-git github.com//pull/7299
- [VL] Customize VCPKG build features according to user's build options @PHILO-HE github.com//pull/7052
- [VL] Remove unused config VELOX_FORCE_COMPLEX_TYPE_SCAN_FALLBACK @felipepessoto github.com//pull/7303
- [INFRA] Label gluten-hudi as DATA_LAKE @dcoliversun github.com//pull/7298
- [6887][VL] Daily Update Velox Version (2024_09_23) @GlutenPerfBot github.com//pull/7309
- [VL] Enhance spill log readability @Yohahaha github.com//pull/7300
- [6975][CH] Rewrite decimal arithmetic @loneylee github.com//pull/7196
- [VL] fix vcpkg package script #7052 @zhouyuan github.com//pull/7316
- [VL] Minor code cleanups @zhztheplayer github.com//pull/7312
- [VL] Minor follow-ups for #7052 @zhztheplayer github.com//pull/7315
- [7178][VL] Fix field not found error when struct field name contains upper case @zml1206 github.com//pull/7304
- [6887][VL] Daily Update Velox Version (2024_09_24) @GlutenPerfBot github.com//pull/7321
- [VL] Add VeloxTransitionSuite @zhztheplayer github.com//pull/7324
- [CORE] Remove unused allPushDownFilters param @zml1206 github.com//pull/7317
- [7096] [CH] fix exception when same names group by @shuai-xu github.com//pull/7101
- [VL] Fix vcpkg binary caching docker image @PHILO-HE github.com//pull/7331
- [7327][INFRA] Publish Velox Backend Test Result Report Github Action @dcoliversun github.com//pull/7328
- [MINOR][DOCS] Improve the configuration document @beliefer github.com//pull/7334
- [VL] Fix field name parsing Subfield @rui-mo github.com//pull/7330
- [6975][CH] Fix decimal cast overflow exception @loneylee github.com//pull/7335
- [VL] Follow-ups for #7304 @zhztheplayer github.com//pull/7340
- [7323][VL] Always round negative decimals for integral types @surnaik github.com//pull/7337
- [7344][CH] Fix the error default database name and table name for the mergetree file format when using path based @zzcclp github.com//pull/7346
- [7348][CH] Move function calculateColumnAndSecondaryIndexSizesImpl outside @loneylee github.com//pull/7349
- [6887][VL] Daily Update Velox Version (2024_09_25) @GlutenPerfBot github.com//pull/7338
- [7028][CH][Part-3] Refactor: Move mergetree related codes to backends-clickhouse @baibaichen github.com//pull/7234
- [7313][VL] Explicit Arrow transitions, part 1: add LoadArrowDataExec / OffloadArrowDataExec @zhztheplayer github.com//pull/7343
- [VL] Minor: Fix documentation of flushable partial aggregate @surnaik github.com//pull/7353
- [7351][CORE] Code cleanup for Gluten session extensions @beliefer github.com//pull/7352
- [VL] Override nodename for IcebergScanTransformer @leoluan2009 github.com//pull/7345
- [7283][CORE] Support DynamicPruningExpression conversion @wForget github.com//pull/7284
- [VL] Enable AtLeastNNonNulls function @zhli1142015 github.com//pull/7326
- [6887][VL] Daily Update Velox Version (2024_09_26) @JkSelf github.com//pull/7355
- [7356][CORE] Make GlutenConfig.GLUTEN_CONFIG_PREFIX private @baibaichen github.com//pull/7357
- [VL] Use new krb5 download url when enable vcpkg @leoluan2009 github.com//pull/7347
- [7367][CH] Revert github.com//pull/7101 @baibaichen github.com//pull/7368
- [6887][VL] Daily Update Velox Version (2024_09_27) @GlutenPerfBot github.com//pull/7370
- [7358][CH] Optimize the strategy of the partition split according to the files count @zzcclp github.com//pull/7361
- [1632][CH]Daily Update Clickhouse Version (20240928) @kyligence-git github.com//pull/7379
- [7313][VL] Explicit Arrow transitions, part 2: new algorithm to find optimal transition @zhztheplayer github.com//pull/7372
- [7313][VL] Explicit Arrow transitions, part 3: code cleanups @zhztheplayer github.com//pull/7383
- [6887][VL] Daily Update Velox Version (2024_09_29) @GlutenPerfBot github.com//pull/7381
- [7385][CH] Add some config parameters to constrol the cache size for the mergetree parts @zzcclp github.com//pull/7386
- [7364][CORE] Simplify the RuleInjector @beliefer github.com//pull/7365
- [7376][ICEBERG]Avoid retrieving the partition schema of Iceberg @lyy-pineapple github.com//pull/7377
- [VL] pass phase recusive invocation of spillTree @waitinfuture github.com//pull/7388
- [7028][CH][Part-4] Refactor
DeltaMergeTreeFileFormat
to read table configuration from deltalog's metadata @baibaichen github.com//pull/7170 - [6887][VL] Daily Update Velox Version (2024_10_01) @GlutenPerfBot github.com//pull/7398
- [7307][VL] Update openssl version velox setup for centos9 @pratham76 github.com//pull/7308
- [6887][VL] Daily Update Velox Version (2024_10_02) @GlutenPerfBot github.com//pull/7404
- [1632][CH]Daily Update Clickhouse Version (20241003) @kyligence-git github.com//pull/7407
- [7394][CH]Reduce the times of the calling listFiles when executing query from the parquet file format @zzcclp github.com//pull/7417
- [7313][VL] Explicit Arrow transitions, part 4: explicit Arrow-to-Velox transition @zhztheplayer github.com//pull/7392
- [CORE] Minor: Fix warnings and rename event handler better @surnaik github.com//pull/7412
- [7427][CH]Revert "fix (#7349)" @baibaichen github.com//pull/7428
- [6887][VL] Daily Update Velox Version (2024_10_03) @GlutenPerfBot github.com//pull/7406
- [6887][VL] Daily Update Velox Version (2024_10_04) @GlutenPerfBot github.com//pull/7408
- [7437][CH]Revert "Auxiliary commit to revert individual files from #7170 @baibaichen github.com//pull/7438
- [7418][VL] Add checks for allocation failures and initialize variables @majetideepak github.com//pull/7419
- [7400][CORE] Scala code style clean up for Backend.scala @beliefer github.com//pull/7401
- [CORE] Infra: Do not dismiss stale reviews @zhztheplayer github.com//pull/7430
- [CORE] Infra: Forward GitHub discussions to Apache mailing list @zhztheplayer github.com//pull/7429
- [6856][CH]Support arrays_overlap and fix array_join diff @KevinyhZou github.com//pull/6857
- [7313][VL] Explicit Arrow transitions, part 5: extra code cleanups @zhztheplayer github.com//pull/7436
- [6887][VL] Daily Update Velox Version (2024_10_08) @GlutenPerfBot github.com//pull/7422
- [VL] Adapt setup-centos8.sh to latest velox helper functions @liujiayi771 github.com//pull/7442
- Minor fix for info.sh @PHILO-HE github.com//pull/7444
- [7028][CH][Part-5] Refactor: add NativeOutputWriter to unify CHDatasourceJniWrapper @baibaichen github.com//pull/7395
- [7325] [CH] add a config to enable turn off read json @shuai-xu github.com//pull/7333
- [VL] Fix dependencies setup @PHILO-HE github.com//pull/7443
- [7402][CORE] Code cleanup for GlutenPlugin @beliefer github.com//pull/7403
- [CORE] Update AllocationListener usage after successfully releasing memory @wForget github.com//pull/7396
- [VL] Prepare shim API for breaking change SPARK-48610 @zhztheplayer github.com//pull/7445
- [6887][VL] Daily Update Velox Version (2024_10_09) @GlutenPerfBot github.com//pull/7447
- [CORE] Fix GH security issues @zhouyuan github.com//pull/7448
- [1632][CH]Daily Update Clickhouse Version (20241010) @kyligence-git github.com//pull/7454
- [VL] Remove VELOX_BUILD_PATH from include directories if build test is disabled @PHILO-HE github.com//pull/7449
- [7426][CH] Fixed: json path contains spaces @lgbo-ustc github.com//pull/7435
- [Core] Fix duplicated column names DS test @FelixYBW github.com//pull/7457
- [7440][VL] Enable unit tests on missing all struct fields @rui-mo github.com//pull/7456
- [6784][6828][VL] Add tests for weekOfYear and cast string as date @zml1206 github.com//pull/6888
- [7459][VL] Move 3.2 / 3.3 Velox native file writer code to
backend-velox
/cpp/velox
@zhztheplayer github.com//pull/7461 - [7410][CORE] Add test args to run spark-ut with Java 17 @CodenameGHOST007 github.com//pull/7411
- [7465][CH] Fix compile error 'Unable to locate class corresponding to inner class entry for BuilderParent owner com.google.protobuf.AbstractMessage' @zzcclp github.com//pull/7466
- [7240][CH] Fix all failed uts GlutenComplexTypeSuite @taiyang-li github.com//pull/7242
- [6887][VL] Daily Update Velox Version (2024_10_11) @GlutenPerfBot github.com//pull/7468
- [VL] Following #7461, add a minor fix @zhztheplayer github.com//pull/7471
- [7373][DOC][VL] Add document for profiling gluten with velox @wForget github.com//pull/7374
- [7480][VL] Clean up some code for protobuf @PHILO-HE github.com//pull/7473
- [7480][VL] Build centos-8 docker image for GHA workflow @PHILO-HE github.com//pull/7481
- [VL] Follow-up for #7481 to fix docker build error @PHILO-HE github.com//pull/7491
- [7489][INFRA] fix: allow to fetch artifacts from workflow
Velox backend Github Runner
upstream repo @dcoliversun github.com//pull/7490 - [7291][DELTA] fix: push down input_file_name expression to transformer scan delta @dcoliversun github.com//pull/7483
- [6887][VL] Code clean for hasUnsupportedColumns function @zhli1142015 github.com//pull/7477
- [7495][BUILD] Fix macOS does not support version-script @zml1206 github.com//pull/7497
- [6887][VL] Daily Update Velox Version (2024_10_12) @GlutenPerfBot github.com//pull/7487
- [6887][VL] Daily Update Velox Version (2024_10_13) @GlutenPerfBot github.com//pull/7504
- [7493][VL] Update Velox.md to clarify dependency deployment @PHILO-HE github.com//pull/7492
- [VL] Enable Spark legacy date formatter if spark.sql.legacy.timeParserPolicy is set to 'LEGACY' @NEUpanning github.com//pull/7375
- [7243][VL] Fix hanging by cross-task spilling @zhztheplayer github.com//pull/7479
- [7484][CH]Fix element_at diff @KevinyhZou github.com//pull/7485
- [7496][DOCS] Add View the Surefire reports of velox test NewToGluten.md @dcoliversun github.com//pull/7501
- [7389] [CH] fix cast map to string diff with spark @shuai-xu github.com//pull/7393
- [7510][VL][CI] Change centos-8 docker image to accelerate GHA workflow @PHILO-HE github.com//pull/7511
- [7509][VL] Memory management: Release all native memory managers after all Velox tasks were released @zhztheplayer github.com//pull/7478
- [7518][VL] Clean up some code used for building protobuf @PHILO-HE github.com//pull/7522
- [7509][VL] Register release hooks as well as factories for Runtime and MemoryManager @zhztheplayer github.com//pull/7516
- [7432][CH] Exception when the result of get_json_object is an array @lgbo-ustc github.com//pull/7513
- [7110][VL][DELTA] support IncrementMetric gluten @dcoliversun github.com//pull/7111
- [7526][VL] Scala code style for VeloxCollect @beliefer github.com//pull/7527
- [VL] Lower default spill run size to reduce overhead memory usage @zhztheplayer github.com//pull/7463
- [7524][VL][UNIFFLE] Reset rss.row.based configuration of uniffle @wForget github.com//pull/7525
- [7311][CH] Support grace aggregate algorithm partial aggregating stages @lgbo-ustc github.com//pull/7322
- [6876] Support Spark-352 @zhouyuan github.com//pull/7138
- [6887][VL] Daily Update Velox Version (2024_10_15) @GlutenPerfBot github.com//pull/7530
- [7517][CH] Support build gluten package with scala213 @lwz9103 github.com//pull/7520
- [1632][CH]Daily Update Clickhouse Version (20241015) @kyligence-git github.com//pull/7529
- [7145][CH] Decouple
SerializedPlanParser
from other parser modules @lgbo-ustc github.com//pull/7250 - [7539][VL] Remove some unnecessary Velox code changes from modify_velox.patch @PHILO-HE github.com//pull/7540
- [6887][VL] Daily Update Velox Version (2024_10_16) @GlutenPerfBot github.com//pull/7549
- [7535][VL] Containerized build within CentOS 7 image @zhztheplayer github.com//pull/7538
- [7514][VL] Reorganize Dockerfiles and document how to build gluten docker @PHILO-HE github.com//pull/7515
- [7420][VL] Fix GCS configuration @majetideepak github.com//pull/7421
- [7550][CH] Rewrite
get_json_object
singular_or_list
@lgbo-ustc github.com//pull/7551 - [7535][VL] CentOS 7 containerized build: Fix for automake version error @zhztheplayer github.com//pull/7555
- [7542][CH] Fix cache not refresh @loneylee github.com//pull/7547
- [7499][VL][CI] Enable ccache GHA job @zhouyuan github.com//pull/7546
- [1632][CH]Daily Update Clickhouse Version (20241016) @kyligence-git github.com//pull/7558
- [6887][VL] Daily Update Velox Version (2024_10_17) @GlutenPerfBot github.com//pull/7566
- [7499][VL] CI: Remove GHA binary cache @zhztheplayer github.com//pull/7554
- [7522][CH] Improve jsonpath support
get_json_object
@lgbo-ustc github.com//pull/7556 - [7482][CH] Remove redundant head object operation of s3 @loneylee github.com//pull/7565
- [7359][VL] feat: Support columnar partial project for UDF @jinchengchenghh github.com//pull/7360
- [7563][CH] Fix failure on too large double number @lgbo-ustc github.com//pull/7570
- [1632][CH]Daily Update Clickhouse Version (20241017) @kyligence-git github.com//pull/7567
- [7559][VL][UNIFFLE] Set rss.enabled to true UniffleShuffleManager @wForget github.com//pull/7560
- [7450][VL] Improve CollectRewriteRule for Velox @beliefer github.com//pull/7451
- [7359][VL] Enable partial project RAS @zhztheplayer github.com//pull/7574
- [7572][CORE] Check if iterator has been closed @wForget github.com//pull/7573
- [VL] Remove unnecessary vanilla Spark compatibility code for VeloxCollectSet function @zhztheplayer github.com//pull/7590
- [1632][CH]Daily Update Clickhouse Version (20241018) @kyligence-git github.com//pull/7588
- [7541][VL] Improve HLLRewriteRule for Velox @beliefer github.com//pull/7543
- [6887][VL] Daily Update Velox Version (2024_10_18) @GlutenPerfBot github.com//pull/7587
- [6887][VL] Daily Update Velox Version (2024_10_19) @GlutenPerfBot github.com//pull/7607
- [6887][VL] Daily Update Velox Version (2024_10_21) @GlutenPerfBot github.com//pull/7617
- [7591][CH]Fix: fail to normalize json text with empty object it @lgbo-ustc github.com//pull/7595
- [7596][CH] Fix bnlj empty join error @loneylee github.com//pull/7597
- [7600][VL] Prepare test case for the removal of workaround code for empty schema batches @zhztheplayer github.com//pull/7601
- [7615][CORE] Introduce
GlutenFormatFactory
@baibaichen github.com//pull/7616 - [7609][CORE] Fix the bug that Gluten cannot change logging level @beliefer github.com//pull/7610
- [7455][CH] Add
spark_modulo
for compatibility @lgbo-ustc github.com//pull/7619 - [7621][CH] Fix repeat function reports an error when times is a negative number @loneylee github.com//pull/7622
- [7577][VL] Add pattern match for extension rules @zml1206 github.com//pull/7584
- [7585][VL] Fix S3 and GCS configs @majetideepak github.com//pull/7586
- [7604][CORE] Code refactors against ColumnarRuleApplier.Executor @beliefer github.com//pull/7606
- [7581][CORE] Code cleanup for GlutenColumnarRule @beliefer github.com//pull/7582
- [6887][VL] Daily Update Velox Version (2024_10_22) @GlutenPerfBot github.com//pull/7626
- [7545][CH] Fix regexp_replace group catching syntax diff @zhanglistar github.com//pull/7603
- [7623][CH] Fix running cache command with error when executor add and removed. @loneylee github.com//pull/7625
- [7600][VL] Remove EmptySchemaWorkaround @zhztheplayer github.com//pull/7620
- [7336][CORE] Bump Spark version to v3.5.3 @Yohahaha github.com//pull/7537
- [CORE] Move scala file to scala package and fix minor typo comment @leoluan2009 github.com//pull/7635
- [7028][CH][Part-6] Introduce MergeTreeDelayedCommitProtocol @baibaichen github.com//pull/7506
- [VL] Fix missing VELOX_HOME builddeps-veloxbe.sh @liujiayi771 github.com//pull/7629
- [5525][FOLLOWUP] Fix mvn versions:set does not work for shim submodules @JinHelin404 github.com//pull/7593
- [6887][VL] Daily Update Velox Version (2024_10_23) @GlutenPerfBot github.com//pull/7643
- [7600][VL] Simplify offload rules RAS @zhztheplayer github.com//pull/7646
- [VL] Move Scala file to Scala package @surnaik github.com//pull/7653
- [VL] Code cleanup for Arrow CSV UTs @zhztheplayer github.com//pull/7651
- [7458][VL] Upgrade to GCC-11 for centos-7/8 and ubuntu-20.04 @PHILO-HE github.com//pull/7578
- [6887][VL] Daily Update Velox Version (2024_10_24) @GlutenPerfBot github.com//pull/7662
- [CORE][CH] Remove ValidatorApi.doSparkPlanValidate @zhztheplayer github.com//pull/7668
- [VL] Follow-up fix for gcc upgrade PR @PHILO-HE github.com//pull/7667
- [7475][VL] Add a config to control whether add trim node when CAST from varchar @Henry2SS github.com//pull/7476
- [BUILD][DOCS][VL] Change BUILD_JEMALLOC to ENABLE_JEMALLOC_STATS @surnaik github.com//pull/7650
- [VL] Upgrade FB_OS_VERSION to v2024.07.01.00 @PHILO-HE github.com//pull/7671
- [7673][CH] Fix substrait infinite loop @loneylee github.com//pull/7674
- [7665] Remove duplicated tpch/tpcds queries resources @marin-ma github.com//pull/7666
- [5103][VL] Use jvm libhdfs replace c++ libhdfs3 @JkSelf github.com//pull/6172
- Revert "[5103][VL] Use jvm libhdfs replace c++ libhdfs3" @marin-ma github.com//pull/7683
- [7659][CH] Implement splittable bzip2 decompression @taiyang-li github.com//pull/7638
- [5103][VL] Use jvm libhdfs replace c++ libhdfs3 @JkSelf github.com//pull/7684
- [CORE] Rework the implementation of spark.gluten.enabled @zhztheplayer github.com//pull/7672
- [6887][VL] Daily Update Velox Version (2024_10_26) @GlutenPerfBot github.com//pull/7688
- [1632][CH]Daily Update Clickhouse Version (20241026) @kyligence-git github.com//pull/7689
- [7359][VL] Optimize string partial project @jinchengchenghh github.com//pull/7592
- [7681][CH][ARM] Fix compile issue for SparkFunctionFloor @loudongfeng github.com//pull/7682
- [7661][VL] Fix validate native IfThen expr @wForget github.com//pull/7669
- [6887][VL] Daily Update Velox Version (2024_10_28) @marin-ma github.com//pull/7694
- [7665] Remove tpch-queries-velox @marin-ma github.com//pull/7705
- [VL] Code clean for BasicPhysicalOperatorTransformer @zml1206 github.com//pull/7695
- [7143][VL] CI: Add GHA job for running all UTs with RAS=ON @zhztheplayer github.com//pull/7702
- [6887][VL] Daily Update Velox Version (2024_10_29) @GlutenPerfBot github.com//pull/7708
- [7657][CH]Fix to_unix_timestamp when input parameter is timestamp type @KevinyhZou github.com//pull/7660
- [6387][CH] support percentile function @taiyang-li github.com//pull/6396
- [7685][VL][RAS] Add new cost model to avoid costly r2c @zml1206 github.com//pull/7686
- [7143][VL] Fix several UTs for RAS @zhztheplayer github.com//pull/7701
- [7713][CH] Fix page index reader failed with or logical operator @baibaichen github.com//pull/7716
- [VL] CI: One OOM GHA job has passed, stop ignoring its result @zhztheplayer github.com//pull/7712
- [7709][CH] Rule constructor simplifications @beliefer github.com//pull/7710
- [7717][CH] [ARM]fix compile issue for SparkFunctionRoundHalfUp @loudongfeng github.com//pull/7718
- [7143][VL] RAS: Fix test case "test ignore row to columnar" when RAS=ON @zhztheplayer github.com//pull/7725
- [1632][CH]Daily Update Clickhouse Version (20241030) @kyligence-git github.com//pull/7720
- [6887][VL] Daily Update Velox Version (2024_10_30) @marin-ma github.com//pull/7722
- [6887][VL] Daily Update Velox Version (2024_10_31) @GlutenPerfBot github.com//pull/7739
- [VL] RAS: Fix fallen back plan nodes are not tagged with meaningful fallback reasons @zhztheplayer github.com//pull/7731
- [VL] Enhance write parquet with compression codec test @wecharyu github.com//pull/7737
- [7714][CH]Fix issue caused by incomplete line if there is only one line last bzip2 block @taiyang-li github.com//pull/7715
- [VL] Add metric to indicate aggregation pushdown @zhli1142015 github.com//pull/7729
- [7703][VL] ColumnarBuildSideRelation transform support multiple key columns @yikf github.com//pull/7704
- [6887][VL] Daily Update Velox Version (2024_11_01) @GlutenPerfBot github.com//pull/7761
- [7670][CH] Fix enable 'files.per.partition.threshold' bug @loneylee github.com//pull/7758
- [7143][VL] RAS: Fix failed UTs GlutenSQLQueryTestSuite @zhztheplayer github.com//pull/7754
- [7747][CH] Fix murmur3hash on arm @lwz9103 github.com//pull/7757
- [7143][VL] RAS: Revert #7731, disable the relevant test cases since RAS doesn't report fallback details due to #7763 @zhztheplayer github.com//pull/7764
- [7143][VL] RAS: Catch exceptions thrown from rewrite rules @zhztheplayer github.com//pull/7767
- [7771][CH]Fix crc32 failure bzip2 @taiyang-li github.com//pull/7772
- [7765][CH] Support CACHE META command for MergeTree table @loneylee github.com//pull/7774
- [1632][CH]Daily Update Clickhouse Version (20241101) @kyligence-git github.com//pull/7762
- [7775][CORE] Make sure the softaffinity hash executor list is order @zzcclp github.com//pull/7776
- [6887][VL] Daily Update Velox Version (2024_11_02) @GlutenPerfBot github.com//pull/7784
- [7753][CORE] Do not replace literals of expand's projects
PullOutPreProject
@lgbo-ustc github.com//pull/7756 - [7446][BUILD] build third party libs using jar from JAVA_HOME @Zand100 github.com//pull/7736
- [7174][VL] Force fallback scan operator when spark.sql.parquet.mergeSchema enabled @Yohahaha github.com//pull/7634
- [6887][VL] Daily Update Velox Version (2024_11_03) @GlutenPerfBot github.com//pull/7786
- [7782] Fix profile Darwin-x86 os.arch error @zml1206 github.com//pull/7783
- [7780][CH] Fix split diff @taiyang-li github.com//pull/7781
- [7792][CH] Set default minio ak/sk to minioadmin @lwz9103 github.com//pull/7793
- [DOC] Update release plan for Velox backend @zhouyuan github.com//pull/7744
- [7727][CORE] Unify the variable name of GlutenConfig with glutenConf @beliefer github.com//pull/7728
- [7700][CH] Fix issue when partition values contain space @exmy github.com//pull/7719
- [7741][VL] refine build package tool @zhouyuan github.com//pull/7742
- [7797][VL] Fix lacking icu lib on centos-7 @PHILO-HE github.com//pull/7798
- [VL] Remove a duplicated Maven dependency, and some follow-ups for #7764 @zhztheplayer github.com//pull/7773
- [7143][VL] RAS: Enable the RAS UT jobs GHA CI @zhztheplayer github.com//pull/7770
- [6887][VL] Daily Update Velox Version (2024_11_05) @GlutenPerfBot github.com//pull/7808
- [VL]
ColumnarBatchSerializerJniWrapper_serialize
, check if the byte array is constructed successfully @NEUpanning github.com//pull/7733 - [7814][CH] Support trigger Gluten ClickHouse CI on ARM @lwz9103 github.com//pull/7815
- [7243][VL] Suspend the Velox task while reading an input Java iterator to make the task spillable @zhztheplayer github.com//pull/7748
- [7654][CH] Fix round on arm @lwz9103 github.com//pull/7794
- [1632][CH]Daily Update Clickhouse Version (20241105) @kyligence-git github.com//pull/7809
- [CH] Ignore unstabe uts and add more message when failed. @baibaichen github.com//pull/7821
- [7812][CH] Fix the query failed for the mergetree format when the 'spark.databricks.delta.stats.skipping' is off @zzcclp github.com//pull/7813
- [6887][VL] Daily Update Velox Version (2024_11_06) @GlutenPerfBot github.com//pull/7822
- [VL] Remove load shared libhdfs @Yohahaha github.com//pull/7818
- [7749][VL] Trim ISOControl characters string for casting to integral type @wForget github.com//pull/7806
- [CH] Rename Mergetree part file name to avoid duplicated file name @liuneng1994 github.com//pull/7769
- [7807] Transform relation bound attr using the name if attr'exprId not found. @yikf github.com//pull/7819
- [1632][CH]Daily Update Clickhouse Version (20241106) @kyligence-git github.com//pull/7824
- [7795][CH] Add backend task id log @loneylee github.com//pull/7801
- [7647][CH] Lazy expand for aggregation @lgbo-ustc github.com//pull/7649
- [VL] Remove one legacy Velox config used for Spark collect_list function @PHILO-HE github.com//pull/7826
- [CORE] Remove unused dependencies of gluten-substrait @zml1206 github.com//pull/7833
- [7079][VL] Fix metrics for InputIteratorTransformer of broadcast exchange @ivoson github.com//pull/7167
- [7800][VL] Add config for max reclaim wait time to avoid dead lock when memory arbitration @Yohahaha github.com//pull/7799
- [7829][CH] Fix read csv file with datetime field not equals spark @loneylee github.com//pull/7832
- [7796][CH] Fix diff while casting bool to string @taiyang-li github.com//pull/7804
- [7778][CH] Make aggregation output schema same as CH native @lgbo-ustc github.com//pull/7811
- Revert "[7800][VL] Add config for max reclaim wait time to avoid dead lock when memory arbitration" @zhztheplayer github.com//pull/7836
- [VL] Sort shuffle writer use vectorized c2r @marin-ma github.com//pull/6782
- [6887][VL] Daily Update Velox Version (2024_11_07) @GlutenPerfBot github.com//pull/7834
- [7675][VL] Support parquet write with complex data type(eg. MAP, ARRYY) @weixiuli github.com//pull/7676
- [7759][CH]Fix pre project push down aggregate @KevinyhZou github.com//pull/7779
- [7795][CH] Remove duplicate log object @loneylee github.com//pull/7839
- [1632][CH]Daily Update Clickhouse Version (20241107) @kyligence-git github.com//pull/7835
- [VL] Fix ccache installation docker @PHILO-HE github.com//pull/7848
- [CORE] Minor: Rename LimitTransformer to LimitExecTransformer @zhztheplayer github.com//pull/7843
- [VL] Follow-up fix for PR #7848 to install ccache @PHILO-HE github.com//pull/7858
- [7458][VL] Upgrade GCC to version 11 gluten-te's ubuntu dockerfile @zhztheplayer github.com//pull/7859
- [CORE] Remove member TransformContext#inputAttributes as unused @zhztheplayer github.com//pull/7844
- [VL] Re-enable background IO threads by default @zhztheplayer github.com//pull/7845
- [7850][VL] Native writer support CreateHiveTableAsSelectCommand @yikf github.com//pull/7851
- [7862][CH]fix pre-projection aggregate not take effect @KevinyhZou github.com//pull/7863
- [CH][Doc] Add Gluten CH Debug docs. @lwz9103 github.com//pull/7846
- [VL] Clean up some legacy code and correct minimum GCC version @PHILO-HE github.com//pull/7865
- [VL] Do not use --version-script link option on Darwin @PHILO-HE github.com//pull/7820
- [7647][CH] Enable lazy expand for
avg
andsum(decimal)
@lgbo-ustc github.com//pull/7840 - [MINOR][INFRA] Exclude metastore_db from git @yikf github.com//pull/7871
- [7760] Fix udf implicit cast & update doc @marin-ma github.com//pull/7852
- [6887][VL] Daily Update Velox Version (2024_11_08) @GlutenPerfBot github.com//pull/7854
- [6887][VL] Daily Update Velox Version (2024_11_09) @GlutenPerfBot github.com//pull/7875
- [7028][CH][Part-7] Support one pipeline write for mergetree @baibaichen github.com//pull/7788
- [VL] Fix weekly scheduled GHA job @PHILO-HE github.com//pull/7888
- [VL] Enable array test for GlutenParquetIOSuite @zml1206 github.com//pull/7841
- [Doc] Show gluten icon when using IDEA @liuneng1994 github.com//pull/7894
- [1632][CH]Daily Update Clickhouse Version (20241111) @kyligence-git github.com//pull/7884
- [6887][VL] Daily Update Velox Version (2024_11_10) @GlutenPerfBot github.com//pull/7881
- [6887][VL] Daily Update Velox Version (2024_11_11) @GlutenPerfBot github.com//pull/7883
- [7847][CORE] Distinguish between native scan and vanilla spark scan plan tree string @zml1206 github.com//pull/7877
- [7886][VL] Fix broken Ubuntu 20.04 + VCPKG + GCS + ABFS build @zhztheplayer github.com//pull/7906
- [7890][UI] Optimize cleanup gluten sql executions ui data @zml1206 github.com//pull/7891
- [7907][CH] Fixed data race
ExpresionParser::getUniqueName
@lgbo-ustc github.com//pull/7908 - [VL] CI: Fix out-of-date module name labeler.yml @zhztheplayer github.com//pull/7915
- [7078][CORE] The fallback check for Scan should not be skipped when DPP is present @wang-zhun github.com//pull/7080
- [7868][CH] Nested column pruning for Project(Filter(Generate)) @taiyang-li github.com//pull/7869
- [CORE] Consolidate RewriteSparkPlanRulesManager, AddFallbackTagRule, TransformPreOverrides into a single rule @zhztheplayer github.com//pull/7918
- [7823] Revert "read data from orc file format - ignore reading except date32" @baibaichen github.com//pull/7917
- [6887][VL] Daily Update Velox Version (2024_11_12) @zhouyuan github.com//pull/7899
- [7028][CH][Part-8] Support one pipeline write for partition mergetree @baibaichen github.com//pull/7924
- [Minor] Fix a typo Gluten config @PHILO-HE github.com//pull/7931
- [VL] Clean up unused variables cpp source files @rui-mo github.com//pull/7929
- [CH] Fix SIGSEGV on jstring2string @liuneng1994 github.com//pull/7928
- [7647][CH] Fixed a bug finding attributes replacement map @lgbo-ustc github.com//pull/7927
- [CORE] Revert Spark version from v353 to v352 @Yohahaha github.com//pull/7930
- [VL][CI] Fix back upload golden files @zml1206 github.com//pull/7880
- [6887][VL] Daily Update Velox Version (2024_11_13) @zhouyuan github.com//pull/7926
- [7243][VL] A follow-up fix for #7748 @zhztheplayer github.com//pull/7935
- [6666][VL] Use custom SparkExprToSubfieldFilterParser @rui-mo github.com//pull/6754
- [7641][VL] Add Gluten benchmark scripts @marin-ma github.com//pull/7642
- [7647][CH] Remove duplicated columns agg results @lgbo-ustc github.com//pull/7937
- [7856][CORE] Ensure correct enabling of GlutenCostEvaluator @weixiuli github.com//pull/7857
- [VL] Fix wrong lib suffix for google_cloud_cpp_storage @PHILO-HE github.com//pull/7933
- [VL] Add test for scan operator with filter on decimal/timestamp/binary field @rui-mo github.com//pull/7945
- [7362][VL] Add test for 'IN' and 'OR' filter Scan @zml1206 github.com//pull/7363
- [7387][CH] Allow parallel downloading scan operator for hive text/json table when the whole compresse(not bzip2) file is a single file split @taiyang-li github.com//pull/7598
- [6887][VL] Daily Update Velox Version (2024_11_14) @GlutenPerfBot github.com//pull/7942
- [7647][CH] Drop literals aggregation results @lgbo-ustc github.com//pull/7951
- [CH] Fix issues due to github.com/ClickHouse/ClickHouse/pull/71539 @baibaichen github.com//pull/7952
- [6896] Add buffered read for hash/sort shuffle reader @marin-ma github.com//pull/7897
- [7837][VL] Spark driver should not initialize cache if not local mode @leoluan2009 github.com//pull/7853
- [7267][CORE][CH] Support nested column pruning for
HiveTableScan
json/parquet/orc format @KevinyhZou github.com//pull/7268 - [7499][VL][CI] Print ccache statistics for tracking its efficacy @PHILO-HE github.com//pull/7957
- [7594] [CH] support cast const map to string @shuai-xu github.com//pull/7599
- [7947][CORE] Add buildSide info for BroadcastNestedLoopJoinExecTransformer simpleStringWithNodeId @zml1206 github.com//pull/7948
- [6887][VL] Daily Update Velox Version (2024_11_15) @GlutenPerfBot github.com//pull/7954
- [6887][VL] Daily Update Velox Version (2024_11_16) @GlutenPerfBot github.com//pull/7958
- [6887][VL] Daily Update Velox Version (2024_11_17) @GlutenPerfBot github.com//pull/7961
- Add config to support viewfs Gluten. @JkSelf github.com//pull/7892
- [7959][CH]
AdvancedExpandStep
generates less row than expected @lgbo-ustc github.com//pull/7960 - [7962][CH] A friendly API to build aggregator params @lgbo-ustc github.com//pull/7963
- [VL] Clean up some legacy code related to USE_AVX512 @PHILO-HE github.com//pull/7956
- [6887][VL] Daily Update Velox Version (2024_11_18) @GlutenPerfBot github.com//pull/7965
- [7910][CORE][VL] Flip dependency direction for gluten-iceberg @zhztheplayer github.com//pull/7967
- [7983][CH] Fix NPE when disable spark.shuffle.compress @exmy github.com//pull/7984
- [7887][VL][DOC] Add usage doc about dynamic load jvm libhdfs and native libhdfs3 @JkSelf github.com//pull/7982
- [6853][CORE] Move more general query planner APIs from gluten-substrait to gluten-core @zhztheplayer github.com//pull/7972
- [6887][VL] Daily Update Velox Version (2024_11_19) @GlutenPerfBot github.com//pull/7978
- [7969][VL] Enable spill to multiple directories for micro benchmark @marin-ma github.com//pull/7970
- [CORE] Avoid formatted comments from being messed by non-spotless linters (especially IDE linters) @zhztheplayer github.com//pull/7989
- [7751][VL] Merge two consecutive aggregates to one complete mode @yikf github.com//pull/7752
- [7800][VL] Add config for max reclaim wait time to avoid dead lock when memory arbitration @Yohahaha github.com//pull/7990
- [7986][CH] Improve lazy expand for high cardinality aggregation @lgbo-ustc github.com//pull/7995
- [CORE] Minor: Use lower case for Maven profile names @zhztheplayer github.com//pull/8001
- [CORE] Query planner: A more explicit practice to register columnar batch types @zhztheplayer github.com//pull/8002
- [7979][CH] Fix exception cause by one child of UnionExec outputs Array(Nothing) while the other outputs Array(String) @taiyang-li github.com//pull/7980
- [7971][CH] Support using left side as the build table for the left anti/semi join @zzcclp github.com//pull/7981
- [6887][VL] Daily Update Velox Version (2024_11_20) @GlutenPerfBot github.com//pull/7997
- [7267][CORE][CH] Move schema pruning optimization of HiveTableScan to an individual post-transform rule @zhztheplayer github.com//pull/8008
- [8005][VL] Add MergeTwoPhasesHashBaseAggregate to injectRas list @yikf github.com//pull/8006
- [VL] fallback unsupported orc write for spark32 and spark33 @jackylee-ch github.com//pull/7996
- [CORE][CH] Remove API BackendSettingsApi#supportShuffleWithProject @zhztheplayer github.com//pull/8009
- [7999][VL] Add compression codec extension to velox written parquet file @liujiayi771 github.com//pull/8000
- [7028][CH][Part-9] Collecting Delta stats for parquet @baibaichen github.com//pull/7993
- [6887][VL] Daily Update Velox Version (2024_11_21) @GlutenPerfBot github.com//pull/8012
- [VL] Link shared jemalloc lib to work with LD_PRELOAD @PHILO-HE github.com//pull/7369
- [6887][VL] Daily Update Velox Version (2024_11_22) @GlutenPerfBot github.com//pull/8019
- [6887][VL] Daily Update Velox Version (2024_11_24) @GlutenPerfBot github.com//pull/8028
- [7953][VL] Fetch and dump all inputs for micro benchmark on middle stage begin @marin-ma github.com//pull/7998
- [7950][VL] Keep Core module's build flag consistent with Velox @surnaik github.com//pull/8027
- [VL] RAS: Remove alternative constraint sets passing to RAS planner @zhztheplayer github.com//pull/8033
- [6920][CORE] Move API
Backend#defaultBatchType
down toBackendSettingsApi
module gluten-substrait @zhztheplayer github.com//pull/8016 - [8010][CORE] Don't generate native metrics if transformer don't generate relNode @zml1206 github.com//pull/8011
- [VL] Bump jemalloc version and update relevant documents @PHILO-HE github.com//pull/8035
- [MISC] Velox maintainers as triage member(collaborators) @zhouyuan github.com//pull/8037
- [VL] Clean up duplicate CMake code for setting CMAKE_CXX_FLAGS @surnaik github.com//pull/8034
- [7741][VL] Fix deprecated actions/upload-artifact version issue when building bundle package @wangyum github.com//pull/8017
- [VL] vcpkg: Broken libelf mirror @zhztheplayer github.com//pull/8047
- [6887][VL] Daily Update Velox Version (2024_11_26) @GlutenPerfBot github.com//pull/8042
- [7896][CH]Fix to_date diff for time parser policy config @KevinyhZou github.com//pull/7923
- [CH]Daily Update Clickhouse Version (20241118) @liuneng1994 github.com//pull/7968
- [8046][VL] CI: fix velox cache/bundle package script @zhouyuan github.com//pull/8051
- [7631][VL] Fall back lead/lag if input is foldable @zml1206 github.com//pull/8038
- [6920][CORE] Redesign and move trait
GlutenPlan
togluten-core
@zhztheplayer github.com//pull/8036 - [3839][CH] Extend nested column pruning vanilla spark @taiyang-li github.com//pull/7992
- [8039][VL] Native writer should respect table properties @yikf github.com//pull/8040
- [VL] Enable locate function test @rui-mo github.com//pull/4791
- [6920][VL] Following #8036, append some code cleanups @zhztheplayer github.com//pull/8058
- [7977][VL] Include cstdint header explicitly @yabinma github.com//pull/8030
- [8061][VL] Fall back nth_value if input is foldable @zml1206 github.com//pull/8062
- [8046][VL]CI: fix ccache path @zhouyuan github.com//pull/8064
- [7905][CH] Implete window's
topk
by aggregation @lgbo-ustc github.com//pull/7976 - [6887][VL] Daily Update Velox Version (2024_11_27) @GlutenPerfBot github.com//pull/8057
- [8073][CH] Replace some deprecated methods about sort @lgbo-ustc github.com//pull/8079
- [7860][CORE] shuffle writer, replace MemoryMappedFile to avoid OOM @ccat3z github.com//pull/7861
- [6887][VL] Daily Update Velox Version (2024_11_28) @GlutenPerfBot github.com//pull/8067
- [6887][VL] Daily Update Velox Version (2024_11_29) @GlutenPerfBot github.com//pull/8086
- [8094][CH][Part-1] Support reading data from the iceberg with CH backend @zzcclp github.com//pull/8095
- [8074][CH] Fix adjust output constant column @lwz9103 github.com//pull/8076
- [8096][CH] Invalid header for disk tmp file @lgbo-ustc github.com//pull/8100
- [8021][CH] Fix ORC read/write mismatch and parquet read failure when column with complex types contains null @taiyang-li github.com//pull/8023
- [8095][CH] package with iceberg profile @lwz9103 github.com//pull/8106
- [8103][DOC] Fix TPC-H/DS queries link @merrily01 github.com//pull/8104
- [1632][CH]Daily Update Clickhouse Version (20241129) @kyligence-git github.com//pull/8087
- [CH] Support separate debug symbols from so file @liuneng1994 github.com//pull/8083
- [1632][CH]Daily Update Clickhouse Version (20241130) @kyligence-git github.com//pull/8112
- [6887][VL] Daily Update Velox Version (2024_11_30) @GlutenPerfBot github.com//pull/8111
- [8080][CH]Support function transform_keys/transform_values @taiyang-li github.com//pull/8085
- Revert "[8080][CH]Support function transform_keys/transform_values" @taiyang-li github.com//pull/8121
- [8119][CH] Disable
max_bytes_ratio_before_external_group_by
@lgbo-ustc github.com//pull/8120 - [8060][CORE] GlutenShuffleManager as a registry of shuffle managers @zhztheplayer github.com//pull/8084
- [7745][VL] Incorporate SQL Union operator into Velox execution pipeline @zhztheplayer github.com//pull/7842
- [8090][CH]Remove sparkLeast and sparkGreatest functions @KevinyhZou github.com//pull/8091
- [VL] Enable GlutenJsonExpressionsSuite @zhli1142015 github.com//pull/8099
- [7028][CH][Part-11] Support write parquet files with bucket @lwz9103 github.com//pull/8052
- [1632][CH]Daily Update Clickhouse Version (20241203) @kyligence-git github.com//pull/8125
- [8046][VL] Fix GHA checkout issue on centos-7 for weekly build job @PHILO-HE github.com//pull/8129
- [6887][VL] Daily Update Velox Version (2024_12_03) @GlutenPerfBot github.com//pull/8124
- [8130][CH] Use the actual user instead of yarn user to read hdfs file @exmy github.com//pull/8131
- [VL] Enhance VeloxHashShuffleWriter partition buffer size estimation by incorporating complex type columns @kecookier github.com//pull/8089
- [7518][FOLLOWUP] Remove build_protobuf parameter from build-guide @wForget github.com//pull/8140
- [6887][VL] Daily Update Velox Version (2024_12_04) @GlutenPerfBot github.com//pull/8137
- [1632][CH]Daily Update Clickhouse Version (20241204) @kyligence-git github.com//pull/8135
- [CORE] Add Gluten Project Improvement Proposals (GPIP) doc @yikf github.com//pull/8133
- [VL] Add back RAII style Velox driver suspension into RowVectorStream @zhztheplayer github.com//pull/8149
- [VL] Change C style casts to C++ style @rui-mo github.com//pull/8153
- [DOC] Fix typos documentation @rui-mo github.com//pull/8155
- [7143][VL] RAS: Remove experimental flags for RAS @zhztheplayer github.com//pull/8154
- [8148][CH] Fix corr with NaN @loneylee github.com//pull/8150
- [1632][CH]Daily Update Clickhouse Version (20241205) @kyligence-git github.com//pull/8152
- [6887][VL] Daily Update Velox Version (2024_12_05) @GlutenPerfBot github.com//pull/8147
- [VL] Minor fix for cpp code style (part 1) @rui-mo github.com//pull/8157
- [CORE] Simplify code of offload scan @zml1206 github.com//pull/8144
- [8159][CH]Remove
SparkFunctionDecimalDivide
@KevinyhZou github.com//pull/8160 - [7900][VL] Enable prefix sort config spill @jinchengchenghh github.com//pull/7904
- [CORE] Bump celeborn to 0.5.2 @yikf github.com//pull/8054
- [6920][CORE][VL] New APIs and refactors to allow different backends / components to be registered and used @zhztheplayer github.com//pull/8143
- [8142][CH] Duplicated columns group by @lgbo-ustc github.com//pull/8164
- [6887][VL] Daily Update Velox Version (2024_12_06) @GlutenPerfBot github.com//pull/8162
- [6887][VL] Daily Update Velox Version (2024_12_07) @GlutenPerfBot github.com//pull/8171
- [CORE] Add nativeFilters info for simpleString of scan @zml1206 github.com//pull/8169
- [CORE][UNIFFLE] Bump uniffle 0.9.1 @wForget github.com//pull/8166
- [8115][CORE] Refine the BuildSideRelation transform to support all scenarios @yikf github.com//pull/8116
- [6887][VL] Daily Update Velox Version (2024_12_08) @GlutenPerfBot github.com//pull/8174
- [CORE][MIRROR] Fix performance issue when allScanPartitions are very large @WangGuangxin github.com//pull/8126
- [CORE] Query planner: Simplify validator
FallbackByNativeValidation
@zhztheplayer github.com//pull/8177 - [VL] Change loadQuantum default value to 8MB from 256MB @yikf github.com//pull/8186
- [6887][VL] Daily Update Velox Version (2024_12_09) @GlutenPerfBot github.com//pull/8178
- [VL] Fix upload arrow path of build bundle package gha @wForget github.com//pull/8193
- [1632][CH]Daily Update Clickhouse Version (20241210) @kyligence-git github.com//pull/8191
- [8356][CORE][VL][CH] Make Iceberg code implement component API @zhztheplayer github.com//pull/8192
- Disable scheduled GHA jobs for forked repos @wForget github.com//pull/8189
- [6887][VL] Daily Update Velox Version (2024_12_10) @GlutenPerfBot github.com//pull/8190
- [8202][ICEBERG] Fix get iceberg index error @lyy-pineapple github.com//pull/8199
- [CORE] Following #8192, amend a quick fix for build info message @zhztheplayer github.com//pull/8205
- [7261][CORE] Support offloading partial filters to native scan @zml1206 github.com//pull/8082
- [6887][VL] Daily Update Velox Version (2024_12_11) @GlutenPerfBot github.com//pull/8200
- [8043] Use spark.shuffle.spill.diskWriteBufferSize sort-based shuffle @marin-ma github.com//pull/8203
- [8168] Add pre-projections for join condition @lgbo-ustc github.com//pull/8185
- [7755] [CH] translate support args with unequal length @shuai-xu github.com//pull/7768
- [VL] Minor fix for cpp code style (part 2) @rui-mo github.com//pull/8210
- [VL] Change the loadQuantum config if velox cache is enabled. @yikf github.com//pull/8197
- [7912][VL] Flip dependency direction for gluten-delta @zhztheplayer github.com//pull/8218
- [1632][CH]Daily Update Clickhouse Version (20241212) @kyligence-git github.com//pull/8213
- [6887][VL] Daily Update Velox Version (2024_12_12) @GlutenPerfBot github.com//pull/8211
- [8187][VL] Support velox cache metrics @yikf github.com//pull/8188
- [8206][VL] Support collect_set window @WangGuangxin github.com//pull/8220
- [VL] Fix sort based shuffle oom spill when compress was disabled @clay4megtr github.com//pull/7553
- [1632][CH]Daily Update Clickhouse Version (20241213) @kyligence-git github.com//pull/8224
- [8229][VL] Don't rewrite collect_list/collect_set window @zml1206 github.com//pull/8230
- [6887][VL] Daily Update Velox Version (2024_12_13) @GlutenPerfBot github.com//pull/8223
- [8025][VL] Respect config kSpillReadBufferSize and add spill compression codec @jinchengchenghh github.com//pull/8045
- [8216][CH] Fix OOM when cartesian product with empty data @lwz9103 github.com//pull/8219
- [6887][VL] Daily Update Velox Version (2024_12_14) @GlutenPerfBot github.com//pull/8233
- [8208][CORE] A new unified approach of source folder isolation for iceberg / hudi / delta with Maven @zhztheplayer github.com//pull/8198
- [VL] Fix crash when there are unreleased memory pools during termintating a Velox task @zhztheplayer github.com//pull/8243
- [6887][VL] Daily Update Velox Version (2024_12_15) @GlutenPerfBot github.com//pull/8235
- [MINOR] [VL] Enhance the gluten timer to support seconds, milliseconds, and microseconds @kecookier github.com//pull/8231
- [7914][CORE] Flip dependency direction for gluten-celeborn @zhztheplayer github.com//pull/8241
- [7911][CORE] Flip dependency direction for gluten-hudi @zhztheplayer github.com//pull/8240
- [CORE] Minor: OverTarget is required only with sufficient memory and doesn't spill due to zero used bytes post-borrow @kecookier github.com//pull/8247
- [VL] Support concat_ws function @PHILO-HE github.com//pull/8101
- [CH] Minor, add delta profile to package.sh @lwz9103 github.com//pull/8250
- [7028][CH][Part-12] Add Local SortExec for Partition Write one pipeline mode @baibaichen github.com//pull/8237
- [7913][CORE] Flip dependency direction for gluten-uniffle @zhztheplayer github.com//pull/8242
- [8128][VL] Retry borrowing when granted size is less than requested multi-slot and shared mode @kecookier github.com//pull/8132
- [8215][VL] Support cast timestamp to date @zml1206 github.com//pull/8212
- [8018][CORE] Introduce ApplyResourceProfileExec to apply resource profile for query stage @zjuwangg github.com//pull/8195
- [CH] Hotfix to #8212 @baibaichen github.com//pull/8259
- [DOC] Update HowTo.md to fix outdated link and test script location @zjuwangg github.com//pull/8255
- [6887][VL] Daily Update Velox Version (2024_12_16) @GlutenPerfBot github.com//pull/8238
- [VL] Allow shared dependencies for lib GCS @PHILO-HE github.com//pull/8251
- [6887][VL] Daily Update Velox Version (2024_12_17) @GlutenPerfBot github.com//pull/8248
- [CORE] Minor code cleanups for TreeMemoryConsumer @zhztheplayer github.com//pull/8254
- minor, remove deprecated gluten-clickhouse-celeborn jar @lwz9103 github.com//pull/8263
- [VL] Fix
RetryOnOomMemoryTarget
only spills one single consumer on retrying @zhztheplayer github.com//pull/8262 - [6887][VL] Daily Update Velox Version (2024_12_18) @GlutenPerfBot github.com//pull/8260
- [CORE][VL][CH] GHA: Update pull request paths triggering CI @zhztheplayer github.com//pull/8264
- [1632][CH]Daily Update Clickhouse Version (20241218) @kyligence-git github.com//pull/8261
- [7903][VL] move local velox patch to oap/velox @zhouyuan github.com//pull/8265
- [8268][CORE] Remove preconditions OverAcquire.repay @kecookier github.com//pull/8269
- [6887][VL] Daily Update Velox Version (2024_12_19) @GlutenPerfBot github.com//pull/8270
- [8257][CORE] Make IDEA support IssueNavigationLink @merrily01 github.com//pull/8258
- [CORE] Use component file to discover components @zhztheplayer github.com//pull/8271
- [8108][VL] Correct the logic of null on failure behavior for try cast @acvictor github.com//pull/8107
- [1632][CH]Daily Update Clickhouse Version (20241219) @kyligence-git github.com//pull/8274
- Revert Revert "[8080][CH]Support function transform_keys/transform_values" @taiyang-li github.com//pull/8277
- [7028][CH][Part-10] Collecting Delta stats for MergeTree @baibaichen github.com//pull/8029
- [6887][VL] Daily Update Velox Version (2024_12_20) @GlutenPerfBot github.com//pull/8286
- [8356][VL] Delta support / Hudi support as Gluten components @zhztheplayer github.com//pull/8282
- [8266][VL][CI] Pre-install spark sources docker image @PHILO-HE github.com//pull/8290
- [7641][VL] Add perf analysis scripts for TPCH workload @marin-ma github.com//pull/8065
- [8266][VL] Use pre-installed resources for Spark/Celeborn @PHILO-HE github.com//pull/8294
- [6887][VL] Daily Update Velox Version (2024_12_21) @GlutenPerfBot github.com//pull/8297
- [GLUTE-8279][CH] Fix concat diff while single argument with array type is input @taiyang-li github.com//pull/8280
- [VL] RAS: A couple of minor fixes for RAS @zhztheplayer github.com//pull/8292
- [CH] Refactor: don't
using namespace DB
header @baibaichen github.com//pull/8300 - [8244][CORE] Softaffinity use consistent hash schedule @yikf github.com//pull/8245
- [6887][VL] Daily Update Velox Version (2024_12_23) @GlutenPerfBot github.com//pull/8301
- [7641][VL] Fix security issue for perf analysis scripts @marin-ma github.com//pull/8309
- [VL] Simplify code for PartialProjectRule @zml1206 github.com//pull/8273
- [8253][CH] Fix cast failed when in-filter with tuple values @lwz9103 github.com//pull/8256
- [8050][VL] Add viewfs support scan validation @JkSelf github.com//pull/8049
- [1632][CH]Daily Update Clickhouse Version (20241224) @kyligence-git github.com//pull/8312
- [6887][VL] Daily Update Velox Version (2024_12_25) @GlutenPerfBot github.com//pull/8334
- [CH] Disable gluten arm ci @lwz9103 github.com//pull/8337
- [INFRA][MINOR] Change the issueRegexp to
(?:#|GLUTEN-)(\d+)
from the vcs.xml @yikf github.com//pull/8329 - [VL] RAS: A benchmark suite for performance of query optimization @zhztheplayer github.com//pull/8339
- [Shims] Fix the code style issue of prepareWrite @rui-mo github.com//pull/8316
- [8341][CH] Fix code style and respect max_read_buffer_size for bzip2 read buffer @taiyang-li github.com//pull/8342
- [8330][VL] Improve convert the viewfs path to hdfs path @wangyum github.com//pull/8331
- [8325][CH] Fix miss matched result for
$
and.
reg expression @lgbo-ustc github.com//pull/8345 - [VL] Remove compile option
--enable_ep_cache
@zhztheplayer github.com//pull/8350 - [8352][ICEBERG] Fix read error when partition column was drop @lyy-pineapple github.com//pull/8353
- [CH] [MINOR] Configure Log4j2 to print logs of
org.apache.iceberg
for tracingClickHouseIcebergSuite
aborted issues @baibaichen github.com//pull/8361 - [7028][CH][Part-14] Refactor Case Sensitive Support for MergeTree @baibaichen github.com//pull/8346
- [8060][CORE][VL] Various of fixes for the experimental
GlutenShuffleManager
@zhztheplayer github.com//pull/8355 - [MINOR] Avoid duplicate comment symbol setup scripts @Zouxxyy github.com//pull/8360
- [7534][CH] Refactor and optimize sparkDecimalXXX functions @taiyang-li github.com//pull/8105
- [1632][CH]Daily Update Clickhouse Version (20241228) @kyligence-git github.com//pull/8368
- [7028][CH][Part-15] [MINOR] Fix UTs @baibaichen github.com//pull/8364
- [8327][CORE] Introduce the ConfigEntry to make the config definition more flexible @yikf github.com//pull/8328
- [6887][VL] Daily Update Velox Version (2024_12_30) @GlutenPerfBot github.com//pull/8371
- [8354][CH] Fix cse issue aggregate[Part2] @loneylee github.com//pull/8376
- [VL] Add some fixes following #8355 @zhztheplayer github.com//pull/8373
- [8375][CH] split decimal binary arithmetic functions into files @lgbo-ustc github.com//pull/8378
- [8283][CH] Eliminate CSE via
ExpressionParser
@lgbo-ustc github.com//pull/8284 - [7964][VL] Support S3 Bucket Config @majetideepak github.com//pull/8123
- Revert "[8327][CORE] Introduce the ConfigEntry to make the config definition more flexible (#8328)" @baibaichen github.com//pull/8382
- [VL] Various fixes for gluten-it @zhztheplayer github.com//pull/8396
- [7750][VL] Move ColumnarBuildSideRelation's memory occupation to Spark off-heap @zjuwangg github.com//pull/8127
- [6887][VL] Daily Update Velox Version (2025_01_02) @zhouyuan github.com//pull/8388
- [VL] Allow shared dependencies for s3 and abfs libs @majetideepak github.com//pull/8402
- [6887][VL] Daily Update Velox Version (2025_01_03) @GlutenPerfBot github.com//pull/8405
- [8393][VL] Fix the smj result mismatch issue @JkSelf github.com//pull/8394
- [8327][CORE][Part-1] Rename
GlutenConfig.getConf
toGlutenConfig.get
@yikf github.com//pull/8395 - [VL] Support loading dependency libs for Oracle linux @xinghuayu007 github.com//pull/8391
- [1632][CH]Daily Update Clickhouse Version (20250103) @kyligence-git github.com//pull/8407
- [VL] Plumb ABFS config @majetideepak github.com//pull/8403
- [8398] Bump Celeborn to 0.4.3 and 0.5.2 @SteNicholas github.com//pull/8399
- [6887][VL] Daily Update Velox Version (2025_01_04) @GlutenPerfBot github.com//pull/8420
- [7602][CH] Add spark cast array to string @zhanglistar github.com//pull/8392
- [8327][CORE][Part-2] Move
GlutenConfig.scala
toorg.apache.gluten.config
package dir @yikf github.com//pull/8426 - [6887][VL] Daily Update Velox Version (2025_01_05) @GlutenPerfBot github.com//pull/8424
- [VL] Make enabling orc scan dynamically configurable @LoseYSelf github.com//pull/8433
- [TEST] Fix gluten test util getExecutedPlan @j7nhai github.com//pull/8374
- [VL] Support casting timestamp type to varchar type @PHILO-HE github.com//pull/8338
- [7502][CH]Fix orc write time zone diff @KevinyhZou github.com//pull/7523
- [8397][CH][Part-1]: Disable hdfs while compiling clickhouse backend on macOS @yxheartipp github.com//pull/8400
- [6887][VL] Daily Update Velox Version (2025_01_06) @GlutenPerfBot github.com//pull/8427
- [8304][CORE] Add an optimization rule to collapse nested get_json_object functions @KevinyhZou github.com//pull/8305
- [8183][CORE] Prune unused column project operator @liujiayi771 github.com//pull/8295
- [8408][CH] Fix compile failures on ARM @loudongfeng github.com//pull/8413
- [1632][CH]Daily Update Clickhouse Version (20250107) @kyligence-git github.com//pull/8440
- [8307][VL] Support Int64 Timestamp parquet reader @zml1206 github.com//pull/8308
- [6887][VL] Daily Update Velox Version (2025_01_07) @GlutenPerfBot github.com//pull/8439
- [7641][VL] Minor fixup for the perf analysis script @marin-ma github.com//pull/8430
- [VL] Gluten-it: New option
--collect-sql-metrics=execution-time
to collect slowest plan nodes into benchmark report @zhztheplayer github.com//pull/8445 - [8375][CH][MINOR] Fix USE_EMBEDDED_COMPILER @baibaichen github.com//pull/8441
- [VL] Remove an out-of-date warning message @zhztheplayer github.com//pull/8447
- [1.3][VL]update oap velox to gluten-1.3.0 @weiting-chen github.com//pull/8519
- [1.3][DOC] update spark 3.5.2 doc (#8543) @zhouyuan github.com//pull/8547
- [1.3] Preparing for Gluten v1.3.0-rc0 @weiting-chen github.com//pull/8552
New Contributors
- @wenwj0 made their first contribution github.com//pull/6286
- @wecharyu made their first contribution github.com//pull/6438
- @liujp made their first contribution github.com//pull/6462
- @majetideepak made their first contribution github.com//pull/6522
- @jiangjiangtian made their first contribution github.com//pull/6746
- @Preetesh2110 made their first contribution github.com//pull/6326
- @EpsilonPrime made their first contribution github.com//pull/6833
- @zuston made their first contribution github.com//pull/6994
- @Z1Wu made their first contribution github.com//pull/7038
- @JinHelin404 made their first contribution github.com//pull/7209
- @beliefer made their first contribution github.com//pull/7334
- @pratham76 made their first contribution github.com//pull/7308
- @CodenameGHOST007 made their first contribution github.com//pull/7411
- @Henry2SS made their first contribution github.com//pull/7476
- @Zand100 made their first contribution github.com//pull/7736
- @yabinma made their first contribution github.com//pull/8030
- @merrily01 made their first contribution github.com//pull/8104
- @clay4megtr made their first contribution github.com//pull/7553
- @xinghuayu007 made their first contribution github.com//pull/8391
- @LoseYSelf made their first contribution github.com//pull/8433
Full Changelog: github.com/apache/incubator-gluten/compare/v1.2.0...v1.3.0-rc0