Tuning openLooKeng

The default openLooKeng settings should work well for most workloads. The following information may help you if your cluster is facing a specific performance problem.

Config Properties

See Properties Reference.

JVM Settings

The following can be helpful for diagnosing GC issues:


Performance Tuning Notes

Below are few notes on configurations which help with optimization and tuning query execution performance.

Common Table Expression(CTE) Optimization

Common Table expressions are common sub-plans under the query plan which are used more than ones within the query. openLooKeng engine assesses the usage of such CTE plan nodes for feasibility of reuse based optimization. This optimization is applied in 2 scenarios:

Reuse Execution Pipeline

Given the configuration optimizer.cte-reuse-enabled is enabled: Same CTE nodes in a given query are arranged in the stage pipeline such that only 1 executes and other reuse the output from the first node wiz. deemed as producer and reset as consumers. Its important to note that CTE pipeline cannot be used if CTE is used in Self join with the same CTE. Pipeline mode execution reduces the duplicate scanning and processing of the operations, thereby reducing the round trip time for the query.

Materialize and Reused

Given the configuration enable-cte-result-cache is enabled: CTE results are materialized to user’s choice of storage(refer data-cache configuration). This approach caches the output of the CTE node which is read and reused for subsequent queries after materialization succeeds.

Plan optimizations

  • Use exact partitioning: When enabled this forces data repartitioning unless the partitioning of upstream stage matches exactly what downstream stage expects (refer: exact partitioning).

  • Adaptive aggregations: This feature engages adaptive partial aggregation; it is governed by following 3 configurations:

  1. Enable Adaptive Partial Aggregation: enable the feature.
  2. Minimum rows threshold: Minimum number of processed rows before partial aggregation might be adaptively turned off.
  3. Unique rows ratio threshold: Ratio between aggregation output and input rows above which partial aggregation might be adaptively turned off.
  • Join Independence assumptions for Selectivity estimates:
  1. Multi-Clause join Independence factor: Scales the strength of independence assumption for selectivity estimates of multi-clause joins.
  2. Filter Conjunction Independence factor: Scales the strength of independence assumption for selectivity estimates of the conjunction of multiple filters.
  • Execution policy for better filtering: Specifies the execution policy enforced by the scheduler. One of following set of execution policies can be configured (refer execution policy specification):
  1. all-at-once: This policy makes available all stages for scheduler to process and start.
  2. phased: This policy follows the stage dependency based on the producer source, and schedule all independent stages together.
  3. prioritize-utilization: This policy follows the stage dependency in addition to producer source, it also looks at dynamic filters producers for dependent paths.






● 拼写,格式,无效链接等错误;

● 技术原理、功能、规格等描述和软件不一致,存在错误;

● 原理图、架构图等存在错误;

● 版本号不匹配:文档版本或内容描述和实际软件不一致;

● 对重要数据或系统存在风险的操作,缺少安全提示;

● 排版不美观,影响阅读;


● 描述存在歧义;

● 图形、表格、文字等晦涩难懂;

● 逻辑不清晰,该分类、分项、分步骤的没有给出;


● 很难通过搜索引擎,openLooKeng官网,相关博客找到所需内容;


● 命令、命令参数等错误;

● 命令无法执行或无法完成对应功能;


● 关键步骤错误或缺失,无法指导用户完成任务,比如安装、配置、部署等;

● 逻辑不清晰,该分类、分项、分步骤的没有给出

● 图形、表格、文字等晦涩难懂

● 缺少必要的前提条件、注意事项等;

● 描述存在歧义