Usage

Index can be managed using any of the supported clients, such as hetu-cli located under the bin directory in the installation.

CREATE

To create an index you can run sql statements of the form:

CREATE INDEX [ IF NOT EXISTS ] index_name
USING [ BITMAP | BLOOM | BTREE | MINMAX ]
ON tbl_name (col_name)
WITH ( 'level' = ['STRIPE', 'PARTITION'], "autoload" = true, "bloom.fpp" = '0.001', "bloom.mmapEnabled" = false, [, …] )
WHERE predicate;
  • WHERE predicate can be used to create index on select partition(s)
  • WITH can be used to specify index properties or index level. See individual index documentation to support properties.
  • "level"='STRIPE' if not specified
  • "autoload" overrides the default value hetu.heuristicindex.filter.cache.autoload-default in config.properties. After the index is created or updated, whether to automatically load it into cache. If false, index will be loaded as needed. This means, the first few queries may not benefit from index as it is being loaded into cache. Setting this to true may result in high memory usage but will give the best results.

If the table is partitioned, you can specify a single partition to create an index on, or an in-predicate to specify multiple partitions:

CREATE INDEX index_name USING bloom ON hive.schema.table (column1);
CREATE INDEX index_name USING bloom ON hive.schema.table (column1) WITH ("bloom.fpp"="0.01") WHERE p=part1;
CREATE INDEX index_name USING bloom ON hive.schema.table (column1) WHERE p in (part1, part2, part3);

Note: If the table is multi-partitioned (for example, partitioned by colA and colB), for BTree index, only index creation on the first level is supported (colA). Bloom, Bitmap and Minmax index creation on either (colA or colB) is supported.

SHOW

To show the information of all the indices or a specific index by index_name. The information includes index name, user, table name, index columns, index type, index status, etc.

SHOW INDEX;
SHOW INDEX index_name;

UPDATE

Update an existing index if the source table has been modified. You can check the status of the index with SHOW INDEX index_name.

UPDATE INDEX index_name;

DROP

To delete an index by name:

DROP INDEX index_name
WHERE predicate;
  • WHERE predicate be used to delete index for specific partition(s). However, if index was initially created on the entire table, it is not possible to delete index for a single partition.
DROP INDEX index_name where p=part1;

Note: dropped index will be removed from cache after a few seconds, so you may still see the next few queries still using the index. The index will be automatically dropped if the source table is dropped.

Notes on resource usage

Disk usage

Heuristic index uses the local temporary directory (default /tmp on linux) while creating and processing indexes while running. Therefore, the temporary directory should have sufficient space. To change the temporary directory set the following property in etc/jvm.config:

-Djava.io.tmpdir=/path/to/another/dir

有奖捉虫

“有虫”文档片段

0/500

存在的问题

文档存在风险与错误

● 拼写,格式,无效链接等错误;

● 技术原理、功能、规格等描述和软件不一致,存在错误;

● 原理图、架构图等存在错误;

● 版本号不匹配:文档版本或内容描述和实际软件不一致;

● 对重要数据或系统存在风险的操作,缺少安全提示;

● 排版不美观,影响阅读;

内容描述不清晰

● 描述存在歧义;

● 图形、表格、文字等晦涩难懂;

● 逻辑不清晰,该分类、分项、分步骤的没有给出;

内容获取有困难

● 很难通过搜索引擎,openLooKeng官网,相关博客找到所需内容;

示例代码有错误

● 命令、命令参数等错误;

● 命令无法执行或无法完成对应功能;

内容有缺失

● 关键步骤错误或缺失,无法指导用户完成任务,比如安装、配置、部署等;

● 逻辑不清晰,该分类、分项、分步骤的没有给出

● 图形、表格、文字等晦涩难懂

● 缺少必要的前提条件、注意事项等;

● 描述存在歧义

0/500

您对文档的总体满意度

非常不满意
非常满意

请问是什么原因让您参与到这个问题中

您的邮箱

创Issue赢奖品
根据您的反馈,会自动生成issue模板。您只需点击按钮,创建issue即可。
有奖捉虫