Adding your own Index Type

Basic ideas

ID

Each index type must have an ID, which is a unique identifier and the canonical name of an index type used in CLI, server config, file path, etc.

The getID() method in the Index interface returns the ID of this index type to any infrastructure that uses it.

Level

A heuristic index stores additional and usually partial information of a dataset in a more compact way to speed up lookups in various ways. Therefore, each index must have a domain on which it is applied. For instance, if an index marks the max value of a data set, we must know how big the data set is when we define the “max” value (i.e. it can be the max of a group of rows, a data partition, or even a whole table). When a new index type is created, it must implement a method Set<Level> getSupportedIndexLevels(); which returns the data set level it can support. The levels are defined as an enum in Index interface.

Interface overlook

Indexing methods

Apart from the methods mentioned above, this section gives a quick guide on the most important methods needed to create a new index type. For the complete document on Index interface, please refer to the Java Doc of the source code.

There are two main functionalities in the Index interface:

boolean matches(Object expression) throws UnsupportedOperationException;

<I> Iterator<I> lookUp(Object expression) throws UnsupportedOperationException;

The first matches() method takes an expression, and returns if a predicate (the expression) can hold on a specific Level of data. For example, if an index records the max value of an integer column, it could easily tell if col_val > 5 can hold by comparing 5 to the max value.

The second lookUp() method is optional. Instead of only returning a boolean about whether a predicate can hold, it returns an iterator of all the possible positions where the data can be found. An index should always have the same result on matches() and lookUp().hasNext().

Adding and persisting values

The following methods are used to add values to the index and persist index objects onto disk:

boolean addValues(Map<String, List<Object>> values) throws IOException;

Index deserialize(InputStream in) throws IOException;

void serialize(OutputStream out) throws IOException;

The usage of them are pretty straightforward. A good example to help understand their usage is the source code of MinMaxIndex, where adding values is just to update the max and min variables according to the input number, and serialize()/deserialize()

有奖捉虫

“有虫”文档片段

0/500

存在的问题

文档存在风险与错误

● 拼写,格式,无效链接等错误;

● 技术原理、功能、规格等描述和软件不一致,存在错误;

● 原理图、架构图等存在错误;

● 版本号不匹配:文档版本或内容描述和实际软件不一致;

● 对重要数据或系统存在风险的操作,缺少安全提示;

● 排版不美观,影响阅读;

内容描述不清晰

● 描述存在歧义;

● 图形、表格、文字等晦涩难懂;

● 逻辑不清晰,该分类、分项、分步骤的没有给出;

内容获取有困难

● 很难通过搜索引擎,openLooKeng官网,相关博客找到所需内容;

示例代码有错误

● 命令、命令参数等错误;

● 命令无法执行或无法完成对应功能;

内容有缺失

● 关键步骤错误或缺失,无法指导用户完成任务,比如安装、配置、部署等;

● 逻辑不清晰,该分类、分项、分步骤的没有给出

● 图形、表格、文字等晦涩难懂

● 缺少必要的前提条件、注意事项等;

● 描述存在歧义

0/500

您对文档的总体满意度

非常不满意
非常满意

请问是什么原因让您参与到这个问题中

您的邮箱

创Issue赢奖品
根据您的反馈,会自动生成issue模板。您只需点击按钮,创建issue即可。
有奖捉虫