Filesystem Access Utilities

Overview

openLooKeng project includes a set of filesystem client utilities to help access and modifying files. Currently, two sets of filesystems are supported: HDFS and local filesystem. A HetuFileSystemClient interface is provided in the SPI, which defines the common file operations to be used in the project. The goal of this client is to provide unified interface, behaviors and exceptions across different filesystems. Therefore client code can easily reuse codes and transfer their logic without having to change the code.

A utility class FileBasedLock located in SPI can provide exclusive access on a given filesystem. It takes advantage of the unified filesystem client interface thus works for different filesystems.

The filesystem clients are implemented as openLooKeng Plugins. The implementations are placed in the hetu-filesystem-client module, where one factory extending HetuFileSystemClientFactory must be implemented and registered in HetuFileSystemClientPlugin.

Filesystem Profiles

A filesystem profile must contain a type field:

fs.client.type=<filesystem type>

Additional configs used by the corresponding filesystem client can be provided. For example, hdfs filesystem needs paths to config resource files core-site.xml and hdfs-site.xml. If the filesystem enables authentication, such as KERBEROS, credentials should be specified in the profile as well.

Multiple profiles defining filesystem clients can be placed in etc/filesystem. Similar to catalogs, they will be loaded upon server start by a filesystem manager which also produce filesystem clients to users in presto-main. Client modules can read profiles in this folder by themselves, but it is highly recommended that client modules obtain a client which is provided by the main module in order to prevent dependency issues.

A typical use case of having multiple profiles is when multiple hdfs are used and preset. In this case create a property file for each cluster and include their own authentication information in different profiles, for example hdfs1.properties and hdfs2.properties. Client code will be able to access them by specifying the profile name (file name), such as getFileSystemClient("hdfs1", <rootPath>) from the FileSystemClientManager.

For each local and hdfs filesystem client, it requires a root path to which the access to filesystem is limited. All access beyond this root dir will be denied. It is suggested to use the deepest directory that meets the need as the client root, in order to avoid the risk of modifying (or deleting) additional files by accident.

Filesystem profile properties

Property NameMandatoryDescriptionDefault Value
fs.client.typeYESThe type of filesystem profile. Accepted values: local, hdfs.
hdfs.config.resourcesNOPath to hdfs resource files (e.g. core-site.xml, hdfs-site.xml)Use Local hdfs
hdfs.authentication.typeYEShdfs authentication Accepted values: KERBEROS, NONE
hdfs.krb5.conf.pathYES if auth type set to KERBEROSPath to the krb5 config file
hdfs.krb5.keytab.pathYES if auth type set to KERBEROSPath to the kerberos keytab file
hdfs.krb5.principalYES if auth type set to KERBEROSPrincipal of kerberos authentication
fs.hdfs.impl.disable.cacheNODisable cache in hdfs.false

openLooKengFileSystemClient

The unified HetuFileSystemClient sets the standard of filesystem access for:

  • Method signatures
  • Behaviors
  • Exceptions

The ultimate goal of doing all these unification is to increase the reusability of code that needs to perform file operations on multiple filesystem.

Method signatures and behaviors

The methods follow the same signature as those in java.nio.file.spi.FileSystemProvider and java.nio.file.Files in order to maximize compatibility. There are also additional methods that are not available in java.nio package to supply useful functionalities (e.g. deleteRecursively()).

Exceptions

Similar to method signatures, exception patterns strictly follow the same as those produced in Files. For other filesystems, the implementation translates exceptions to Java local java.nio.file.FileSystemException which is most semantically equivalent. This allows the client code to:

  1. handle IOException without worrying about the underlying filesystem.
  2. decouple from extra dependencies, such as hadoop

Here are examples of exceptions that are translated from hdfs:

  • org.apache.hadoop.fs.FileAlreadyExistsException -> java.nio.file.FileAlreadyExistsException
  • org.apache.hadoop.fs.PathIsNotEmptyDirectoryException -> java.nio.file.DirectoryNotEmptyException

FileBasedLock

A FileBasedLock utility is provided in io.prestosql.spi.filesystem to be used together with HetuFileSystemClient. This lock is file-based, and follows “try-and-back-off-on-exception” pattern to serialize operations in concurrent scenarios.

The lock design has a two-step file based lock schema: .lockFile to mark lock status and .lockInfo to record lock ownership:

  1. Before a client obtains the lock, it first checks if a valid (non-expire) .lockFile exists.

  2. If not it tries to access the information stored in .lockInfo to see if the lock is held by itself, or try to write its uuid into the .lockInfo file if a valid file does not exist.

  3. After all these it starts a background thread to keep updating .lockFile to inform others that the filesystem has been locked and renew it.

  4. If any of the check above indicates that the lock has been acquired by others, or any exception occurs during the process, the current process/thread gives up (back-off) locking and tries again in the next loop.

有奖捉虫

“有虫”文档片段

0/500

存在的问题

文档存在风险与错误

● 拼写,格式,无效链接等错误;

● 技术原理、功能、规格等描述和软件不一致,存在错误;

● 原理图、架构图等存在错误;

● 版本号不匹配:文档版本或内容描述和实际软件不一致;

● 对重要数据或系统存在风险的操作,缺少安全提示;

● 排版不美观,影响阅读;

内容描述不清晰

● 描述存在歧义;

● 图形、表格、文字等晦涩难懂;

● 逻辑不清晰,该分类、分项、分步骤的没有给出;

内容获取有困难

● 很难通过搜索引擎,openLooKeng官网,相关博客找到所需内容;

示例代码有错误

● 命令、命令参数等错误;

● 命令无法执行或无法完成对应功能;

内容有缺失

● 关键步骤错误或缺失,无法指导用户完成任务,比如安装、配置、部署等;

● 逻辑不清晰,该分类、分项、分步骤的没有给出

● 图形、表格、文字等晦涩难懂

● 缺少必要的前提条件、注意事项等;

● 描述存在歧义

0/500

您对文档的总体满意度

非常不满意
非常满意

请问是什么原因让您参与到这个问题中

您的邮箱

创Issue赢奖品
根据您的反馈,会自动生成issue模板。您只需点击按钮,创建issue即可。
有奖捉虫