Regular Expression Functions

All of the regular expression functions use the Java pattern syntax, with a few notable exceptions:

  • When using multi-line mode (enabled via the (?m) flag), only \n is recognized as a line terminator. Additionally, the (?d) flag is not supported and must not be used.
  • Case-insensitive matching (enabled via the (?i) flag) is always performed in a Unicode-aware manner. However, context-sensitive and local-sensitive matching is not supported. Additionally, the (?u) flag is not supported and must not be used.
  • Surrogate pairs are not supported. For example, \uD800\uDC00 is not treated as U+10000 and must be specified as \x{10000}. Boundaries (\b) are incorrectly handled for a non-spacing mark without a base character.
  • \Q and \E are not supported in character classes (such as [A-Z123]) and are instead treated as literals.
  • Unicode character classes (\p{prop}) are supported with the following differences:
    • All underscores in names must be removed. For example, use OldItalic instead of Old_Italic.
    • Scripts must be specified directly, without the Is, script= or sc= prefixes. Example: \p{Hiragana}
    • Blocks must be specified with the In prefix. The block= and blk= prefixes are not supported. Example: \p{Mongolian}
    • Categories must be specified directly, without the Is, general_category= or gc= prefixes. Example: \p{L}
    • Binary properties must be specified directly, without the Is. Example: \p{NoncharacterCodePoint}

regexp_extract_all(string, pattern) -> array(varchar)

Returns the substring(s) matched by the regular expression pattern in string:

SELECT regexp_extract_all('1a 2b 14m', '\d+'); -- [1, 2, 14]

regexp_extract_all(string, pattern, group) -> array(varchar)

Finds all occurrences of the regular expression pattern in string and returns the capturing group number group:

SELECT regexp_extract_all('1a 2b 14m', '(\d+)([a-z]+)', 2); -- ['a', 'b', 'm']

regexp_extract(string, pattern) -> varchar

Returns the first substring matched by the regular expression pattern in string:

SELECT regexp_extract('1a 2b 14m', '\d+'); -- 1

regexp_extract(string, pattern, group) -> varchar

Finds the first occurrence of the regular expression pattern in string and returns the capturing group number group:

SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'

regexp_like(string, pattern) -> boolean

Evaluates the regular expression pattern and determines if it is contained within string.

This function is similar to the LIKE operator, except that the pattern only needs to be contained within string, rather than needing to match all of string. In other words, this performs a contains operation rather than a match operation. You can match the entire string by anchoring the pattern using ^ and $:

SELECT regexp_like('1a 2b 14m', '\d+b'); -- true

regexp_replace(string, pattern) -> varchar

Removes every instance of the substring matched by the regular expression pattern from string:

SELECT regexp_replace('1a 2b 14m', '\d+[ab] '); -- '14m'

regexp_replace(string, pattern, replacement) -> varchar

Replaces every instance of the substring matched by the regular expression pattern in string with replacement. Capturing groups can be referenced in replacement using $g for a numbered group or ${name} for a named group. A dollar sign ($) may be included in the replacement by escaping it with a backslash (\$):

SELECT regexp_replace('1a 2b 14m', '(\d+)([ab]) ', '3c$2 '); -- '3ca 3cb 14m'

regexp_replace(string, pattern, function) -> varchar

Replaces every instance of the substring matched by the regular expression pattern in string using function. The lambda expression <lambda> function is invoked for each match with the capturing groups passed as an array. Capturing group numbers start at one; there is no group for the entire match (if you need this, surround the entire expression with parenthesis). :

SELECT regexp_replace('new york', '(\w)(\w*)', x -> upper(x[1]) || lower(x[2])); --'New York'

regexp_split(string, pattern) -> array(varchar)

Splits string using the regular expression pattern and returns an array. Trailing empty strings are preserved:

SELECT regexp_split('1a 2b 14m', '\s*[a-z]+\s*'); -- [1, 2, 14, ]

有奖捉虫

“有虫”文档片段

0/500

存在的问题

文档存在风险与错误

● 拼写,格式,无效链接等错误;

● 技术原理、功能、规格等描述和软件不一致,存在错误;

● 原理图、架构图等存在错误;

● 版本号不匹配:文档版本或内容描述和实际软件不一致;

● 对重要数据或系统存在风险的操作,缺少安全提示;

● 排版不美观,影响阅读;

内容描述不清晰

● 描述存在歧义;

● 图形、表格、文字等晦涩难懂;

● 逻辑不清晰,该分类、分项、分步骤的没有给出;

内容获取有困难

● 很难通过搜索引擎,openLooKeng官网,相关博客找到所需内容;

示例代码有错误

● 命令、命令参数等错误;

● 命令无法执行或无法完成对应功能;

内容有缺失

● 关键步骤错误或缺失,无法指导用户完成任务,比如安装、配置、部署等;

● 逻辑不清晰,该分类、分项、分步骤的没有给出

● 图形、表格、文字等晦涩难懂

● 缺少必要的前提条件、注意事项等;

● 描述存在歧义

0/500

您对文档的总体满意度

非常不满意
非常满意

请问是什么原因让您参与到这个问题中

您的邮箱

创Issue赢奖品
根据您的反馈,会自动生成issue模板。您只需点击按钮,创建issue即可。
有奖捉虫