Title: | 'Python' Style Regular Expression Functions |
---|---|
Description: | A comprehensive set of regular expression functions based on those found in 'Python' without relying on 'reticulate'. It provides functions that intend to (1) make it easier for users familiar with 'Python' to work with regular expressions, (2) reduce the complexity often associated with regular expressions code, (3) and enable users to write more readable and maintainable code that relies on regular expression-based pattern matching. |
Authors: | Garrett Shipley [aut, cre] |
Maintainer: | Garrett Shipley <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.0 |
Built: | 2024-11-02 05:12:53 UTC |
Source: | https://github.com/pythonicr/re |
re_compile
compiles a regular expression pattern with specified flags. This function allows setting
various flags akin to regex modifiers in other programming languages like Python. The flags control various aspects
of pattern matching. This function is really just a way to set flag arguments with a constant variable.
re_compile(pattern, IGNORECASE, I, MULTILINE, M, DOTALL, S, VERBOSE, X, NOFLAG)
re_compile(pattern, IGNORECASE, I, MULTILINE, M, DOTALL, S, VERBOSE, X, NOFLAG)
pattern |
The regular expression pattern to be compiled. |
IGNORECASE |
Flag to indicate case-insensitive matching. |
I |
Abbreviation for IGNORECASE. |
MULTILINE |
Flag to indicate multi-line matching, where |
M |
Abbreviation for MULTILINE. |
DOTALL |
Flag to indicate that |
S |
Abbreviation for DOTALL |
VERBOSE |
Flag to allow a more verbose regex syntax, which can include comments and whitespace for readability. |
X |
Abbreviation for VERBOSE |
NOFLAG |
Flag to indicate that no flags should be set. |
An object of class "Pattern" representing the compiled regular expression with the specified flags.
Python re.compile() documentation
pattern <- re_compile("^abc", IGNORECASE) pattern <- re_compile("end$", M = TRUE) pattern <- re_compile("a.b", DOTALL = TRUE)
pattern <- re_compile("^abc", IGNORECASE) pattern <- re_compile("end$", M = TRUE) pattern <- re_compile("a.b", DOTALL = TRUE)
re_contains
checks whether a specified pattern (regular expression) is found within each element of a
character vector. If the provided pattern is not already a compiled pattern object, it compiles it using
re_compile
.
re_contains(pattern, string, ...)
re_contains(pattern, string, ...)
pattern |
A regular expression pattern or a compiled pattern object. |
string |
A character vector where each element is a string to be checked against the pattern. |
... |
Arguments passed on to
|
A logical vector of the same length as string
, indicating whether each element contains a match for the
pattern.
pattern <- re_compile("^abc", IGNORECASE) re_contains(pattern, "Abcdef") re_contains("xyz$", "hello world xyz")
pattern <- re_compile("^abc", IGNORECASE) re_contains(pattern, "Abcdef") re_contains("xyz$", "hello world xyz")
re_escape
escapes all special characters in a string. This function is useful when you want to treat a
string literally in a regular expression context, escaping characters that would otherwise be interpreted as special
regex operators.
re_escape(pattern)
re_escape(pattern)
pattern |
A character vector where each element is a string in which special regex characters are to be escaped. |
A character vector of the same length as pattern
.
Python re.escape() documentation
re_escape("a[bc].*d?") re_escape(".^$|*+?{}[]()")
re_escape("a[bc].*d?") re_escape(".^$|*+?{}[]()")
re_findall
extracts all occurrences of a specified pattern (regular expression) from each element of a
character vector. If the provided pattern is not already a compiled pattern object, it compiles it using
re_compile
.
re_findall(pattern, string, ...)
re_findall(pattern, string, ...)
pattern |
A regular expression pattern or a compiled pattern object. |
string |
A character vector where each element is a string from which to extract matches of the pattern. |
... |
Arguments passed on to
|
A list of character vectors, where each vector contains all the matches found in the corresponding element of
string
.
Python re.findall() documentation
pattern <- re_compile("\\b\\w+\\b") re_findall(pattern, "This is a test.") # Extracts all words re_findall("\\d+", "123 abc 456")
pattern <- re_compile("\\b\\w+\\b") re_findall(pattern, "This is a test.") # Extracts all words re_findall("\\d+", "123 abc 456")
re_fullmatch
checks whether each element of a character vector fully matches a specified pattern
(regular expression). If the provided pattern is not already a compiled pattern object, it compiles it using
re_compile
. The function ensures that the entire string matches the pattern from start to end.
re_fullmatch(pattern, string, ...)
re_fullmatch(pattern, string, ...)
pattern |
A regular expression pattern or a compiled pattern object. |
string |
A character vector where each element is a string to be matched against the pattern. |
... |
Arguments passed on to
|
A list where each element is a character vector containing the full match for the corresponding element of
string
, or character(0)
if there is no match.
Python re.fullmatch() documentation
pattern <- re_compile("\\d{3}-\\d{2}-\\d{4}") re_fullmatch(pattern, "123-45-6789") # Full match re_fullmatch("123-45-6789", "123-45-6789 and more") # No full match
pattern <- re_compile("\\d{3}-\\d{2}-\\d{4}") re_fullmatch(pattern, "123-45-6789") # Full match re_fullmatch("123-45-6789", "123-45-6789 and more") # No full match
re_match
checks whether each element of a character vector matches a specified pattern (regular
expression) at the start. If the provided pattern is not already a compiled pattern object, it compiles it using
re_compile
. The function ensures that the matching occurs at the beginning of the string.
re_match(pattern, string, ...)
re_match(pattern, string, ...)
pattern |
A regular expression pattern or a compiled pattern object. |
string |
A character vector where each element is a string to be matched against the pattern at the beginning. |
... |
Arguments passed on to
|
A list where each element is a character vector containing the match found at the start of the corresponding
element of string
, or character(0)
if there is no match at the start.
Python re.match() equivalent functionality documentation
pattern <- re_compile("\\d{3}") re_match(pattern, "123abc") re_match("abc", "xyzabc")
pattern <- re_compile("\\d{3}") re_match(pattern, "123abc") re_match("abc", "xyzabc")
re_search
searches for occurrences of a specified pattern (regular expression) within each element of
a character vector. If the provided pattern is not already a compiled pattern object, it compiles it using
re_compile
.
re_search(pattern, string, ...)
re_search(pattern, string, ...)
pattern |
A regular expression pattern or a compiled pattern object. |
string |
A character vector where each element is a string in which to search for the pattern. |
... |
Arguments passed on to
|
A list where each element is a character vector containing all matches found in the corresponding element of
string
. If no matches are found, the element will be character(0)
.
Python re.search() documentation
pattern <- re_compile("\\d+") re_search(pattern, "abc 123 xyz") # Finds "123" re_search("\\bword\\b", "A sentence with the word.") # Finds "word"
pattern <- re_compile("\\d+") re_search(pattern, "abc 123 xyz") # Finds "123" re_search("\\bword\\b", "A sentence with the word.") # Finds "word"
re_split
splits each element of a character vector into substrings based on a specified pattern
(regular expression). If the provided pattern is not already a compiled pattern object, it compiles it using
re_compile
. The function allows for controlling the maximum number of splits performed.
re_split(pattern, string, ..., maxsplit = -1L)
re_split(pattern, string, ..., maxsplit = -1L)
pattern |
A regular expression pattern or a compiled pattern object. |
string |
A character vector where each element is a string to be split. |
... |
Arguments passed on to
|
maxsplit |
The maximum number of splits to perform on each string. If -1L (default), all possible splits are performed. |
A list of character vectors, where each vector contains the substrings resulting from splitting the
corresponding element of string
.
Python re.split() documentation
pattern <- re_compile("\\s+") re_split(pattern, "Split this string") # Splits on whitespace re_split("\\W+", "Split,with!punctuation.morestuff", maxsplit = 2)
pattern <- re_compile("\\s+") re_split(pattern, "Split this string") # Splits on whitespace re_split("\\W+", "Split,with!punctuation.morestuff", maxsplit = 2)
re_sub
replaces all occurrences of a specified pattern (regular expression) in each element of a
character vector with a replacement string. If the provided pattern is not already a compiled pattern object, it
compiles it using re_compile
.
re_sub(pattern, replacement, string, ...)
re_sub(pattern, replacement, string, ...)
pattern |
A regular expression pattern or a compiled pattern object. |
replacement |
The replacement string. |
string |
A character vector where each element is a string in which the pattern will be replaced. |
... |
Arguments passed on to
|
A character vector of the same length as string
, with all occurrences of the pattern replaced by
replacement
in each element.
pattern <- re_compile("\\d+") re_sub(pattern, "number", "Replace 123 with text.") # Replaces "123" with "number" re_sub("\\s+", "-", "Split and join") # Replaces spaces with hyphens
pattern <- re_compile("\\d+") re_sub(pattern, "number", "Replace 123 with text.") # Replaces "123" with "number" re_sub("\\s+", "-", "Split and join") # Replaces spaces with hyphens