Applies pattern-based string replacements to one or more files. Expects a series of regular-expression-replacement pairs that are applied one-by-one in the
given order. By default, all performed replacements are displayed on the console (verbose = TRUE
) without actually changing any file content
(run_dry = TRUE
).
Usage
str_replace_file(
path,
pattern,
process_line_by_line = FALSE,
eol = c("LF", "CRLF", "CR", "LFCR"),
verbose = TRUE,
n_context_chrs = 20L,
show_rel_path = TRUE,
run_dry = TRUE
)
Arguments
- path
Paths to the text files. A character vector.
- pattern
A named character vector with patterns as names and replacements as values (
c(pattern1 = replacement1)
). Patterns are interpreted as regular expressions as described instringi::stringi-search-regex()
. Replacements are interpreted as-is, except that references of the form\1
,\2
, etc. will be replaced with the contents of the respective matched group (created in patterns using()
). Pattern-replacement pairs are processed in the order given, meaning that first listed pairs are applied before later listed ones.- process_line_by_line
Whether each line in a file should be treated as a separate string or the whole file as one single string. While the latter is more performant, you probably want the former if you're using
"^"
or"$"
in yourpattern
s.- eol
End of line (EOL) control character sequence. Only relevant if
process_line_by_line = TRUE
. One of"LF"
for the line feed (LF) character ("\n"
). The standard on Unix and Unix-like systems (Linux, macOS, *BSD, etc.) and the default."CRLF"
for the carriage return + line feed (CR+LF) character sequence ("\r\n"
). The standard on Microsoft Windows, DOS and some other systems."CR"
for the carriage return (CR) character ("\r"
). The standard on classic Mac OS and some other antiquated systems."LFCR"
for the line feed + carriage return (LF+CR) character sequence ("\n\r"
). The standard on RISC OS and some other exotic systems.
- verbose
Whether or not to display replacements on the console.
- n_context_chrs
The (maximum) number of characters displayed around the actual
string
and its replacement. The number refers to a single side ofstring
/replacement, so the total number of context characters is at the maximum2 * n_context_chrs
. Only relevant ifverbose = TRUE
.- show_rel_path
Whether or not to display file
path
s as relative from the current working directory. IfFALSE
, absolute paths are displayed. Only relevant ifverbose = TRUE
.- run_dry
Whether or not to show replacements on the console only, without actually modifying any files. Implies
verbose = TRUE
.
Details
Note that process_line_by_line
requires the line ending standard (EOL) of the input files to be correctly set via
eol
. It always defaults to "LF"
(Unix standard) since this is something which cannot be reliably detected without complex heuristics (and even then
not unambiguously in all edge cases). Simply deriving a default depending on the host OS (i.a. "LF"
on Unix systems like Linux and macOS and "CRLF"
on
Windows) seems like a really bad idea with regard to cross-system collaboration (files shared via Git etc.), thus it was refrained from.
The text files are assumed to be in UTF-8 character encoding, other encodings are not supported.
See also
Other string functions:
str_normalize()
,
str_normalize_file()
,
str_replace_verbose()