Skip to contents

Applies pattern-based string replacements to one or more files. Expects a series of regular-expression-replacement pairs that are applied one-by-one in the given order. By default, all performed replacements are displayed on the console (verbose = TRUE) without actually changing any file content (run_dry = TRUE).

Usage

str_replace_file(
  path,
  pattern,
  process_line_by_line = FALSE,
  eol = c("LF", "CRLF", "CR", "LFCR"),
  verbose = TRUE,
  n_context_chrs = 20L,
  show_rel_path = TRUE,
  run_dry = TRUE
)

Arguments

path

Paths to the text files. A character vector.

pattern

A named character vector with patterns as names and replacements as values (c(pattern1 = replacement1)). Patterns are interpreted as regular expressions as described in stringi::stringi-search-regex(). Replacements are interpreted as-is, except that references of the form \1, \2, etc. will be replaced with the contents of the respective matched group (created in patterns using ()). Pattern-replacement pairs are processed in the order given, meaning that first listed pairs are applied before later listed ones.

process_line_by_line

Whether each line in a file should be treated as a separate string or the whole file as one single string. While the latter is more performant, you probably want the former if you're using "^" or "$" in your patterns.

eol

End of line (EOL) control character sequence. Only relevant if process_line_by_line = TRUE. One of

  • "LF" for the line feed (LF) character ("\n"). The standard on Unix and Unix-like systems (Linux, macOS, *BSD, etc.) and the default.

  • "CRLF" for the carriage return + line feed (CR+LF) character sequence ("\r\n"). The standard on Microsoft Windows, DOS and some other systems.

  • "CR" for the carriage return (CR) character ("\r"). The standard on classic Mac OS and some other antiquated systems.

  • "LFCR" for the line feed + carriage return (LF+CR) character sequence ("\n\r"). The standard on RISC OS and some other exotic systems.

verbose

Whether or not to display replacements on the console.

n_context_chrs

The (maximum) number of characters displayed around the actual string and its replacement. The number refers to a single side of string/replacement, so the total number of context characters is at the maximum 2 * n_context_chrs. Only relevant if verbose = TRUE.

show_rel_path

Whether or not to display file paths as relative from the current working directory. If FALSE, absolute paths are displayed. Only relevant if verbose = TRUE.

run_dry

Whether or not to show replacements on the console only, without actually modifying any files. Implies verbose = TRUE.

Value

path invisibly.

Details

Note that process_line_by_line requires the line ending standard (EOL) of the input files to be correctly set via eol. It always defaults to "LF" (Unix standard) since this is something which cannot be reliably detected without complex heuristics (and even then not unambiguously in all edge cases). Simply deriving a default depending on the host OS (i.a. "LF" on Unix systems like Linux and macOS and "CRLF" on Windows) seems like a really bad idea with regard to cross-system collaboration (files shared via Git etc.), thus it was refrained from.

The text files are assumed to be in UTF-8 character encoding, other encodings are not supported.

See also

Other string functions: str_normalize(), str_normalize_file(), str_replace_verbose()