Regular expression patterns and replacements for text normalization
Source:R/yay.gen.R
regex_text_normalization.Rd
Regular expression patterns and replacements for text normalization
Format
A tibble.
See also
String normalization functions: str_normalize()
str_normalize_file()
Other regular expression rules:
regex_file_normalization
Examples
# unnest the pattern column
tidyr::unnest_longer(data = yay::regex_text_normalization,
col = pattern)
#> # A tibble: 9 × 5
#> id category purpose pattern replacement
#> <chr> <chr> <chr> <chr> <chr>
#> 1 uniform_quotation_marks harmonize_punctuation "use typewriter double quotes (`\"`) as quotation marks" "[“”„‟… "\""
#> 2 uniform_apostrophes harmonize_punctuation "use typewriter single quotes (`'`) as apostrophes" "[’‘‚‛… "'"
#> 3 no_break_percentages prettify_punctuation "use narrow non-breaking space between numbers and percentage signs" "\\b(\… "\\1 \\2"
#> 4 no_break_abbreviations_german prettify_punctuation "use narrow non-breaking space between characters of common German abbreviations" "(?i)\… "\\1 \\2"
#> 5 no_break_abbreviations_german prettify_punctuation "use narrow non-breaking space between characters of common German abbreviations" "(?i)\… "\\1 \\2"
#> 6 no_break_abbreviations_german prettify_punctuation "use narrow non-breaking space between characters of common German abbreviations" "(?i)\… "\\1 \\2"
#> 7 no_break_abbreviations_german prettify_punctuation "use narrow non-breaking space between characters of common German abbreviations" "(?i)\… "\\1 \\2"
#> 8 no_break_equals_sign prettify_punctuation "use narrow non-breaking space before and after certain assignments and equality comp… "(?<= … " \\1 "
#> 9 en_dash_value_ranges prettify_punctuation "use [en dash](https://www.thepunctuationguide.com/en-dash.html) instead of hyphen in… "(?<!-… "\\1–\\2"