From xradiograph

Emacs: Regular Expressions

Regexps are weird in Emacs because of the lisp

EmacsWiki:CategoryRegexp
EmacsWiki:EmacsCrashRegexp
Emacs in 24 hours, Regexp chapter
Steve Yegge’s Effective Emacs Regex section
(somebody’s) guide to Emacs’ Regexes

 

 

Why all the brackets and slashes, sometimes?

eg: "\\(\\[\\[#.*\\]\\]\\)\n!!!"
brackets [,] need escaping in a search regexp, but not in the replace regexp

The reason for needing to escape all of the control characters in an Emacs regexp is that the Emacs Lisp interpreter sees them prior to the regular expression engine. --Chris Smith

 

Newline in Emacs regexp

  1. in minibuffer: C-q C-j (see below)
  2. in code: \n

 

[...] insert a ^J character, which Emacs uses to represent newlines in functions and commands. At the point in the regexp or replacement where you need to insert a newline, hit Ctrl-q followed by Ctrl-j. Ctrl-q is Emacs’s “quote” command: rather than executing the following keystroke, Emacs will insert the key into the current buffer or the minibuffer. (Yegge)

 

Differences between interactive and programmatic

The above escapes are different for programmatic use, versus interactive usage.

 

slash-marks need to be escaped, so everything gets WORSE

 

M-x replace-regexp RET \(fw.getfieldsdata(.*)\{1\}\) RET trim$(\1)

 

(replace-regexp "\\(fw.getfieldsdata(.*)\\{1\\}\\)" "trim$(\\1)")

 

 

practical applications

using regexes to keep/remove matching lines

Sometimes, it just takes too long to find the right regex.
It would probably pay-off in the long-run to get it right, but deadlines are deadlines.
In these cases, I replace the offending party with something unique -- say ##-##.
Match on that as required, then replace with the original value.
This wouldn’t work for variations.

 

Specific case -- I was trying to remove all lines containing a period. No matter what I did, passing in a regex just didn’t work. When I converted periods to XXX and removed all of those, it worked. BLEAUGH.

 

 

replace all whitespace of more than one character: \s-\{2,\}

 

(while (re-search-forward "\\s-\\{2,\\}" nil t)
       (replace-match ""))

 

Find all 4-digit numbers (interactive)

\([0-9]\{4\}\)

 

Find all sequences of 3 or 4 uppercase letters

\([[:upper:]]\{3,4\}\)

 

convert XML tags to UPPERCASE

I know there’s a better way of handling underscores.
But this here is today’s dirty-laundry....

 

This uses the replace-regexp evaluation-feature introduced in Emacs 22, which I read about waaaay back when in Steve Yegge’s Shiny and New: Emacs 22, and never seriously implemented before today. And I’m Emacs 24. O tempora o mores!

 

There are a lot more examples at EmacsWiki:ReplaceRegexp

 

\,(upcase \1)
\(</?\w+_?\w+?_?\w+?>\)

 

<locate>
        <request>
                <id>1</id>
        </request>
        <response>
                <constituent>
                        <id>1</id>
                        <mailing_name>Mr. Test Name</mailing_name>
                        <alpha_sort_name>NAME      TEST</alpha_sort_name>
                        <address>
                                <addr_sub_type>NADR</addr_sub_type>
                                <line_1_text>123 Main St.</line_1_text>
                                <line_2_text></line_2_text>
                                <line_3_text></line_3_text>
                                <line_4_text></line_4_text>
                                <city_name>Great Town</city_name>
                                <stae_code>MD</stae_code>
                                <prov_code></prov_code>
                                <zip_code>99999</zip_code>
                                <intl_postal_code></intl_postal_code>
                                <cntr_code>USA</cntr_code>
                        </address>
                </constituent>
                <status>0</status>
                <error>No Error</error>
        </response>
</locate>

 


<LOCATE>
        <REQUEST>
                <id>1</id>
        </REQUEST>
        <RESPONSE>
                <CONSTITUENT>
                        <id>1</id>
                        <MAILING_NAME>Mr. Test Name</MAILING_NAME>
                        <ALPHA_SORT_NAME>NAME      TEST</ALPHA_SORT_NAME>
                        <ADDRESS>
                                <ADDR_SUB_TYPE>NADR</ADDR_SUB_TYPE>
                                <LINE_1_TEXT>123 Main St.</LINE_1_TEXT>
                                <LINE_2_TEXT></LINE_2_TEXT>
                                <LINE_3_TEXT></LINE_3_TEXT>
                                <LINE_4_TEXT></LINE_4_TEXT>
                                <CITY_NAME>Great Town</CITY_NAME>
                                <STAE_CODE>MD</STAE_CODE>
                                <PROV_CODE></PROV_CODE>
                                <ZIP_CODE>99999</ZIP_CODE>
                                <INTL_POSTAL_CODE></INTL_POSTAL_CODE>
                                <CNTR_CODE>USA</CNTR_CODE>
                        </ADDRESS>
                </CONSTITUENT>
                <STATUS>0</STATUS>
                <ERROR>No Error</ERROR>
        </RESPONSE>
</LOCATE>

 

See Also

Programming.Regular Expression

 

Tags

Emacs regexps RegularExpressions regexes

Retrieved from http://www.xradiograph.com/Emacs/RegularExpressions
Page last modified on February 26, 2014, at 04:12 PM