\& = \1 + \2. Or, baby steps in Lisp regexp
Between writing news items, I sometimes twiddle with little pieces of Lisp.
Bob Wiley: Baby steps?
Dr. Leo Marvin: It means setting small, reasonable goals.
Here is a tiny function that I’m playing with, eventually to become part of a larger program.
(defun gijs-subhead ()
"html tags for subheadings and headlines"
(interactive)
(goto-char (point-min))
(while (re-search-forward "\\(.+?[A-Z0-9a-z]\\)\\([^\.\\|\"]$\\)" nil t) (replace-match "<h5>\\&</h5>" t nil)))
This goes over a text, finds all bits that don’t end in a . (full stop) or a " (quote mark) and put these bits in html-tags.
I took me a while to understand why my earlier incantation was always eating the last character. For example, a subheading in a text would be ‘Tax evasion’ (without the single quote marks) and my function would change it into Tax evasio.
My error was in understanding the replace-match
I first used this:
(replace-match "<h5>\\1</h5>" t nil)
but the correct version is:
(replace-match "<h5>\\&</h5>" t nil)
To understand the difference, look at the re-search-forward string.
(re-search-forward "\\(.+?[A-Z0-9a-z]\\)\\([^\.\\|\"]$\\)
The re-search-forward string consists of two parts, 1 and 2, separated by () braces, and these are escaped by double slashes \\. And there is, ofcourse, the whole string &. So, actually, as far as the replace-match is concerned, there are three parts.
When the replace-match takes only part ‘\\1’ , it omits ‘\\2’. And that second part defines all last characters except the full stop or the quote mark. Hence, \\1 = \\& - \\2.
Thanks for your patience, Cecil.