Regex – How to Interpret the “@” Symbol Before Regex in Awk on Linux

awklinuxregex

This it taken from the "awk-exercises".

For the input file patterns.txt, filter lines containing three or more
occurrences of "ar" and replace the last but second "ar" with "X"

par car tar far Cart

part cart mart

Expected output

par car tX far Cart

pXt cart mart

awk 'BEGIN{r = @/(.*)ar((.*ar){2})/} $0~r{print gensub(r, "\\1X\\2", 1)}' patterns.txt

There is one think i cannot understand. What is the "@" means in BEGIN block?

Best Answer

A little bit of background re: standard regex constants.

Without the @ prefix:

##### this:

r = /(.*)ar((.*ar){2})/

##### is comparable to this:

r = ($0 ~ /(.*)ar((.*ar){2})/)      # assign 'r' the value of the comparison, ie, r = 0 (false) or 1 (true)

NOTE: if r = /(.*)ar((.*ar){2})/ is performed in the BEGIN block (where $0 is undefined) you'll always end up with r = 0 (false)

The obvious objective of this line of code is to assign a regex pattern to the variable r for use later in the script.

In GNU awk there are a couple approaches for assigning a regex pattern to a variable:

  1. dynamic regexps: r = "(.*)ar((.*ar){2})"
  2. strongly typed regex constant: r = @/(.*)ar((.*ar){2})/

So, to answer OP's question, r = @/.../ is one approach (strongly typed regex constant) available in GNU awk for assigning a regexp to a variable.