C – Meaning of Strange Question Marks (Trigraphs)

c++trigraphs

I came across some weird-looking code. It doesn't even look like C, yet to my surprise it compiles and runs on my C compiler. Is this some non-standard extension to the C language and if so, what is the reason for it?

??=include <stdio.h>

int main()
??<
  const char arr[] = 
  ??<
    0xF0 ??! 0x0F,
    ??-0x00,
    0xAA ??' 0x55
  ??>;

  for(int i=0; i<sizeof(arr)/sizeof(*arr); i++)
  ??<
    printf("%X??/n", (unsigned char)arr??(i??));
  ??>

  return 0;
??>

Output:

FF
FF
FF

Best Answer

The code is fully standard compliant to any version of the C standard. The ?? mechanism is called trigraphs and was introduced to C to allow an alternative way of printing certain symbols. It looks like the program was written as a demonstration of various trigraph sequences.

Back in the days, many computers and their keyboards were based on an old symbol table called ISO 646 which didn't contain all symbols used in the C language, such as \ { } [ ]. This made it impossible for programmers from some countries to even write C, because their national keyboard layout lacked the necessary symbols. Instead of remaking the keyboards and symbol tables, the C language was changed.

Therefore trigraphs were introduced. Today they are considered a completely obsolete feature and it is not recommended to use them.[1] GCC will for example give you a warning if you use them. Still, they remain in the C standard for backwards-compatibility and all C compilers must support them.

The existing trigraph sequences are (C11 5.2.1.1 Trigraph sequences):

??=  #
??(  [
??/  \
??)  ]
??'  ^
??<  {
??!  |
??>  }
??-  ~

The left column is the trigraph sequence and the right column is its meaning.


EDIT: Those interested in the historical decisions can read about it themselves in the C rationale v5.10, chapter 5.2.1.1.


[1]: C23 removed trigraphs from the language standard entirely.