Why does this code give the output C++Sucks
? What is the concept behind it?
#include <stdio.h>
double m[] = {7709179928849219.0, 771};
int main() {
m[1]--?m[0]*=2,main():printf((char*)m);
}
Test it here.
c++deobfuscation
Why does this code give the output C++Sucks
? What is the concept behind it?
#include <stdio.h>
double m[] = {7709179928849219.0, 771};
int main() {
m[1]--?m[0]*=2,main():printf((char*)m);
}
Test it here.
What does this program do?
main(_){write(read(0,&_,1)&&main());}
Before we analyze it, let's prettify it:
main(_) {
write ( read(0, &_, 1) && main() );
}
First, you should know that _
is a valid variable name, albeit an ugly one. Let's change it:
main(argc) {
write( read(0, &argc, 1) && main() );
}
Next, realize that the return type of a function, and the type of a parameter are optional in C (but not in C++):
int main(int argc) {
write( read(0, &argc, 1) && main() );
}
Next, understand how return values work. For certain CPU types, the return value is always stored in the same registers (EAX on x86, for example). Thus, if you omit a return
statement, the return value is likely going to be whatever the most recent function returned.
int main(int argc) {
int result = write( read(0, &argc, 1) && main() );
return result;
}
The call to read
is more-or-less evident: it reads from standard in (file descriptor 0), into the memory located at &argc
, for 1
byte. It returns 1
if the read was successful, and 0 otherwise.
&&
is the logical "and" operator. It evaluates its right-hand-side if and only if it's left-hand-side is "true" (technically, any non-zero value). The result of the &&
expression is an int
which is always 1 (for "true") or 0 (for false).
In this case, the right-hand-side invokes main
with no arguments. Calling main
with no arguments after declaring it with 1 argument is undefined behavior. Nevertheless, it often works, as long as you don't care about the initial value of the argc
parameter.
The result of the &&
is then passed to write()
. So, our code now looks like:
int main(int argc) {
int read_result = read(0, &argc, 1) && main();
int result = write(read_result);
return result;
}
Hmm. A quick look at the man pages reveals that write
takes three arguments, not one. Another case of undefined behavior. Just like calling main
with too few arguments, we cannot predict what write
will receive for its 2nd and 3rd arguments. On typical computers, they will get something, but we can't know for sure what. (On atypical computers, strange things can happen.) The author is relying upon write
receiving whatever was previously stored on the memory stack. And, he is relying upon that being the 2nd and 3rd arguments to read.
int main(int argc) {
int read_result = read(0, &argc, 1) && main();
int result = write(read_result, &argc, 1);
return result;
}
Fixing the invalid call to main
, and adding headers, and expanding the &&
we have:
#include <unistd.h>
int main(int argc, int argv) {
int result;
result = read(0, &argc, 1);
if(result) result = main(argc, argv);
result = write(result, &argc, 1);
return result;
}
This program won't work as expected on many computers. Even if you use the same computer as the original author, it might not work on a different operating system. Even if you use the same computer and same operating system, it won't work on many compilers. Even if you use the same computer compiler and operating system, it might not work if you change the compiler's command line flags.
As I said in the comments, the question does not have a valid answer. If you found a contest organizer or contest judge that says otherwise, don't invite them to your next contest.
Let's de-obfuscate it.
Indenting:
main(_) {
_^448 && main(-~_);
putchar(--_%64
? 32 | -~7[__TIME__-_/8%8][">'txiZ^(~z?"-48] >> ";;;====~$::199"[_*2&8|_/64]/(_&2?1:8)%8&1
: 10);
}
Introducing variables to untangle this mess:
main(int i) {
if(i^448)
main(-~i);
if(--i % 64) {
char a = -~7[__TIME__-i/8%8][">'txiZ^(~z?"-48];
char b = a >> ";;;====~$::199"[i*2&8|i/64]/(i&2?1:8)%8;
putchar(32 | (b & 1));
} else {
putchar(10); // newline
}
}
Note that -~i == i+1
because of twos-complement. Therefore, we have
main(int i) {
if(i != 448)
main(i+1);
i--;
if(i % 64 == 0) {
putchar('\n');
} else {
char a = -~7[__TIME__-i/8%8][">'txiZ^(~z?"-48];
char b = a >> ";;;====~$::199"[i*2&8|i/64]/(i&2?1:8)%8;
putchar(32 | (b & 1));
}
}
Now, note that a[b]
is the same as b[a]
, and apply the -~ == 1+
change again:
main(int i) {
if(i != 448)
main(i+1);
i--;
if(i % 64 == 0) {
putchar('\n');
} else {
char a = (">'txiZ^(~z?"-48)[(__TIME__-i/8%8)[7]] + 1;
char b = a >> ";;;====~$::199"[(i*2&8)|i/64]/(i&2?1:8)%8;
putchar(32 | (b & 1));
}
}
Converting the recursion to a loop and sneaking in a bit more simplification:
// please don't pass any command-line arguments
main() {
int i;
for(i=447; i>=0; i--) {
if(i % 64 == 0) {
putchar('\n');
} else {
char t = __TIME__[7 - i/8%8];
char a = ">'txiZ^(~z?"[t - 48] + 1;
int shift = ";;;====~$::199"[(i*2&8) | (i/64)];
if((i & 2) == 0)
shift /= 8;
shift = shift % 8;
char b = a >> shift;
putchar(32 | (b & 1));
}
}
}
This outputs one character per iteration. Every 64th character, it outputs a newline. Otherwise, it uses a pair of data tables to figure out what to output, and puts either character 32 (a space) or character 33 (a !
). The first table (">'txiZ^(~z?"
) is a set of 10 bitmaps describing the appearance of each character, and the second table (";;;====~$::199"
) selects the appropriate bit to display from the bitmap.
Let's start by examining the second table, int shift = ";;;====~$::199"[(i*2&8) | (i/64)];
. i/64
is the line number (6 to 0) and i*2&8
is 8 iff i
is 4, 5, 6 or 7 mod 8.
if((i & 2) == 0) shift /= 8; shift = shift % 8
selects either the high octal digit (for i%8
= 0,1,4,5) or the low octal digit (for i%8
= 2,3,6,7) of the table value. The shift table ends up looking like this:
row col val
6 6-7 0
6 4-5 0
6 2-3 5
6 0-1 7
5 6-7 1
5 4-5 7
5 2-3 5
5 0-1 7
4 6-7 1
4 4-5 7
4 2-3 5
4 0-1 7
3 6-7 1
3 4-5 6
3 2-3 5
3 0-1 7
2 6-7 2
2 4-5 7
2 2-3 3
2 0-1 7
1 6-7 2
1 4-5 7
1 2-3 3
1 0-1 7
0 6-7 4
0 4-5 4
0 2-3 3
0 0-1 7
or in tabular form
00005577
11775577
11775577
11665577
22773377
22773377
44443377
Note that the author used the null terminator for the first two table entries (sneaky!).
This is designed after a seven-segment display, with 7
s as blanks. So, the entries in the first table must define the segments that get lit up.
__TIME__
is a special macro defined by the preprocessor. It expands to a string constant containing the time at which the preprocessor was run, in the form "HH:MM:SS"
. Observe that it contains exactly 8 characters. Note that 0-9 have ASCII values 48 through 57 and :
has ASCII value 58. The output is 64 characters per line, so that leaves 8 characters per character of __TIME__
.
7 - i/8%8
is thus the index of __TIME__
that is presently being output (the 7-
is needed because we are iterating i
downwards). So, t
is the character of __TIME__
being output.
a
ends up equalling the following in binary, depending on the input t
:
0 00111111
1 00101000
2 01110101
3 01111001
4 01101010
5 01011011
6 01011111
7 00101001
8 01111111
9 01111011
: 01000000
Each number is a bitmap describing the segments that are lit up in our seven-segment display. Since the characters are all 7-bit ASCII, the high bit is always cleared. Thus, 7
in the segment table always prints as a blank. The second table looks like this with the 7
s as blanks:
000055
11 55
11 55
116655
22 33
22 33
444433
So, for example, 4
is 01101010
(bits 1, 3, 5, and 6 set), which prints as
----!!--
!!--!!--
!!--!!--
!!!!!!--
----!!--
----!!--
----!!--
To show we really understand the code, let's adjust the output a bit with this table:
00
11 55
11 55
66
22 33
22 33
44
This is encoded as "?;;?==? '::799\x07"
. For artistic purposes, we'll add 64 to a few of the characters (since only the low 6 bits are used, this won't affect the output); this gives "?{{?}}?gg::799G"
(note that the 8th character is unused, so we can actually make it whatever we want). Putting our new table in the original code:
main(_){_^448&&main(-~_);putchar(--_%64?32|-~7[__TIME__-_/8%8][">'txiZ^(~z?"-48]>>"?{{?}}?gg::799G"[_*2&8|_/64]/(_&2?1:8)%8&1:10);}
we get
!! !! !!
!! !! !! !! !! !! !! !! !!
!! !! !! !! !! !! !! !! !!
!! !! !! !!
!! !! !! !! !! !! !! !! !!
!! !! !! !! !! !! !! !! !!
!! !! !!
just as we expected. It's not as solid-looking as the original, which explains why the author chose to use the table he did.
Best Answer
The number
7709179928849219.0
has the following binary representation as a 64-bitdouble
:+
shows the position of the sign;^
of the exponent, and-
of the mantissa (i.e. the value without the exponent).Since the representation uses binary exponent and mantissa, doubling the number increments the exponent by one. Your program does it precisely 771 times, so the exponent which started at 1075 (decimal representation of
10000110011
) becomes 1075 + 771 = 1846 at the end; binary representation of 1846 is11100110110
. The resultant pattern looks like this:This pattern corresponds to the string that you see printed, only backwards. At the same time, the second element of the array becomes zero, providing null terminator, making the string suitable for passing to
printf()
.