See the C FAQ, Question 1.32
Q: What is the difference between these initializations?
char a[] = "string literal";
char *p = "string literal";
My program crashes if I try to assign a new value to p[i]
.
A: A string literal (the formal term
for a double-quoted string in C
source) can be used in two slightly
different ways:
- As the initializer for an array of char, as in the declaration of
char a[]
, it specifies the initial values
of the characters in that array (and,
if necessary, its size).
- Anywhere else, it turns into an unnamed, static array of characters,
and this unnamed array may be stored
in read-only memory, and which
therefore cannot necessarily be
modified. In an expression context,
the array is converted at once to a
pointer, as usual (see section 6), so
the second declaration initializes p
to point to the unnamed array's first
element.
Some compilers have a switch
controlling whether string literals
are writable or not (for compiling old
code), and some may have options to
cause string literals to be formally
treated as arrays of const char (for
better error catching).
I'm declaring character array as char* string.
This is where your problems begin! Although pointers and arrays have some things in common, syntactically, they are not the same. What you are doing in the line copied below is declaring s
as a pointer to a char
and initializing that pointer with the address of the string literal you provide.
char* s = "random";
As a string literal, the compiler is allowed to (though not obliged to) allocate memory for that data in read-only memory; thus, when you attempt (later) to modify the character pointed to by (the address in) the s
variable (or any other pointer, such as your t
, which contains the same address), you will experience undefined behaviour. Some systems will cause your program to crash ("Segmentation fault"), others may silently allow you to 'get away' with it. Indeed, you may even get different result with the same code at different times.
To fix this, and to properly declare a character array, use the []
notation:
char a[] = "random";
This will declare a
as a (modifiable) array of characters (whose size is determined, in this case, by the initial value you give it - 7
, here, with the terminating nul
character included); then, the compiler will initialize that array with a copy of the string literal's data. You are then free to use an expression like *a
to refer to the first element of that array.
The following short program may be helpful:
#include <stdio.h>
int main()
{
char* s = "random";
*s = 'R'; // Undefined behaviour: could be ignored, could crash, could work!
printf("%s\n", s);
char a[] = "random";
*a = 'R'; // Well-defined behaviour - this will change the first letter of the string.
printf("%s\n", a);
return 0;
}
(You may need to comment-out the lines that use s
to get the code to run to the other lines!)
Best Answer
You can look at string literal as "a sequence of characters surrounded by double quotes". This string should be treated as read-only and trying to modify this memory leads to undefined behavior. It's not necessarily stored in read only memory, and the type is
char[]
and notconst char[]
, but it is still undefined behavior. The reason the type is notconst
is backwards compability. C didn't have theconst
qualifier in the beginning. In C++, string literals have the typeconst char[]
.So how come that you get segmentation fault?
char *ptr = "string literal"
makesptr
to point to the read-only memory where your string literal is stored. So when you try to access this memory:ptr[0] = 'X'
(which is by the way equivalent to*(ptr + 0) = 'X'
), it is a memory access violation.On the other hand:
char b[] = "string2";
allocates memory and copies string"string2"
into it, thus modifying it is valid. This memory is freed whenb
goes out of scope.Have a look at Literal string initializer for a character array