C++ – Why size_t is an Unsigned Integer

c++language-designsize-tunsigned-integer

Bjarne Stroustrup wrote in The C++ Programming Language:

The unsigned integer types are ideal for uses that treat storage as a
bit array. Using an unsigned instead of an int to gain one more bit to
represent positive integers is almost never a good idea. Attempts to
ensure that some values are positive by declaring variables unsigned
will typically be defeated by the implicit conversion rules.

size_t seems to be unsigned "to gain one more bit to represent positive integers". So was this a mistake (or trade-off), and if so, should we minimize use of it in our own code?

Another relevant article is Signed and Unsigned Types in Interfaces by Scott Meyers. To summarize, he recommends not using unsigned integers in interfaces, regardless of whether the value is always positive or not. In other words, even if negative values make no sense, you shouldn't necessarily use unsigned.

Best Answer

size_t is unsigned for historical reasons.

On an architecture with 16 bit pointers, such as the "small" model DOS programming, it would be impractical to limit strings to 32 KB.

For this reason, the C standard requires (via required ranges) ptrdiff_t, the signed counterpart to size_t and the result type of pointer difference, to be effectively 17 bits.

Those reasons can still apply in parts of the embedded programming world.

However, they do not apply to modern 32-bit or 64-bit programming, where a much more important consideration is that the unfortunate implicit conversion rules of C and C++ make unsigned types into bug attractors, when they're used for numbers (and hence, arithmetical operations and magnitude comparisions). With 20-20 hindsight we can now see that the decision to adopt those particular conversion rules, where e.g. string( "Hi" ).length() < -3 is practically guaranteed, was rather silly and impractical. However, that decision means that in modern programming, adopting unsigned types for numbers has severe disadvantages and no advantages – except for satisfying the feelings of those who find unsigned to be a self-descriptive type name, and fail to think of typedef int MyType.

Summing up, it was not a mistake. It was a decision for then very rational, practical programming reasons. It had nothing to do with transferring expectations from bounds-checked languages like Pascal to C++ (which is a fallacy, but a very very common one, even if some of those who do it have never heard of Pascal).

Related Solutions

C++ – Unsigned Int vs. Size_t: Key Differences

The size_t type is the unsigned integer type that is the result of the sizeof operator (and the offsetof operator), so it is guaranteed to be big enough to contain the size of the biggest object your system can handle (e.g., a static array of 8Gb).

The size_t type may be bigger than, equal to, or smaller than an unsigned int, and your compiler might make assumptions about it for optimization.

You may find more precise information in the C99 standard, section 7.17, a draft of which is available on the Internet in pdf format, or in the C11 standard, section 7.19, also available as a pdf draft.

C++ size_t – Is size_t Always Unsigned in C++?

Yes. It's usually defined as something like the following (on 32-bit systems):

typedef unsigned int size_t;

Reference:

C++ Standard Section 18.1 defines size_t is in <cstddef> which is described in C Standard as <stddef.h>.
C Standard Section 4.1.5 defines size_t as an unsigned integral type of the result of the sizeof operator

Best Answer

Related Solutions

C++ – Unsigned Int vs. Size_t: Key Differences

C++ size_t – Is size_t Always Unsigned in C++?

Related Question