Java – Unsigned Bytes in Java

bytejava

Bytes in Java are signed by default. I see on other posts that a workaround to have unsigned bytes is something similar to that: int num = (int) bite & 0xFF

Could someone please explain to me why this works and converts a signed byte to an unsigned byte and then its respective integer? ANDing a byte with 11111111 results in the same byte – right?

Best Answer

A typecast has a higher precedence than the & operator. Therefore you're first casting to an int, then ANDing in order to mask out all the high-order bits that are set, including the "sign bit" of the two's complement notation which java uses, leaving you with just the positive value of the original byte. E.g.:

let byte x = 11111111 = -1
then (int) x = 11111111 11111111 11111111 11111111
and x & 0xFF = 00000000 00000000 00000000 11111111 = 255

and you've effectively removed the sign from the original byte.

Related Solutions

Java Unsigned Numbers – Understanding and Using Unsigned Bytes

Java doesn't actually have unsigned primitives.

The value 127 is actually represented by '01111111' the first bit being the sign (0 is positive).

An unsigned byte would be able to hold values 0 to 255, but 127 is the maximum for a signed byte. Since a byte has 8 bits, and the signed one consumes one to hold the sign. So if you want to represent values larger than 127 you need to use a bigger type that has a greater number of bits. The greater type also has a reserved bit for sign, but it has at least 8 bits used for the actual values, so you can represent the value 255.

That being said, you should probably avoid using byte and short because there are issues with them. You'll notice i cast the result to short, since the operators actually return int. You should just stick to int and long in java since they are implemented better.

Edit: the AND operator makes it unsigned since the sign bit is the first bit of the short, and you copy the 8 bits holding the value of the byte to the last 8 bits of the short. So if you have a negative number the first bit which is 1 (that means it's negative) actually becomes part of the value. And the short will always be positive since its sign bit is at two high of a power of two to be affected by the short.

 byte:             10101101
                    ||||||| <- actual value
short:     0000000010101101
            ||||||||||||||| <- actual value

Edit 2: Take note though that since the negative values use two's complement representation the value might not be what you expect it. all the positive values remain the same.
But -128 = 0x10000000 will become 128
-127 = 0x10000001 will become 129
and so on until -1 = 0x11111111 which will become 255

Java – Can We Make Unsigned Byte?

The fact that primitives are signed in Java is irrelevant to how they're represented in memory / transit - a byte is merely 8 bits and whether you interpret that as a signed range or not is up to you. There is no magic flag to say "this is signed" or "this is unsigned".

As primitives are signed the Java compiler will prevent you from assigning a value higher than +127 to a byte (or lower than -128). However, there's nothing to stop you downcasting an int (or short) in order to achieve this:

int i = 200; // 0000 0000 0000 0000 0000 0000 1100 1000 (200)
byte b = (byte) 200; // 1100 1000 (-56 by Java specification, 200 by convention)

/*
 * Will print a negative int -56 because upcasting byte to int does
 * so called "sign extension" which yields those bits:
 * 1111 1111 1111 1111 1111 1111 1100 1000 (-56)
 *
 * But you could still choose to interpret this as +200.
 */
System.out.println(b); // "-56"

/*
 * Will print a positive int 200 because bitwise AND with 0xFF will
 * zero all the 24 most significant bits that:
 * a) were added during upcasting to int which took place silently
 *    just before evaluating the bitwise AND operator.
 *    So the `b & 0xFF` is equivalent with `((int) b) & 0xFF`.
 * b) were set to 1s because of "sign extension" during the upcasting
 *
 * 1111 1111 1111 1111 1111 1111 1100 1000 (the int)
 * &
 * 0000 0000 0000 0000 0000 0000 1111 1111 (the 0xFF)
 * =======================================
 * 0000 0000 0000 0000 0000 0000 1100 1000 (200)
 */
System.out.println(b & 0xFF); // "200"

/*
 * You would typically do this *within* the method that expected an 
 * unsigned byte and the advantage is you apply `0xFF` only once
 * and than you use the `unsignedByte` variable in all your bitwise
 * operations.
 *
 * You could use any integer type longer than `byte` for the `unsignedByte` variable,
 * i.e. `short`, `int`, `long` and even `char`, but during bitwise operations
 * it would get casted to `int` anyway.
 */
void printUnsignedByte(byte b) {
    int unsignedByte = b & 0xFF;
    System.out.println(unsignedByte); // "200"
}

Best Answer

Related Solutions

Java Unsigned Numbers – Understanding and Using Unsigned Bytes

Java – Can We Make Unsigned Byte?

Related Question