Operators and type casting
Operators and Assignments
C has a wide range of operators that make simple math easy to handle. The list of operators grouped into precedence levels is as follows:
Primary expressions
Identifiers are names of things in C, and consist of either a letter or an underscore ( _ ) optionally followed by letters, digits, or underscores. An identifier (or variable name) is a primary expression, provided that it has been declared as designating an object (in which case it is an lvalue [a value that can be used as the left side of an assignment expression]) or a function (in which case it is a function designator).
A constant is a primary expression. Its type depends on its form and value. The types of constants are character constants (e.g. ' '
is a space), integer constants (e.g. 2
), floating-point constants (e.g. 0.5
), and enumerated constants that have been previously defined via enum
.
A string literal is a primary expression. It consists of a string of characters within double quotes ( " ).
A parenthesized expression is a primary expression. It consists of an expression within parentheses ( ( ) ). Its type and value are those of the non-parenthesized expression within the parentheses.
In C11, an expression that starts with _Generic followed by (, an initial expression, a list of values of the form type: expression where type is either a named type or the keyword default, and ) constitutes a primary expression. The value is the expression that follows the type of the initial expression or the default if not found.
Postfix operators
First, a primary expression is also a postfix expression. The following expressions are also postfix expressions:
A postfix expression followed by a left square bracket ([
), an expression, and a right square bracket (]
) in sequence constitutes an invocation of the array subscript operator. One of the expressions shall have type "pointer to object type" and the other shall have an integer type; the result type is type. Successive array subscript operators designate an element of a multidimensional array.
A postfix expression followed by parentheses or an optional parenthesized argument list indicates an invocation of the function call operator. The value of the function call operator is the return value of the function called with the provided arguments. The parameters to the function are copied on the stack by value (or at least the compiler acts as if that is what happens; if the programmer wanted the parameter to be copied by reference, then it is easier to pass the address of the area to be modified by value, then the called function can access the area through the respective pointer). The trend for compilers is to pass the parameters from right to left onto the stack, but this is not universal.
A postfix expression followed by a dot (.
) followed by an identifier selects a member from a structure or union; a postfix expression followed by an arrow (->
) followed by an identifier selects a member from a structure or union who is pointed to by the pointer on the left-hand side of the expression.
A postfix expression followed by the increment or decrement operators (++
or --
respectively) indicates that the variable is to be incremented or decremented as a side effect. The value of the expression is the value of the postfix expression before the increment or decrement. These operators only work on integers and pointers.
Unary expressions
First, a postfix expression is a unary expression. The following expressions are all unary expressions:
The increment or decrement operators followed by a unary expression is a unary expression. The value of the expression is the value of the unary expression after the increment or decrement. These operators only work on integers and pointers.
The following operators followed by a cast expression are unary expressions:
Operator Meaning ======== ======= & Address-of; value is the location of the operand * Contents-of; value is what is stored at the location - Negation + Value-of operator ! Logical negation ( (!E) is equivalent to (0==E) ) ~ Bit-wise complement
The keyword sizeof
followed by a unary expression is a unary expression. The value is the size of the type of the expression in bytes. The expression is not evaluated.
The keyword sizeof
followed by a parenthesized type name is a unary expression. The value is the size of the type in bytes.
Cast operators
A unary expression is also a cast expression.
A parenthesized type name followed by any expression, including literals, is a cast expression. The parenthesized type name has the effect of forcing the cast expression into the type specified by the type name in parentheses. For arithmetic types, this either does not change the value of the expression, or truncates the value of the expression if the expression is an integer and the new type is smaller than the previous type.
An example of casting an int as a float:
int i = 5;
printf("%f\n", (float) i / 2); // Will print out: 2.500000
Multiplicative and additive operators
First, a multiplicative expression is also a cast expression, and an additive expression is also a multiplicative expression. This follows the precedence that multiplication happens before addition.
In C, simple math is very easy to handle. The following operators exist: + (addition), - (subtraction), * (multiplication), / (division), and % (modulus); You likely know all of them from your math classes - except, perhaps, modulus. It returns the remainder of a division (e.g. 5 % 2 = 1). (Modulus is not defined for floating-point numbers, but the math.h library has an fmod function.)
Care must be taken with the modulus, because it's not the equivalent of the mathematical modulus: (-5) % 2 is not 1, but -1. Division of integers will return an integer, and the division of a negative integer by a positive integer will round towards zero instead of rounding down (e.g. (-5) / 3 = -1 instead of -2). However, it is always true that for all integer a and nonzero integer b, ((a / b) * b) + (a % b) == a.
There is no inline operator to do exponentiation (e.g. 5 ^ 2 is not 25 [it is 7; ^ is the exclusive-or operator], and 5 ** 2 is an error), but there is a power function.
The mathematical order of operations does apply. For example (2 + 3) * 2 = 10 while 2 + 3 * 2 = 8. Multiplicative operators have precedence over additive operators.
#include <stdio.h>
int main(void)
{
int i = 0, j = 0;
/* while i is less than 5 AND j is less than 5, loop */
while( (i < 5) && (j < 5) )
{
/* postfix increment, i++
* the value of i is read and then incremented
*/
printf("i: %d\t", i++);
/*
* prefix increment, ++j
* the value of j is incremented and then read
*/
printf("j: %d\n", ++j);
}
printf("At the end they have both equal values:\ni: %d\tj: %d\n", i, j);
getchar(); /* pause */
return 0;
}
will display the following:
i: 0 j: 1 i: 1 j: 2 i: 2 j: 3 i: 3 j: 4 i: 4 j: 5 At the end they have both equal values: i: 5 j: 5
The shift operators (which may be used to rotate bits)
A shift expression is also an additive expression (meaning that the shift operators have a precedence just below addition and subtraction).
Shift functions are often used in low-level I/O hardware interfacing. Shift and rotate functions are heavily used in cryptography and software floating point emulation. Other than that, shifts can be used in place of division or multiplication by a power of two. Many processors have dedicated function blocks to make these operations fast -- see Design/Shift and Rotate Blocks. On processors which have such blocks, most C compilers compile shift and rotate operators to a single assembly-language instruction -- see Assembly/Shift and Rotate..
shift left
The <<
operator shifts the binary representation to the left, dropping the most significant bits and appending it with zero bits.
The result is equivalent to multiplying the integer by a power of two.
unsigned shift right
The unsigned shift right operator, also sometimes called the logical right shift operator.
It shifts the binary representation to the right, dropping the least significant bits and prepending it with zeros.
The >>
operator is equivalent to division by a power of two for unsigned integers.
signed shift right
The signed shift right operator, also sometimes called the arithmetic right shift operator.
It shifts the binary representation to the right, dropping the least significant bit, but prepending it with copies of the original sign bit.
The >>
operator is not equivalent to division for signed integers.
In C, the behavior of the >>
operator depends on the data type it acts on.
Therefore, a signed and an unsigned right shift looks exactly the same, but produces a different result in some cases.
rotate right
Contrary to popular belief, it is possible to write C code that compiles down to the "rotate" assembly language instruction (on CPUs that have such an instruction).
Most compilers recognize this idiom:
unsigned int x;
unsigned int y;
/* ... */
y = (x >> shift) | (x << (32 - shift));
and compile it to a single 32 bit rotate instruction. [1] [2]
On some systems, this may be "#define"ed as a macro or defined as an inline function called something like "rightrotate32" or "rotr32" or "ror32" in a standard header file like "bitops.h". [3]
rotate left
Most compilers recognize this idiom:
unsigned int x;
unsigned int y;
/* ... */
y = (x << shift) | (x >> (32 - shift));
and compile it to a single 32 bit rotate instruction.
On some systems, this may be "#define"ed as a macro or defined as an inline function called something like "leftrotate32" or "rotl32" in a header file like "bitops.h".
Relational and equality operators
A relational expression is also a shift expression; an equality expression is also a relational expression.
The relational binary operators <
(less than), >
(greater than), <=
(less than or equal), and >=
(greater than or equal) operators return a value of 1 if the result of the operation is true, 0 if false. The result of these operators is type int
.
The equality binary operators ==
(equals) and !=
(not equals) operators are similar to the relational operators except that their precedence is lower. They also return a value of 1 if the result of the operation is true and 0 if it is false.
One thing with floating-point numbers and equality operators: Because floating-point operations can produce approximations (e.g. 0.1 is a repeating decimal in binary, so 0.1 * 10.0 is hardly ever 1.0), it is unwise to use the ==
operator with floating-point numbers. Instead, if a and b are the numbers to compare, compare fabs (a - b)
to a fudge factor.
Bitwise operators
The bitwise operators are &
(and), ^
(exclusive or) and |
(inclusive or). The &
operator has higher precedence than ^
, which has higher precedence than |
.
The values being operated upon must be integral; the result is integral.
One use for the bitwise operators is to emulate bit flags. These flags can be set with OR, tested with AND, flipped with XOR, and cleared with AND NOT. For example:
/* This code is a sample for bitwise operations. */
#define BITFLAG1 (1)
#define BITFLAG2 (2)
#define BITFLAG3 (4) /* They are powers of 2 */
unsigned bitbucket = 0U; /* Clear all */
bitbucket |= BITFLAG1; /* Set bit flag 1 */
bitbucket &= ~BITFLAG2; /* Clear bit flag 2 */
bitbucket ^= BITFLAG3; /* Flip the state of bit flag 3 from off to on or
vice versa */
if (bitbucket & BITFLAG3) {
/* bit flag 3 is set */
} else {
/* bit flag 3 is not set */
}
Logical operators
The logical operators are &&
(and), and ||
(or). Both of these operators produce 1 if the relationship is true and 0 for false. Both of these operators short-circuit; if the result of the expression can be determined from the first operand, the second is ignored. The &&
operator has higher precedence than the ||
operator.
&&
is used to evaluate expressions left to right, and returns a 1 if both statements are true, 0 if either of them are false. If the first expression is false, the second is not evaluated.
int x = 7;
int y = 5;
if(x == 7 && y == 5) {
...
}
Here, the &&
operator checks the left-most expression, then the expression to its right. If there were more than two expressions chained (e.g. x && y && z
), the operator would check x first, then y (if x is nonzero), then continue rightwards to z if neither x or y is zero.
Since both statements return true, the &&
operator returns true, and the code block is executed.
if(x == 5 && y == 5) {
...
}
The && operator checks in the same way as before, and finds that the first expression is false. The && operator stops evaluating as soon as it finds a statement to be false, and returns a false.
||
is used to evaluate expressions left to right, and returns a 1 if either of the expressions are true, 0 if both are false. If the first expression is true, the second expression is not evaluated.
/* Use the same variables as before. */
if(x == 2 || y == 5) { // the || statement checks both expressions, finds that the latter is true, and returns true
...
}
The ||
operator here checks the left-most expression, finds it false, but continues to evaluate the next expression.
It finds that the next expression returns true, stops, and returns a 1.
Much how the &&
operator ceases when it finds an expression that returns false, the ||
operator ceases when it finds an expression that returns true.
It is worth noting that C does not have Boolean values (true and false) commonly found in other languages. It instead interprets a 0 as false, and any nonzero value as true.
Conditional operators
The ternary ?:
operator is the conditional operator. The expression (x ? y : z)
has the value of y
if x
is nonzero, z
otherwise.
Example:
int x = 0;
int y;
y = (x ? 10 : 6); /* The parentheses are technically not necessary as assignment
has a lower precedence than the conditional operator, but
it's there for clarity. */
The expression x
evaluates to 0. The ternary operator then looks for the "if-false" value, which in this case, is 6. It returns that, so y
is equal to six. Had x
been a non-zero, then the expression would have returned a 10.
Assignment operators
The assignment operators are =
, *=
, /=
, %=
, +=
, -=
, <<=
, >>=
, &=
, ^=
, and |=
. The =
operator stores the value of the right operand into the location determined by the left operand, which must be an lvalue (a value that has an address, and therefore can be assigned to).
For the others, x op= y
is shorthand for x = x op (y)
. Hence, the following expressions are the same:
1. x += y - x = x+y 2. x -= y - x = x-y 3. x *= y - x = x*y 4. x /= y - x = x/y 5. x %= y - x = x%y
The value of the assignment expression is the value of the left operand after the assignment. Thus, assignments can be chained; e.g. the expression a = b = c = 0;
would assign the value zero to all three variables.
Comma operator
The operator with the least precedence is the comma operator. The value of the expression x, y
will evaluate both x
and y
, but provides the value of y
.
This operator is useful for including multiple actions in one statement (e.g. within a for loop conditional).
Here is a small example of the comma operator:
int i, x; /* Declares two ints, i and x, in one declaration.
Technically, this is not the comma operator. */
/* this loop initializes x and i to 0, then runs the loop */
for (x = 0, i = 0; i <= 6; i++) {
printf("x = %d, and i = %d\n", x, i);
}
References
- GCC: "Optimize common rotate constructs"
- "Cleanups in ROTL/ROTR DAG combiner code" mentions that this code supports the "rotate" instruction in the CellSPU
- "replace private copy of bit rotation routines" -- recommends including "bitops.h" and using its rol32 and ror32 rather than copy-and-paste into a new program.