“Modern C”: Notes on chapter 5 “Basic values and data”
By Dmitry Kabanov
These are my notes taken while reading chapter 5 “Basic values and data” from the book “Modern C” by Jens Gustedt.
This chapter discusses values of different objects that are used in a C program, and how they are represented.
The table of contents for all notes for this book are available in that post.
C programs manipulate data values which have different representations in a computer. Programmer is abstracted out from the actual representations. Actual program is C’s abstract state machine that has objects that are represented with concrete types on a concrete computer, and values that change in time (hence, state changes).
5.1 Abstract state machine
Program has different states as the values manipulated by the program change. Values are observable when they are assigned to variables, and not-observable when they are results of intermediate expressions. For example, in this statement
x = (x * 1.5) - y;
the subexpression (x * 1.5)
is hidden as we never assign it to a variable,
while the value of x
may be observable as it is a variable hence
is saved in addressable memory.
C compiler is allowed to do optimizations that remove some variables if it is clear that the end results does not change.
Takeaway 5.2. All values in C programs are numbers or translate to numbers.
A type is an additional property that is associated with a value.
In C programs all values have types that are statically determined.
Also, results of computations depend on the type: for example,
if the type of subexpression is unsigned
, the result cannot be negative.
Also, the types are actually abstract as they depend on the actual platform.
5.2 Basic types
Some of the basic types are built-in keywords such as unsigned
, int
,
and double
.
Some other basic types are defined in header files, such as bool
or size_t
.
Actually, all basic types in C are numbers or can be treated as numbers. There are two principal classes of numbers: integers and floating-point numbers. Integers can be subclassed as signed and unsigned, while floating-point numbers are subclassed as real and complex.
There are narrow types for integers that are promoted to a wider type
during computations.
For example, narrow types bool
, (un)signed char
, (un)signed short
are promoted on most of today’s platforms to signed int
.
Floating-point numbers are float
, double
, long double
for reals,
and float complex
, double complex
, long double complex
for complex
numbers.
The precision, i.e., the ranges, of these types are not strictly defined,
that is, they depend on the platform.
However, C standard constrains the types.
For example, char
is less or equal to short
, short
is less or equal
to int
, int
is less or equal to long
.
On my Linux x86_64
machine, GCC compiler says that int
has size
of 4 bytes, while long
of 8 bytes, has the range of values for long
is much bigger than for int
on this particular platform.
For signed and unsigned numbers, their sizes in bytes are equal, hence,
they can represent different maximum value: for example, typical 32-bit
int
value has maximum 2^31 - 1
which is about 2 billions, while
unsigned int
has maximum 2^32 - 1
about 4 billions.
There are special semantic types. For example, stddef.h
defines
type size_t
to represent sizes in programs and the type ptrdiff_t
to represent differences between large numbers (and negative differences
are allowed).
Header file stdint.h
add uintmax_t
and intmax_t
types that denote
widest possible on this platform unsigned and signed integer types,
respectively.
5.3 Specifying values
Values can be specified as normal decimal integers,
hexadecimal integers such as 0x25ABB7F
,
decimal floating-point numbers such as 3.14E0
,
hexadecimal floating-point numbers such as 0x7.AFP10
,
characters such as ‘A’,
strings such as “Hello\b and Heaven”, where special escape sequence \b
deletes previous character.
Integer literals can have specified type: for example 3
is perfectly
representable as short
, however, we can prescribed it to be of type
unsigned long
by adding suffix: 3UL
.
Floating-point literals by themselves are of type double
but
can be specified as float
or long double
with suffixes F
and L
.
Complex numbers can be specified with the help of the macro I
.
Value of type double complex
is given by expressions like 2.5 + 0.3*I
and of type float complex
by expressions like 0.5F + 0.3F*I
.
5.4 Implicit conversions
C compiler does a lot of implicit conversions of types.
For example, expression -1U
has type unsigned
because minus operator
does not change the type, and expression with result more than 2^31 - 1
may not fit into typical 32-bit int
.
The recommendations are basically to avoid narrowing conversions, that is, to avoid assigning the result of an expression to a variable of narrower type. Also, it is not recommended to mix signed and unsigned expressions. Last, use unsigned types when you can.
5.5 Initializers
In C, practically all variables must be initialized.
Initialization of arrays can explicitly state the index of the element, which is preferable:
double A[] = {7.8};
double B[] = {[0] = 2.5, [3] = 47.23};
The default initialization value is 0
.
5.6 Named constants
Sometimes we have constant values and instead of using them literally,
it is better to name them.
For semantic reasons, even if the same value used in the program
with different meanings, then there must be several different named
constants with the same value.
C offers two ways to do this: using enum
or using macros.
Constants should be distinguished from const
qualified objects.
The qualifier const
makes variable read-only after it is initialized.
For example, type char const* const
denotes a read-only object
with read-only strings.
C allows to name small integers via enumerations:
enum corvid { magpie, raven, jay, n_corvids };
where magpie
will initialize to zero, raven
to one, and so on.
Note, that here we use an idiom of adding as the last element the constant
which will tell us the number of the elements in the enumeration.
Enumeration constants are of type signed int
and can be initialized
in more complex ways, for example,
enum constants { p0 = 1, p1 = 7*p0, p2 = 2*p1 };
as long as the initialization values are integer constant expressions, that is, can be fully determined at compile time.
To declare constants of other types than signed int
, the only way
is to use macros, which are actually handled by C preprocessor.
Macros are defined like this:
#define M_PI 3.1415926
When C preprocessor preprocesses a source code file, it replaces
all strings M_PI
with the actual value (which is a double
literal
in this case).
It is usually a good idea to write macros names in all caps, LIKE_THIS
,
although in the C library, some values are not using this convention
(for example, false
).
5.7 Binary representations
This section is super dense and technical, so I only skimmed through it as I do not require the knowledge from it directly for my current project.
The most interesting bits that I have noticed are the following.
Unigned integers wrap nicely (they form a ring in a mathematical sense) and they never leave to problems.
Signed integers can trap, that is, lead to errors such as arithmetic overflow.
Header file stdint.h
provides fixed-width integers, for example,
uint32_t
for unsigned integers of width 32 bits, or int8_t
for signed integers of width 8 bits.
Floating-point numbers such as float~s and ~double~s represent a subset of real numbers. Only values that can be expanded in powers of 2 can be represented exactly, for example ~0.5
, while others cannot, for example 0.3
is irrational
number in binary representation.
Also, floating-point numbers do not obey to the arithmetic laws (associative, commutative, distributive), which means that change in order of operations can give diffrent result. Also, the results of operations with numbers of very different magnitudes can provide results, which are different from mathematics. For example, adding very small number to a very large number can have just the large number as the result.
Floating-point numbers must not be compared for equality. The only meaningful comparison is, how close they are.
Complex numbers are a pair of real numbers. Header file tgmath.h
provides type-generic macros creal
and cimag
that return
real and imaginary parts, respectively.