# “Modern C”: Notes on chapter 5 “Basic values and data”

By **Dmitry Kabanov**

These are my notes taken while reading chapter 5 “Basic values and data” from the book “Modern C” by Jens Gustedt.

This chapter discusses values of different objects that are used in a C program, and how they are represented.

The table of contents for all notes for this book are available in that post.

C programs manipulate data values which have different representations in a computer. Programmer is abstracted out from the actual representations. Actual program is C’s abstract state machine that has objects that are represented with concrete types on a concrete computer, and values that change in time (hence, state changes).

## 5.1 Abstract state machine

Program has different states as the values manipulated by the program
change.
Values are **observable** when they are assigned to variables,
and not-observable when they are results of intermediate expressions.
For example, in this statement

```
x = (x * 1.5) - y;
```

the subexpression `(x * 1.5)`

is hidden as we never assign it to a variable,
while the value of `x`

may be observable as it is a variable hence
is saved in **addressable memory**.

C compiler is allowed to do optimizations that remove some variables if it is clear that the end results does not change.

*Takeaway 5.2*. All values in C programs are numbers or translate to numbers.

A *type* is an additional property that is associated with a value.
In C programs all values have types that are statically determined.
Also, results of computations depend on the type: for example,
if the type of subexpression is `unsigned`

, the result cannot be negative.

Also, the types are actually abstract as they depend on the actual platform.

## 5.2 Basic types

Some of the basic types are built-in keywords such as `unsigned`

, `int`

,
and `double`

.
Some other basic types are defined in header files, such as `bool`

or `size_t`

.

Actually, all basic types in C are numbers or can be treated as numbers. There are two principal classes of numbers: integers and floating-point numbers. Integers can be subclassed as signed and unsigned, while floating-point numbers are subclassed as real and complex.

There are *narrow types* for integers that are *promoted* to a wider type
during computations.
For example, narrow types `bool`

, `(un)signed char`

, `(un)signed short`

are promoted on most of today’s platforms to `signed int`

.

Floating-point numbers are `float`

, `double`

, `long double`

for reals,
and `float complex`

, `double complex`

, `long double complex`

for complex
numbers.

The precision, i.e., the ranges, of these types are not strictly defined,
that is, they depend on the platform.
However, C standard constrains the types.
For example, `char`

is less or equal to `short`

, `short`

is less or equal
to `int`

, `int`

is less or equal to `long`

.
On my Linux `x86_64`

machine, GCC compiler says that `int`

has size
of 4 bytes, while `long`

of 8 bytes, has the range of values for `long`

is much bigger than for `int`

on this particular platform.

For signed and unsigned numbers, their sizes in bytes are equal, hence,
they can represent different maximum value: for example, typical 32-bit
`int`

value has maximum `2^31 - 1`

which is about 2 billions, while
`unsigned int`

has maximum `2^32 - 1`

about 4 billions.

There are special semantic types. For example, `stddef.h`

defines
type `size_t`

to represent sizes in programs and the type `ptrdiff_t`

to represent differences between large numbers (and negative differences
are allowed).
Header file `stdint.h`

add `uintmax_t`

and `intmax_t`

types that denote
widest possible on this platform unsigned and signed integer types,
respectively.

## 5.3 Specifying values

Values can be specified as normal **decimal** integers,
**hexadecimal** integers such as `0x25ABB7F`

,
**decimal floating-point** numbers such as `3.14E0`

,
**hexadecimal floating-point** numbers such as `0x7.AFP10`

,
**characters** such as ‘A’,
**strings** such as “Hello\b and Heaven”, where special escape sequence `\b`

deletes previous character.

Integer literals can have specified type: for example `3`

is perfectly
representable as `short`

, however, we can prescribed it to be of type
`unsigned long`

by adding suffix: `3UL`

.

Floating-point literals by themselves are of type `double`

but
can be specified as `float`

or `long double`

with suffixes `F`

and `L`

.

Complex numbers can be specified with the help of the macro `I`

.
Value of type `double complex`

is given by expressions like `2.5 + 0.3*I`

and of type `float complex`

by expressions like `0.5F + 0.3F*I`

.

## 5.4 Implicit conversions

C compiler does a lot of implicit conversions of types.
For example, expression `-1U`

has type `unsigned`

because minus operator
does not change the type, and expression with result more than `2^31 - 1`

may not fit into typical 32-bit `int`

.

The recommendations are basically to avoid narrowing conversions, that is, to avoid assigning the result of an expression to a variable of narrower type. Also, it is not recommended to mix signed and unsigned expressions. Last, use unsigned types when you can.

## 5.5 Initializers

In C, practically all variables must be initialized.

Initialization of arrays can explicitly state the index of the element, which is preferable:

```
double A[] = {7.8};
double B[] = {[0] = 2.5, [3] = 47.23};
```

The default initialization value is `0`

.

## 5.6 Named constants

Sometimes we have constant values and instead of using them literally,
it is better to name them.
For semantic reasons, even if the same value used in the program
with different meanings, then there must be several different named
constants with the same value.
C offers two ways to do this: using `enum`

or using macros.

Constants should be distinguished from `const`

qualified objects.
The qualifier `const`

makes variable read-only after it is initialized.

For example, type `char const* const`

denotes a read-only object
with read-only strings.

C allows to name small integers via enumerations:

```
enum corvid { magpie, raven, jay, n_corvids };
```

where `magpie`

will initialize to zero, `raven`

to one, and so on.
Note, that here we use an idiom of adding as the last element the constant
which will tell us the number of the elements in the enumeration.

Enumeration constants are of type `signed int`

and can be initialized
in more complex ways, for example,

```
enum constants { p0 = 1, p1 = 7*p0, p2 = 2*p1 };
```

as long as the initialization values are integer constant expressions, that is, can be fully determined at compile time.

To declare constants of other types than `signed int`

, the only way
is to use macros, which are actually handled by C preprocessor.
Macros are defined like this:

```
#define M_PI 3.1415926
```

When C preprocessor *preprocesses* a source code file, it replaces
all strings `M_PI`

with the actual value (which is a `double`

literal
in this case).

It is usually a good idea to write macros names in all caps, `LIKE_THIS`

,
although in the C library, some values are not using this convention
(for example, `false`

).

## 5.7 Binary representations

This section is super dense and technical, so I only skimmed through it as I do not require the knowledge from it directly for my current project.

The most interesting bits that I have noticed are the following.

Unigned integers wrap nicely (they form a ring in a mathematical sense) and they never leave to problems.

Signed integers can *trap*, that is, lead to errors such as arithmetic
overflow.

Header file `stdint.h`

provides fixed-width integers, for example,
`uint32_t`

for unsigned integers of width 32 bits, or `int8_t`

for signed integers of width 8 bits.

Floating-point numbers such as `float~s and ~double~s represent a subset of real numbers. Only values that can be expanded in powers of 2 can be represented exactly, for example ~0.5`

, while others cannot, for example `0.3`

is irrational
number in binary representation.

Also, floating-point numbers do not obey to the arithmetic laws (associative, commutative, distributive), which means that change in order of operations can give diffrent result. Also, the results of operations with numbers of very different magnitudes can provide results, which are different from mathematics. For example, adding very small number to a very large number can have just the large number as the result.

Floating-point numbers must not be compared for equality. The only meaningful comparison is, how close they are.

Complex numbers are a pair of real numbers. Header file `tgmath.h`

provides type-generic macros `creal`

and `cimag`

that return
real and imaginary parts, respectively.