Chapter 7
Polar Coordinate Systems
First of all, we must note that the universe is spherical.
— Nicolaus Copernicus (1473–1543)
The Cartesian coordinate system isn't the only system for mapping out space and defining
locations precisely. An alternative to the Cartesian system is the polar coordinate
system, which is the subject of this chapter. If you're not very familiar with polar
coordinates, it might seem like an esoteric or advanced topic (especially because of the trig),
and you might be tempted to gloss over. Please don't make this mistake. There are many very
practical problems in areas such as AI and camera control whose solutions (and inherent
difficulties!) can be readily understood in the framework of polar coordinates.
This chapter is organized into the following sections:

Section 7.1 describes 2D polar coordinates.

Section 7.2 gives some examples where polar coordinates are
preferable to Cartesian coordinates.

Section 7.3 shows how polar space works in 3D and introduces
cylindrical and spherical coordinates.

Finally, Section 7.4 makes it clear that polar space
can be used to describe vectors as well as positions.
7.12D Polar Space
This section introduces the basic idea behind polar coordinates, using two dimensions to get us
warmed up. Section 7.1.1 shows how to use polar coordinates to describe
position. Section 7.1.2 discusses aliasing of polar coordinates.
Section 7.1.3 shows how to convert between polar and Cartesian coordinates
in 2D.
7.1.1Locating Points by Using 2D Polar Coordinates
Remember that a 2D Cartesian coordinate space has an origin, which establishes the position of
the coordinate space, and two axes that pass through the origin, which establish the orientation
of the space. A 2D polar coordinate space also has an origin (known as the pole), which
has the same basic purpose—it defines the “center” of the coordinate space. A polar
coordinate space has only one axis, however, sometimes called the polar axis, which
is usually depicted as a ray from the origin. It is customary in math literature for the polar
axis to point to the right in diagrams, and thus it corresponds to the
$+x$
axis in a Cartesian
system, as shown in Figure 7.1.
It's often convenient to use different conventions than this, as
shown in
Section 7.3.3. Until then, our
discussion adopts the traditional conventions of the math
literature.
In the Cartesian coordinate system, we described a 2D point using two signed distances,
$x$
and
$y$
. The polar coordinate system uses one distance and one angle. By convention, the
distance is usually assigned to the variable
$r$
(which is short for “radius”) and the angle is
usually called
$\theta $
.
The polar coordinate pair
$(r,\theta )$
specifies a point in 2D space as follows:
Locating the point described by 2D polar
coordinates
$\mathbf{(}\mathit{r}\mathbf{,}\mathit{\theta}\mathbf{)}$
Step 1.Start at the origin, facing in the direction of the polar
axis, and rotate by the angle
$\theta $
. Positive values of
$\theta $
are usually interpreted to mean counterclockwise rotation,
negative values mean clockwise rotation.
Step 2.Now move forward from the origin a distance of
$r$
units. You have arrived at the point described by the
polar coordinates
$(r,\theta )$
.
This process is shown in Figure 7.2.
In summary,
$r$
defines the distance from the point to the origin, and
$\theta $
defines the
direction of the point from the origin. Figure 7.3 shows several points
and their polar coordinates. You should study this figure until you are convinced that you know
how it works.
You might have noticed that the diagrams of polar coordinate spaces contain grid lines, but that
these grid lines are slightly different from the grid lines used in diagrams of Cartesian
coordinate systems. Each grid line in a Cartesian coordinate system is composed of points with
the same value for one of the coordinates. A vertical line is composed of points that all have
the same
$x$
coordinate, and a horizontal line is composed of points that all have the same
$y$
coordinate. The grid lines in a polar coordinate system are similar:

The “grid circles” show lines of constant
$r$
. This makes sense; after all, the
definition of a circle is the set of all points equidistant from its center. That's
why the letter
$r$
is the customary variable to hold this distance, because it is a
radial distance.

The straight grid lines that pass through the origin show
lines of constant
$\theta $
, consisting of points that are the
same direction from the origin.
One note regarding angle measurements. With Cartesian coordinates, the unit of measure wasn't
really significant. We could interpret diagrams using feet, meters, miles, yards, lightyears,
beardseconds, or picas, and it didn't really matter. If you take some Cartesian coordinate data, interpreting that data using different
physical units just makes whatever you're looking at get bigger or smaller, but it's
proportionally the same shape. However, interpreting the angular component of polar coordinates
using different angular units can produce drastically distorted results.
It really doesn't matter whether you use degrees or radians (or grads, mils, minutes, signs,
sextants, or Furmans), as long as you keep it straight. In the text of this book, we almost
always give specific angular measurements in degrees and use the
${}^{\mathrm{o}}$
symbol after the
number. We do this because we are human beings, and most humans who are not math professors find
it easier to deal with whole numbers rather than fractions of
$\pi $
. Indeed, the choice of the
number 360 was specifically designed to make fractions avoidable in many common cases. However,
computing machines prefer to work with angles expressed using
radians, and so the code snippets in this book use radians rather than degrees.
7.1.2Aliasing
Hopefully you're starting to get a good feel for how polar coordinates work and what polar
coordinate space looks like. But there may be some nagging thoughts in the back of your head.
Consciously or subconsciously, you may have noticed a fundamental difference between Cartesian
and polar space. Perhaps you imagined a 2D Cartesian space as a perfectly even continuum of
space, like a flawless sheet of JellO, spanning infinitely in all directions, each infinitely
thin bite identical to all the others. Sure, there are some “special” places, like the origin,
and the axes, but those are just like marks on the bottom of the pan—the JellO itself is the
same there as everywhere else. But when you imagined the fabric of polar coordinate space,
something was different. Polar coordinate space has some “seams” in it, some discontinuities
where things are a bit “patched together.” In the infinitely large circular pan of JellO,
there are multiple sheets of JellO stacked on top of each other. When you put your spoon down
a particular place to get a bite, you often end up with multiple bites! There's a piece of hair
in the block of JellO, a singularity that requires special precautions.
Whether your mental image of polar space was of JellO, or some other yummy dessert, you were
probably pondering some of these questions:

Can the radial distance
$r$
ever be negative?

Can
$\theta $
ever go outside the interval
$[180{}^{\mathrm{o}},+180{}^{\mathrm{o}}]$
?

The value of the angle
$\theta $
directly “west” of
the origin (i.e., for points where
$x<0$
and
$y=0$
using
Cartesian coordinates) is ambiguous. You may have noticed that none of
these points are labeled in Figure 7.3.
Is
$\theta $
equal to
$+180{}^{\mathrm{o}}$
or
$180{}^{\mathrm{o}}$
for these
points?

The polar coordinates for the origin itself are also ambiguous. Clearly
$r=0$
, but what value of
$\theta $
should we use? Wouldn't
any value work?
The answer to all of these questions is “yes.” In fact, we must face
a rather harsh reality about polar space.
For any given point, there are
infinitely many polar coordinate pairs that can be used to
describe that point.
This phenomenon is known as aliasing. Two coordinate pairs are said to be aliases
of each other if they have different numeric values but refer to the same point in space. Notice
that aliasing doesn't happen in Cartesian space—each point in space is assigned exactly one
$(x,y)$
coordinate pair; the mapping of points to coordinate pairs is onetoone.
Before we discuss some of the difficulties created by aliasing,
let's be clear about one task for which aliasing does not
pose any problems: interpreting a particular polar coordinate pair
$(r,\theta )$
and locating the point in space referred to by those
coordinates. No matter what the values of
$r$
and
$\theta $
, we can
come up with a sensible interpretation.
When
$r<0$
, it is interpreted as “backward” movement—displacement in the opposite direction
that we would move if
$r$
were positive. If
$\theta $
is outside the range
$[180{}^{\mathrm{o}},+180{}^{\mathrm{o}}]$
, that's not a cause for panic; we can still determine the resulting
direction. In other words, although there may
be some “unusual” polar coordinates, there's no such thing as “invalid” polar coordinates. A
given point in space corresponds to many coordinate pairs, but a coordinate pair unambiguously
designates exactly one point in space.
One way to create an alias for a point
$(r,\theta )$
is to add a
multiple of
$360{}^{\mathrm{o}}$
to
$\theta $
. This adds one or more whole
“revolutions,” but doesn't change the resulting direction defined
by
$\theta $
. Thus
$(r,\theta )$
and
$(r,\theta +k360{}^{\mathrm{o}})$
describe the same point, where
$k$
is an integer. We can also
generate an alias by adding
$180{}^{\mathrm{o}}$
to
$\theta $
and negating
$r$
; which means we face the other direction, but we displace by the
opposite amount.
In general, for any point
$(r,\theta )$
other than the origin, all of
the polar coordinates that are aliases for
$(r,\theta )$
can be
expressed as
$$((1{)}^{k}r,\theta +k180{}^{\mathrm{o}}),$$
where
$k$
is any integer.
So, in spite of aliasing, we can all agree what point is described by the polar coordinates
$(r,\theta )$
, no matter what values of
$r$
and
$\theta $
are used. But what about the reverse
problem? Given an arbitrary point
$\mathbf{p}$
in space, can we all agree what polar
coordinates
$(r,\theta )$
should be used to describe
$\mathbf{p}$
? We've just said that there are
an infinite number of polar coordinate pairs that could be used to describe the location
$\mathbf{p}$
. Which do we use? The short answer is: “Any one that works is OK, but only
one is the preferred one to use.”
It's like reducing fractions. We all agree that
$13/26$
is a perfectly valid fraction, and
there's no dispute as to what the value of this fraction is. Even so,
$13/26$
is an “unusual”
fraction; most of us would prefer that this value be expressed as
$1/2$
, which is simpler and
easier to understand. A fraction is in the “preferred” format when it's expressed in lowest
terms, meaning there isn't an integer greater than 1 that evenly divides both the numerator and
denominator. We don't have to reduce
$13/26$
to
$1/2$
, but by convention we normally do.
A person's level of commitment to this convention is usually based on how many points their math
teacher counted off on their homework for not reducing fractions to lowest
terms.
For polar coordinates, the “preferred” way to describe any given point is known as the
canonical coordinates for that point. A 2D polar coordinate pair
$(r,\theta )$
is in the
canonical set if
$r$
is nonnegative and
$\theta $
is in the interval
$(180{}^{\mathrm{o}},180{}^{\mathrm{o}}]$
. Notice that the interval is half open: for points directly “west” of the origin
(
$x<0,y=0)$
, we will use
$\theta =+180{}^{\mathrm{o}}$
. Also, if
$r=0$
(which is only true at the
origin), then we usually assign
$\theta =0$
. If you apply all these rules, then for any given
point in 2D space, there is exactly one way to represent that point using canonical polar
coordinates. We can summmarize this succintly with some math notation. A polar coordinate pair
$(r,\theta )$
is in the canonical set if all of the following are true:
Conditions satisfied by canonical coordinates
$$\begin{array}{rlrl}& r\ge 0& & \text{We don't measure distances ``backwards.''}\\ & 180{}^{\mathrm{o}}<\theta \le 180{}^{\mathrm{o}}& & \begin{array}{c}\text{The angle is limited to 1/2 revolution.}\\ \text{We use {+180degrees} for ``west.''}\end{array}\\ & r=0\text{}\text{}\Rightarrow \text{}\text{}\theta =0& & \text{At the origin, set the angle to zero.}\end{array}$$
The following algorithm can be used to convert a polar coordinate
pair into its canonical form:
Converting a polar coordinate pair
$\mathbf{(}\mathit{r}\mathbf{,}\mathit{\theta}\mathbf{)}$
to canonical form

If
$r=0$
, then assign
$\theta =0$
.

If
$r<0$
, then negate
$r$
, and add
$180{}^{\mathrm{o}}$
to
$\theta $
.

If
$\theta \le 180{}^{\mathrm{o}}$
, then add
$360{}^{\mathrm{o}}$
to
$\theta $
until
$\theta >180{}^{\mathrm{o}}$
.

If
$\theta >180{}^{\mathrm{o}}$
, then subtract
$360{}^{\mathrm{o}}$
from
$\theta $
until
$\theta \le 180{}^{\mathrm{o}}$
.
Listing 7.1 shows how it could be done in C. As discussed in
Section 7.1.1, our computer code will normally store angles using radians.
// Radial distance
float r;
// Angle in RADIANS
float theta;
// Declare a constant for 2*pi (360 degrees)
const float TWOPI = 2.0f*PI;
// Check if we are exactly at the origin
if (r == 0.0f) {
// At the origin  slam theta to zero
theta = 0.0f;
} else {
// Handle negative distance
if (r < 0.0f) {
r = r;
theta += PI;
}
// Theta out of range? Note that this if() check is not
// strictly necessary, but we try to avoid doing floating
// point operations if they aren't necessary. Why
// incur floating point precision loss if we don't
// need to?
if (fabs(theta) > PI) {
// Offset by PI
theta += PI;
// Wrap in range 0...TWOPI
theta = floor(theta / TWOPI) * TWOPI;
// Undo offset, shifting angle back in range PI...PI
theta = PI;
}
}
Picky readers may notice that while this code ensures that
$\theta $
is in the closed
interval
$[\pi ,+\pi ]$
, it does not explicitly avoid the case where
$\theta =\pi $
. The value
of
$\pi $
is not exactly representable in floating point. In fact, because
$\pi $
is an irrational
number, it can never be represented exactly in floating point, or with any finite number of
digits in any base, for that matter! The value of the constant PI in our code is not
exactly equal to
$\pi $
, it's the closest number to
$\pi $
that is representable by a
float. Using doubleprecision arithmetic can get us closer to the exact value, but
it is still not exact. So you can think of this function as returning a value from the
open interval
$(\pi ,+\pi )$
.
7.1.3Converting between Cartesian and
Polar Coordinates in 2D
This section describes how to convert between the Cartesian and polar coordinate systems in 2D.
By the way, if you were wondering when we were going to make use of the trigonometry that we
reviewed in Section 1.4.5, this is it.
Figure 7.4 shows the geometry involved
in converting between polar and Cartesian coordinates in 2D.
Converting polar coordinates
$(r,\theta )$
to the corresponding Cartesian coordinates follows
almost immediately from the definitions of sine and cosine:
Converting 2D polar coordinates to Cartesian
$$\begin{array}{}\text{(7.1)}& x& =r\mathrm{cos}\theta ;& y& =r\mathrm{sin}\theta .\end{array}$$
Notice that aliasing is a nonissue; Equation (7.1) works even for
“weird” values of
$r$
and
$\theta $
.
Computing the polar coordinates
$(r,\theta )$
from the Cartesian coordinates
$(x,y)$
is the tricky
part. Due to aliasing, there isn't only one right answer; there are infinitely many
$(r,\theta )$
pairs that describe the point
$(x,y)$
. Usually, we want the canonical coordinates.
We can easily compute
$r$
by using the Pythagorean theorem,
$$r=\sqrt{{x}^{2}+{y}^{2}}.$$
Since the square root function always returns the positive root, we
don't have to worry about
$r$
causing our computed polar coordinates
to be outside the canonical set.
Computing
$r$
was pretty easy, so now let's solve for
$\theta $
:
$$\begin{array}{rl}\frac{y}{x}& =\frac{r\mathrm{sin}\theta}{r\mathrm{cos}\theta},\\ \frac{y}{x}& =\frac{\mathrm{sin}\theta}{\mathrm{cos}\theta},\\ y/x& =\mathrm{tan}\theta ,\\ \theta & =\mathrm{arctan}(y/x).\end{array}$$
Unfortunately, there are two problems with this approach. The first is that if
$x=0$
, the
division is undefined. The second is that the
$\mathrm{arctan}$
function has a range of only
$[90{}^{\mathrm{o}},+90{}^{\mathrm{o}}]$
. The basic problem is that the division
$y/x$
effectively
discards some useful information. Both
$x$
and
$y$
can either be positive or negative, resulting
in four different possibilities, corresponding to the four different quadrants that may contain
the point. But the division
$y/x$
results in a single value. If we negate both
$x$
and
$y$
, we
move to a different quadrant in the plane, but the ratio
$y/x$
doesn't change.
Because of these problems, the complete equation for conversion from Cartesian to polar coordinates
requires some “if statements” to handle each quadrant, and is a bit of a mess for “math
people.” Luckily, “computer people” have the atan2 function, which properly
computes the angle
$\theta $
for all
$x$
and
$y$
, except for the pesky case at the origin. Borrowing
this notation, let's define an
$\mathrm{a}\mathrm{t}\mathrm{a}\mathrm{n}2$
function we can use in this book in our math notation:
The
$\mathbf{a}\mathbf{t}\mathbf{a}\mathbf{n}\mathbf{2}$
function used in this book
$$\begin{array}{}\text{(7.2)}& \mathrm{a}\mathrm{t}\mathrm{a}\mathrm{n}2(y,x)=\{\begin{array}{ll}0,& x=0,y=0,\\ +90{}^{\mathrm{o}},& x=0,y>0,\\ 90{}^{\mathrm{o}},& x=0,y<0,\\ \mathrm{arctan}(y/x),& x>0,\\ \mathrm{arctan}(y/x)+180{}^{\mathrm{o}},& x<0,y\ge 0,\\ \mathrm{arctan}(y/x)180{}^{\mathrm{o}},& x<0,y<0.\end{array}\end{array}$$
Let's make two key observations about Equation (7.2). First, following the convention of the
atan2 function found in the standard libraries of most computer languages, the
arguments are in the “reverse” order:
$y,x$
. You can either just remember that it's reversed,
or you might find it handy to remember the lexical similarity between
$\mathrm{a}\mathrm{t}\mathrm{a}\mathrm{n}2(y,x)$
and
$\mathrm{arctan}(y/x)$
. Or remember that
$\mathrm{tan}\theta =\mathrm{sin}\theta /\mathrm{cos}\theta $
, and
$\theta =\mathrm{a}\mathrm{t}\mathrm{a}\mathrm{n}2(\mathrm{sin}\theta ,\mathrm{cos}\theta )$
.
Second, in many software libraries, the atan2 function is undefined at the
origin, when
$x=y=0$
. The
$\mathrm{a}\mathrm{t}\mathrm{a}\mathrm{n}2$
function we are defining for use in our equations in
the text of this book is defined such that
$\mathrm{a}\mathrm{t}\mathrm{a}\mathrm{n}2(0,0)=0$
. In our code snippets, we use the
library function atan2 and explicitly handle the origin as a special case, but in
our equations, we use the abstract function
$\mathrm{a}\mathrm{t}\mathrm{a}\mathrm{n}2$
, which is defined at the origin. (Note the
difference in typeface.)
Back to the task at hand: computing the polar angle
$\theta $
from a
set of 2D Cartesian coordinates. Armed with the
$\mathrm{a}\mathrm{t}\mathrm{a}\mathrm{n}2$
function, we can easily convert 2D Cartesian coordinates to polar
form:
2D Cartesian to polar coordinate conversion
$$\begin{array}{rlrl}r& =\sqrt{{x}^{2}+{y}^{2}};& \theta & =\mathrm{a}\mathrm{t}\mathrm{a}\mathrm{n}2(y,x).\end{array}$$
The C code in Listing 7.2 shows how to convert a Cartesian
$(x,y)$
coordinate pair to the corresponding canonical polar
$(r,\theta )$
coordinates.
// Input: Cartesian coordinates
float x,y;
// Output: polar radial distance, and angle in RADIANS
float r, theta;
// Check if we are at the origin
if (x == 0.0f && y == 0.0f) {
// At the origin  slam both polar coordinates to zero
r = 0.0f;
theta = 0.0f;
} else {
// Compute values. Isn't the atan2 function great?
r = sqrt(x*x + y*y);
theta = atan2(y,x);
}
7.2Why Would Anybody Use Polar Coordinates?
With all of the complications with aliasing, degrees and radians, and trig, why would anybody use
polar coordinates when Cartesian coordinates work just fine, without any hairs in the JellO?
Actually, you probably use polar coordinates more often than you do Cartesian coordinates. They
arise frequently in informal conversation.
For example, one author is from Alvarado, Texas. When people ask where Alvarado, Texas, is, he
tells them, “About 15 miles southeast of Burleson.” He's describing where Alvarado is by using
polar coordinates, specifying an origin (Burleson), a distance (15 miles), and an angle
(southeast). Of course, most people who aren't from Texas (and many people who are) don't know
where Burleson is, either, so it's more natural to switch to a different polar coordinate system
and say, “About 50 miles southwest of Dallas.” Luckily, even people from outside the United
States usually know where Dallas is. By the way, everyone in Texas does not wear
a cowboy hat and boots. We do use the words
“y'all” and “fixin',” however.
In short, polar coordinates often arise because people naturally think about locations in terms
of distance and direction. (Of course, we often aren't very precise when using polar
coordinates, but precision is not really one of the brain's strong suits.) Cartesian coordinates
are just not our native language. The opposite is true of computers—in general, when using a
computer to solve geometric problems, it's easier to use Cartesian coordinates than polar
coordinates. We discuss this difference between humans and computers again in
Chapter 8 when we compare different methods for describing orientation in 3D.
Perhaps the reason for our affinity for polar coordinates is that each polar coordinate has
concrete meaning all by itself. One fighter pilot may say to another “Bogey, six
o'clock!” In the midst of a dogfight, these brave fighter pilots
are actually using polar coordinates. “Six o'clock” means “behind you” and is basically the
angle
$\theta $
that we've been studying. Notice that the pilot didn't need to specify a distance,
presumably because the other pilot could turn around and see for himself faster than the other
pilot could tell him. So one polar coordinate (in this case, a direction) is useful information
by itself. The same types of examples can be made for the other polar coordinate, distance (
$r$
).
Contrast that with the usefulness of a lone Cartesian coordinate. Imagine a fighter pilot saying,
“Bogey,
$x=1000\text{ft}$
!” This information is more difficult to process, and isn't as
useful.
In video games, one of the most common times that polar coordinates arise is when we want to aim
a camera, weapon, or something else at some target. This problem is easily handled by using a
Cartesiantopolar coordinate conversion, since it's usually the angles we need. Even when
angular data can be avoided for such purposes (we might be able to completely use vector
operations, for example, if the orientation of the object is specified using a matrix), polar
coordinates are still useful. Usually, cameras and turrets and assassins' arms cannot move
instantaneously (no matter how good the assassin), but targets do move. In this
situation, we usually “chase” the target in some manner. This chasing (whatever type of
control system is used, whether a simple velocity limiter, a lag, or a secondorder system) is
usually best done in polar space, rather than, say, interpolating a target position in 3D space.
Polar coordinates are also often encountered with physical data acquisition systems that provide
basic raw measurements in terms of distance and direction.
One final occasion worth mentioning when polar coordinates are more natural to use than Cartesian
coordinates is moving around on the surface of a sphere. When would anybody do that? You're
probably doing it right now. The latitude/longitude coordinates used to precisely describe
geographic locations are really not Cartesian coordinates, they are polar coordinates. (To be
more precise, they are a type of 3D polar coordinates known as spherical coordinates,
which we'll discuss in Section 7.3.2.) Of course, if you are looking at a
relatively small area compared to the size of the planet and you're not too far away from the
equator, you can use latitude and longitude as Cartesian coordinates without too many problems.
We do it all the time in Dallas.
7.33D Polar Space
Polar coordinates can be used in 3D as well as 2D. As you probably have already guessed, 3D
polar coordinates have three values. But is the third coordinate another linear distance
(like
$r$
) or is it another angle (like
$\theta $
)? Actually, we can choose to do either; there
are two different types of 3D polar coordinates. If we add a linear distance, we have
cylindrical coordinates, which are the subject of the next section. If we add another
angle instead, we have spherical coordinates, which are covered in the later sections.
Although cylindrical coordinates are less commonly used than spherical coordinates, we describe
them first because they are easier to understand.
Section 7.3.1 discusses one kind of 3D polar coordinates,
cylindrical coordinates, and Section 7.3.2 discusses the other kind
of 3D polar coordinates, spherical coordinates. Section 7.3.3 presents
some alternative polar coordinate conventions that are often more streamlined for use in video
game code. Section 7.3.4 describes the special types of aliasing that can
occur in spherical coordinate space. Section 7.3.5 shows how to
convert between spherical coordinates and 3D Cartesian coordinates.
7.3.1Cylindrical Coordinates
To extend Cartesian coordinates into 3D, we start with the 2D system, used for working in the
plane, and add a third axis perpendicular to this plane. This is basically how cylindrical
coordinates work to extend polar coordinates into 3D. Let's call the third axis the
$z$
axis,
as we do with Cartesian coordinates. To locate the point described by the cylindrical
coordinates
$(r,\theta ,z)$
, we start by processing
$r$
and
$\theta $
just like we would for 2D
polar coordinates, and then move “up” or “down” according to the
$z$
coordinate.
Figure 7.5 shows how to locate a point
$(r,\theta ,z)$
by using
cylindrical coordinates.
Conversion between 3D Cartesian coordinates and cylindrical coordinates is straightforward. The
$z$
coordinate is the same in either representation, and we convert between
$(x,y)$
and
$(r,\theta )$
via the 2D techniques from Section 7.1.3.
We don't use cylindrical coordinates much in this book, but they are useful in some situations
when working in a cylindershaped environment or describing a cylindershaped object. In the same
way that people often use polar coordinates without knowing it (see Section 7.2),
people who don't know the term “cylindrical coordinates” may still use them. Be aware that
even when people do acknowledge that they are using cylindrical coordinates, notation and
conventions vary widely. For example, some people use the notation
$(\rho ,\varphi ,z)$
. Also, the
orientation of the axes and definition of positive rotation are set according to whatever is most
convenient for a given situation.
7.3.2Spherical Coordinates
The more common kind of 3D polar coordinate system is a spherical coordinate system.
Whereas a set of cylindrical coordinates has two distances and one angle, a set of spherical
coordinates has two angles and one distance.
Let's review the essence of how polar coordinates work in 2D. A point is specified by giving a
direction (
$\theta $
) and a distance (
$r$
). Spherical coordinates also work by defining a
direction and distance; the only difference is that in 3D it takes two angles to define a
direction. There are also two polar axes in a 3D spherical space. The first axis is
“horizontal” and corresponds to the polar axis in 2D polar coordinates or
$+x$
in our 3D
Cartesian conventions. The other axis is vertical, corresponding to
$+y$
in our 3D Cartesian
conventions.
Different people use different conventions and notation for spherical coordinates, but most math
people have agreed that the two angles are named
$\theta $
and
$\varphi $
. Math people also are in general agreement about how these two angles are to
be interpreted to define a direction. The entire process works like this:
Locating points in 3D using polar coordinates
Step 1.Begin by standing at the origin, facing the direction
of the horizontal polar axis. The vertical axis points from
your feet to your head. Point your right arm straight up, in
the direction of the vertical polar axis.
Step 2.Rotate counterclockwise by the angle
$\theta $
(the same
way that we did for 2D polar coordinates).
Step 3.Rotate your arm downward by the angle
$\varphi $
. Your arm
now points in the direction specified by the polar angles
$\theta $
and
$\varphi $
.
Step 4.Displace from the origin along this direction by the distance
$r$
.
You've arrived at the point described by the spherical coordinates
$(r,\theta ,\varphi )$
.
Figure 7.6 shows how this works.
Other people use different notation. The convention in which
the symbols
$\theta $
and
$\varphi $
are reversed is frequently used,
especially in physics. Other authors, perhaps intent on replacing
all Roman letters with Greek, use
$\rho $
instead of
$r$
as the name
of the radial distance. We present some conventions
that are a bit more practical for video game purposes in
Section 7.3.3.
The horizontal angle
$\theta $
is known as the azimuth, and
$\varphi $
is the zenith.
Other terms that you've probably heard are longitude and latitude. Longitude
is basically the same as
$\theta $
, and latitude is the angle of inclination,
$90{}^{\mathrm{o}}\varphi $
. So, you see, the latitude/longitude system for describing locations on planet Earth is
actually a type of spherical coordinate system. We're often interested only in describing points
on the planet's surface, and so the radial distance
$r$
, which would measure the distance to the
center of the Earth, isn't necessary. We can think of
$r$
as being roughly equivalent to
altitude, although the value is offset by Earth's radius in order to make either ground level or sea level equal to
zero, depending on exactly what is meant by “altitude.”
7.3.3Some Polar Conventions Useful in 3D Virtual Worlds
The spherical coordinate system described in the previous section is the traditional righthanded
system used by math people, and the formulas for converting between Cartesian and spherical
coordinates are rather elegant under these assumptions. However, for most people in the video
game industry, this elegance is only a minor benefit to be weighed against the following
irritating disadvantages of the traditional conventions:

The default horizontal direction at
$\theta =0$
points in the direction of
$+x$
.
This is unfortunate, since for us,
$+x$
points “to the right” or “east,” neither
of which are the “default” directions in most people's mind. Similar to the way
that numbers on a clock start at the top, it would be nicer for us if the horizontal
polar axis pointed towards
$+z$
, which is “forward” or “north.”

The conventions for the angle
$\varphi $
are unfortunate in several respects.
It would be nicer if the 2D polar coordinates
$(r,\theta )$
were
extended into 3D simply by adding a third coordinate of zero, similar to how
we extend the Cartesian system from 2D to 3D. But the spherical coordinates
$(r,\theta ,0)$
don't correspond to the 2D polar coordinates
$(r,\theta )$
as we'd like. In fact, assigning
$\varphi =0$
puts us in the awkward situation of
Gimbal lock, a singularity we describe in Section 7.3.4.
Instead, the points in the 2D plane are represented as
$(r,\theta ,90{}^{\mathrm{o}})$
. It might have been more intuitive to measure
latitude, rather than zenith. Most people think of the default as
“horizontal,” and “up” as the extreme case.

No offense to the Greeks, but
$\theta $
and
$\varphi $
take a little while to get used
to. The symbol
$r$
isn't so bad because at least it stands for something meaningful:
radial distance or radius. Wouldn't it be great if the symbols we used to
denote the angles were similarly short for English words, rather than
completely arbitrary Greek symbols?

It would be nice if the two angles for spherical coordinates were the same as the
first two angles we use for Euler angles, which are used to describe
orientation in 3D. We're not going to discuss Euler angles until
Section 8.3, so for now let us disagree with
Descartes
twiceover by saying “It'd be nice because we told you so.”

It's a righthanded system, and we use a lefthanded
system (in this book at least).
Let's describe some spherical coordinate conventions that are better suited for our purposes. We
have no complaints against the standard conventions for the radial distance
$r$
, and so we
preserve both the name and semantics of this coordinate. Our grievances are primarily concerning
the two angles, both of which we rename and repurpose.
The horizontal angle
$\theta $
is renamed
$h$
, which is short for heading and is similar to
a compass heading. A heading of zero indicates a direction of “forward” or “to the north,”
depending on the context. This matches standard aviation conventions. If we assume our 3D
Cartesian conventions described in Section 1.3.4, then a heading
of zero (and thus our primary polar axis) corresponds to
$+z$
. Also, since we prefer a
lefthanded coordinate system, positive rotation will rotate clockwise when viewed from
above.
The vertical angle
$\varphi $
is renamed
$p$
, which is short for pitch and measures how much
we are looking up or down. The default pitch value of zero indicates a horizontal direction,
which is what most of us intuitively expect. Perhaps not so intuitively, positive pitch rotates
downward, which means that pitch actually measures the angle of declination. This
might seem to be a bad choice, but it is consistent with the lefthand rule (see
Figure 1.14). Later we see how consistency with the
lefthand rule bears fruit worth suffering this small measure of counterintuitiveness.
Figure 7.7 shows how heading
and pitch conspire to define a direction.
7.3.4Aliasing of Spherical Coordinates
Section 7.1.2 examined the bothersome phenomenon of aliasing of 2D polar
coordinates: different numerical coordinate pairs are aliases of each other when they
refer to the same point in space. Three basic types of aliasing were presented, which we review
here because they are also present in the 3D spherical coordinate system.
The first surefire way to generate an alias is to add a multiple of
$360{}^{\mathrm{o}}$
to either angle. This is really the most trivial form
of aliasing and is caused by the cyclic nature of angular
measurements.
The other two forms of aliasing are a bit more interesting because they are caused by the
interdependence of the coordinates. In other words, the meaning of one coordinate,
$r$
, depends
on the values of the other coordinate(s), the angles. This dependency creates a form of aliasing
and a singularity:

The aliasing in 2D polar space can be triggered by
negating the radial distance
$r$
and adjusting the angle so that the
opposite direction is indicated. We can do the same with spherical
coordinates. Using the heading and pitch conventions described in
Section 7.3.3, all we need to do is flip the heading by
adding an odd multiple of 180°, and then negate the pitch.

The singularity in 2D polar space occurs at the origin, because
the angular coordinate is irrelevant when
$r=0$
. With spherical coordinates,
both angles are irrelevant at the origin.
So spherical coordinates exhibit similar aliasing behavior because the meaning of
$r$
changes
depending on the values of the angles. However, spherical coordinates also suffer additional
forms of aliasing because the pitch angle rotates about an axis that varies depending on the
heading angle. This creates an additional form of aliasing and an additional singularity, which
are analogous to those caused by the dependence of
$r$
on the direction.

Different heading and pitch values can result in the same direction, even excluding
trivial aliasing of each individual angle. An alias of (
$h,p)$
can be
generated by
$(h\pm 180{}^{\mathrm{o}},180{}^{\mathrm{o}}p)$
. For example, instead of turning
right 90°(facing “east”) and pitching down 45°, we could turn left
90°(facing “west”) and then pitch down 135°. Although we would be
upside down, we would still be looking in the same direction.

A singularity occurs when the pitch angle is set to
$\pm 90{}^{\mathrm{o}}$
(or
any alias of these values). In this situation, known as
Gimbal lock, the direction indicated is purely vertical (straight up or
straight down), and the heading angle is irrelevant. We have a great deal more to say
about Gimbal lock when we discuss Euler angles in Section 8.3.
Just as we did in 2D, we can define a set of canonical spherical coordinates such that any given
point in 3D space maps unambiguously to exactly one coordinate triple within the canonical set.
We place similar restrictions on
$r$
and
$h$
as we did for polar coordinates. Two additional
constraints are added related to the pitch angle. First, pitch is restricted to be on the
interval
$[90{}^{\mathrm{o}},+90{}^{\mathrm{o}}]$
. Second, since the heading value is irrelevant when pitch
reaches the extreme values in the case of Gimbal lock, we force
$h=0$
in that case. The
conditions that are satisfied by the points in the canonical set are summarized by the criteria
below. (Note that these criteria assume our heading and pitch conventions, not the traditional
math conventions with
$\theta $
and
$\varphi $
.)
Conditions satisfied by canonical spherical coordinates, assuming the
conventions for spherical coordinates in this book
$$\begin{array}{rlrl}& r\ge 0& & \text{We don't measure distances ``backwards.''}\\ & 180{}^{\mathrm{o}}<h\le 180{}^{\mathrm{o}}& & \begin{array}{c}\text{Heading is limited to 1/2 revolution.}\\ \text{We use {+180degrees} for ``south.''}\end{array}\\ & 90{}^{\mathrm{o}}\le p\le 90{}^{\mathrm{o}}& & \begin{array}{c}\text{Pitch limits are straight up and down.}\\ \text{We can't ``pitch over backwards.''}\end{array}\\ & r=0\text{}\text{}\Rightarrow \text{}\text{}h=p=0& & \text{At the origin, we set the angles to zero.}\\ & p=90{}^{\mathrm{o}}\text{}\text{}\Rightarrow \text{}\text{}h=0& & \begin{array}{c}\text{When looking directly up or down,}\\ \text{we set the heading to zero.}\end{array}\end{array}$$
The following algorithm can be used to convert a spherical
coordinate triple into its canonical form:
Converting a spherical coordinate triple
$\mathbf{(}\mathit{r}\mathbf{,}\mathit{h}\mathbf{,}\mathit{p}\mathbf{)}$
to canonical form

If
$r=0$
, then assign
$h=p=0$
.

If
$r<0$
, then negate
$r$
, add
$180{}^{\mathrm{o}}$
to
$h$
, and
negate
$p$
.

If
$p<90{}^{\mathrm{o}}$
, then add
$360{}^{\mathrm{o}}$
to
$p$
until
$p\ge 90{}^{\mathrm{o}}$
.

If
$p>270{}^{\mathrm{o}}$
, then subtract
$360{}^{\mathrm{o}}$
from
$p$
until
$p\le 270{}^{\mathrm{o}}$
.

If
$p>90{}^{\mathrm{o}}$
, then add
$180{}^{\mathrm{o}}$
to
$h$
and
set
$p=180{}^{\mathrm{o}}p$
.

If
$h\le 180{}^{\mathrm{o}}$
, then add
$360{}^{\mathrm{o}}$
to
$h$
until
$h>180{}^{\mathrm{o}}$
.

If
$h>180{}^{\mathrm{o}}$
, then subtract
$360{}^{\mathrm{o}}$
from
$h$
until
$h\le 180{}^{\mathrm{o}}$
.
Listing 7.3 shows how it could be done in C.
Remember that computers like radians.
// Radial distance
float r;
// Angles in radians
float heading, pitch;
// Declare a few constants
const float TWOPI = 2.0f*PI; // 360 degrees
const float PIOVERTWO = PI/2.0f; // 90 degrees
// Check if we are exactly at the origin
if (r == 0.0f) {
// At the origin  slam angles to zero
heading = pitch = 0.0f;
} else {
// Handle negative distance
if (r < 0.0f) {
r = r;
heading += PI;
pitch = pitch;
}
// Pitch out of range?
if (fabs(pitch) > PIOVERTWO) {
// Offset by 90 degrees
pitch += PIOVERTWO;
// Wrap in range 0...TWOPI
pitch = floor(pitch / TWOPI) * TWOPI;
// Out of range?
if (pitch > PI) {
// Flip heading
heading += PI;
// Undo offset and also set pitch = 180pitch
pitch = 3.0f*PI/2.0f  pitch; // p = 270 degrees  p
} else {
// Undo offset, shifting pitch in range
// 90 degrees ... +90 degrees
pitch = PIOVERTWO;
}
}
// Gimbal lock? Test using a relatively small tolerance
// here, close to the limits of single precision.
if (fabs(pitch) >= PIOVERTWO*0.9999) {
heading = 0.0f;
} else {
// Wrap heading, avoiding math when possible
// to preserve precision
if (fabs(heading) > PI) {
// Offset by PI
heading += PI;
// Wrap in range 0...TWOPI
heading = floor(heading / TWOPI) * TWOPI;
// Undo offset, shifting angle back in range PI...PI
heading = PI;
}
}
}
7.3.5
Converting between Spherical and Cartesian
Coordinates
Let's see if we can convert spherical coordinates to 3D Cartesian coordinates. Examine
Figure 7.8, which shows both spherical and Cartesian
coordinates. We first develop the conversions using the traditional righthanded conventions for
both Cartesian and spherical spaces, and then we show conversions applicable to our lefthanded
conventions.
Notice in Figure 7.8 that we've introduced a new variable
$d$
, which is the horizontal distance between the point and the vertical axis. From the right
triangle with hypotenuse
$r$
and legs
$d$
and
$z$
, we get
$$\begin{array}{rl}z/r& =\mathrm{cos}\varphi ,\\ z& =r\mathrm{cos}\varphi .\end{array}$$
and so we're left to compute
$x$
and
$y$
.
Consider that if
$\varphi =90{}^{\mathrm{o}}$
, we basically have 2D polar
coordinates. Let's assign
${x}^{\prime}$
and
${y}^{\prime}$
to stand for the
$x$
and
$y$
coordinates that would result if
$\varphi =90{}^{\mathrm{o}}$
. From
Section 7.1.3, we have
$$\begin{array}{rlrl}{x}^{\prime}& =r\mathrm{cos}\theta ,& {y}^{\prime}& =r\mathrm{sin}\theta .\end{array}$$
Notice that when
$\varphi =90{}^{\mathrm{o}}$
,
$d=r$
. As
$\varphi $
decreases,
$d$
decreases, and by the
properties of similar triangles,
$x/{x}^{\prime}=y/{y}^{\prime}=d/r$
. Looking at
$\mathrm{\u25b3}drz$
again, we observe
that
$d/r=\mathrm{sin}\varphi $
. Putting all this together, we have
Converting spherical coordinates used by math people to 3D Cartesian coordinates
$$\begin{array}{rlrlrl}x& =r\mathrm{sin}\varphi \text{}\mathrm{cos}\theta ,& y& =r\mathrm{sin}\varphi \text{}\mathrm{sin}\theta ,& z& =r\mathrm{cos}\varphi .\end{array}$$
These equations are applicable for righthanded math people. If we adopt our conventions for both
the Cartesian (see Section 1.3.4) and spherical (see
Section 7.3.3) spaces, the following formulas should be used:
SphericaltoCartesian conversion for the conventions used in this book
$$\begin{array}{}\text{(7.3)}& x& =r\mathrm{cos}p\text{}\mathrm{sin}h,& y& =r\mathrm{sin}p,& z& =r\mathrm{cos}p\text{}\mathrm{cos}h.\end{array}$$
Converting from Cartesian coordinates to spherical coordinates is more complicated, due to
aliasing. We know that there are multiple sets of spherical coordinates that map to any given 3D
position; we want the canonical coordinates. The derivation that follows uses our preferred
aviationinspired conventions in Equation (7.3) because those
conventions are the ones most commonly used in video games.
As with 2D polar coordinates, computing
$r$
is a straightforward application of the distance
formula:
$$r=\sqrt{{x}^{2}+{y}^{2}+{z}^{2}}.$$
As before, the singularity at the origin, where
$r=0$
, is handled as a special case.
The heading angle is surprisingly simple to compute using our
$\mathrm{a}\mathrm{t}\mathrm{a}\mathrm{n}2$
function:
$$h=\mathrm{a}\mathrm{t}\mathrm{a}\mathrm{n}2(x,z).$$
The trick works because
$\mathrm{a}\mathrm{t}\mathrm{a}\mathrm{n}2$
uses only the ratio of its arguments and their signs. By
examining Equation (7.3), we notice that the scale factor of
$r\mathrm{cos}p$
is common to both
$x$
and
$z$
. Furthermore, by using canonical coordinates, we are
assuming
$r>0$
and
$90{}^{\mathrm{o}}\le p\le 90{}^{\mathrm{o}}$
; thus,
$\mathrm{cos}p\ge 0$
and the common scale
factor is always nonnegative. The Gimbal lock case is dealt with by our definition of
$\mathrm{a}\mathrm{t}\mathrm{a}\mathrm{n}2$
.
Finally, once we know
$r$
, we can solve for
$p$
from
$y$
:
$$\begin{array}{rl}y& =r\mathrm{sin}p,\\ y/r& =\mathrm{sin}p,\\ p& =\mathrm{arcsin}(y/r).\end{array}$$
The
$\mathrm{arcsin}$
function has a range of
$[90{}^{\mathrm{o}},90{}^{\mathrm{o}}]$
, which fortunately coincides with
the range for
$p$
within the canonical set.
Listing 7.4 illustrates the entire procedure.
// Input Cartesian coordinates
float x,y,z;
// Output radial distance
float r;
// Output angles in radians
float heading, pitch;
// Declare a few constants
const float TWOPI = 2.0f*PI; // 360 degrees
const float PIOVERTWO = PI/2.0f; // 90 degrees
// Compute radial distance
r = sqrt(x*x + y*y + z*z);
// Check if we are exactly at the origin
if (r > 0.0f) {
// Compute pitch
pitch = asin(y/r);
// Check for gimbal lock, since the library atan2
// function is undefined at the (2D) origin
if (fabs(pitch) >= PIOVERTWO*0.9999) {
heading = 0.0f;
} else {
heading = atan2(x,z);
}
} else {
// At the origin  slam angles to zero
heading = pitch = 0.0f;
}
7.4Using Polar Coordinates to Specify Vectors
We've seen how to describe a point by using polar coordinates, and
how to describe a vector by using Cartesian coordinates. It's also possible to use polar form to
describe vectors. Actually, to say that we can “also” use polar form is sort of like saying
that a computer is controlled with a keyboard but it can “also” be controlled with the mouse.
Polar coordinates directly describe the two key properties of a vector—its direction and
length. In Cartesian form, these values are stored indirectly and obtained only through some
computations that essentially boil down to a conversion to polar form. This is why, as we
discussed in Section 7.2, polar coordinates are the local currency in everyday
conversation.
But it isn't just laymen who prefer polar form. It's interesting to notice that most physics
textbooks contain a brief introduction to vectors, and this introduction is carried out using a
framework of polar coordinates. This is done despite the fact that it makes the math
significantly more complicated.
As for the details of how polar vectors work, we've actually already covered them. Consider our
“algorithm”FIXME link for locating a point described by 2D polar coordinates. If you take out the phrase “start at the
origin” and leave the rest intact, the instructions describe how to visualize the displacement
(vector) described by any given polar coordinates. This is the same idea from
Section 2.4: a vector is related to the point with the same
coordinates because it gives us the displacement from the origin to that point.
We've also already learned the math for converting vectors between Cartesian and polar form. The
methods discussed in Section 7.1.3 were presented in terms of points, but they
are equally valid for vectors.
Exercises

Plot and label the points with the following polar coordinates:
$$\begin{array}{rlrl}\mathbf{a}& =(2,60{}^{\mathrm{o}})& \mathbf{b}& =(5,195{}^{\mathrm{o}})\\ \mathbf{c}& =(3,45{}^{\mathrm{o}})& \mathbf{d}& =(2.75,300{}^{\mathrm{o}})\\ \mathbf{e}& =(4,\pi /6\text{}\mathrm{r}\mathrm{a}\mathrm{d})& \mathbf{f}& =(1,4\pi /3\text{}\mathrm{r}\mathrm{a}\mathrm{d})\\ \mathbf{g}& =(5/2,\pi /2\text{}\mathrm{r}\mathrm{a}\mathrm{d})\end{array}$$

Convert the following 2D polar coordinates to canonical
form:
 (a)
$(4,207{}^{\mathrm{o}})$
 (b)
$(5,720{}^{\mathrm{o}})$
 (c)
$(0,45.2{}^{\mathrm{o}})$
 (d)
$(12.6,11\pi /4\text{}\mathrm{r}\mathrm{a}\mathrm{d})$

Convert the following 2D polar coordinates to Cartesian
form:
 (a)
$(1,45{}^{\mathrm{o}})$
 (b)
$(3,0{}^{\mathrm{o}})$
 (c)
$(4,90{}^{\mathrm{o}})$
 (d)
$(10,30{}^{\mathrm{o}})$
 (e)
$(5.5,\pi \text{}\mathrm{r}\mathrm{a}\mathrm{d})$

Convert the polar coordinates in Exercise 2 to Cartesian form.

Convert the following 2D Cartesian coordinates to (canonical) polar form:
 (a)
$(10,20)$
 (b)
$(12,5)$

(c)
$(0,4.5)$
 (d)
$(3,4)$
 (e)
$(0,0)$

(f)
$(5280,0)$

Convert the following cylindrical coordinates to Cartesian form:
 (a)
$(4,120{}^{\mathrm{o}},5)$
 (b)
$(2,45{}^{\mathrm{o}},1)$
 (c)
$(6,\pi /6,3)$
 (d)
$(3,3\pi ,1)$

Convert the following 3D Cartesian coordinates to (canonical) cylindrical form:
 (a)
$(1,1,1)$
 (b)
$(0,5,2)$
 (c)
$(3,4,7)$
 (d)
$(0,0,3)$

Convert the following spherical coordinates
$(r,\theta ,\varphi )$
to Cartesian form
according to the standard mathematical convention:
 (a)
$(4,\pi /3,3\pi /4)$
 (b)
$(5,5\pi /6,\pi /3)$
 (c)
$(2,\pi /6,\pi )$
 (d)
$(8,9\pi /4,\pi /6)$

Interpret the spherical coordinates (a)–(d) from the previous exercise as
$(r,h,p)$
triples,
switching to our video game conventions.
 1.Convert to canonical
$(r,h,p)$
coordinates.

2.Use the canonical coordinates to convert to Cartesian form (using the video game
conventions).

Convert the following 3D Cartesian coordinates to (canonical) spherical form using our
modified convention:
 (a)
$(\sqrt{2},2\sqrt{3},\sqrt{2})$
 (b)
$(2\sqrt{3},6,4)$
 (c)
$(1,1,1)$
 (d)
$(2,2\sqrt{3},4)$
 (e)
$(\sqrt{3},\sqrt{3},2\sqrt{2})$
 (f)
$(3,4,12)$

What do the “grid lines” look like in spherical space? Assuming the spherical
conventions used in this book, describe the shape defined by the set of all points that
meet the following criteria. Do not restrict the coordinates to the canonical set.
 (a)A fixed radius
$r={r}_{0}$
, but any arbitrary values for
$h$
and
$p$
.
 (b)A fixed heading
$h={h}_{0}$
, but any arbitrary values for
$r$
and
$p$
.
 (c)A fixed pitch
$p={p}_{0}$
, but any arbitrary values for
$r$
and
$h$
.

During crunch time one evening, a game developer decided to get some fresh air and go for a
walk. The developer left the studio walking south and walked for 5 km. She then turned east and walked
another 5 km. Realizing that all the fresh air was making her lightheaded, she decided to
return to the studio. She turned north, walked 5 km and was back at the studio, ready to squash
the few remaining programming bugs left on her list. Unfortunately, waiting for her
at the door was a hungry bear, and she was eaten alive. What color was the bear?
For the execution of the voyage to the Indies,
I did not make use of intelligence, mathematics or maps.
— Christopher Columbus (1451–1506)