University of Virginia Library


87

Page 87

6 Perceiving the window in order to
see the world

The picture is both a scene and a surface, and the scene is
paradoxically seen
behind the surface. This duality of
information is the reason the observer is never quite sure how
to answer the question, “What do you see?” For he can perfectly
well answer that he sees a wall or a piece of paper.

J. J. Gibson, from The Ecological Approach to Visual
Perception
(Gibson, 1979, p. 281)


We have seen (in Chapter 4) that pictures drawn in perspective
suffer very little distortion when they are not seen
from the center of projection. Even though the Renaissance
artists did not write about the robustness of perspective,
they must have understood that paintings can look undistorted
from many vantage points. In fact, soon after the
introduction of linear perspective they began to experiment
most audaciously with the robustness of perspective. As
John White points out, Donatello's relief The Dance of Salome
(or The Feast of Herod), shown in Figure 6-1, in the
Siena Baptistery, “is less than two feet from the top step
leading to the font, and well below eye level even when
seen from the baptistery floor itself” (White, 1967, p. 192).[1]

In this chapter, we will explore the underpinnings of the
robustness of perspective, and we will see why the phenomenon
does not occur unless the surface of the picture


88

Page 88
[ILLUSTRATION]

Figure 6-1. Donatello, The Feast
of Herod (ca. 1425). Gilded
bronze panel, baptismal font, Cathedral
of San Giovanni, Siena.

is perceptible. In other words, we will discover that the
Alberti window differs from all others in that it functions
properly only if it is not completely transparent: We must
perceive the window in order to see the world.

Look back at Figure 4-1 and imagine a geometer familiar
with Gothic arcades who has been asked to solve the inverse
perspective problem given that o as depicted in panel
95 is the most likely center of projection. Our geometer
can now do one of two things: accept the suggested center
of projection, in which case the solution will be a plan very
much like the one shown in panel 95, a plan such as no
Gothic architect would envisage in his most apocalyptic
nightmares, or assume that the arcade is in keeping with
all other Gothic architecture, with respectable right angles
and columns endowed with a rectangular cross section,
such as is shown in panel 97. The latter assumption implies
that the center of projection of the picture does not coincide
with the one suggested. Thus the observer is faced with a
dilemma: to ignore the rules of architecture, or abandon


89

Page 89
the suggested center of projection and choose one in keeping
with the rules of architecture. This is the geometer's
dilemma of perspective, which the visual system too must
resolve.

The robustness of perspective shows that the visual system
does not assume that the center of projection coincides
with the viewer's vantage point. For if it did, every time
the viewer moved, the perceived scene would have to
change and perspective would not be robust. Indeed, the
robustness of perspective suggests that the visual system
infers the correct location of the center of projection. For
if it did not, the perceived scene would not contain right
angles where familiar objects do. We do not know how
the visual system does this. I will assume that it uses methods
similar to those a geometer might use. Such methods
require two hypotheses: (1) the hypothesis of rectangularity,
that is, to assume that such and such a pair of lines in the
picture represents lines that are perpendicular to each other
in the scene, and (2) the hypothesis of parallelism, that is, to
assume that such and such a pair of lines in the picture
represents lines that are parallel to each other in the scene.
For example, here is a geometric method that relies on the
identification of a drawing as a perspectival representation
of a rectangular parallelepiped (a box with six rectangular
faces).

[If the box shown in Figure 6-2 is assumed to be upright, i.e., its top
and bottom faces are assumed to be horizontal, then we must assume
a tilted picture plane (as if we were looking at the box from above).
Because the picture plane is neither parallel nor orthogonal to any of
the box's faces, there are three vanishing points. The two horizontal
vanishing points V′ and V′′ are conjugate, as is the vertical vanishing
point V′′′ with each of the other ones. Each pair of conjugate vanishing
points defines the diameter of a sphere that passes through the center
of projection, which we wish to find. (The diameters of the three spheres
form a triangle, V′ V′′ V′′′, and the intersection of each sphere with the
picture plane is a circle; in Figure 6-2 we show only half of each circle).
Because the three spheres pass through the center of projection, the
single point they share must be the center of projection we are looking
for. Or, to put it in somewhat different terms, the center of projection
must be at the point of intersection of the three circles formed by the
intersections of the three spheres with each other. But to find this point,


90

Page 90
[ILLUSTRATION]

Figure 6-2. Perspective drawing of a
figure and determination of center of
projection

we need only determine the point of intersection of two of these circles.
First we note that these circles define planes perpendicular to the picture
plane. Thus the line of sight (that is, the principal ray) must be the
intersection of these two planes. The diameters of two of the sphere-intersect
circles are shown in Figure 6-2; the point at which the diameters
intersect is the foot of the line of sight (that is, the intersection of the
principal ray with the picture plane). To find the center of projection
we need only erect a perpendicular to the picture plane from the foot
of the line of sight. To find the distance of the center of projection
along this line, we draw one of the sphere-intersect circles, on its diameter
we mark the foot of the line of sight, and at that point we erect
a perpendicular to the diameter; the perpendicular intersects the circle
at the center of projection, at a distance equal to the distance of the
center of projection from the picture plane.]

We have just gone through the steps for finding the center
of projection of the most elaborate type of perspectival
arrangement, three-point perspective. In general, if one
wants to find the center of projection of perspectival pictures,
one always needs more than one pair of conjugate


91

Page 91
vanishing points. For instance, in the case of the construzione
legittima
(Figure 1-12), often referred to as one-point
perspective, all the sides of box-like objects (interiors of
rooms or exteriors of buildings) are either parallel or orthogonal
to the picture plane. The vanishing point of the
orthogonals is the foot of the line of sight (it plays the role
of a pair of conjugate vanishing points). To determine the
distance of the center of projection, it is necessary to find
another pair of conjugate vanishing points, that is, the so-called
distance points (at which the diagonals of a checkerboard
pavement converge). Or, consider the somewhat
more complicated case of oblique perspective, sometimes
called two-point perspective, in which the tops and bottoms
of boxes are horizontal (or, more precisely, orthogonal
to the picture plane), but the other faces are neither
parallel nor orthogonal to the picture plane. To find the
center of projection in this case, we must have in the picture
at least two boxes whose orientations are different; that is,
their sides are not parallel. Then we have two pairs of
conjugate vanishing points with which we can find the
center of projection.[2]

Let me remind the reader how the question of finding
the center of projection came up: We were inquiring why
the surface of the picture had to be perceptible for perspective
to be robust; in the geometric analysis just concluded,
we saw that to find the center of projection we
have to construct a perpendicular to the picture plane. Now
to erect a perpendicular to the surface of the picture, that
surface must be visible. If we assume that the visual system
performs an analysis that is analogous to these geometric
constructions, then we should not be surprised to observe
that when the surface is not visible, as in Pozzo's ceiling,
the robustness of perspective is lost. If you do not look at
the ceiling from the yellow disk that tells you where to
stand for your eye to be at the center of projection, the
painted architecture looks lopsided and about to tumble.
Pirenne summarizes this point in the following words:


92

Page 92
[ILLUSTRATION]

Figure 6-3. If magic lantern will not
come to body's eye, mind's eye must
go to magic lantern. (a) When a
transparency is projected onto a
plane not parallel to plane of transparency,
it will look distorted from
all vantage points. (b) When a
transparency is projected onto a
plane parallel to plane of transparency,
it will not look distorted from
any vantage point (except very extreme
ones).

When the shape and the position of the picture surface can be
seen, an unconscious psychological process of compensation takes
place, which restores the correct view when the picture is viewed
from the wrong position. In the case of Pozzo's ceilings, on the
other hand, the painted surface is 'invisible' and striking deformations
are seen. (1970, p. 99)[3]

Further evidence on the crucial role of the perception of
the texture of the picture plane in making possible the
robustness of perspective can be obtained by carrying out
a very simple experiment. Suppose you want to show
slides to an audience, and you are forced to place the projector
on one side of the room. How should you place the
screen: Should you have the screen face the people in the
middle of the room, or should you set up the screen to
face the projector? The intuitive solution to this problem
is the former. We are uncomfortable in turning the screen
away from the spectators; we feel we are not giving them
the best possible chance to see the pictures, for, we think,
they will look distorted. However, the correct solution is
the nonintuitive one: Always set up the screen to be perpendicular
to the projector; otherwise the picture will look
distorted to everyone in the audience. The explanation for
this surprising rule of thumb is simple (see Figure 6-3): We
have argued that viewers normally feel that their mind's
eye is on a perpendicular to the picture plane, erected at
the foot of the principal ray. Let us assume, for the sake
of simplicity, that a photograph of a natural scene is being
projected. Under optimal viewing conditions, the screen
is at a right angle to the optical axis of the projector, and
the spectator is very close to the optical axis of the projector.
Because most slides are not cropped, the center of
the slide can be taken as the center of projection; on that
point, a line perpendicular to the picture plane is erected
and the viewer feels that his or her mind's eye is on that
line, which happens to coincide with the optical axis of
the projector, and hence no distortion is experienced. In
fact, as long as the optical axis of the projector remains at


93

Page 93
[ILLUSTRATION]

Figure 6-4. Photograph of a photograph
(Time, March 29, 1968)

right angles to the screen, the mind's eye will fall on that
axis. However, if the screen is tilted relative to the optical
axis of the projector, the viewer will locate his or her
mind's eye at a point away from the optical axis of the
projector and will perceive a distorted picture.

The account I have given of our preference for the positioning
of projectors also holds for a phenomenon pointed
out by Pirenne (1970, pp. 96–9): If we look at a photograph
of a scene that has a photograph in it (such as Figure 6-4),
the scene will not appear to be distorted regardless of the


94

Page 94
point from which we look at the photograph. But unless
the photograph in the scene is parallel to the picture plane,
it will appear to be flat and distorted from all points of
view. It will be seen only as a picture and it will not have
the vividness of depth that the scene it belongs to may
have. This is an example of the operation of a mechanism
of compensation for the viewer's position in space vis-à-vis
the picture's center of projection: It suggests that the
compensation requires the viewer to be able to perceive
the surface of the picture. But in what sense does one not
perceive the surface of the photograph in the photograph?
In Figure 6-4, we can immediately see that we would have
to move our viewpoint to the right in order to see the
poster of Nixon frontally. Thus, strictly speaking, we can
see the orientation of the surface of the distorted photograph.
Why then is it distorted? I believe there are two
reasons for this.

First, we can only compensate for one surface at a time.
Photocopy Figure 6-5 and fold the copy along the dotted
line to form a 90-degree angle and stand it on a surface in
front of you. Prop up an unfolded copy of Figure 6-6 next
to it. Now compare what happens to the two pictures as
you shake your head from side to side. The distortion
observed in the folded picture when we move in front of
it is striking, whereas there is practically none when we
move in front of the flat one. Why is this the case? Presumably,
because the folded picture consists of two planes
and the flat one consists of just one; and because we can
only compensate for one plane at a time. No research has
been done on the way we compensate for changes in viewing
position when we look at a folded version of Figure
6-5
: Do we compensate for one side of the diptych and
therefore see the distortion in the other? Or do we attempt
to perform a compromise compensation that cannot compensate
for the changes in our position vis-à-vis either
surface?

The second reason we perceive the distortion of the photograph
in the photograph is that we are not free to choose
which surface will control the process of compensation: In
this picture, there is a primary surface and a secondary


95

Page 95
surface (perhaps unlike the example in Figure 6-5, in which
there may be two surfaces equally demanding of compensation).
Presumably, there are more cues that tell us that
the primary surface is a representation of a scene, such as
perceptibility of surface texture and of a frame, than exist
for the photograph represented in it.

Although we have made some progress in our inquiry
into the robustness of perspective, we have yet to understand
how the visual system identifies which angles in the
picture represent right angles in the scene, which is (as we
have seen earlier in this chapter) a precondition for locating
the center of projection. Because the image of a right angle
can run anywhere from 0 degrees to 180 degrees, drawings
of right angles have no particular signature, and therefore
they can be identified only by some more elaborate procedure.


96

Page 96
[ILLUSTRATION]

Figure 6-6. This drawing corresponds
to what you can see in
Figure
6-5
when picture is folded to
form a 90-degree angle and your eye
is on a bisector of that angle.

There are two views on the nature of this procedure.
According to the first view, right angles are identified
by first recognizing the objects in which they are embedded.
For instance, with respect to Figure 4-1, such an approach
would assume that the visual system first recognizes
that the picture represents a building and then identifies
the features likely to represent right angles. According to
the second view, right angles are recognized by first recognizing
rectangular corners (i.e., the concurrence of three
lines at a point so that all the angles formed are right angles)
in which they are embedded. This is possible because, as
we will presently see, rectangular corners do have a
signature.

The first view, the perception of right angles by an appeal
to the semantics of the represented scene, is exemplified
by the trapezoidal room created by Adalbert Ames,
Jr. This is a room whose plan is shown in Figure 6-7,
which looks like a rectangular room to those looking at it
through the peephole. Here there is no dilemma. There is
ambiguity, however: For an immobile viewer, the visible
features of the room are compatible with many possible
rooms, including the one the typical viewer reports seeing,
which is rectangular, and illusory. But now bring two
people into the room; they are at different distances from
people into the room; they are at different distances from
the observer looking through the peephole and so subtend
different visual angles. Now we have a dilemma: If the
people are seen equal in height, they must be at different


97

Page 97
[ILLUSTRATION]

Figure 6-7. Plan of Ames distorted
room

distances, and because their backs are against the rear wall,
the rear wall cannot be perpendicular to the side walls. On
the other horn of the dilemma, if the room is still mistakenly
seen as a normal rectangular room, then — so goes the
unconscious inference — the people must be at equal distances
from the observer; but because they subtend different
visual angles, they must differ in height. As may be
seen in Figure 6-8, when the viewer is faced with a choice
between seeing an oddly shaped room and seeing two adults
differ dramatically in height, the latter is chosen. We choose
to see grotesque differences in height rather than a distorted
room possibly because sizes of human beings vary more
in our experience than the angles of room corners. Such
an explanation tacitly assumes that the viewer first unconsciously
recognizes that the scene represents a room; and
because a room implies right angles, the viewer then unconsciously
resolves the dilemma of the Ames room by
choosing rectangularity over equal heads, which assumes
that the semantic interpretation of the scene as a room
precedes and determines the interpretation of its features.

In other words, our familiarity with an object depicted
in a picture may be sufficient to determine its perceived
shape. We do not know whether we perceive the Ames
room as we do because of our familiarity with rectangular
rooms, but Perkins and Cooper (1980) have provided us
with an elegant demonstration that leads us to conclude


98

Page 98
[ILLUSTRATION]

Figure 6-9. Views of John Hancock
Tower, Boston, that satisfy (left)
and that do not satisfy (right) Perkins's
laws

that familiarity with the object is probably not critical in
perceiving rectangularity in real objects. In Figure 6-9, we
see two views of the John Hancock Tower in Boston, one
of which appears to have a rectangular cross section, the
other of which appears strangely distorted. This impression
is not confined to looking at pictures of the tower: One
gets the same impression by looking at it from the vantage
points of these pictures. The cross section of the building
is actually a parallelogram, and so the view that appears
distorted (because it does not fit our preconceptions about
the shapes of buildings) is in fact the more veridical one.
So we conclude that our knowledge of architecture does
not override the effect of purely optical changes in the
projection of an object.

Furthermore, as Perkins and Cooper (1980) have shown,
the hypothesis that the resolution of the dilemma of perspective
in favor of rectangularity is conditioned by semantics,
that is to say by object or scene recognition, is


99

Page 99
[ILLUSTRATION]

Figure 6-10. Drawing of unfamiliar
object that we perceive to have right
angles

dealt a blow by the observation that the objects in Figures
6-10
and 6-11 appear to have right angles even though they
are not familiar; indeed, the object in Figure 6-11 is as
unfamiliar as an object can get — it is impossible. These
drawings show that the visual system need not and probably
does not appeal to semantics in order to resolve the
dilemma of perspective.

Thus we are led to the alternative to semantics: that an
angle in a picture is seen as a representation of a right angle
only when it is perceived as a part of a representation of
a rectangular corner. To understand how this can be done,
we must consider junctures, the local features that represent
the vertices of objects that have straight edges. Figure 612
is the drawing of a cube in which two of the junctures
have been labeled, for obvious reasons, fork and arrow junctures.
Perkins (1968, 1972, 1973) formulated the following
laws:

Perkins's first law. A fork juncture is perceived as the vertex
of a cube if and only if the measure of each of the three
angles is greater than 90 degrees.

Perkins's second law. An arrow juncture is perceived as the
vertex of a cube if and only if the measure of each of the
two angles is less than 90 degrees and the sum of their
measures is greater than 90 degrees.


100

Page 100
[ILLUSTRATION]

Figure 6-13. Drawing of three-dimensional
object that does not look
rectangular and does not obey Perkins's
laws

Figure 6-10 obeys Perkins's laws, whereas the form in
Figure 6-13 does not; the former looks rectangular, the
latter does not. Perkins's laws can be extended to junctures
that are not themselves rectangular, but that are part of
bodies that can be decomposed into two congruent bodies,
each of which has a rectangular juncture. Figure 6-14 shows
an object that is seen as a rectangular prism with a mirror-symmetric
irregular pentagonal cross section. We have
added auxiliary lines to the drawing to indicate the plane
of symmetry and a line joining two symmetric vertices of
the cross section. The arrow juncture obtained in the process
of drawing these auxiliary lines satisfies Perkins's second
law and is therefore seen as the vertex of a cube, which
implies the other perceived features. Figure 6-15, on the
other hand, does not look symmetric, and there is some
question regarding whether its upper surface looks orthogonal
to the sides.

These laws, with their emphases on 90-degree angles in
the picture, are related to a special case of central projection,
in which the center of projection is infinitely distant from
the picture plane. In this type of projection, called parallel
projection,
there is no center of projection and the projecting
rays (Figure 1-2) are parallel, there is no horizon line on
which parallel lines converge at vanishing points, and parallel
lines in the scene are depicted as parallel lines in the
picture. It turns out that Perkins's laws are not just laws
of perception: They also state the possible parallel projections
of rectangular vertices. As we will see presently,
Perkins's laws are not generally applicable to central projection.
That is, certain geometrically correct central projections
give rise to pictures that violate Perkins's laws and
therefore do not look right.

Perkins's laws were independently discovered by Roger
N. Shepard and Elizabeth Smith in an experiment[4] that


101

Page 101
studied the perception of vertices of cubes, tetrahedrons
(in which three edges meet at 60-degree angles), and plane
patterns like the Mercedes-Benz equiangle trademark (in
which three lines in a plane meet at 120-degree angles).
Figure 6-16 shows the three objects studied. Shepard and

102

Page 102
[ILLUSTRATION]

Figure 6-17. Upper right: If one
radius, r1, is fixed, two angles can
specify orientation of the other two
radii.
θ1 is measure of angle between
r1 and r2
; θ2 is measure of angle between
r2 and r3.
Lower left: Half
the forms (remaining forms are mirror
images of forms shown here)
used in experiment.
Forks are forms
for which
θ1 + θ2 > 180°; tees are
forms for which
θ1 + θ2 = 180°
arrows are forms for which θ1
+ θ2 < 180°; and ells are forms for
which
θ1 or θ2 = 0° (angles labeled
0° here are small angles, measuring
roughly 7.5°, so as to ensure that
there will always be three lines in
each form).

Smith created 122 patterns, each of which was a circle with
three radii (see Figure 6-17). The orientation of one of the
radii was held fixed: It was always horizontal. Each pattern
differed from others in the disposition of the other two
radii; subjects were asked to say of each pattern whether
it was an acceptable drawing of each of the three objects
studied.

Figure 6-18 shows the results of the experiment. The
data of greatest interest to us are those shown in panel B:
For all the stimuli that obeyed Perkins's laws, more than
50 percent of the subjects accepted the pattern as the representation
of the vertex of a cube; for all the stimuli that
violated Perkins's laws, almost no subjects accepted the
pattern as a representation of the vertex of a cube.

Let us recapitulate: A perceiver is faced with the dilemma
of perspective when a picture drawn in perspective is seen
from a vantage point other than the center of projection.


103

Page 103
[ILLUSTRATION]

Figure 6-18. Proportion of subjects
accepting each pattern as representing
Mercedes-Benz equiangle (panel A),
vertex of cube (panel B), and vertex
of tetrahedron (panel C). Areas of
disks in three panels (whose organization
parallels that in
Figure 6-17)
represent proportion of subjects accepting
patterns as projections of
three types of objects. Dashed
boundaries in three panels delimit
stimuli that are possible parallel projections
of each type of object. In
panel B, these boundaries also delimit
stimuli that obey Perkins's
laws.

The perceiver must either assume that the center of projection
coincides with the perceiver's vantage point, in
which case the proper interpretation of the scene will change
with each change of vantage point, or the perceiver must
infer the location of the center of projection and reconstruct
the proper scene as it would be seen from that point. Because
we have seen that perspective is robust in the face
of changing vantage points, the latter must be the case.
Furthermore, because inferring the location of the center
of projection seems to require an assumption that the objects
represented are rectangular, we examined the question
of the perception of rectangularity of corners in pictures.
We discussed two possibilities: that the perception of rectangularity
is based on familiarity with the sorts of objects
represented and that rectangularity is based on geometric
rules that apply to the configuration of line junctures that
represent right-angled vertices. We concluded in favor of
the latter.

 
[1]

Chapter 13 of White's book discusses several other frescoes and reliefs
that have an inaccessible center of projection. No one has done the
inventory of Renaissance art with respect to this phenomenon. We will
return to the question of inaccessible centers of projection in the
next chapter.

[2]

La Gournerie (1884, Book VI, Chapter 1), Olmer (1949), and Adams
(1972) discuss such procedures.

[3]

This theory was developed by Pirenne on the basis of a suggestion made
by Albert Einstein in a letter written in 1955.

[4]

The Shepard and Smith experiment was carried out in 1971 and published
in 1972 (Shepard, 1981). Perkins (1972) published a similar experiment
in 1972. The major difference between these experiments is
that Shepard and Smith also studied the perception of vertices of nonrectangular
objects, whereas Perkins confined himself to rectangular
vertices. Other relevant research is Cooper (1977), Perkins (1973), and
Perkins and Cooper (1980).