1.2 Systems of Reference

Any one who will try to imagine the state of a mind conscious of knowing the absolute position of a point will ever after be content with our relative knowledge.

James Clerk Maxwell, 1877

There are many theories of relativity, each of which can be associated with some arbitrariness in our descriptions of events. For example, suppose we describe the spatial relations between stationary particles on a line by assigning a real-valued coordinate to each particle, such that the distance between any two particles equals the difference between their coordinates. There is a degree of arbitrariness in this description due to the fact that all the coordinates could be increased by some arbitrary constant without affecting any of the relations between the particles. Symbolically this translational relativity can be expressed by saying that if x is a suitable system of coordinates for describing the relations between the particles, then so is x + k for any constant k. Likewise if we describe the spatial relations between stationary particles on a plane by assigning an ordered pair of real-valued coordinates to each particle, such that the squared distance between any two particles equals the sum of the squares of the differences between their respective coordinates, then there is a degree of arbitrariness in the description (in addition to the translational relativity of each individual coordinate) due to the fact that we could rotate the coordinates of every particle by an arbitrary constant angle without affecting any of the relations between the particles. This relativity of orientation is expressed symbolically by saying that if (x,y) is a suitable system of coordinates for describing the positions of particles on a plane, then so is (ax–by, bx+ay) where a² + b² = 1.

These relativities are purely formal, in the sense that they are tautological consequences of the premises, regardless of whether they have any physical applicability. Our first premise was that it’s possible to assign a single real-valued coordinate to each particle on a line such that the distance between any two particles equals the difference between their coordinates. If this premise is satisfied, the invariance of relations under coordinate transformations from x to x + k follows trivially, but if the pairwise distances between three given particles were, say, 5, 3, and 12 units, then no three numbers could be assigned to the particles such that the pairwise differences equal the distances. This shows that the n(n–1)/2 pairwise distances between n particles cannot be independent of each other if those distances can be encoded unambiguously by just n coordinates in one dimension or, more generally, by kn coordinates in k dimensions. A suitable system of coordinates in one dimension exists only if the distances between particles satisfy a very restrictive condition. Letting d(A,B) denote the signed distance from A to B, the condition that must be satisfied is that for every three particles A,B,C we have d(A,B) + d(B,C) + d(C,A) = 0. Of course, this is essentially the definition of co-linearity, but we have no a priori reason to expect this definition to have any applicability in the world of physical objects. The fact that it has wide applicability is a non-trivial aspect of our experience, albeit one that we ordinarily take for granted.

Likewise for particles in a region of three dimensional space the premise that we can assign three numbers to each particle such that the squared distance between any two particles equals the sum of the squares of the differences between their respective coordinates is true only under a very restrictive condition, because there are only 3n degrees of freedom in the n(n–1)/2 pairwise distances between n particles.

Just as we found relativity of orientation for the pair of spatial coordinates x and y, we also find the same relativity for each of the pairs x,z and y,z in three dimensional space. Thus we have translational relativity for each of the four coordinates x,y,z,t, and we have rotational relativity for each pair of spatial coordinates (x,y), (x,z), and (y,z). This leaves the pairs of coordinates (x,t), (y,t) and (z,t). Not surprisingly we find that there is an analogous arbitrariness in these coordinate pairs, which can be expressed (for the x,t pair) by saying that the relations between the instances of particles on a line as a function of time are unaffected if we replace the x and t coordinates with ax – bt and –bx + at respectively, where a² – b² = 1. These transformations (rotations in the x,t plane through an imaginary angle), which characterize the theory of special relativity, are based on the premise that it is possible to assign pairs of values, x and t, to each instance of each particle on the x axis such that the squared spacetime distance equals the difference between the squares of the differences between the respective coordinates.

Each of the above examples represents an invariance of physically measurable relations under certain classes of linear transformations. Extending this idea, Einstein’s general theory of relativity shows how the laws of physics, suitably formulated, are invariant under an even larger class of transformations of space and time coordinates, including non-linear transformations, and how these transformations subsume the phenomena of gravity. In general relativity the metrical properties of space and time are not constant, so the simple premises on which we based the primitive relativities described above turn out not to be satisfied globally. However, it remains true that those simple premises are satisfied locally, i.e., over sufficiently small regions of space and time, so they continue to be of fundamental importance.

As mentioned previously, the relativities described above are purely formal and tautological, but it turns out that each of them is closely related to a non-trivial physical symmetry. There exists a large class of identifiable objects whose lengths maintain a fixed proportion to each other under the very same set of transformations that characterize the relativities of the coordinates. In other words, just as we can translate the coordinates on the x axis without affecting the length of any object, we also find a large class of objects that can be individually translated along the x axis without affecting their lengths. The same applies to rotations and boosts. Such changes are physically distinct from purely formal shifts of the entire coordinate system, because when we move individual objects we are actually changing the relations between objects, since we are moving only a subset of all the coordinated objects. (Also, moving an object from one stationary position to another requires acceleration.) Thus for each formal arbitrariness in the system of coordinates there exists a physical symmetry, i.e., a large class of entities whose extents remain in constant proportions to each other when subjected individually to the same transformations.

We refer to these relations as physical symmetries rather than physical invariances, because (for example) we have no basis for asserting that the length of a solid object or the duration of a physical process is invariant under changes in position, orientation or state of motion. We have no way of assessing the truth of such a statement, because our measures of length and duration are all comparative. We can say only that the spatial and temporal extents of all the “stable” physical entities and processes are affected (if at all) in exactly the same proportion by changes in position, orientation, and state of motion. Of course, given this empirical fact, it is often convenient to speak as if the spatial and temporal extents are invariant, but we shouldn’t forget that, from an epistemological standpoint, we can assert only symmetry, not invariance.

In his original presentation of special relativity in 1905 Einstein took measuring rods and clocks as primitive elements, even though he realized the weakness of this approach. He later wrote of the special theory

It is striking that the theory introduces two kinds of physical things, i.e., (1) measuring rods and clocks, and (2) all other things, e.g., the electromagnetic field, the material point, etc. This, in a certain sense, is inconsistent; strictly speaking, measuring rods and clocks should emerge as solutions of the basic equations (objects consisting of moving atomic configurations), not, as it were, as theoretically self-sufficient entities. The procedure was justified, however, because it was clear from the very beginning that the postulates of the theory are not strong enough to deduce from them equations for physical events sufficiently complete and sufficiently free from arbitrariness to form the basis of a theory of measuring rods and clocks.

This is quite similar to the view he expressed many years earlier

…the solid body and the clock do not in the conceptual edifice of physics play the part of irreducible elements, but that of composite structures, which may not play any independent part in theoretical physics. But it is my conviction that in the present stage of development of theoretical physics these ideas must still be employed as independent ideas; for we are still far from possessing such certain knowledge of theoretical principles as to be able to give exact theoretical constructions of solid bodies and clocks.

The first quote is from his Autobiographical Notes in 1949, whereas the second is from his essay on Geometry and Experience published in 1921. It’s interesting how little his views had changed during the intervening 28 years, despite the fact that those years saw the advent of quantum mechanics, which many would say provided the very theoretical principles underlying the construction of solid bodies and clocks that Einstein felt had been lacking. Whether or not the principles of quantum mechanics are adequate to justify our conceptions of reference lengths and time intervals, the characteristic spatial and temporal extents of quantum phenomena are used today as the basis for all such references.

Considering the arbitrariness of absolute coordinates, one might think our spatio-temporal descriptions could be better expressed in purely relational terms, such as by specifying only the mutual distances (minimum path lengths) between objects. Nevertheless, the most common method of description is to assign absolute coordinates (three spatial and one temporal) to each object, with reference to an established system of coordinates, while recognizing that the choice of coordinate systems is to some extent arbitrary. The relations between objects are then inferred from these absolute (thought somewhat arbitrary) coordinates. This may seem to be a round-about process, but there are several reasons for using absolute coordinate systems to encode the relations between objects, rather than explicitly specifying the relations themselves.

One reason is that this approach enables us to take advantage of the efficiency made possible by the finite dimensionality of space. As discussed in Section 1.1, if there were no limit to the dimensionality of space, then we would expect a set of n particles to have n(n–1)/2 independent pairwise spatial relations, so to explicitly specify all the distances between particles would require n–1 numbers for each particle, representing the distances to each of the other particles. For a large number of particles (to say nothing of a potentially infinite number) this would be impractical. Fortunately the spatial relations between the objects of our experience are not mutually independent. The nth particle essentially adds only three (rather than n–1) degrees of freedom to the relational configuration. In physical terms this restriction can be clearly seen from the fact that the maximum number of mutually equidistant particles in D-dimensional space is D+1. Experience teaches us that in our physical space we can arrange four, but not five or more, particles such that they are all mutually equidistant, so we conclude that our space has three dimensions.

Historically the use of absolute coordinates rather than explicit relations may also have been partly due to the fact that analytic geometry and Cartesian coordinates were invented (by Fermat, Descartes and others) at almost the same time that the new science of mechanics needed them, just as tensor analysis was invented, three hundred years later, at the very moment when it was needed to facilitate the development of general relativity. (Of course, such coincidences are not accidental; contrivances requiring new materials tend to be invented soon after the material becomes available.) The coordinate systems of Descartes were not merely efficient, they were also consistent with the ancient Aristotelian belief (also held by Descartes) that there is no such thing as empty space or vacuum, and that continuous substance permeates the universe. In this context we cannot even contemplate explicitly specifying each individual distance between substantial points, because space is regarded as a continuum of substance. For Aristotle and Descartes, every spatial extent is a measure of the length of some substance, not a pure distance between particles as contemplated by atomists. In this sense we can say that the continuous absolute coordinate systems inherited by modern science from Aristotle and Descartes are a remnant of the Cartesian natural philosophy.

Another, perhaps more compelling, reason for the adoption of abstract coordinate systems in the descriptions of physical phenomena was the need to account for acceleration. As Newton explained with the example of a “spinning pail”, the mutual relations between a set of material particles in an instant are not adequate to fully characterize a physical situation – at least not if we are considering only a small subset of all the particles in the universe. (Whether the mutual relations would be adequate if all the matter in the universe was taken into account is an open question.) In retrospect, there were other possible alternatives, such as characterizing not just the relations between particles at a specific instant, but over some temporal span of existence, but this would have required the unification of spatial and temporal measures, which did not occur until much later. Originally the motions of objects were represented simply by allowing the spatial coordinates of each persistent object to be continuous single-valued functions of one real variable, the time coordinate.

Incidentally, one consequence of the use of absolute coordinates is that it automatically entails a breaking of the alleged translational symmetry. We said previously that the coordinate system x could be replaced by x + k for any real number k, implying that every real value of k is in some sense equally suitable. However, from a strictly mathematical point of view there does not exist a uniform distribution over the real numbers, so this form of representation does not exactly entail the perfect symmetry of position in an infinite space, even if the space is completely empty.

The set of all combinations of values for the three spatial coordinates and one time coordinate is assumed to give a complete coordination not only of the spatial positions of each entity at each time, but of all possible spatial positions at all possible times. Any definite set of space and time coordinates constitutes a system of reference. There are infinitely many distinct ways in which such coordinates can be assigned, but they are not entirely arbitrary, because we limit the range of possibilities by requiring contiguous physical entities to be assigned contiguous coordinates. This imposes a definite structure on the system, so it is more than merely a set of labels; it represents the most primitive laws of physics.

One way of specifying an entire model of a world consisting of n (classical) particles would be to explicitly give the 3n functions x_j(t), y_j(t), z_j(t) for j = 1 to n. In this form, the un-occupied points of space would be irrelevant, since only the actual paths of actual physical entities have any meaning. In fact, it could be argued that only the intersections of these particles have physical significance, so the paths followed by the particles in between their mutual intersections could be regarded as merely hypothetical. Following this approach we might end up with a purely combinatorial specification of discrete interactions, with no need for the notion of a continuous physical space within which entities reside and move. However, the hypothesis that physical objects have continuous positions as functions of time with respect to a specified system of reference has proven to be extremely useful, especially for purposes of describing simple laws by which the observable interactions can be efficiently described and predicted.

An important class of physical laws that make use of the full spatio-temporal framework consists of laws that are expressed in terms of fields. A field is regarded as existing at each point within the system of coordinates, even those points that are not occupied by a material particle. Therefore, each continuous field existing throughout time has, potentially, far more degrees of freedom than does a discrete particle, or even infinitely many discrete particles. Arguably, we never actually observe fields, were merely observe effects attributed to fields. It’s ironic that we can simplify the descriptions of particles by introducing hypothetical entities (fields) with far more degrees of freedom, but the laws governing the behavior of these fields (e.g., Maxwell’s equations for the electromagnetic field) along with symmetries and simple boundary conditions suffice to constrain the fields so that actually do provide a simplification. (Fields also provide a way of maintaining conservation laws for interactions “at a distance”.) Whether the usefulness of the concepts of continuous space, time, and fields suggests that they possess some ontological status is debatable, but the concepts are undeniably useful.

These systems of reference are more than simple labeling. The numerical values of the coordinates are intended to connote physical properties of order and measure. In fact, we might even suppose that the sequence of states of all particles are uniformly parameterized by the time coordinate of our system of reference, but therein lies an ambiguity, because it isn't clear how the temporal states of one particle are to be placed in correspondence with the temporal states of another. Here we must make an important decision about how our model of the world is to be constructed. We might choose to regard the totality of all entities as comprising a single element in a succession of universal temporal states, in which case the temporal correspondence between entities is unambiguous. In such a universe the temporal coordinate induces a total ordering of events, which is to say, if we let the symbol ≤ denote temporal precedence or equality, then for every three events a,b,c we have

(i) a ≤ a

(ii) if a ≤ b and b ≤ a, then a = b

(iii) if a ≤ b and b ≤ c, then a ≤ c

(iv) either a ≤ b or b ≤ a

However, this is not the only possible choice. We might choose instead to regard the temporal state of each individual particle as an independent quantity, bearing in mind that orderings of the elements of a set are not necessarily total. For example, consider the subsets of a flat plane, and the ordering induced by the inclusion relation ⊆. Obviously the first three axioms of a total ordering are satisfied, because for any three subsets a,b,c of the plane we have (i) a ⊆ a , (ii) if a ⊆ b and b ⊆ a, then a = b, and (iii) if a ⊆ b and b ⊆ c, then a ⊆ c. However, the fourth axiom is not satisfied, because it's entirely possible to have two sets neither of which is included in the other. An ordering of this type is called a partial ordering, and we should allow for the possibility that the temporal relations between events induce a partial rather than a total ordering. In fact, we have no a priori reason to expect that temporal relations induce even a partial ordering. It is safest to assume that each entity possesses its own temporal state, and let our observations teach us how those states are mutually related, if at all. (Similar caution should be applied when modeling the relations between the spatial states of particles.)

Given any system of space and time coordinates we can define infinitely many others such that speeds are preserved. This represents an equivalence relation, and we can then define a reference frame as an equivalence class of coordinate systems such that the speed of each object has the same value in terms of each coordinate system in that class. Thus within a reference frame we can speak of the speed of an object, without needing to specify any particular coordinate system. Of course, just as our coordinate systems are generally valid only locally, so too are the reference frames.

Purely kinematic relativity contains enough degrees of freedom that we can simply define our systems of reference (i.e., coordinate systems) to satisfy the additivity of velocity. In other words, we can adopt velocity additivity as a principle, and this is essentially what scientists had tacitly done since ancient times. The great insight of Galileo and his successors was that this principle is inadequate to single out the physically meaningful reference systems. A new principle was necessary, namely, the principle of inertia, to be discussed in the next section.

Return to Table of Contents