Teaching Special Relativity

Teaching Special Relativity

The mutual actions of two bodies upon each other are always equal.

Isaac Newton

When first introduced to the special theory of relativity many students are troubled by the apparent circularity in the proposition that the speed of light is invariant. They are told that the speed of light has the value c relative to any frame of reference – provided we define our coordinates in such a way that the speed of light has the value c. This makes the invariance of light speed seem tautological and devoid of physical content, a mere artifact of the freedom to synchronize clocks in an arbitrary manner for different frames of reference. To dispel this misconception, we should first remind the student that there is no a priori reason why any coherent synchronization of clocks would yield invariant light speed. For example, if two pulses of light take different amounts of time to traverse the distance between two fixed spatial locations relative to a given frame (as in a ballistic theory of light), then obviously no single synchronization of clocks would let us assign the same speed to all light pulses. The fact that we can define coherent clock settings (“free from contradictions”) to give invariant light speed is a highly non-trivial attribute of the physical world, one that has been ascertained empirically. Many students overlook this because they begin their studies with a (vague) pre-conception of light as a wave in a perfectly uniform and stationary medium with a characteristic speed relative to the rest frame of the medium, independent of the state of motion of the source. On this basis it does indeed follow that we can assign the same value to the speed of light in terms of all other relatively moving systems of reference by means of suitable coordinate scaling and clock synchronizations, but we must not forget that this is founded on the highly non-trivial empirical fact that the speed of all light is invariant in terms of at least one coherent system of space and time coordinates.

The more serious obstacle to understanding for most students – the one that typically leads to the most persistent confusion – is that they think the “suitable synchronizations and scalings” that make the speed of light invariant have no other physical significance, and many students suspect that the resulting space-time coordinate systems are arbitrary and cannot (or should not) be regarded as physically significant measures of space and time. For example, they imagine that we could just as well use sound waves (albeit in an unrealistic perfectly stationary and homogeneous medium) to establish simultaneity, but of course this is not true, since the recoverable kinetic energy of a particle does not approach infinity as the speed of the particle approaches the speed of sound. The root cause of the student’s misunderstanding is that most presentations of the subject follow Einstein’s 1905 paper in neglecting to clearly identify the operational definition of simultaneity that forms the basis of Newtonian physics, i.e., nearly all introductions to special relativity fail to mention that Newton’s third law already entails a definite synchronization of the time coordinates at spatially separate locations. For any frame of reference the isotropy and homogeneity of mechanical inertia implicit in Newton’s laws are sufficient to establish a unique system of coordinates for both space and time (up to trivial spatial re-orientations, translations and choice of units). To see this, notice that for any unaccelerated frame of reference Newton’s third law holds good only if we define the synchronization of clocks such that two identical particles initially adjacent and at rest in the frame and exerting mutual repulsive forces on each other will travel equal distances in equal times. (See the note What is an Inertial Coordinate System?) Einstein’s 1905 paper, as well as most subsequent presentations, take great care to describe a different way of establishing a system of space and time coordinates, by means of light signals, but these presentations almost invariably fail to even mention the crucial empirical fact that the light-based coordinate systems of Poincare and Einstein are identical to the inertia-based coordinate systems (intrinsic to any given frame) of Galileo and Newton.

This is particularly unfortunate, because the fact that the laws of both electromagnetism and mechanics are homogeneous and isotropic in terms of the same class of coordinate systems is what makes those coordinate systems the most physically meaningful measures of space and time. Of course, the significance of these coordinates is strengthened even further by the fact that descriptions of the strong and weak nuclear forces – and the wave functions of quantum mechanics – all exhibit their most symmetrical forms when expressed in terms of the very same class of coordinate systems, which we call standard inertial coordinates. Physical phenomena with this property are said to be Lorentz invariant, and this has been found to characterize all known physical phenomena locally. (We leave aside gravitation, since it involves a generalization of the concept of the spacetime manifold.)

Of course, having said all this, the use of standard inertial coordinates, defined so that the laws of mechanics are homogeneous and isotropic, remains a convention. We can obviously define other kinds of space and time coordinate systems, with different measures of spatial position and different synchronizations of the time coordinates at spatially separate locations. Indeed, Einstein described one such alternate system of coordinates in his 1905 paper before introducing the “more practical” light-based coordinates. Naturally we are free to regard any particular system of coordinates as the “true” measures of space and time, but we must bear in mind that this designation has no significance in the absence of a meaningful definition of the word “true” in this context, and moreover that the laws of mechanics will not take their homogeneous and isotropic form in terms of any of those alternative systems of coordinates.

Regarding Einstein’s original 1905 paper, one might argue that he actually did establish the identity between the light-based and the inertia-based coordinate systems – noting that the preface of the paper says “the same laws of electrodynamics and optics will be valid for all frames of reference for which the equations of mechanics are valid”, and the very first sentence of Part I of the paper says “Let us take a system of coordinates in which the equations of Newtonian mechanics hold good” (and Sommerfeld added “to the first approximation”). Taken literally this does indeed suffice to establish the inertia-based coordinates for both space and time, as discussed above. Unfortunately the remainder of this section of Einstein’s paper reveals that when he spoke of “frames” and “coordinate systems” he was referring to only the spatial coordinates. He says “if a material point is at rest relatively to this system of coordinates, its position can be defined [by] the methods of Euclidean geometry…”, and then he goes on to say that “if we wish to describe the motion of a material point, we give the values of its coordinates as functions of time”. Thus the “coordinates” he is referring to are just the spatial coordinates. He has not (at this stage) progressed to the point of treating time as a fourth coordinate. Now, strictly speaking, this is inconsistent with his opening sentence, in which he refers to the coordinates as being such that the laws of mechanics hold good. Mechanics concerns the motions of objects, not just their spatial positions, so we cannot claim that the laws of Newtonian mechanics (or more precisely the laws of mechanics, whatever they may be, in their simplest isotropic and homogeneous form) hold good in terms of any system of spatial coordinates without also specifying the time coordinate. As noted above, Newton’s third law holds good only for a specific synchronization between the time coordinates assigned to spatially separate locations, so without specifying this time coordinate we cannot claim that the laws of mechanics (which involve time) hold good in terms of any given system of space coordinates. The problem is compounded by the order in which the two foundational principles are enunciated at the beginning of section 2 of Einstein’s 1905 paper, where he places the relativity principle ahead of the effective specification of the class of coordinate systems in terms of which the relativity principle holds good. (To be fair, the same confusion exists in most presentations of Newtonian mechanics, where the “laws of motion” are enunciated first, and only later does the student learn that those laws are valid only in terms of a class of coordinate systems defined specifically so as to make those laws valid.)

After neglecting to mention that the requirement for the laws of mechanics to take their homogeneous and isotropic form already suffices to establish a complete system of space and time coordinates, Einstein goes on to describe a way – using light signals – to assign time coordinates (in modern terminology) to his system of space coordinates. He emphasizes that these coordinates are defined such that the speed of light is isotropic (i.e., the same in all directions), but he fails to mention that the standard inertial coordinates (intrinsic to each frame) of Galileo and Newton were likewise defined such that mechanical inertia is isotropic, and he fails to explicitly mention the crucial empirical fact that these two definitions yield the identical class of coordinate systems. Without pointing out the identity between the seemingly novel light-based coordinates and the traditional and intuitive inertia-based coordinates, we leave students with the impression that the coordinates in special relativity are totally arbitrary and have no physical significance – when in fact they are exactly the measures of space and time that the student implicitly takes for granted. This gives rise to the notion that the isotropy of the speed of light is purely conventional, even though the isotropy of light speed in terms of standard inertial (space-time) coordinates is actually an empirical fact.

Ironically, it is arguable that Lorentz came closer in 1904 than Einstein did in 1905 to explicitly identifying the standard inertial coordinate systems with the light-based coordinate systems. In his 1904 paper, Lorentz concluded section 12 describing his theorem of corresponding states by saying that

The proper relation between the forces and the accelerations will exist in the two cases, if we suppose that the masses of all particles are influenced by a translation to the same degree as the electromagnetic masses of the electrons.

Unfortunately, as Lorentz himself later admitted, he did not at that time grasp the significance of what he had said. He had essentially shown that, if we are to account for all the observations confirming the complete undetectability of absolute motion, the light-based coordinate systems related by Lorentz transformations must be identical to the inertia-based coordinates of Galileo and Newton, but he just didn’t make the connection in his mind. The connection was certainly implicit throughout Einstein’s 1905 paper, essentially contained within the relativity principle, but that paper never explicitly mentioned the simultaneity of inertia-based coordinates, and the order of presentation makes it easy for students to miss the connection.

Einstein’s review paper of 1907 did a slightly better job of explaining the situation, partly by reversing the order of the principles. He began by assuming that for any system of orthogonal unaccelerated space coordinates we can adjust a coherent set of standard ideal clocks in such a way that the speed of every pulse of light, in every direction, regardless of the state of motion of the source, equals a universal constant. He points out that (as we discussed above) “It is by no means self-evident that the assumption made here, which we will call ‘the principle of the constancy of the velocity of light,’ is actually realized in nature”. Indeed, we can imagine an emission theory of light (for example) in which it would not be possible by any coherent array of clocks adjusted in any manner to yield the invariance of light speed. It is possible if – but only if – light has an invariant speed with respect to at least one coherent system of reference. This is why Einstein says the assumption is made plausible by the experimental confirmations of Lorentz’s theory, in which light is conceived as propagating in a fixed homogeneous ether, so it has an invariant speed in terms of the rest frame coordinates of the ether. It then follows trivially that we can adjust clocks for any other (relatively moving) systems of unaccelerated space coordinates to give the same invariant light speed. But we still don’t have the connection with the inertia-based coordinates of mechanics.

Einstein then goes on (in the 1907 paper) to introduce the crucial empirical assertion, stating that, given the assignment of time coordinates using the light principle just described for each unaccelerated system of reference, “[all] physical laws are independent of the state of motion of the reference system”. This assertion includes the laws of mechanics in their homogeneous and isotropic form, and Einstein makes it clear that this assertion has empirical content by saying that it is suggested by the Michelson and Morley experiment. Just as Lorentz had done in 1904, Einstein realized that the null result for that experiment (combined with others) forces us to conclude that mechanical inertia itself must be Lorentz invariant (to use the modern terminology). Thus the intrinsic inertia-based coordinate systems of Galileo and Newton are identical to the light-based coordinates. Many years later (in 1949) when discussing the foundations of special relativity Einstein emphasized again the importance of the empirical basis for these foundational propositions, and tried to dispel the misconception (perhaps encouraged by his 1905 paper, and also by some comments in his 1916 popular account of relativity) that the invariance of light speed is entirely conventional: “With the given physical interpretation of coordinates and time, this is by no means a merely conventional step…[it] can be experimentally confirmed or disproved.”

When confronted with this, many students are bewildered – How can something be both a free convention (definition) and an empirical fact? The answer, of course, is that these characterizations actually refer to two different propositions. The isotropy of light speed is a convention, but the isotropy of light speed in terms of standard inertial coordinates is an empirical fact. This sounds paradoxical to students (at first) because they are so accustomed to taking standard inertial coordinate systems for granted as the most physically meaningful measures of space and time – and for good reason: They have been explicitly taken for granted since Galileo and Newton, and implicitly for all of human history. The student would never think of regarding as purely conventional the choice of defining the mechanical inertia of a resting object to be isotropic. They think “Well of course the inertia of a resting object is the same in all directions – why wouldn’t it be?”, without realizing that this depends on our choice of coordinate system, and that we freely choose to use coordinate systems in which inertia is isotropic in order for the laws of mechanics to take their simplest and most convenient form. (Needless to say, independence of extrinsic direction does not imply independence of velocity.)

It has often been pointed out that Newton’s so-called “laws of motion” are not purely laws of motion, they collectively constitute the definition of a class of coordinate systems, which we call standard inertial coordinates. We define these coordinates so as to make the laws of mechanics homogeneous and isotropic. (Note that it is necessary but not sufficient for these coordinates to be unaccelerated.) More precisely, we use particular instances of inertial phenomena to define a system of coordinates in terms of which those phenomena are homogeneous and isotropic. This is a circular step, merely amounting to a definition. The physical content of the laws is that, once we have established a coordinate system in which the specific instances of the phenomena satisfy those equations, we find that all other inertial phenomena also satisfy those equations. One reason it is so challenging to teach special relativity is that the subject can’t be properly understood without a thorough revision of the student’s understanding of Newtonian physics on a more sophisticated level, clarifying the definitional and empirical content of Newton’s laws, and the operational meaning of simultaneity based on inertial symmetry. We also need to carefully explain that the Galilean postulate for the transformation between inertial coordinate systems is empirically incompatible with the Galilean definition of inertial coordinates intrinsic to a each frame.

Although it’s important to understand that we are not required to use coordinates in terms of which the equations of mechanics are homogeneous and isotropic, it is equally important to understand that we almost invariably choose to use such coordinates, and that they form the basis of our deepest intuitions of space and time. Only once the student has grasped this are they able to understand the significance of the empirical isotropy of the one-way speed of light in terms of these coordinates. It’s important to stress that the relationship between relatively moving systems of standard inertial coordinates (i.e., the Lorentz transformation), including the value of c, can be determined empirically based purely on the behavior of mechanical inertia, without any attention to the behavior of light.

We should mention that advocates of the electromagnetic view of the world in the late 19th century argued that mechanical inertia itself would someday be understood as having a purely electromagnetic foundation (extrapolating from the electromagnetic mass), and therefore the identity of inertia-based and light-based coordinates would indeed be tautological. However, as Lorentz was careful to point out in his 1904 paper (quoted above) we are not justified in assuming that matter is ultimately just a manifestation of electromagnetism, and Einstein was even more explicit about this, pointing out that the forces responsible for the stable configurations of matter must oppose the electromagnetic forces. So we cannot claim that the Lorentz covariance of mechanical inertia follows constructively from the Lorentz covariance of electromagnetism.

Perhaps the most fundamental source of confusion among beginning students is the unexamined belief that it should be possible to measure some physical quantity without stipulating what would constitute a measurement of that quantity. They tend to think that named physical quantities have some kind of metaphysical values that can be grasped independently of any operational meanings, because they have never had occasion to reflect on the fact that every measurement is ultimately a comparison of one thing with another. For example, to measure the spatial length of an object we may place the object next to a ruler (i.e., some standard configuration of matter) and note the comparison. Strictly speaking we aren’t justified in claiming to have measured the “true length” of the object, because we have no a priori basis for claiming that a “ruler” represents “true length” – unless we simply define it to be so. (Obviously the word “true” is superfluous in this context.) We are able to measure length only by agreeing to define “length” as the result of comparison with a ruler, or by some other operational means (after defining what qualifies as a “ruler”.) Any definition of “length” that doesn’t have an operational basis is purely metaphysical, and devoid of content when examined closely. Every quantification of length is necessarily based on some operational procedure that we freely choose to define as “length”. The same applies to any kind of measurement of any physical quantity.

Needless to say, we might devise other, physically distinct, ways of evaluating the “length” of an object (by making other kinds of comparisons), many of which may yield the same result, and one might think this would justify the claim that we have measured “true length”. But of course there are also infinitely many other measurement procedures that yield different results. All we can say is that we define useful quantities such as “length” to correlate with large sets of operational procedures that, in our experience, yield persistent correlations in a wide range of circumstances. (As Henri Poincare observed, there would be no such thing as geometry if no objects possessed persistent configurations – meaning that their configurations exhibit persistent correlations with the configurations of other objects.)

For another example, consider the two-way speed of light. This is understood to signify the ratio of two quantities, one being the spatial distance traveled by a pulse of light (in vacuum) as it moves from its emission point to a reflection point and back to its emission point where it is re-absorbed, and the other being the duration of time between the emission and the re-absorption of the pulse. Thus any measurement of the two-way speed of light involves measurements of the spatial distance traveled by the pulse, and of the temporal interval between emission and re-absorption. For the reason just explained, we are never justified in claiming to be able to empirically determine the “true” magnitudes of either of these intervals, because they are ultimately just comparisons with the intervals associated with other physical processes and entities, and we are free to define the precise operational procedures to be represented by any given term. Moreover, in this case we encounter additional ambiguities, such as the need to define the spatial distance between non-simultaneous events. If we perform the measurement on the surface of the Earth (for example), we know both the emitter and the mirror are moving between the emission, reflection, and re-absorption events – at least with respect to some systems of reference, so we face the non-trivial task of defining the distance between two moving objects. Clearly the relevant “distance” is the spatial distance between the simultaneous positions of the emitter and the mirror, but how are we to establish simultaneity? If the emitter and the mirror are at rest relative to each other, we might try to finesse this point, by simply stipulating that the “spatial distance” between two objects at rest relative to each other is the value given by comparison with a set of standard rulers that are at rest with respect to the emitter and mirror. But of course this is an arbitrary stipulation, and it entails a particular convention for simultaneity between separate points. The atoms comprising a ruler are held together by electromagnetic forces, resulting in equilibrium configurations that are aligned with the synchronization convention based on the stipulation of isotropic speed of electromagnetic waves. Using this same convention, someone in motion relative to the emitter and mirror would argue that our distance measurements are incorrect – because they would be using a different simultaneity. Similar considerations apply to the measurement of the temporal interval.

This point is often not appreciated by beginning students, who imagine that a measurement of the two-way speed of light (or anything else) is free of ambiguities, and so (they think) we can measure two-way speeds without invoking any free choices or operational definitions. Obviously that’s not the case. The requisite set of assumptions necessary to give a meaningful definition of “two-way speed” is essentially the same as for defining “one-way speed”. In both cases the necessary and sufficient requirement is the stipulation of a system of space and time coordinates. Speeds are then defined for any infinitesimal interval of a particle as the ratio of the (root sum square of the) change in the space coordinates divided by the change in the time coordinate. One might think that a two-way speed measurement avoids the need for a complete time coordination throughout space, since the time interval is measured only along the worldline of the emitter, but as explained above this is not the case, because we need simultaneity at distant locations to establish the distance traveled without ambiguity.

Part of the reason for confusion on this point can be traced back to how Einstein’s 1905 paper on the electrodynamics of moving bodies was structured. He presented the two-way speed of light as unproblematic, and then pointed out that the definition of the one-way speed depends on our synchronization of clocks at separate locations. But once this has been recognized, and the consequences are worked out in the remainder of the paper, it is clear that the definition of distance in the two-way speed measurement depends on the very same choice of synchronization. Why then did Einstein begin by treating the distance traveled in the two-way measurement as if it didn’t rely on the stipulation of simultaneity? One could ask the same question about the very first sentence in Part A of the paper, where (as discussed above) he begins by considering “a coordinate system in which Newton’s mechanical equations are valid”. The problem is that in the subsequent discussion we learn that Newton’s laws are not actually valid in terms of those coordinates. As mentioned above, in later printings of the paper Sommerfeld tried to untangle this by adding the footnote “to the first approximation”, but of course there are no relativistic effects to the first approximation, so it’s debatable whether this completely resolves the issue.

Are we to regard these examples as evidence of problems with Einstein’s presentation, or perhaps even with the foundations of special relativity? Certainly not the latter, since it’s quite possible to present the foundations of special relativity without becoming entangled in these “chicken or egg” conundrums. And yet these more rigorous presentations would not necessarily have been as effective in conveying the new point of view to Einstein’s contemporaries. One has to begin somewhere, using terms and concepts that are already understandable to one’s audience, and then proceed to develop the less familiar concepts, even though this may entail some inconsistency between the way in which terms are used at the start of the discussion and how they are used at the end. Einstein himself explicitly recognized the difficulty, which is really inherent in any effective and intelligible presentation of a fundamentally new way of thinking. In later years he criticized his own early presentation, noting that it

…introduces two kinds of things, i.e., (1) measuring rods and clocks, (2) all other things, e.g., the electromagnetic field, the material point, etc. This, in a certain sense, is inconsistent; strictly speaking, measuring rods and clocks should emerge as solutions of the basic equations (objects consisting of moving atomic configurations), not, as it were, as theoretically self-sufficient entities. The procedure justifies itself, however, because it was clear from the very beginning that the postulates of the theory are not strong enough to deduce from them equations for physical events sufficiently complete and sufficiently free from arbitrariness in order to base upon such a foundation a theory of measuring rods and clocks. If one did not wish to forego a physical interpretation of the coordinates in general (something that, in itself, would be possible), it was better to permit such inconsistency – with the obligation, however, of eliminating it at a later stage of the theory.

Incidentally this explains why presentations that invoke the concept of “slow clock transport” are not really useful, because a “clock” is a high-level entity that already entails the action of physical processes whose descriptions are far more complex than (and rely upon) the simple establishment of a suitable system of space and time coordinates in terms of which the basic laws of mechanics are optimally simple, i.e., standard inertial coordinates. Ultimately these are the coordinate system we really have in mind when we think of distances and time intervals.

As discussed above, the clearest and simplest way of establishing the necessary and sufficient coordinate systems for a meaningful definition of “speed” is to use Galileo’s prescription. He and his successors (e.g., Newton) found that the motions of physical objects satisfy a set of optimally simple formulas (“the laws of motion”) if - but only if - the spatial locations and times are expressed in terms of a certain class of coordinate systems. These systems are fully defined by the requirement that mechanical inertia is the same everywhere and in all directions. In other words, we define these coordinates in such a way that mechanical inertia is homogeneous and isotropic. Note that this characterization applies to the laws of mechanics, both as formulated by Newton, and as amended in special relativity, so by introducing them this way we avoid Einstein’s “chicken or egg” conundrum and the need for Sommerfeld’s problematic footnote. This prescription, which is nothing other than the one given by Galileo, together with scale factors based on the invariance of internal energy, is sufficient to fully establish a physically meaningful system of space and time coordinates. Notice that only in terms of such coordinates are all three of Newton’s laws quasi-statically valid.

This highlights another source of confusion. Students are often told that an “inertial coordinate system” is one in terms of which Newton’s first law is valid, and then they are immediately told that Newton’s laws (all three of them) are valid in terms of inertial coordinates. This implication would follow under Galilean relativity (with its assumed unique absolute time coordinate), but it does not follow under special relativity. This is yet another example of how concepts are introduced at the beginning of presentations on special relativity in ways that turn out to be inconsistent with special relativity by the end of the presentations, and unfortunately the inconsistency is often not “eliminated at a later stage”, so the student is left in a state of confusion. The satisfaction of Newton’s first law is obviously not sufficient to ensure the satisfaction of Newton’s third law (by which we mean the isotropy of inertia). The “third law” law implies that two identical particles, initially adjacent and at rest, exerting a mutual repulsive force on each other, will acquire equal speeds in equal times (equality of action and reaction). Hence if we place the two particles initially at the midpoint between two mutually stationary objects, we must consider their arrivals at those two objects as simultaneous – otherwise the equality of action and reaction is violated and mechanical inertia is not isotropic.

The crucial point is that this shows that space and time coordinates in terms of which Newton’s laws (by which we mean the homogeneity and isotropy of mechanical inertia) are valid already entail an operational definition of simultaneity. We can synchronize separate clocks on the basis of the isotropy of mechanical inertia, i.e., the equality of action and reaction. This isn’t the only possible way of defining simultaneity, but it is the only way consistent with coordinate systems in which the laws of mechanics take their simplest form, and it is surely the way most consistent with our intuitive concept of simultaneity. The novelty of special relativity is not in its basic definition of suitable coordinate systems, but rather in how two such systems (in relative motion) are related to each other. But this relationship is not a matter of convention, it is an empirical fact that coordinate systems in which mechanical inertia is homogeneous and isotropic are related to each other by Lorentz transformations.

Having established the full meaning of “a coordinate system in which the equations of Newtonian mechanics hold”, it is now trivial to measure the speed of light (or the speed of anything else), for both one-way and two-way processes. Speed over any infinitesimal interval is simply, as mentioned before, the root-sum-square of the changes in the spatial coordinates divided by the change in the time coordinate. Why, then, did Einstein state in his popular exposition of relativity published in 1917 that the isotropy of the one-way speed of light “is in reality neither a supposition nor a hypothesis about the physical nature of light, but a stipulation which I can make of my own free will in order to arrive at a definition of simultaneity”? That statement is perfectly correct, and completely in accord with what we’ve said. We can certainly define simultaneity based on the isotropy of light speed, just as we can define simultaneity based on the isotropy of mechanical inertia. Each of these is merely a definition, but the proposition that these two definitions of simultaneity based on two apparently distinct classes of physical phenomena yield the same simultaneity is a testable hypothesis with physical content. This is why, as we mentioned above, Einstein wrote

With the given physical interpretation of coordinates and time, this is by no means a merely conventional step but implies certain hypotheses concerning the actual behavior of moving measuring rods and clocks, which can be experimentally confirmed or disproved.

As also noted above, he later pointed out that it was inconsistent to even talk about measuring rods and clocks, since those are high level entities. This highlights the uselessness of the concept of “slow clock transport” in such discussions, because the concept is meaningless in the absence of a definition of a “clock”. Obviously we would not refer to a sun dial or a pendulum, we can only mean something approaching an “ideal clock”, but what exactly is this? There are only three viable options. One would be to stipulate a “light clock”, consisting of light bouncing back and forth between mirrors, but if this is our definition of a clock, any application to the measurement of the speed of light becomes circular. Another option, in modern times, is to refer to a clock based on the quantum mechanical decay of radioactive atoms, but this entails the postulation of quantum mechanics and particle physics, which is far from intuitive or self-evident. The only remaining option for defining a suitable “clock” is the isotropy and homogeneity of mechanical inertia, which is the basis of operation for most actual clocks. But this reduces to the basic definition of suitable measures of space and time as those in terms of which mechanical inertia is homogeneous and isotropic. Of course, the fact that quantum mechanical processes are perfectly Lorentz invariant is itself an independent and testable hypothesis – one that has (so far) been borne out by all observations, and this constitutes one of the strongest confirmations of the fundamental significance of the inertia-based measures of space and time that form the basis of both Newtonian mechanics and special relativity.

Needless to say, it remains true that the use of inertia-based coordinates to define secondary quantities such as “speed” is a convention – but it is by no means an arbitrary convention. The principle of inertia serves as an organizing principle, much like the conservation of energy. These are useful concepts because we find, empirically, that the descriptions of phenomena and their inter-relationships take their simplest and most perspicacious form when expressed in terms that maintain those symmetries.

Return to MathPages Main Menu