Special Relativity and Superluminal Travel

Even prior to 1905 the ideas underlying special relativity were associated with the recognition of the speed of light as a limiting speed. For example, Lorentz’s 1904 paper was entitled “Electrodynamic Phenomena in a System Moving With Any Velocity Less Than That of Light”, and in this paper he showed how the laws of physics take the same form when expressed in terms of any system of coordinates related to a standard inertial coordinate system by what is now called a Lorentz transformation. He noted that this applies only to systems moving at less than the speed of light relative to each other.

In his June 1905 paper Einstein addressed the question of whether an object or signal could propagate faster than light. In section 4 he notes that length contraction would result in a rod approaching zero length as its speed approaches c, and he comments that

…for superluminal velocities our considerations become meaningless; we shall see in the considerations that follow that in our theory the velocity of light plays the role of infinitely great velocities.

In section 10 he derives the expression for the kinetic energy of a particle, and notes that this becomes infinitely great as v approaches c. He concludes by commenting that

As in our previous results, superluminary velocities have no possibility of existence.

Despite this, one occasionally sees the claim that special relativity merely implies that the speed of a material object cannot ever equal c, but doesn’t rule out the notion that something might somehow transition from below c to above c (or vice versa). However, there are abundant reasons why (in the context of special relativity) no mass-energy or information could propagate faster than light. For example, inertial coordinate systems are related by Lorentz transformations, which would yield imaginary coordinates when transforming to a speed greater than light, and the elapsed proper time along any spacelike interval (strictly a contradiction in terms) would be imaginary. The notion also violates continuity of energy and momentum. Furthermore, it is easily shown that any hypothetical capability of superluminal signaling would imply causal paradoxes. Indeed, in a 1907 article (Jahrbuch) Einstein outlined this powerful argument against superluminal travel or signaling in the context of special relativity. In section 5 of this paper he says

The
addition
theorem
of velocities also
yields
the interesting
conclusion
that there
cannot
exist
an
effect that
can
be used
for arbitrary
signaling and
that is
propagated
faster than light in
vacuum.
For
example,

let
there be
a
material
strip
stretched
along
the x-axis of
S,
relative
to
which
a
certain effect
(viewed
from
the material
strip)
propagates
with
velocity
W,
and
let there
be
two
observers,
one
in the point
x
=
0
(point
A)
and
one
in
the point
x
= X
(point
B)
of the x-axis,
who
are
at rest
relative
to
S. Let the observer at

A
send
a
signal
by means
of the

above-mentioned effect
to
the
observer at
B
through
the material strip,
which
shall
not
be
at rest
but
shall
be
moving
in the
negative
x-direction
with velocity
v (<
c).
As
a
consequence
of the [velocity composition formula]
,
the
signal will
then
be
transmitted
from
A
to
B
with
velocity (
W−v)/(1−Wv/c2).
The
time
T
necessary
for this is then

T = X(1−Wv/c2)/(W−v).
The
velocity
v can
assume any
value smaller than
c.
Hence,
if,
as we
have
assumed,
W >
c, one can
always
choose
v
such
that
T <
0.
This
result
means
that
we
would have
to
consider
as
possible
a
transfer
mechanism
whereby
the achieved effect
would precede
the
cause.
Even
though
this
result,
in
my
opinion,
does not
contain
any
contradiction
from
a
purely logical
point of
view,
it
conflicts
with the
character
of
all
our
experience
to such
an
extent
that this
seems
sufficient
to
prove
the
impossibility
of the
assumption
W > c.

Pictorially the situation is as shown in the figure below:

The material strip is at rest in terms of the t′ and x′ axes, and the putative signal proceeds from A to B at the superluminal speed W in terms of these coordinates. (The dashed diagonal is a light path.) Since the strip is moving to the left at the speed v in terms of the unprimed coordinates x,t, those coordinates are moving to the right with speed v in terms of the primed coordinates. As shown, event B has coordinates X,T where X is positive but T is negative, which signifies that B occurs prior to A in terms of these coordinates.

In an exposition on relativity written in 1910, Einstein repeated the Jahrbuch argument almost verbatim (with slightly different labels for the variables), concluding again that

If we assume that W is larger than c, then one can always choose v in such a way that T is negative. There would have to exist a transmission phenomenon such that the signal would arrive at its goal before having been emitted: The effect would precede the cause. Even though such a result is not inadmissible from a logical point of view, it so contradicts all our empirical knowledge that we can consider the impossibility of having W > c has been demonstrated.

In another manuscript on relativity, composed around 1912, he repeated the argument again, but with some slight alterations in the form, along with a significant footnote:

We draw
yet
another
interesting consequence
from this addition theorem.
If
the
vector q'
is
parallel
to
the x' axis of S', then
q
= (
q′ + v)/(
1 +
q′v/c²).
If
we assume
that there
is
a physical
effect that
propagates
from the
place
of
its
excitation with
a
velocity
that
is
greater
than
c,
then such
a
propagation
must
also be
possible
with
respect
to S
', in
particular along
the
X′
axis of S′
.
Then it is
possible
to
give
to
the translational
velocity
v a
negative
value that
is
so
large,
considered
as
an
absolute value
(< c),
that
1 +
q'v/c²
becomes
negative.
Then,
according
to [the velocity composition formula],
the
propagation
velocity q
of the
signal
becomes
negative
with
respect
to S
. i
.e.,
there would exist effects
at
a
distance that
precede
their
cause.
Since the existence
of such effects
is
quite
improbable,[68]
then,
according
to
the
theory
of
relativity, one
will have
to
consider it
out
of the
question
that there
is
a
kind
of
signal (i.e.,
a
propagation
usable in
principle
for
telegraphy)
whose
velocity
exceeds
c.

To vividly emphasize how “improbable” this situation would be, Einstein added footnote [68] which reads as follows:

By combining several such signaling devices, one could even achieve that the effect preceding its cause occurs at the location of the cause itself.

This is the most commonly cited argument, pointing out that superluminal signaling in the context of special relativity would lead to causal loops, raising the so-called grandfather paradox. We could arrange for a signal to be sent if and only if it is not sent. It’s easy to see how this kind of loop could be constructed if superluminal signaling were possible, as illustrated in the figure below.

Given that the laws of physics take exactly the same form in terms of every system of inertial coordinates (principle of relativity), the putative ability to send any superluminal signal in terms of one such system would imply that we could do the same in terms of any other, and every directed spacelike interval goes in all possible spacelike directions – both forward and backward in coordinate time – in terms of some inertial system, so this would imply that signaling along every spacelike interval is possible. Hence it would be possible to send a signal from p to q, and from q to r, so someone at p could send a signal to his causal past. As Einstein said, this conflicts with all our experience. (It would also permit closed energy loops, which would be unstable.)

Quantitatively, based on the premise that signals of speed u > 1 can be conveyed, we could send a signal from the origin p of a system S of inertial coordinates out to the event q with coordinates x_q = L, t_q = L/u, and from that event we could send a signal with speed –u in terms of another system S′ of inertial coordinates (with the same origin and moving at speed v relative to S) back to event r, where x_r = 0. The time coordinate of this return event is determined by the condition (x_r′ – x_q′)/(t_r′ – t_q′) = −u, from which it follows that t_r = [L/(u-v)][2 – (u + 1/u)]. This is negative for any v greater than 2u/(1+u²). Hence if signals with |u| > 1 were possible, we could construct a causal loop. During the interval between r and p we could toss a coin, and at p we could transmit the result of that coin toss to q and then to r, prior to the toss. Likewise we could set up paradoxical situations, such as arranging for a message to be sent if and only if it is not sent. This suffices to convince most people that, as Einstein said, “superluminary velocities have no possibility of existence”. (One could still argue for a fully deterministic world with no freedom or randomness, in which any loops are regarded as part of a self-consistent “block universe” by fiat, but this goes contrary to ordinary scientific premises, and renders most considerations of “communication” meaningless.)

One misconception about this is the idea that the so-called “arrow of time” (a phrase coined by Eddington although the concept had been discussed much earlier) implies a unique temporal foliation. Needless to say, the “arrow of time” is not (and doesn’t pertain to) a unique foliation. It distinguishes between the past and future light cones, but the foliations of all local inertial frames share the same causal “arrow of time” (past and future light cones), and all the laws of physics, including thermodynamics, take exactly the same form in terms of any of those inertial coordinate systems with their respective foliations (relativity of simultaneity). The Lorentz-invariant “arrow of time” does not single out any one inertial frame, it is equally compatible with all of them. People are sometimes confused by the fact that, to create a causal paradox, a traveler would need to go from event B of the homebound person’s worldline to an earlier event A of the homebound person’s worldline, and this would be in the reverse proper time direction of the homebound person, so we could exclude this merely by appealing to the “arrow of time”. However, that is not correct, because the traveler would not go directly from B to A in the reverse timelike direction. The whole point is that the traveler purportedly goes superluminally from B to a spacelike-separated event C, and from there he moves superluminally from C to A. Neither of those legs is a reverse timelike interval. They are both spacelike intervals. A measuring rod at rest in some inertial frame lies along such an interval, so it would make no sense to claim that time flows along the ruler, merely because the ruler is not at rest in the frame of the Earth or of the isotropic CMBR frame. The putative traveler moving along such an interval is not at rest in any inertial frame, and the elapsed proper time along those intervals is imaginary, and so on. Thus it simply makes no sense in the context of special relativity for a material entity (or even a signal) to propagate along a spacelike interval, and this precludes any superluminal propagation.

The above provides the answer to common questions, such as “If we create a superluminal space ship that can travel to Proxima Centuri (4 light years away) and back in just 2 years according to clocks on earth, where is the causal paradox?” The answer is that, according to the principle of relativity, if we could construct a device that travels at 4c in terms of the Sun-ProxCen inertial rest frame, we could do the same in terms of any other inertial frame (because the laws of physics take the same form in all). Hence we could make a ship that travels to ProxCen at 4c in terms of the Sun-ProxCen frame S, and we could also make another ship that travels from ProxCen back to Earth at 4c in terms of any other inertial frame S′ moving at speed v relative to S. In terms of S let the departure, turnaround, and return events have (t,x) coordinates (0,0), (1,4), and (T,0) with units so c=1. In terms of S′ these events have coordinates (0,0), ((1−4v)γ,(4−v)γ), and (Tγ,−Tvγ) where γ = 1/√(1−v²), so we have

and hence T = (8−17v)/(4−v). This will be negative for any v greater than 8/17. Therefore, with the ship moving at −4c in terms of S′ on its return journey, given that S′ is moving at a speed greater than (8/17)c in the direction Sun-to-ProxCen, the ship would arrive back on earth prior to its departure. The only way to deny this is to claim the laws of physics are different when expressed in terms of S than when expressed in terms of S′, which is a flat denial of the principle of relativity.

Since people sometimes confuse objects with frames, it can be helpful to consider a simple example of two identical spaceships A and B receding from each other at the relative speed of c/2. Suppose A can send a signal to B at the speed 4c in terms of the inertial coordinates in which A is at rest, and when the signal arrives, B sends the signal back to A at the speed 4c in terms of the inertial coordinates in which B is at rest. If B is 4 light-years from A (in terms of A’s frame) at the relay event, then as explained above, the return signal will arrive at A prior to the transmission of the original signal by 1/7 years.

Incidentally, in the early 1920s, when popular fascination with relativity was at its highest, the humor magazine Punch published (without attribution) a now-famous limerick

There was a young lady named Bright

Who traveled far faster than light.

She set out one day

In a relative way

And returned on the previous night.

There is some evidence to suggest that it was composed by A. H. Reginald Buller, a professor of botany in Winnipeg.

Limericks aside, people sometimes notice that the interval from B to C in the above figure proceeds in the positive t coordinate direction, whereas the interval from C to A proceeds in the negative t coordinate direction, and they wonder if perhaps it might be possible to signal just along “forward-going” spacelike intervals, not along “backward-going” spacelike intervals, and thereby preclude causal loops while still allowing some superluminal travel. The problem with this idea is that there exist systems S′ of inertial coordinates (in terms of which all the laws of physics take exactly the same form as they do in terms of S) such that the interval BC proceeds in the negative t′ direction and the interval CA proceeds in the positive t′ direction. And there exist still other inertial coordinate systems S″ in terms of which both of the intervals BC and CA proceed in the negative t″ direction. So, we could assert an absolute distinction between forward and reverse directed spacelike intervals only by denying the principle of relativity and local Lorentz invariance (with its relativity of simultaneity), invoking a preferred frame in which the laws of physics take a detectably distinguished form.

Of course, there are distinguished frames of reference in our universe, such as the local frame in terms of which the distribution of distant galaxies (and the CMBR) is maximally isotropic, and one might wonder if such a frame might serve to establish an absolute temporal foliation affecting local physics. However, this would constitute a clear violation of local Lorentz invariance and the locally Minkowskian metric of spacetime. The whole basis and motivation for the principle of relativity was the discovery that the local laws of physics take the very same homogeneous and isotropic form in terms of any system of inertial coordinates, regardless of how that system is moving relative to the “fixed stars” (to use the archaic phrase). If we hypothesize a preferred frame for distinguishing forward from backward spacelike intervals, and assert that signaling is possible along the former but not the latter, we would have a prima facie violation of the principle of relativity. It would imply that, in two identically constructed laboratories floating out in the vacuum of space, a hypothetical device capable of sending superluminal signals across the lab would work differently, depending on the motion of the lab relative to the putative absolute rest frame. If this were true, it would invalidate (or render meaningless) local Lorentz invariance.

People sometimes claim that entropy increases along some spacetime intervals and decreases along others, but that is incorrect. Entropy doesn’t increase or decrease along a spacelike interval. Spacelike-separated observables commute! Entropy increases in the positive direction along timelike intervals. In contrast, every spacelike interval is a locus of events of a stationary object in terms of some frame. For example, a ruler sitting stationary on a desk lies along a spacelike interval, but it would make no sense to claim that the entropy of one end of the ruler is different than the entropy of the other end, merely because the Earth is not at rest in terms of the Sun’s rest frame (or the isotropic CMBR frame). This would imply that the high entropy end alternates from one end to the other as the ruler is slowly rotated. Entropy progresses along timelike worldlines according to the same laws of physics in terms of every system of inertial coordinates, and at every event the same forward and past light cones apply, and yet these systems do not share the same temporal foliation.

Note that it wouldn’t even make sense to declare one of these as the unique entropic foliation, because there is no single coherent entropic foliation. Consider several sealed capsules, each containing an identical closed thermodynamic system (such as a hot object next to a cold object), and suppose we transport each capsule along a different timelike path through spacetime. The entropy of each system is increasing (or not decreasing) along its timelike path, in proportion to the elapsed proper time. At some future time slice of any given inertial foliation the entropy of each capsule will have increased, but by different amounts, due to the differences in elapsed proper times along their paths, so in general their individual entropies are not synchronized on any time slice. Of course, we can also define the sum of the entropies of all the capsules on any given time slice of a given foliation, and this sum must also be non-decreasing with advancing coordinate time, but this is true for every inertial foliation, because the same laws of physics apply in terms of every such system.

Needless to say, local Lorentz invariance cannot even be formulated without assuming the existence of closed separable physical systems that can be placed into different states of motion relative to each other. Separability means that spacelike observables commute, and any influence that external material may have on the interior of the labs must be exerted by Lorentz invariant interactions, which means the influence could be replicated or shielded by entities in the lab. Hence, even by appealing to quasi-Machian effects of distant matter, the above reasoning cannot be circumvented without violating local Lorentz invariance. To make this more explicit, consider a region R of vacuum into which we introduce some matter to build a hypothetical apparatus (say, 10 meters long) that can send a superluminal signal from one end of the apparatus to the other. This apparatus (with suitable shielding, etc.) is at rest in terms of a particular inertial coordinate system S, but if we change its state of motion so the entire apparatus and shielding is at rest in the inertial coordinates S′, it will work exactly the same way in terms of S′ as it did in terms of S (principle of relativity). The vacuum has no preferred state of motion, so the only way one could argue that the apparatus would work differently when at rest in S′ than when at rest in S is by hypothesizing that its behavior is affected by the relative state of motion of some distant matter outside of the region R, but if we are restricted to ordinary physical effects that propagate causally according to special relativity these effects could be replicated or shielded by the apparatus within R. The only way to evade this is to hypothesize that the putative effects of the distant matter outside R are propagated in a non-Lorentz invariant way, which conflicts with special relativity.

As noted above, the very notion of an elapsed proper time along a spacelike interval is strictly a contradiction in terms. The proper time of an interval with inertial components dt and dx is √[(dt)² – (dx/c)²], which is imaginary for spacelike intervals. In view of this, what would be the elapsed proper time for a traveler moving along such an interval? Would his wrist watch indicate the passage of √-1 years? It’s also worth noting that the elapsed proper time for a subluminal traveler can be arbitrarily close to zero, so the elapsed proper time (if there was such a thing) for a hypothetical superluminal traveler couldn’t be any less than for the fastest subluminal traveler. Ironically, for a subluminal traveler as his proper trip time approaches zero the elapsed time in the home frame approaches L/c, whereas for a superluminal traveler as the elapsed time in the home frame approaches zero the magnitude of the elapsed proper trip time for the traveler approaches L/c (overlooking the imaginary unit).

Many people have odd ideas about travel in reverse time. For example, one internet blogger writes: “Imagine you have a particle that goes right to left backwards in time, what would it look like? It would look like a particle going left to right forward in time.” This is an odd statement, because in either case the particle is following a timelike (or, if you prefer, a reverse timelike) interval. Granted the fundamental laws of physics are (more or less) symmetrical between forward and reverse time, but they aren’t symmetrical between time and space. The particle will never be moving along a spacelike interval, regardless of which direction of time we choose. Swapping past and future light cones (advanced and retarded potentials, entropy increasing or decreasing, etc.) is one thing, but swapping timelike with spacelike intervals is quite another. If we etch the name of the winner of the Kentucky Derby on a particle after the race and send it leftward or rightward to some other location, in neither case will it arrive prior to the Derby. That would require either that the particle propagates along a spacelike interval (which is forward in coordinate time for some inertial frames and backward for others), or else a jitterbug along alternating forward and reverse timelike intervals as depicted below.

Either alternative leads to an effect preceding the cause, which “conflicts with the character of all our experience”.

The fact that particles of matter each have their own rest frames doesn't invalidate Einstein's argument in the context of special relativity. Of course, it's conceivable that special relativity is wrong and spacelike observables don't commute, etc., but the premise of Einstein's argument is that special relativity is correct, which entails the possibility of separable closed systems that can be at rest in different frames and that exhibit the symmetry of Lorentz invariance. The fact that our two inertial laboratories (discussed above) are in different states of motion relative to (say) the cosmic microwave background radiation does not provide any warrant for asserting that physics would behave detectably differently inside those labs. Any actual causal effect produced by the CMBR on the phenomena inside the labs could be replicated or shielded in the labs assuming special relativity, i.e. spacelike observables commute. If this were not true, the principle of relativity would have no operational meaning.

It may be worth emphasizing that the term “preferred frame” in discussions of the foundations of physics does not refer to the rest frame of any physical entity, but rather to a putative frame in terms of which the laws of physics supposedly take a special distinguished form. This is the kind of preferred frame that conflicts with special relativity, and this is the kind of preferred frame that people must invoke to argue that the laws of physics that govern superluminal propagation capability are isotropic in terms of just one particular frame, and non-isotropic in terms of all other frames. However, the two identically constructed labs in different states of inertial motion discussed above make it clear that the superluminal devices would work identically in terms of their respective frames. From this it follows that any superluminal propagation in the context of special relativity unavoidably leads to effects preceding their causes in a physically unacceptable sense, just as Einstein explained. Pauli repeated this argument in his 1921 encyclopedia article, saying

This would be tantamount to upsetting the concepts of cause and effect, and it can therefore be concluded that it is impossible to send out signals with a velocity greater than that of light.

R. C. Tolman’s text on relativity in 1917 repeated Einstein’s 1907 argument almost verbatim, but without attribution, so it is sometimes erroneously attributed to Tolman.

Occasionally one sees the claim that superluminal signaling can be reconciled with special relativity, without implying causal loops, by restricting the relative speed between transmitter and receiver. For example, it may be claimed that an essentially “instantaneous” signal can be sent from a material transmitter and received by a material receiver if and only if they are mutually at rest. However, it’s easy to see that this restriction doesn’t preclude causal loops as shown in the figure below.

The subluminal objects A and B are mutually at rest in the inertial coordinate system S, and the subluminal objects C and D are mutually at rest in the inertial coordinate system S′, so according to the prescribed claim it would be possible for A to transmit a signal from event e₁ to the simultaneous (in S) event e₂ of B, and then B can transmit this subluminally to object C at event e₃, which can transmit the signal to the simultaneous (in S′) event e₄ of D, which is in the causal past of event e1, to which it can send the signal subluminally.

Another source of confusion is the somewhat misleading use of the word “tachyon” in Lorentz invariant quantum field theory with (hypothetically) negative mass squared (i.e., imaginary mass). Because the group velocity for such a field is superluminal, some people at first mistakenly thought its excitations propagate faster than light. However, the superluminal group velocity is not the speed of propagation of any localized excitation; all localized excitations of the field propagate subluminally, so there is no possibility of superluminal signaling or conveyance of energy. Another way of expressing this is in terms of symmetrical virtual “particles” that are not actually directed signals, and permit swapping the emission and absorption events. Again, these cannot, even in principle, be used for signaling. The same applies to quantum entanglement, which can also be shown to not entail any ability to signal superluminally. This is all self-evident from the start, since quantum field theory is Lorentz invariant, and spacelike observables commute – but nevertheless such fields are sometimes called "tachyons" in the literature. This is not to be confused with the science fiction usage of the word tachyon referring to putative superluminal signaling, which would imply causal loops.

On a historical note, one may wonder why Einstein was careful to say (repeatedly) that effects preceding their causes is not a logical impossibility, although it is conclusively ruled out based on the fact that it would conflict with the character of all our experience. Here we see the influence of David Hume, who famously wrote about the conventionality of causation. Hume’s skepticism and willingness to challenge the most fundamental notions made a strong impression on the young Einstein. (He later wrote that Hume had been more important in his thinking than Mach.) Some would contend that “conflict with the character of all our experience" is a weak basis for reaching a conclusion, but of course this is the basis of all empirical science. Fundamental physical principles, such as local conservation of energy and momentum, the principle of relativity, local Lorentz invariance, etc., are not logical necessities, they are simply expressions of some of the most consistent aspects of our experience of the world. Science consists of inferring the principles of natural phenomena by induction, and then applying those principles to deduce other phenomena. This is the inductive-deductive process. Scientific induction of general physical principles is always incomplete (how could it not be?), and it is useful only to the extent that natural phenomena exhibit strong patterns and fundamental symmetries. If the world was pure chaos, then induction would have no value… but we don’t live in such a world. It makes no sense to say we should refrain from formulating (or provisionally believing in) fundamental principles, such as the local conservation of energy and momentum or local Lorentz invariance, until they have been verified in every individual past, current, and future instance. That would be a purely anti-scientific (and even irrational) position. As Newton said

In experimental philosophy we are to look upon propositions inferred by general induction from phenomena as accurately or very nearly true, notwithstanding any contrary hypotheses that may be imagined, till such time as other phenomena occur, by which they may either be made more accurate or liable to exception. This rule we must follow, that the argument of induction may not be evaded by hypotheses.

Another objection that is sometimes raised is that as long as we do not have a complete and final theory of the universe (e.g., as long as there are unresolved issues involving dark matter, dark energy, inflation, and the reconciliation between general relativity and quantum mechanics) we cannot rule out superluminal propagation. This is the standard sophistry of science denialists, arguing that until we know everything we don’t know anything. Also, as noted above, it’s conceivable that special relativity is wrong and Lorentz invariance fails. It so happens that the empirical support for it is overwhelming, making it one of the most secure principles of physics, but that isn’t the topic we are discussing. Rather, we are considering here whether, under the assumption that special relativity and local Lorentz invariance are true, it rules out superluminal travel. The answer is yes, for all the reasons that have been described here (e.g., the proper time along every spacelike interval is imaginary). Causal paradoxes are just one of infinitely many absurd consequences that can be derived from the counterfactual premise of superluminal signaling or travel in the context of special relativity.

Return to MathPages Main Menu