## 1 Introduction

I discuss the thermal structure of the mantle as a guide to interpreting tomograms. That is, what features in the Earth are likely to be detected tomographically. I presume the existence of mantle-wide convection with slabs and plumes. Crudely, the lower mantle contains of descending slabs and graveyards of slabs ponded near its base (Fig. 1). Romanowicz and Gung [30] call the hotter regions between slabs and slabs graveyards superplumes. I retain this term as ‘superplume regions’ for lack of a better one. Hot spots preferentially occur above superplume regions, rather than above slab graveyards. The mantle at 250–500-km depth is also hotter than its surroundings above superplumes. This implies, as expected, that mantle plumes arise from places at the base of the mantle where the basal thermal boundary layer, D″, is thick.

The question arises as to whether deep superplume regions are hotter than the upper mantle (mid-oceanic ridge basalt, MORB) adiabat, rather than merely the surrounding slab graveyards at great depth. To address this issue, I begin with the average geothermal gradient in the mantle away from slabs and plumes in Section 2. I then discuss subduction with regard to the width of slabs in the lower mantle and the temperature within slab graveyards deep in the lower mantle in Section 3. I then discuss the persistence of lateral temperature anomalies associated with superplume regions in the deep mantle in Section 4. I finally discuss the ascent of starting plume heads as a feature that might be detected by tomography in Section 5 and return to the related problem of the persistence of superplume regions.

I compile and adapt dimensional and analytical methods in this regard. My intent is to make easily understood general conclusions that do not obscure gross observable features of convection with complex mathematics and numerical models. This work serves as a guide for interpreting numerical studies of convection in regard to tomography and vice versa. Tomographers and fluid dynamicists are well aware of the need to act together in a self-consistent manner (e.g., [16,17,22–24,30]).

## 2 Geothermal gradient at mid-mantle depths

Global tomography images the broad-scale structure of the Earth's mantle. One finds velocities and attenuation of seismic waves and, to some extent, the density and the anisotropy. These measurements constrain the thermal structure as seismic velocity decreases and attenuation increases with temperature. Current tomograms resolve slabs, slab graveyards, and superplume regions. Our lack of knowledge of mineral physics limits what can be directly inferred from these results.

Specifically, common methods of plotting tomograms are potentially misleading. To show these difficulties, I formally represent the shear wave velocity as a function of depth and temperature,

$$\beta =\beta ({\mathrm{Z}}_{\mathrm{ref}},{\mathrm{T}}_{\mathrm{ref}})+{\left(\frac{\partial \beta}{\partial \mathrm{T}}\right)}_{Z}(\mathrm{T}-{\mathrm{T}}_{\mathrm{ref}})+{\left(\frac{\partial \beta}{\partial \mathrm{Z}}\right)}_{S}(\mathrm{Z}-{\mathrm{Z}}_{\mathrm{ref}})$$ | (1) |

_{ref}is a convenient reference potential temperature (like the MORB adiabat), Z is depth, Z

_{ref}is a reference depth, and S is entropy. Similar formal expressions exist for density, P-wave velocity and the attenuation of both types of waves.

In practice, the depth derivatives are unknown, particularly for the lower mantle. Tomographers thus often explicitly center their plots on the mean value at a given depth. Tomographers are well aware that plots generated with respect to local depth averages need to be carefully interpreted if the actual average thermal gradient in the mantle is subadiabatic at mid-mantle depths as shown later in this section. For example, material hotter than the average at 2400-km depth may be cooler than the shallow MORB adiabat and not able to ascend buoyantly into upper mantle having a cooler color on the plot.

The temperature derivatives may also be unknown, especially if trace phases like partial melt affect seismic velocity and attenuation. To partly adjust for this problem and to show variations at all depths on the same plot, tomographers may explicitly weight color schemes to cover the observed range at each depth. For example, the same color range may represent a 10% variation at 50-km depth but only 1% at 2000-km depth.

It is well known that the thermal gradient of a fluid heated from within is subadiabatic in the middle of the convecting region (e.g., [29]). Whole mantle convection behaves in this way as the major heat sources are radioactive heat generation and internal cooling (e.g., [8,12,13]). The core provides only a modest amount of heating from below [12,13,34]. The function of plate tectonics is to cool the mantle and that of plumes is to cool the core [13].

To estimate the mid-mantle thermal gradient, I begin with the heat flow equation,

$$\rho \phantom{\rule{1.69998pt}{0ex}}C\phantom{\rule{1.69998pt}{0ex}}\frac{\partial \mathrm{T}}{\partial \mathrm{t}}+\rho \phantom{\rule{1.69998pt}{0ex}}C\phantom{\rule{1.69998pt}{0ex}}\mathrm{V}\xb7\nabla \mathrm{T}=\mathrm{k}\phantom{\rule{1.69998pt}{0ex}}{\nabla}^{2}\mathrm{T}+\rho \phantom{\rule{1.69998pt}{0ex}}A$$ | (2) |

I follow a batch of material that was once part of a slab to obtain a quantitative estimate of the mid-mantle geothermal gradient (Fig. 1). The slab material sinks (negatively) buoyantly into the deep mantle. Its temperature anomaly spreads out as the material ponds near the base of the mantle. Thereafter, the conduction term in Eq. (1) is small. The velocity term in Eq. (2) is zero by definition if one sets the coordinate system to follow the material. The heat flow equation (2) then reduces to static heating by radioactive heat generation,

$$\frac{\mathrm{DT}}{\mathrm{Dt}}=\frac{A}{C}$$ | (3) |

The interior of the actual Earth is likely to have cooled over geologic time [1,4,18]. In this case, cooling supplies heat to the surface. A subadiabatic gradient exists as with radioactive heating. The material now arriving near the surface has a higher potential temperature than deep material that will arrive say 1000 million years from now. Equivalently, the average temperature of the material that descended at say ∼2000 Myr and is now reaching the surface was higher than the average temperature of material that descended at ∼1000 Myr and will reach the surface in 1000 million years.

Combining the long term cooling rate of the Earth's mantle ∂T/∂t and radioactive heating the departure ΔT of the geotherm across the mantle from adiabatic is

$$\Delta \mathrm{T}=\lfloor -\frac{\partial \mathrm{T}}{\partial \mathrm{t}}+\frac{A}{C}\rfloor \Delta t$$ | (4) |

$$\frac{\Delta T}{M}=\lfloor -\frac{\partial \mathrm{T}}{\partial \mathrm{t}}+\frac{A}{C}\rfloor \frac{\Delta t}{M}=\frac{1}{{V}_{Z}}\lfloor -\frac{\partial \mathrm{T}}{\partial \mathrm{t}}+\frac{A}{C}\rfloor $$ | (5) |

_{Z}is the vertical velocity.

One obtains the subadiabatic gradient (as ΔT/Δt) from Eqs. (4) and (5) from the surface heat flow without knowing the fraction of heat supplied by radioactivity because the sum of these effects is involved. In particular, the magnitude of the departure of the thermal gradient from adiabatic relates to the heat balance in the Earth's mantle. The average surface heat flow coming from the mantle is:

$${q}_{\mathrm{S}}=\rho \phantom{\rule{1.69998pt}{0ex}}C\phantom{\rule{1.69998pt}{0ex}}\Delta T\phantom{\rule{1.69998pt}{0ex}}{V}_{Z}$$ | (6) |

$$4\phantom{\rule{1.69998pt}{0ex}}\pi \phantom{\rule{1.69998pt}{0ex}}{r}_{\mathrm{E}}^{2}\phantom{\rule{1.69998pt}{0ex}}{q}_{\mathrm{S}}=4\phantom{\rule{1.69998pt}{0ex}}\pi \phantom{\rule{1.69998pt}{0ex}}{r}_{\mathrm{E}}^{2}\phantom{\rule{1.69998pt}{0ex}}M\phantom{\rule{1.69998pt}{0ex}}\rho \phantom{\rule{1.69998pt}{0ex}}C\phantom{\rule{1.69998pt}{0ex}}\Delta \mathrm{T}/\Delta t$$ | (7) |

_{E}is the radius of the Earth and $4\phantom{\rule{1.69998pt}{0ex}}\pi \phantom{\rule{1.69998pt}{0ex}}{r}_{\mathrm{E}}^{2}\phantom{\rule{1.69998pt}{0ex}}M\phantom{\rule{1.69998pt}{0ex}}\rho $ is the mass of the mantle.

Sclater et al. [33] have estimated the heat flow from the Earth. Following Davies [13], I quickly show the essence of the method. The Earth's surface is 0.6 ocean basins and 0.4 continents. The oceanic heat flow is approximately 500 mW m^{−2} divided by the square root of age in million years. Currently oceanic crust recycles at a rate of 3 km^{2} Myr^{−1}, which would replace the ocean basins in 100 Myr. The average heat flow in the ocean basins is thus ∼100 mW m^{−2}. The continental heat flow (not associated with crustal radioactivity) is less well constrained to ∼25 mW m^{−2}. The global average is thus ∼70 mW m^{−2}. About 0.1 of this heat flow comes ultimately from detected plumes [12,34]. This leaves ∼63 mW m^{−2} from mantle radioactivity and mantle cooling. As the heat capacity of the mantle is ∼1.25 kJ m^{−3} K^{−1} [36] and the mass of the mantle is 4×10^{24} kg, this is equivalent to a heat production of 8×10^{−12} W kg^{−1} or to a cooling rate of 0.2 K Myr^{−1} or a combination there of.

I estimate the return time for the flow Δt from the geometry of plate tectonics. Assuming that the slab and the material entrained with it are 200-km thick, 600 km^{3} with a mass of 2×10^{15} kg is subducted each year. The mantle (4×10^{24} kg) cycles on a time scale Δt of 2000 Myr, implying that ΔT is ∼400 K (from Eq. (7), using q_{S}=63 mW m^{−2} and C=1.25 kJ m^{−3} K^{−1}). That is, the thermal gradient in the mantle is likely to be significantly subadiabatic and that tomograms with respect to mean velocity at a given depth should not be construed as diagrams with respect to the MORB adiabat.

To summarize, average geotherm in the mantle (away from slabs and plumes) consists of a superadiabatic conductive geothermal gradient in the lithosphere, a superadiabatic thermal gradient within the D″ thermal boundary layer at the base of the mantle, and a subadiabatic thermal gradient within the intervening mantle (Fig. 2). A chemically denser layer may underlie the convecting part of D″, but is not shown for simplicity. The existence of hot spots associated with mantle plumes indicates that the potential temperature within the convecting part of D″ is hotter than the MORB adiabat.

## 3 Slab descent

The width of slabs as they descend is detectable by tomography. I quantify the width of slabs sinking into the lower mantle using simple heat and mass transfer methods. I constrain the temperature anomaly associated with slabs in this process.

### 3.1 Force balance and broadening of the slab with depth

Steinberger [37] dynamically models the descent of slabs into the lower mantle. I use dimensional scaling to represent the essence of this process in the case that viscosity increases strongly with depth. I begin with the velocity that a slab sinks through the lower mantle. For simplicity, my slab sinks vertically (Fig. 3). The slab is able to deform so that its width increases as it sinks into more viscous material (the widening may occur by buckling and folding. I do not consider the mechanical details of widening, as they are not susceptible to simple dimensional analysis and not likely to be resolved in tomograms). The shear traction on the sides of the slab balances the body force producing sinking, dimensionally:

$$\tau \approx \Delta \rho \phantom{\rule{1.69998pt}{0ex}}g\phantom{\rule{1.69998pt}{0ex}}W$$ | (8) |

$$\tau \approx \frac{\eta \phantom{\rule{1.69998pt}{0ex}}{V}_{\mathrm{S}}}{L}$$ | (9) |

_{S}, and L is the length scale for the flow. At steady state, the volume per length flux ${V}_{\mathrm{S}}\phantom{\rule{1.69998pt}{0ex}}W$ of slab material is constant with depth. Solving Eqs. (8) and (9) with this constraint (keeping the density contrast constant, as it is likely to change much less than viscosity) yields that the width of the slab increases as

$$\mathrm{W}={\mathrm{W}}_{\mathrm{ref}}{\left[\frac{\eta}{{\eta}_{\mathrm{ref}}}\right]}^{1/2}$$ | (10) |

_{ref}is the width of the slab at a reference viscosity η

_{ref}. The velocity is:

$${V}_{\mathrm{S}}={\mathrm{V}}_{\mathrm{ref}}{\left[\frac{\eta}{{\eta}_{\mathrm{ref}}}\right]}^{-1/2}=\Delta \rho \phantom{\rule{1.69998pt}{0ex}}g\phantom{\rule{1.69998pt}{0ex}}{W}_{\mathrm{ref}}\phantom{\rule{1.69998pt}{0ex}}\frac{L}{{\eta}^{1/2}\phantom{\rule{1.69998pt}{0ex}}{\eta}_{\mathrm{ref}}^{1/2}}$$ | (11) |

_{ref}is the velocity of the slab at a reference viscosity η

_{ref}. The leading term $\Delta \rho \phantom{\rule{1.69998pt}{0ex}}g\phantom{\rule{1.69998pt}{0ex}}{W}_{\mathrm{ref}}$ is the (negative) buoyancy per surface area of the slab. This quantity scales with the square root of (subducted) plate age if the reference depth is taken near the top of the lower mantle were the slab is still behaving somewhat rigidly.

To obtain closed-formed results, I assume that the viscosity is given by the simple function:

$$\eta ={\eta}_{\mathrm{ref}}\phantom{\rule{1.69998pt}{0ex}}\mathrm{exp}\lfloor \frac{\delta \mathrm{T}}{{T}_{\eta}}\rfloor \mathrm{exp}\lfloor \frac{Z}{{D}_{\eta}}\rfloor $$ | (12) |

_{η}is the temperature scale for the viscosity, Z is depth below some reference, and D

_{η}is the depth scale for the viscosity. The actual variation of viscosity with depth (as would be determined by a glacial rebound study) includes the effect of a subadiabatic gradient:

$$\delta \mathrm{T}=\frac{\Delta T\phantom{\rule{1.69998pt}{0ex}}Z}{M}$$ | (13) |

$$\eta ={\eta}_{\mathrm{ref}}\phantom{\rule{1.69998pt}{0ex}}\mathrm{exp}\left[\frac{\Delta \mathrm{TZ}}{{\mathrm{MT}}_{\eta}}\right]\mathrm{exp}\left[\frac{Z}{{D}_{\eta}}\right]\equiv {\eta}_{\mathrm{ref}}\mathrm{exp}\left[\frac{Z}{{D}_{\mathrm{app}}}\right]$$ | (14) |

_{app}is the apparent depth scale for viscosity in the mantle. The time for a slab to sink to some depth Z below the reference depth is:

$${t}_{\mathrm{sink}}=\underset{{Z}_{0}}{\stackrel{Z}{\int}}\frac{\mathrm{d}z}{V}\approx \frac{2\phantom{\rule{1.69998pt}{0ex}}{D}_{\mathrm{app}}}{{V}_{\mathrm{ref}}}\mathrm{exp}\left[\frac{Z}{2\phantom{\rule{1.69998pt}{0ex}}{D}_{\mathrm{app}}}\right]\approx \frac{2\phantom{\rule{1.69998pt}{0ex}}{D}_{\mathrm{app}}\phantom{\rule{1.69998pt}{0ex}}{\eta}_{\mathrm{ref}}}{\Delta \rho \phantom{\rule{1.69998pt}{0ex}}g\phantom{\rule{1.69998pt}{0ex}}{W}_{\mathrm{ref}}\phantom{\rule{1.69998pt}{0ex}}L}\mathrm{exp}\left[\frac{Z}{2\phantom{\rule{1.69998pt}{0ex}}{D}_{\mathrm{app}}}\right]$$ | (15) |

_{0}is the starting depth. The two approximate equalities that apply to the time for a slab to sink into the deep mantle take the lower limit of the integral as −∞.

Eqs. (10) and (15) suffice to illustrate some basic points about the expected properties of slabs within the lower mantle. The increase of viscosity across the lower mantle is unknown but may be significant. For example, Lambeck and Johnson [21] give a constant viscosity of 10^{22} Pa s in the lower mantle while Forte and Mitrovica [17] give three orders of magnitude change with a maximum around 2000 km depth. Steinberger [37,38] gives an increase of a factor of 40 to 4×10^{22} Pa s.

To obtain estimates of sinking time and slab width, I assume that viscosity increases with depth as in Eq. (14). I consider increases of 2 and 3 orders of magnitude from 700 to 2700 km depth, to leave 200-km depth for D″. The depth scales D_{app} are 434 and 290 km, respectively. The sinking times from Eq. (15) assuming that the vertical plate velocity at ∼700 km, V_{ref}, is 50 km Myr^{−1}, are 173 and 367 Myr, respectively. The material spends 63% of this time in the bottom $2\phantom{\rule{1.69998pt}{0ex}}{D}_{\mathrm{app}}$ of the mantle, so the times to reach this ponding level are 63 and 135 Myr. Slab material sinks through the lower mantle in times that are short compared with the 2000-Myr-cycle time of the mantle. The sinking times for two-orders-of-magnitude contrast are reasonable. The sinking times for three-order-of-magnitude contrast are excessive in that tomograms correlate with the last 120–200 Myr of subduction [7]. The two viscosities imply factors of 10 and 30 broadening of the slab in Eq. (15) or increases in width from ∼200 to ∼2000 and ∼6000 km, respectively. The latter increase is large enough to grossly violate the assumption in the force balance that the slab is thin, compared to the other length scales in the mantle. Rather, the slab would sink through the relatively low-viscosity regions in the uppermost lower mantle and then pond and spreads out slowly in the lowermost lower mantle. The sinking time of 367 Myr is thus an overestimate, indicating that three orders of magnitude of increase are excessive. I thus use the 2-order-of-magnitude increase as a plausible rounded estimate for a significant change in viscosity in subsequent calculations. I return to the gradual spreading of deeply ponded slabs in Section 4.

### 3.2 Heat conduction and the temperature of deep slabs

Conduction tends to spread out the thermal anomaly associated with the slab. I constrain the average thermal anomaly of dead slabs and hence the effective amount of material that is entrained with slabs. I begin with a slab (that started as a line source) that broadens only by conduction. Mathematically, the temperature is proportional to $\mathrm{exp}(-{\mathrm{x}}^{2}/4\phantom{\rule{1.69998pt}{0ex}}\kappa \phantom{\rule{1.69998pt}{0ex}}\mathrm{t})$, where $\kappa \equiv \mathrm{k}/\rho \phantom{\rule{1.69998pt}{0ex}}C$ is thermal diffusivity and x is the distance from the center of the slab (Eq. (4-163) of [43]). The half width of the anomaly is $\sim 2\phantom{\rule{1.69998pt}{0ex}}\sqrt{\kappa \phantom{\rule{1.69998pt}{0ex}}t}$. The full width of the slab at some time is then:

$$\mathrm{W}=\mathrm{a}\phantom{\rule{1.69998pt}{0ex}}\sqrt{\kappa \phantom{\rule{1.69998pt}{0ex}}t}$$ | (16) |

$$\frac{\mathrm{DW}}{\mathrm{Dt}}=\frac{{a}^{2}\phantom{\rule{1.69998pt}{0ex}}\kappa}{2\phantom{\rule{1.69998pt}{0ex}}W}$$ | (17) |

$$\frac{\mathrm{DW}}{W\phantom{\rule{1.69998pt}{0ex}}\mathrm{Dt}}=\frac{-\mathrm{DT}}{T\phantom{\rule{1.69998pt}{0ex}}\mathrm{Dt}}=\frac{{a}^{2}\phantom{\rule{1.69998pt}{0ex}}\kappa}{2\phantom{\rule{1.69998pt}{0ex}}{W}^{2}}$$ | (18) |

$$\frac{\partial \mathrm{t}}{\partial \mathrm{Z}}=\frac{1}{{V}_{S}}$$ | (19) |

$$\frac{\partial \mathrm{W}}{W\phantom{\rule{1.69998pt}{0ex}}\partial \mathrm{Z}}=\frac{-\partial \mathrm{T}}{T\phantom{\rule{1.69998pt}{0ex}}\partial \mathrm{Z}}=\frac{{a}^{2}\phantom{\rule{1.69998pt}{0ex}}\kappa}{2\phantom{\rule{1.69998pt}{0ex}}\left(\mathrm{V}\phantom{\rule{1.69998pt}{0ex}}\mathrm{W}\right)\phantom{\rule{1.69998pt}{0ex}}W}=\frac{{a}^{2}\phantom{\rule{1.69998pt}{0ex}}\kappa}{2\phantom{\rule{1.69998pt}{0ex}}\left({\mathrm{V}}_{\mathrm{ref}}\phantom{\rule{1.69998pt}{0ex}}{W}_{\mathrm{ref}}\right)\phantom{\rule{1.69998pt}{0ex}}W}$$ | (20) |

$$\mathrm{ln}\left[\frac{{T}_{\mathrm{ref}}}{{T}_{\mathrm{final}}}\right]=\underset{0}{\stackrel{Z}{\int}}\frac{{a}^{2}\phantom{\rule{1.69998pt}{0ex}}\kappa}{2\phantom{\rule{1.69998pt}{0ex}}{V}_{\mathrm{ref}}\phantom{\rule{1.69998pt}{0ex}}{W}_{\mathrm{ref}}^{2}}\mathrm{exp}\left[\frac{-\mathrm{z}}{2\phantom{\rule{1.69998pt}{0ex}}{Z}_{\mathrm{app}}}\right]\phantom{\rule{1.69998pt}{0ex}}\mathrm{d}\mathrm{z}\approx \frac{{a}^{2}\phantom{\rule{1.69998pt}{0ex}}\kappa \phantom{\rule{1.69998pt}{0ex}}{Z}_{\mathrm{app}}}{{V}_{\mathrm{ref}}\phantom{\rule{1.69998pt}{0ex}}{W}_{\mathrm{ref}}^{2}}$$ | (21) |

_{ref}at the reference depth 0, T

_{final}is the final temperature anomaly at depth Z beneath the reference depth, and the upper limit of the integral is infinity in the final approximation. Little heat conduction occurs in the lower mantle if viscosity increases with depth. For example, I let the viscosity depth scale Z

_{app}be 434 km, the initial width of the slab W

_{ref}be 200 km, the vertical plate velocity at ∼700 km V

_{ref}be 50 km Myr

^{−1}, and the thermal diffusivity be 10

^{−6}m

^{2}s

^{−1}. The relative temperature change is 11%, a modest amount. That is, the average temperature anomaly broadens little from conduction between the time the slab enters the lower mantle until the time where it ponds at near the base of the mantle amongst slightly older slabs.

Viewed in another way, the breadth of the thermal anomaly of a slab and the volume of cool material within it result mostly from conduction while the slab was still oceanic lithosphere rather than conduction after subduction. The same feature exists for mantle plumes [15]. Starting plume heads and plume conduits entrain only minor amounts of material once these leave the boundary layer. This gives some confidence in ignoring entrainment during buckling and folding of slabs because starting plume heads deform greatly during their ascent.

The 400-K, 200-km thick average anomaly in slabs and the cycle time of Δt of 2000 Myr that I obtained in Section 2 are reasonable round estimates since significant entrainment does not occur at midmantle depths. A combination of 300 K, 1500 Myr, and 267 km would be equally good. Geochemical data and our knowledge of past plate tectonics are not now good enough to get a refined estimate.

## 4 Slow spreading of slabs graveyards

Broad regions of high seismic velocity and low attenuation are prominent features in the deep lower mantle in global tomograms. They occur where many slabs have descended over that last ∼180 Myr for which reliable reconstructions are available [6,26,28,37]. Conversely, superplume regions of low seismic velocity and high attenuation exist elsewhere where slabs have not impinged. Surface hot spots preferentially overlie the slab-free regions [9].

The first order physics of the process are fairly simple. Newly subducted slab material ponds within recently arrived slabs near the base of the mantle. The viscosity of the lower mantle is on average a factor of ∼20 higher than the upper mantle so that the slabs sink significantly slower than surface plate rates [6,44]. The viscosity of the slab graveyard and the surrounding mantle is sufficient to prevent it from quickly spreading laterally along the base of the mantle like dense oceanic bottom water. Rather tomography detects dead slabs in the deep mantle as well as currently sinking slabs.

### 4.1 Time scale of slab graveyards

The existence of slab graveyards constrains the viscosity of the deep mantle. That is, a graveyard grows as a pile until lateral flow comes into balance with subduction [6] (Fig. 4). I represent the effect by denoting the local elevation of the top of the slab pile relative to its global mean as H. I denote a term of its spherical harmonic expansion of order m as H_{m}. Assuming a linear fluid where lateral viscosity variations are unimportant, a representative harmonic term evolves as

$$\frac{\partial {\mathrm{H}}_{m}}{\partial \mathrm{t}}=-\frac{{H}_{m}}{{t}_{m}}+{\mathrm{F}}_{m}$$ | (22) |

_{m}is time constant and F

_{m}is the flux (equivalent thickness per time) supplied by slabs for the harmonic. At steady state, the height of the harmonic is

$${\left[{\mathrm{H}}_{m}\right]}_{\mathrm{ss}}={\mathrm{F}}_{m}/{\mathrm{t}}_{m}$$ | (23) |

_{m}is known from reconstructions over the last 200 Myr and the height H

_{m}is known from tomography [28].

Low-order harmonics, degrees 1 to 3, are prominent on lower mantle tomograms. The harmonic of degree 2 dominates the geoid (the first harmonic of the geoid is by definition zero). The spectrum of hot spot distribution (as points on sphere) peaks mildly at 6 and at low harmonics 1 to 3 [27]. The dominance of lower spherical harmonics in tomograms and the geoid is both a consequence of forcing function F_{m} and the time scale t_{m}.

Beginning with the forcing function, the spherical harmonic expansion for subducted material expressed as predicted geoidal effect peaks at low harmonics [27]. This occurs because the positions of subduction zones have changed little over 180 Myr compared with the $\sim 10\phantom{\rule{1.69998pt}{0ex}}000$-km length scales of the lowest harmonics so the low-harmonic phase and magnitude of H_{m} have not changed much [28]. Conversely, plate boundaries have moved significantly relative to higher harmonics over this time although the actual change is not well resolved. Overall, the time scale for the low harmonics cannot be far greater than 180 Myr, because features before that major plate reorganization would be evident on tomograms and not correlate with more recent subduction. In any case, it cannot approach the mantle cycle time of 2000 Myr.

With regard to the time scale, slab piles are within the highest viscosity region of the mantle. The restoration of stable interface of a slab pile is analogous to the growth of an instability on an unstable interface or to the restoration of a stable surface by glacial rebound. The presence of the core causes scale time to be higher at low harmonics (than intermediate ones) even for an isoviscous mantle [2,31,35]. However, a strong depth gradient of viscosity within the lower mantle is not necessary for tomograms to correlate with past subduction. For example, Bunge et al. [7] assume an isoviscous lower mantle, 4×10^{22} Pa s, and obtain a time scale of 150 Myr and a reasonable fit to observations.

Overall, the persistence of slab piles, the correlation of mid-mantle and deep-mantle slab anomalies with subduction over the last 200 Myr, and the increasing width of slab anomalies with depth in Section 3 all provide evidence that the viscosity of the mantle increases with depth. More numerical calculations like those of Bunge and Richards [6] and Bunge et al. [7], and Steinberger [37] are needed along with better tomographic resolution. Out lack of knowledge of ancient plate motions may remain a difficulty.

### 4.2 Slab impingement in the deep mantle and its effect of hot spot motion

Flow in the mantle prevents hot spots from being fixed features. Rather, relief on the top of D″, including the cusps beneath plume conduits, and the plume conduits themselves advect with the flow [38,39]. The observed rates of hot spot motion are small compared to surface plate velocities because the viscosity in the mantle above D″ is high compared with that of the upper mantle [25]. Exceptions to the rule are interesting, as they are likely to be detected. I concentrate on an effect that may occur near the base of the mantle.

As noted already, tomograms indicate that the lower-mantle regions between slabs are relatively warm. I consider what happens when subduction begins above such a region. First, the slab sinks rapidly because the surrounding viscosity in Eq. (15) is relatively low. Once to the bottom, the slab material spreads rapidly, again because the surrounding viscosity is relatively low in Eq. (24). The process may cause rapid hot spot wander. The initiation of subduction may somewhat kink a track as the geometry of the return flow immediately adjusts to the new surface boundary condition. The plume cusp may also move as the slab graveyard impinges on it. That is, the cusp advects away from the slab graveyard and toward the adjacent superplume region. Plumes affected in this manner may move significantly relative to the average hot spot frame. However, the mechanism is unlikely to produce a sharp link in a mid-plate track, like the Hawaii-Emperor bend. Once established, a slab graveyard and its buoyancy evolve slowly. There is also no obvious place for a slab graveyard near the Hawaiian hot spot to make the Hawaii-Emperor bend. I do not have a good example of plume deflection by a new slab graveyard.

Conversely, information may be gained by careful tomography. For example, new slabs are potentially distinct from older slabs that impinge on slab piles. New slabs sink into relatively warm, low-viscosity mantle. New slabs should sink rapidly in Eq. (15), widen more slowly in Eq. (10), and have a greater temperature contrast with their surroundings than old slabs. They also lack the underlying negative buoyancy of an established slab pile [37].

### 4.3 Kinematics of the return flow

The long-term kinematics of flow within the deep mantle are of interest to geochemistry. Crudely, plumes sample subducted material including sediments, oceanic basalt, oceanic gabbro, and depleted residuum with model ages of ∼0.7 to 1.9 Ga (e.g., [5,14]). Components of this age are compatible with the kinematics in Fig. 1 [15]. Old plume components, like ^{3}He, and ‘young’ slab components, indicated by Pb isotopes [10], require more explanation. It is not my purpose to second guess the conclusions of geochemists, rather I show that I expect a mixture of young, intermediate, and old components in plumes supplied by the low-viscosity D″ layer.

To begin with observations, tomograms constrain the gross flow associated with slabs: slabs sink deep into the mantle. As noted in Section 2, the return flow to ridges is on average gradual advection of this material back to the surface. The return flow rate (normalized as $4\phantom{\rule{1.69998pt}{0ex}}\pi \phantom{\rule{1.69998pt}{0ex}}{r}^{2}\phantom{\rule{1.69998pt}{0ex}}\rho \phantom{\rule{1.69998pt}{0ex}}{V}_{Z}$, where r is the local radius from the Earth's center and ρ is the local density) is more-or-less constant between the base of the lithosphere and the depth range where slabs pond. Conversely, plumes tap material from D″ and take it to shallow depths. The return flow into D″ displaces material downward throughout the (sublithospheric) mantle with a uniform normalized flow rate. The normalized flow rate is the sum of the slab and plume effects. It is downward within and immediately above D″ and upward in the rest of the mantle (Fig. 2).

A stagnation region forms a dynamic trap between the upward and downward flowing regions in the deep mantle. Material in this region undergoes a random walk from the vagaries of slab and plume activity. It eventually ends up either in plumes or at ridges. In the former case, it contributes an old component to plumes.

The young component in plumes arises because slabs made from old oceanic lithosphere sink more rapidly and deeply than average slabs [7]. That is, the larger initial width W_{ref} implies a smaller sinking time in Eq. (15). In addition, new slabs made of old lithosphere penetrate relatively low viscosity mantle and sink more deeply. In both cases, slabs impinge on the top of D″ and contribute some of their material to plumes.

## 5 Ascent of starting plume heads

Starting plume heads are tempting tomographic targets as they should be several hundred kilometers in diameter (Fig. 1). In contrast, plume conduits are narrow and difficult to detect in the lower mantle. Lower mantle ray paths through plume conduits can never be first arrivals and are missed by travel-time tomography. Seismologists cannot easily separate immediately late conduit arrivals from the rest of the coda. They require large arrays to reliably find very late low-amplitude scattered and diffracted arrivals.

### 5.1 Plume head size and ascent time

I constrain the probable depth range and size range of starting plume heads by modifying a well-known analytical theory reviewed by Schubert et al. [32]. I represent the plume head as a zero-viscosity buoyant sphere with a density contrast $D\phantom{\rule{1.69998pt}{0ex}}\rho $ and a radius r_{h}. The Stokes-law viscosity of the sphere is:

$${V}_{h}=\frac{\Delta \rho \phantom{\rule{1.69998pt}{0ex}}g\phantom{\rule{1.69998pt}{0ex}}{r}_{h}^{2}}{3\phantom{\rule{1.69998pt}{0ex}}\eta}$$ | (24) |

$${r}_{h}={\left[\frac{3\phantom{\rule{1.69998pt}{0ex}}t\phantom{\rule{1.69998pt}{0ex}}Q}{4\phantom{\rule{1.69998pt}{0ex}}\pi}\right]}^{1/3}$$ | (25) |

$${V}_{h}=\frac{\Delta \rho \phantom{\rule{1.69998pt}{0ex}}g}{3\phantom{\rule{1.69998pt}{0ex}}\eta}{\left[\frac{3\phantom{\rule{1.69998pt}{0ex}}t\phantom{\rule{1.69998pt}{0ex}}Q}{4\phantom{\rule{1.69998pt}{0ex}}\pi}\right]}^{2/3}$$ | (26) |

$${t}_{H}={\left[\frac{5\phantom{\rule{1.69998pt}{0ex}}\eta \phantom{\rule{1.69998pt}{0ex}}H}{\Delta \rho \phantom{\rule{1.69998pt}{0ex}}g}\right]}^{3/5}{\left[\frac{4\phantom{\rule{1.69998pt}{0ex}}\pi}{3\phantom{\rule{1.69998pt}{0ex}}Q}\right]}^{2/5}$$ | (27) |

$${r}_{H}={\left[\frac{15\phantom{\rule{1.69998pt}{0ex}}Q\phantom{\rule{1.69998pt}{0ex}}H\phantom{\rule{1.69998pt}{0ex}}\eta}{4\phantom{\rule{1.69998pt}{0ex}}\pi \phantom{\rule{1.69998pt}{0ex}}\Delta \rho \phantom{\rule{1.69998pt}{0ex}}g}\right]}^{1/5}$$ | (28) |

I compare the isoviscous situation with a lower mantle where viscosity is strongly dependent on depth. The velocity is then from Eq. (14):

$${V}_{h}=\frac{\Delta \rho \phantom{\rule{1.69998pt}{0ex}}g}{3\phantom{\rule{1.69998pt}{0ex}}{\eta}_{\mathrm{B}}}{\left[\frac{3\phantom{\rule{1.69998pt}{0ex}}t\phantom{\rule{1.69998pt}{0ex}}Q}{4\phantom{\rule{1.69998pt}{0ex}}\pi}\right]}^{2/3}\mathrm{exp}\left[\frac{H}{{D}_{\mathrm{app}}}\right]$$ | (29) |

_{B}near the base of the mantle. I obtain the ascent time by integrating

$$\underset{0}{\stackrel{H}{\int}}\mathrm{exp}\left[\frac{-\mathrm{h}}{{D}_{\mathrm{app}}}\right]\phantom{\rule{1.69998pt}{0ex}}\mathrm{d}\mathrm{h}=\left[\frac{\Delta \rho \phantom{\rule{1.69998pt}{0ex}}g}{3\phantom{\rule{1.69998pt}{0ex}}{\eta}_{B}}\right]{\left[\frac{3\phantom{\rule{1.69998pt}{0ex}}Q}{4\phantom{\rule{1.69998pt}{0ex}}\pi}\right]}^{2/3}\underset{0}{\stackrel{{t}_{H}}{\int}}{t}^{2/3}\phantom{\rule{1.69998pt}{0ex}}\mathrm{d}t$$ | (30) |

$${t}_{\infty}={\left[\frac{5\phantom{\rule{1.69998pt}{0ex}}{\eta}_{\mathrm{B}}\phantom{\rule{1.69998pt}{0ex}}{D}_{\mathrm{app}}}{\Delta \rho \phantom{\rule{1.69998pt}{0ex}}g}\right]}^{3/5}{\left[\frac{4\phantom{\rule{1.69998pt}{0ex}}\pi}{3\phantom{\rule{1.69998pt}{0ex}}Q}\right]}^{2/5}$$ | (31) |

$${r}_{\infty}={\left[\frac{15\phantom{\rule{1.69998pt}{0ex}}Q\phantom{\rule{1.69998pt}{0ex}}{D}_{\mathrm{app}}\phantom{\rule{1.69998pt}{0ex}}{\eta}_{\mathrm{B}}}{4\phantom{\rule{1.69998pt}{0ex}}\pi \phantom{\rule{1.69998pt}{0ex}}\Delta \rho \phantom{\rule{1.69998pt}{0ex}}g}\right]}^{1/5}$$ | (32) |

_{app}. The weak dependence of plume head radius on physical parameters in the isoviscous case (28) carries through to the case of a strong viscosity gradient.

The likely position of the plume head differs markedly between a strong viscosity gradient and an isoviscous lower mantle. The time to ascend through a viscosity gradient to height H is from Eqs. (30) and (31):

$${t}_{H}={\mathrm{t}}_{\infty}{\left[1-\mathrm{exp}\left(\frac{-\mathrm{H}}{{D}_{\mathrm{app}}}\right)\right]}^{3/5}$$ | (33) |

_{app}of the mantle and 92% of its time in the bottom $2\phantom{\rule{1.69998pt}{0ex}}{D}_{\mathrm{app}}$. It is quite near its final radius once it ascends above this basal region. That is, a random observation is most likely to find starting plume heads near the base of the mantle and unlikely to find them in shallower regions that they quickly traverse.

The standing crop of plume heads is inversely proportional to the ascent time. I first obtain an estimate for an isoviscous (10^{22} Pa s) lower mantle. I let the temperature anomaly be 800 K (400 K relative to the MORB adiabat), the density be 5500 kg m^{−3}, and thermal expansion coefficient be 10^{−5} K^{−1}, the acceleration of gravity be 10.5 m s^{−2}, and the lower mantle ascent distance be 2000 km. I let volume flux be 200 m^{3} s^{−1} in the upper range of estimates for individual plumes tabulated by Sleep [34]. This yields an ascent time of 42 Myr and a final radius of 400 km. The ascent time t_{∞} for D_{app}=434 km and η_{B}=10^{23} Pa s is 68 Myr and the final radius is 469 km. There is no significant difference between the two cases given the uncertainty in the parameters. My rounded estimates are that the ascent time is ∼50 Myr and that the final head radius is ∼400 km (as shown in Fig. 1).

Current constraints on plume head ascent time are vague because these features are only indirectly detected through their flood basalts. The ascent time needs to be long enough that plume heads have as few 10^{7} years worth of plume flux to produce widespread flood basalts. It cannot be much longer than 10^{8} years without destroying the negative correlation between plume location and slab graveyards.

The expected standing crop of plume heads within our current mantle is small. There have been ∼4 starting plume heads in the last 100 Myr, Yellowstone, East Africa, Réunion Island, and Iceland, or one per 25 Myr. Comparing this with the 50-Myr ascent time, the current standing crop is most likely 2±2. If there is a strong viscosity gradient with depth, all these starting plume heads are most likely near the base of the mantle.

### 5.2 Ascent of starting plumes from mid-mantle depths

It is conceivable that plumes start from mid-mantle depths, for example, above a thick layer of dense dregs [11,20]. This layer should have significant effects on the thermal and seismic velocity structure of the mantle [42].

I consider its effects on starting plume heads. Plumes should ascend from the peaks of the dreg layer. Such peaks are not resolved in tomograms and a thick layer of dregs may not exist [42]. In particular, there is no evidence that starting plume heads are now forming in such regions or that dregs peaks underlie hot spots.

A potential testable implication is that plumes would ascend from different depths above peaks in a thick dregs layer, but from essentially the same depth if no (or only a thin) dregs layer exists [42]. In the case that the viscosity is not strongly depth dependent, different starting depths weakly affect the ascent time in Eq. (27) and the final radius in Eq. (28) through the ascent distance H. Modest variations in the volume flux Q would mask any real effect in the limited number of real cases. Conversely, the starting height in a mantle with strongly depth-dependent viscosity has strong effects through the viscosity at the starting depth η_{B}. For example, a change in a factor of 10 of viscosity would change the final volume of the plume head by a factor of 4 in Eq. (32). One would need estimates of both the volume flux Q and the plume head volume to see if the correlation in Eqs. (28) and (32) occurs or if there are effects that might be attributed to variations in the depth to peaks of the dreg layer. At present, these quantities are not precisely constrained even for Iceland and Réunion Island.

### 5.3 Ascent of superplume regions

The formalism in Section 5.1 is applicable to the ascent of superplume regions detected by tomography. Romanowicz and Gung [30] consider that these regions are buoyant relative to upper mantle MORB adiabat, not just relative to their lateral surroundings. The observed sinking of slabs and ascent of plume heads through the mantle provides some constraint on whether superplume regions are in fact buoyant relative to the MORB adiabat.

The radius and the density contrast differ between superplume regions and plume heads in Stokes-law ascent rate (Eq. (25)). The radius of the superplume regions is a significant part of the lower mantle ∼1000 km compared with ∼400 km for starting plume heads. This implies that the ascent rates would be equal if the density contrast was a factor of 6.25 greater in plume heads than in starting plumes relative to the mantle adiabat. The ascent rate needs to be a factor of few slower than this so that the regions persist on the time scale that slabs form graveyards between them. This implies that the density contrast of superplume regions relative to the mantle adiabat cannot be more than a few percent of that of plume heads. That is, the temperature contrast of superplume regions with the MORB adiabat is less than ∼20 K compared with ∼400 K of plume heads. This situation creates a special circumstances issue. From Section 2, a typical part of the lower mantle heats up 20 K relative to the evolving MORB adiabat in 100 Myr. It is unlikely, though not impossible, that we live in the time that much of the lower mantle is just about to become unstable heating up over ∼2000 Myr. More likely, the superplume regions are cooler than the MORB adiabat, but hotter than their slab-dominated surroundings.

## 6 Conclusions

Present tomograms resolve the gross features of the lower mantle [3,19]. Resolved features are slabs and warmer regions between them. Unresolved features include starting plume heads and plume tail conduits. Given this situation, how does one proceed with geodynamics and tomography? I have compiled some simple but robust features of mantle-wide convection to aid in this effort.

Plate tectonics cool the mantle (Fig. 1). The heat comes from cooling of the Earth over time and from radioactive decay. The core supplies a small amount of heat that ascends in plumes. Surface plate tectonics does not efficiently stir the mantle because the viscosity of the lower mantle is much greater than that of the upper mantle. The stirring that does occur involves sinking and ponding of slabs in the deep mantle and the eventual return flow of this material to ridge axes on a time scale of 2000 Myr.

A basic feature of this geometry is that the geotherm is subadiabatic between the upper lithospheric boundary layer and the lower D″ boundary layer (Fig. 2). The total subadiabatic temperature change is ∼400 K. The deep mantle above D″ consists of graveyard piles of ponded slabs and somewhat warmer superplume regions between them. The superplume regions are likely to be cooler and negatively buoyant with respect to the upper mantle (MORB) adiabat. Tomograms plotted with respect to the mean velocity at each depth do not represent the temperature structure relative to an adiabat.

The viscosity may change by about two orders of magnitude in the lower mantle, a lot compared with other physical parameters. An observable effect is that slabs should widen proportionally to the square root of viscosity and its descent rate should decrease inversely with the square root of viscosity (Figs. 1 and 3). The temperature anomalies associated with slabs broaden little by conduction while they traverse the lower mantle. The temperature and seismic anomaly should become less distinct as slabs penetrate into (slightly longer ago) subducted material near the base of the mantle.

Graveyards of ponded slabs persist near the base of the mantle on a time scale comparable to that for plate motions to reorganize, ∼120–200 Myr (Fig. 4). Significantly shorter response times preclude slab piles while significantly longer ones would cause tomograms to correlate poorly with relatively recent subduction. The longest wavelengths are most evident because they persist longer and because the source of the material from slabs stays in phase. The viscosity of the lower mantle obtained from these considerations is ∼4×10^{22} Pa s.

Starting plume heads traverse the lower mantle on a time scale of ∼50 Myr. In the case that viscosity increases with depth, they spend most of their ascent time in the base of the mantle. They traverse an isoviscous mantle at a relatively constant rate. The size of starting plume heads depends weakly on physical parameters including the starting depth. At most a few starting plume heads exist in the present lower mantle.

My inferences provide hints on how to formulate convection calculations. The temperature and depth dependence of viscosity is important to the existence of plate tectonics and plumes in the first place. The widening of slabs with depth is a simple feature that correlates directly with viscosity. Plates need to be included in a ‘damage’ rheology so that weak plate boundaries form and evolve over time [40,41]. This lets plate geometries and the rate of plate turnover be strongly influenced by shallow processes. It lets the subadiabatic thermal gradient in the mid-mantle build-up over time and its stable stratification to affect the flow. Ideally the calculations should resolve D″ and plumes heads and tails. The successful generic models then can be used to formulate models of modern plate motions.

Tomographic modeling may proceed with geodynamic constraints imposed. For example, deep slabs broaden mechanically. They should be relatively isothermal and have thin conductive boundaries. Buoyant regions and negatively buoyant regions not associated with slabs should be rare in the mid-mantle as they already should have ascended or sunk out of this region if they somehow formed [3]. Optimally the density, temperature, and viscosity structure should be consistent with tomography, geoid, long-term fluid dynamics, and short-term glacial rebound. Simple scaling relationships help one get started on this task. They also let one effectively use existing tomograms and dynamic models.

## Acknowledgements

This research was in part supported by NSF grants EAR-9902079 and EAR-0000747. I thank Barbara Romanowicz for a helpful review.