# Diffusing diffusivities, stochastic subordination

In this post, I’ll discuss an idea emerging in the statistical physics community that attempts to resolve some unexplained observations of motion of single particles.1

# Anomalous yet Brownian

The motivation is as follows: suppose we observe two particles and keep track of their trajectory, as seen in Figure 1.

Figure 1: Two (simulated) trajectories of particles undergoing what seems to be classical Brownian motion (diffusion).

Although it certainly looks like they are “diffusing”, we want to characterize more precisely (statistically) the motion of these two particles. The standard way of doing so is mean squared displacement (MSD) analysis. Although “mean” could mean over many possible trajectories, in the context of single particle tracking, it is usually associated with a single trajectory. Call $X_i$ the $i$th observation of the particle, which we are assuming we observe at regular snapshots of time $\Delta t$, so $X_i$ is observed at time $i \Delta t$.

The MSD is then, an average squared displacement over all windows (lags) of a particular size $j$, $$$\text{MSD} := \langle X^2(j) \rangle = \frac{1}{n-j}\sum_{i=1}^{n-j} \left(X_{i+j} - X_{i}\right)^2, \tag{1}$$$ where, in (1), $n$ is our total number of observations. This quantity is nice because it distinguishes between a wide array of behaviors. Most notably, for classical diffusion, we have $$$\langle X^2(t) \rangle = 4Dt,$$$

meaning that we expect the MSD to be a line, where the slope encodes the diffusing coefficient (diffusivity). Deviations from a straight line are associated with anomalous diffusion, either subdiffusion or superdiffusion often due to interactions complex environments. Therefore, the MSD is the first tool in diagnosing the behavior of our particles, and can be seen in Figure 2.

Figure 2: A comparison of the mean squared displacement statistic $\langle X^2(\tau)\rangle$ as a function of the lag (window) size $\tau$.

From Figure 2, we see that the MSD curves for the two particles are not only lines, but effectively indistinguishable. Can we conclude they are undergoing the same, classical diffusive motion? We can go a bit further. Classical diffusion has another incredibly useful property: Gaussian behavior. In some time interval of size $t$, the probability of finding a particle displaced distance $x$ is the Gaussian $$$p(x,t) = \frac{1}{\sqrt{4\pi Dt}} e^{-\frac{x^2}{4Dt}}. \tag{2}$$$

We can compute this pretty easily for our data and get the following result.

Figure 3: Probability density $p(x,t)$ of displacements $x$ for time window $\tau$. For classical diffusion, we expect this to be a Gaussian that widens with time. Note the $\log$ axis.

In Figure 3, we finally see a clear difference between our two particles. Particle 1 (in green) seems to follow what we expect by (2): a Gaussian that spreads out with longer times $t$. However, particle 2 seems to be doing something completely different. Note that 3 is on a log scale, and that the tails (large $x$) values decay linearly, which translates to an exponential tail, or one that scales $p(x) \sim |x|^{\beta}$ for large $x$.

We’ve stumbled into an interesting question: what is particle 2 doing? It looks like classical diffusion in some ways (linear MSD), but distinctly different in others (non-Gaussian displacement). While I motivated this with simulated data, there is precedence for investigation from experiments.

In Figure 4, we see the first experimental evidence motivating this. In this paper, the authors find two experimental setups, both involving beads interacting with cytoskeletal filaments which produced linear MSDs, but exponential tails for displacement. They do note that the exponential tails existed only for small times, but nonetheless, there is a clear deviation from classical diffusion.

# Diffusing diffusivities

From the motivation, it should be somewhat intuitive that we want to come up with a model that produces both i) a linear MSD, ii) non-Gaussian displacements. In the past few years, a number of papers have addressed this.

A few different interpretations and models have emerged, as seen in Figure 5. I will summarize the broad idea, although the details vary a bit from paper-to-paper. The main observation is that some sort of memory (correlation) for the system is needed, but where it exists is subtle. For instance subdiffusion is a classical model of diffusion with memory, but is not what we want, as it produces non-linear MSD curves. These authors identify that the correlation needs to be in the step distance, rather than the step direction. That is, if we call $\Delta X_i := X_i - X_{i-1}$ then, $$$\langle \Delta X_i \Delta X_j \rangle = 0, \quad \text{but} \quad \langle\Delta X_i^2 \Delta X_j^2 \rangle \neq \langle \Delta X_i^2\rangle \langle \Delta X_j^2\rangle. \tag{3}$$$ That is, (3) encodes that the directions of steps are uncorrelated, but the magnitudes are not. The justification for this is that in a complex environment, particles change their diffusivity slowly by encountering different “parts” of the environment, meaning that there is a memory to the diffusivity itself. We could then assume that there is some probability distribution of diffusivities, denoted $\pi(D)$, meaning we can average the displacements over all possible values of $D$ to obtain $p(x,t)$: $$$p(x,t) = \int_{0}^{\infty} \pi(D) \frac{1}{\sqrt{4\pi D t}} e^{-\frac{x^2}{4Dt}}. \tag{4}$$$ One possibility is that the distribution of diffusivities is exponential, $$$\pi(D) = \frac{1}{\langle D \rangle} e^{-D/\langle D \rangle}, \tag{5}$$$ in which case, (4) becomes the Laplace distribution $$$p(x,t) = \frac{1}{\sqrt{4\langle D \rangle t}} e^{-\frac{|x|}{\sqrt{\langle D \rangle t}}}. \tag{6}$$$

(6) does indeed have the feature of exponential tails that we are hoping for, but this is for all $t$, even though, experimentally, we see convergence to Gaussian behavior for large $t$.

To resolve this, we can put dynamics on the diffusivity, $D$, with the following set of stochastic differential equations \begin{align} \dot{X}(t) &= \sqrt{2D(t)} \, \xi(t)\\ D(t) &= Y^2(t)\\ \dot{Y}(t)&=-\frac{1}{\tau} Y + \sigma \eta(t), \tag{7} \end{align}

where $\eta, \xi$ are independent white-noise processes. This says that $Y(t)$ undergoes an Ornstein-Uhlenbeck process and the magnitude of $Y(t)$ dictates the diffusivity $D(t)$. At long times, the statinary density of $D$ is exactly of the form (5) with $\langle D \rangle = \sigma^2 \tau$. In fact, (7) is exactly what I simulated to obtain particle 2 in Figure 1 so that $\langle D \rangle$ was equal to the constant $D$ of particle 1.

One interesting feature of (7) is worth pointing out right away. While we think of the particle $X(t) \in \mathbb{R}^d$ diffusing in $d=2,3$ dimensions, this formulation does not require that $Y(t)$ be evolving in the same number of dimensions. The authors provide commentary on this, but ultimately it seems to remain elusive. What would it mean for the diffusivity to evolve in a higher or lower dimension than the particle itself?

It’s also worth pointing out that (7) has actually been considered before in another context: finance. This process is closely related to the Cox-Ingersoll-Ross (CIR) model describing the evolution of interest rates.

# Stochastic subordination

Studying (7) further requires a cute mathematical trick called stochastic subordination. Effectively, this is a change of variables on time, where the evolution of the new time is random (and not necessarily monotonically increasing).

We can rewrite our system in Fokker-Planck form for the probability density $\tilde{p}(x,t)$ conditioned on some realization of $D(t)$, so $$$\tilde{p}(x,t) = p\left(x,t | D(t)\right)$$$ which just provides the standard diffusion equation $$$\partial_t \tilde{p}(x,t) = D(t) \nabla^2 \tilde{p}(x,t). \tag{8}$$$ With this in mind, we now think of the change of variables $$$\frac{\mathrm{d}}{\mathrm{d}\tau} x(\tau) = \sqrt{2} \xi(t), \qquad \frac{\mathrm{d}}{\mathrm{d}t} \tau(t) = D(t). \tag{9}$$$ The change of variables (9) is convenient because we now can write the solution to (8) as $$$\tilde{p}(x,\tau) = \frac{1}{\sqrt{4\pi \tau}} e^{-\frac{x^2}{4\tau}}, \tag{10}$$$ where $\tau$ in (10) must be interpreted as a random quantity, but, is known from (9) and (7) to be the stochastic integral $$$\tau(t) = \int_0^t D(s) \, \mathrm{d} s = \int_0^t Y^2(s) \, \mathrm{d} s. \tag{11}$$$ Let $T(\tau,t)$ be the probability density of $\tau$ which evolves by (11), then the true displacement can be found by integrating over all possibilities of $\tau$ $$$p(x,t) = \int_0^\infty T(\tau,t) \tilde{p}(x,\tau) \, \mathrm{d} \tau. \tag{12}$$$

I will leave the details out here, but we did all this work because the Fourier transform of $p(x,t)$, described by the integral (12) can be computed explicitly, which is somewhat surprising.

One nice way to quantify deviations from Gaussian-ness is the kurtosis, defined to be $$$K := \frac{\langle x^4(t)\rangle}{\langle x^2(t)\rangle^2}.$$$ Recall that a 1D Gaussian has kurtosis $K=3$, and values larger than $K=3$ measure how not-Gaussian a distribution is. By solving (12) using the Fourier transform, the authors find $$$K \sim \begin{cases} 9 & \text{for small }t \\ 3 & \text{for large }t, \end{cases} \tag{13}$$$

where I have written the results for 1D, but the results find this formula for any dimension. The kurtosis result (13) has the exact behavior observed experimentally: large deviations from Gaussian displacement (large $K$) for small $t$, but close to Gaussian behavior ($K \approx 3$) for large $t$.

# Testing the hypothesis

While the diffusing-diffusivity model seems to indeed match experiments quite well, it isn’t wildly mechanistic in its description. Therefore, we should be curious: can we make predictions from (or “verify”) this very coarsed grained model?

In Figure 6, the authors seek to answer this question. The whole idea behind this setup is to allow the particle to diffuse in an environment where the changes in $D$ are explicitly known. To manifest this, the authors create an apparatus to observe the motion of a bead diffusing near a wall. The wall influences the diffusion in the perpendicular direction, but not horizontal, meaning these two components can be decomposed and compared. From both theoretical and experimental setups, it is well established that the diffusivity decreases with the distance $z$ to the surface due to hydrodynamic effects, and is given by $$$D_\perp \sim D_0 \left(\frac{6z^2+2az}{6z^2+9az+2a^2} \right),$$$ where $D_0$ is the diffusivity with no confinement and $a$ is the particle radius. Although diffusion in the direction parallel to the wall is seen to be affected, it fluctuates much less than in the perpendicular direction. Moreover, at interactions of this scale, both gravity and electrostatic forces must be accounted for as well, which we will just bake in to some potential $U(z)$. With this known, we can write down (a discretized) SDE that tells us the change in the height $z$: $$$\Delta z \approx \left \{\partial_z D_\perp(z) - D_\perp(z)U'(z)/k_B T \right\}\Delta t + \sqrt{2D_\perp(z)}\xi,$$$

where $\xi$ is just a white noise term, the first $D_\perp$ term is due to diffusive drift, and the second accounts for the gravitational and electrostatic forces.

Ultimately, from their test, they conclude that if samples are taken faster than some critical time interval, so $\Delta t < \Delta t_c$, then fluctuations in the diffusivity $D$ dominate, and the predictions made by the diffusivity diffusivity hypothesis are seen: the displacements are non-Gaussian, yet the MSD grows linearly. However, over longer times $\Delta t > \Delta t_c$, the potential (describing the gravitational and electrostatic forces) dominate, and the fluctuations in $D$ get washed out. Therefore, these experiments provide evidence that at least in certain scenarios, the diffusing diffusivity explanation does indeed explain the observed statistics.

# Conclusion

The idea that particles in complex media don’t behave the way classical stochastic processes predict is a modern research paradigm, of which this issue of non-Gaussian diffusion is central. These mathematical approaches elegantly pin down coarse grained descriptions of possible behaviors that match experiments nicely. These theories are further strengthened by the predictions confirmed with the near-wall experiments. However, I’m still a little dissatisfied with the state of this question and think a lot of work will be done in this direction in the future. For one, there really are no mechanisms of how the environment would dictate $D(t)$. Of course, this is easier said than done for a complex environment, but until then: these are just models that broadly summarize the motion, rather than precisely describe it. Moreover, I suspect that these subtleties exist because of how we are measuring motion. The MSD is a classical tool that has been applied to a wide variety of problems, but ultimately, is problematic for this very reason: many processes produce the same MSD curve. I am curious about whether more advanced single particle statistics approaches (say, fancy Bayesian stuff) will make this research somewhat obsolete by doing a better job distinguishing different motions by assessing fundamentally different statistics.

1. much of the content in this post is taken from (great) slides by Mykyta Chubynsky

2. B. Wang et al., PNAS (2009)

3. A. V. Chechkin et al., Phys. Rev. X (2017)
M.V. Chubynsky and G.W. Slater, Phys. Rev. Lett. (2014)
Y. Lanoiselée, D. S. Grebenkov, arXiv:1711.09588

4. M. Matse et al., Phys. Rev. E (2017)