Let $V$ be an $n$-dimensional vector space, with basis
$$ B = \{ v_1, \ldots, v_n \}. $$
Now consider an arbitrary linear function $f$ that takes a vector $v \in V$ and maps it to a scalar in, for example, $\mathbb{R}$. Then express $v$ as a linear combination of the basis vectors:
$$ \begin{align*}
f(v)
&= f(c_1v_1 + \cdots + c_nv_n) \\
&= c_1 f(v_1) + \cdots + c_n f(v_n).
\end{align*} $$
This means that if we specify $f(v_1)$ through $f(v_n)$, we’ve completely characterized $f$. In other words, if you tell me what $f(v_1)$ through $f(v_n)$ equal, I know what $f(v)$ equals for any vector $v$.
So we can specify any linear function that maps from $V$ to $\mathbb{R}$ (called a “linear functional”) as a list of $n$ scalars. This makes it clear that a linear functional is an $n$-dimensional vector itself, and so linear functions form a $n$-dimensional vector space as well:
$$V^* = \{ f_{[a_1, \ldots, a_n]} \mid a_1, \ldots, a_n \in \mathbb{R} \}.$$ This is the dual space of $V$, where I’ve used $f_{[a_1, \ldots, a_n]}$ to denote the linear functional such that $f(v_1) = a_1$, etc.
Since $V$ and its dual space $V^*$ have the same dimension, each element in $V$ has a corresponding element in $V^*$. Consider that
$$
f_{[a_1, \ldots, a_n]} = a_1(c_1v_1) + \cdots + a_n(c_nv_n).
$$Look at the right-hand side — if the vectors we’re dealing with are Euclidean vectors, then we could write $\vecb{a} = (a_1, \ldots, a_n)$ and express any functional as a dot product:
$$
f_{\vecb{a}}(\vecb{v}) = \vecb{a} \cdot \vecb{v}.
$$ We can extend this idea to any inner product space. If $V$ has an inner product, then we can very naturally associate a vector $w \in V$ with its corresponding functional in $V^*$, defined as $$f(v) = \langle v, w \rangle.$$
Now let’s throw linear transformations into the mix. Let $T$ be a linear transformation from $U$ to $W$, and let $f \in W^*$ be a linear functional on $W$.
Then if $u \in U$, observe that $f(T(u))$ gives us a scalar. Since $f$ composed with $T$ maps an element in $U$ to a scalar, the composition $f \circ T$ is a linear functional itself, an element of $U^*$!
In this way, given a functional in $W^*$ and a transformation $T$, we can generate functionals in $U^*$. So we can define another linear transformation from $W^*$ to $U^*$ that gives us this functional: $$ T^\intercal(f) = f \circ T.$$
$T^\intercal$ is called the transpose of $T$, and it’s not a coincidence that the matrix of $T^\intercal$ is the transpose of the matrix of $T$ in certain bases.
Finally, let’s talk about inner product spaces again. If we have inner products for $U$ and $W$, we can, like above, associate vectors $u \in U$ and $w \in W$ with functionals in $U^*$ and $W^*$ respectively:
$$\begin{align*}
u \quad&\longleftrightarrow\quad f_u(v) = \langle v, u \rangle \\
w \quad&\longleftrightarrow\quad f_w(v) = \langle v, w \rangle.
\end{align*}$$
Recall that $T^\intercal$ takes a functional in $W^*$ and returns a functional in $U^*$. It’d be nice if we could deal with vectors in $W$ and $U$ instead of functionals in $W^*$ and $U^*$. Can we replace the transpose with another transformation that takes a vector in $W$, transforms it into its corresponding functional in $W^*$, apply $T^\intercal$ to it to get a functional in $U^*$, and finally transform it back into its corresponding vector in $U$?
It turns out we can: it’s called the adjoint of $T$, denoted $T^*$.
To see how $T^*$ must be defined, let’s take the definition of the transpose and substitute $f \in W^*$ with its inner product equivalent, $\langle v, w\rangle$, where $w$ is the vector in $W$ associated with $f$, and $v$ is the functional’s argument:
$$\begin{align*}
T^\intercal(f)
&= T^\intercal(\langle v, w\rangle) \\
&= (\langle v, w\rangle) \circ T \\
&= \langle T(v), w\rangle.
\end{align*}$$
Then asking what vector in $U$ corresponds to this functional amounts to asking what vector satisfies
$$ \langle T(v), w\rangle = \langle v, ?\rangle. $$
Thus, $T^*$ is the linear transformation that is defined to be the one that makes the following equality true:
$$ \langle T(v), w\rangle = \langle v, T^*(w)\rangle. $$
Once we do that, we can go on to define important operators like normal, unitary, and Hermitian operators!
]]>Separation of variables says that we can simply “split” the derivative $\dod{y}{x}$ and write
$$f(y) \dif{y} = g(x) \dif{x}$$
Then, we can integrate both sides: $$\begin{gather*}
\int f(y) \dif{y} = \int g(x) \dif{x} \\
F(y) = G(x) + C
\end{gather*}$$ where $F(y)$ and $G(x)$ are antiderivatives of $f(y)$ and $g(x)$ respectively. And voila, we can use algebra to solve the differential equation.
But the “splitting” part always bugged me. Now my professor has explained why this is okay to do. Consider the antiderivative of $f(y)$, and differentiate it with respect to $x$.
We know that
$$\begin{align*}
\dod{}{x}F(y) &= \dod{}{y}F(y) \cdot \dod{y}{x} \text{ by the chain rule} \\
&= f(y) \dod{y}{x}
\end{align*}$$because $F(y)$ is the antiderivative of $f(y)$, so the derivative of $F(y)$ must be the function $f(y)$ itself.
Now, substituting into the differential equation for $f(y) \dod{y}{x}$, we have
$$\dod{}{x}F(y) = g(x)$$
Now, if we integrate the function on the left ($\dod{}{x}F(y)$) and the function on the right ($g(x)$) with respect to $x$, we should get the same antiderivative, up to a constant, so
$$ \int \left[\dod{}{x}F(y)\right] \dif{x} = G(x) + C$$
The antiderivative of a derivative of a function is just the function itself (up to a constant), so
$$ F(y) = G(x) + C$$ which is the result we get from doing it the “intuitive” way.
Let’s say we have a linear transformation $T$, and we want to find $T(\vecb{x})$. But what if transforming $\vecb{x}$ is difficult? Then, we’ll say that $T(\vecb{x}) = A\vecb{x}$, where $A$ is the associated matrix, and we’ll find its eigenvectors and eigenvalues, the $\vecb{v}$’s and $\lambda$’s such that $A\vecb{x} = \lambda\vecb{x}$.
If these eigenvectors happen to form a basis, then $\vecb{x}$ can be written as a linear combinations of the eigenvectors, $c_1\vecb{v}_1 + \ldots + c_n\vecb{v}_n$, and then we can plug it back into our transformation.
$$\begin{align}
T(\vecb{x}) &= A\vecb{x} \\
&= A(c_1\vecb{v}_1 + \ldots + c_n\vecb{v}_n) \\
&= c_1A\vecb{v}_1 + \ldots + c_nA\vecb{v}_n \\
&= c_1\lambda_1\vecb{v}_1 + \ldots + c_n\lambda_n\vecb{v}_n \\
\end{align}$$
Depending on the transformation we’re working with, this new calculation could turn out to be a lot easier!
Update: But that’s kind of clunky. Let’s try to express more simply the sum
$$ T(\vecb{x}) = c_1\lambda_1\vecb{v}_1 + \ldots + c_n\lambda_n\vecb{v}_n$$
Matrices were made to deal with large sets of data, so let’s define $P$ to be a matrix of eigenvectors (as column vectors) and $D$ to be a matrix of eigenvalues along the main diagonal (and zeros elsewhere):
$$
P = \begin{bmatrix}
| & & | \\
\vecb{v}_1 & \ldots & \vecb{v}_n \\
| & & | \\
\end{bmatrix} \qquad
D = \begin{bmatrix}
\lambda_1 & \\
& \ddots \\
&& \lambda_n
\end{bmatrix}$$
If we multiply the matrix $P$, the matrix $D$, and the vector $c$, notice that we get
$$
\begin{align*}
PD\vecb{c} &= \begin{bmatrix}
| & & | \\
\vecb{v}_1 & \ldots & \vecb{v}_n \\
| & & | \\
\end{bmatrix}
\begin{bmatrix}
\lambda_1 & \\
& \ddots \\
&& \lambda_n
\end{bmatrix} \begin{bmatrix} c_1 \\ \vdots \\ c_n \end{bmatrix} \\
&= \begin{bmatrix}
| & & | \\
\vecb{v}_1 & \ldots & \vecb{v}_n \\
| & & | \\
\end{bmatrix}\begin{bmatrix} \lambda_1 c_1 \\ \vdots \\ \lambda_n c_n \end{bmatrix} \\
&=c_1\lambda_1\vecb{v}_1 + \ldots + c_n\lambda_n\vecb{v}_n \\
\end{align*}
$$
which is the original sum.
But we still have those $c_k$’s in there. We know that
$$
\begin{align*}
\vecb{x} &= \begin{bmatrix}
| & & | \\
\vecb{v}_1 & \ldots & \vecb{v}_n \\
| & & | \\
\end{bmatrix} \begin{bmatrix} c_1 \\ \vdots \\ c_n \end{bmatrix} \\
&= P\vecb{c}
\end{align*}
$$
so $\vecb{c} = P^{-1}\vecb{x}$ (provided $P$ is invertible).
Cool! Since we know that $T(\vecb{x}) = PD\vecb{c}$, we have our final expression:
$$ T(\vecb{x}) = A\vecb{x} = PDP^{-1} \vecb{x} $$
This is the diagonalization of $A$, where $D$ is the diagonalized matrix and $P$ is the change-of-basis matrix. Why would we do this? Working with diagonal matrices tends to be a lot easier than working with non-diagonal matrices. Again, depending on the transformation, this new calculation could be a lot easier than simply applying the original transformation.
]]>So let’s say we’re working with forces in physics. If you look at some common forces, like gravity and the electromagnetic force, we see that forces tend to depend on position:
$$
\begin{gather}
\vecb{F}(\vecb{r}) = \frac{GMm}{r^2} \hatb{r} \\
\vecb{F}(\vecb{r}) = \frac{kq_1q_2}{r^2} \hatb{r}
\end{gather}
$$
But force is defined using a time derivative of velocity:
$$ \vecb{F}(t) = m\vecb{a} = m\dod{\mathbf{v}}{t} $$
This is kind of inconvenient. The definition of force depends on time, but most forces are expressed with respect to position. Meanwhile, the positions of particles tend to vary with time (i.e., they move), making things more complicated. In order to work with moving particles experiencing forces, we would need to convert all of our times into positions or vice-versa.
It turns out that there’s an easier way. Someone clever decided to integrate both sides of Newton’s second law along the position to get these line integrals along a path $C$:
$$\begin{align}
\int_C \vecb{F} \cdot \dif{\vecb{r}} &= m \int_C \dod{\vecb{v}}{t} \cdot \dif{\vecb{r}} \\
&= m \int_{t_1}^{t_2} \dod{\vecb{v}}{t} \cdot \vecb{v}\dif{t} \\
&= m \int_C \vecb{v} \cdot \dif{\vecb{v}} \\
&= m \left( \int_{v_{x1}}^{v_{x2}} v_x \dif{v_x} + \int_{v_{y1}}^{v_{y2}} v_y \dif{v_y} + \int_{v_{z1}}^{v_{z2}} v_z \dif{v_z} \right) \\
&= \frac{1}{2}m \left( \left(v_{x2}^2 – v_{x1}^2\right) + \left(v_{y2}^2 – v_{y1}^2\right) + \left(v_{z2}^2 – v_{z1}^2\right) \right) \\
&= \frac{1}{2}mv_2^2 – \frac{1}{2}mv_1^2 \\
\end{align}$$
And we’re left with the Work-Energy Theorem:
$$\int_C \vecb{F} \cdot \dif{\vecb{r}} = \frac{1}{2}mv_2^2 – \frac{1}{2}mv_1^2$$
Now, if we define the quantity $\frac{1}{2}mv^2$ to be the “kinetic energy” and the quantity $\int \vecb{F} \cdot \dif{\vecb{r}}$ to be the “work” (the “total” force applied to a system over a distance), then we can express the theorem intuitively: work applied to a system increases the kinetic energy. And now we can use it to solve problems without having to deal with times — only distances.
Let’s try a quick example. If gravity applies a force of 10 N to a box from rest for 10 m, what will its velocity be? In this case, $F = 10 \text{ N}$, $r_2 = 10 \text{ m}$, and $r_1 = 0 \text{ m}$. We can now plug in (we can replace the line integral with a regular integral, assuming we’re moving in a straight line):
$$\begin{gather}
\int_{r_1}^{r_2} F \dif{r} = \frac{1}{2}mv_2^2 – \frac{1}{2}mv_1^2 \\
F(r_2 – r_1) = \frac{1}{2}mv_2^2 – \frac{1}{2}mv_1^2 \\
\end{gather}$$
And from there, we can solve for velocity. Again, notice that we don’t care about the time it takes, just the distance over which a force acts. Cool! (What if we cared about the time it takes rather than the distance? Then we can use impulse, which comes from the definition of force.)
There’s one problem with the Work-Energy Theorem: evaluating the line integral is annoying. Converting that line integral into a regular integral is fine for the example I just gave, but it won’t work if the path is not in the same direction as the force.
It turns out, though, that a lot of the forces we care about (gravity, springs, electromagnetic force) are conservative forces. This means that the line integral is path-independent: its value doesn’t depend on which path you take from point A to point B; it only matters where point A and point B are.
One analogy is with regular integrals. If you’re integrating from $x = 0$ to $x = 5$, you’ll always get the same answer as long as the ultimate endpoints are the same.
$$ \int_0^5 f(x) \dif{x} = \int_0^{-10} f(x) \dif{x} + \int_{-10}^7 f(x) \dif{x} + \int_7^5 f(x) \dif{x}$$
Along the same lines, for line integrals of conservative forces, you’ll always get the same answer no matter which path you take.
The path-independence of regular integrals is what makes the fundamental theorem of calculus possible:
$$ \int_a^b f(x) \dif{x} = F(b) – F(a)$$
It turns out that there’s an analogous version of the fundamental theorem for line integrals of conservative forces:
$$ \int_C \vecb{F}(\vecb{r}) \cdot \dif{\vecb{r}} = U(\vecb{r}_2) – U(\vecb{r}_1)$$ where $U$ is the “antiderivative” — the scalar potential of the vector field $\vecb{F}$. In other words, for every conservative force, we can define a potential function that we can use to evaluate line integrals of conservative forces. This potential function will have a value for every point in space. (Note that the value will be a scalar and not a vector.) For example, for the force of gravity, $\vecb{F}(\vecb{r}) = \frac{GMm}{r^2}\hatb{r}$, and $U(\vecb{r}) = -\frac{GMm}{r} + C$, where $C$ is the constant of integration, here usually taken so that $U(\infty) = 0$. Notice that $\dod{U}{r} = F$ (or the multidimensional equivalent, $\nabla{U} = \vecb{F})$!
So for a conservative force, the Work-Energy Theorem becomes:
$$U(\vecb{r}_2) – U(\vecb{r}_1) = \frac{1}{2}mv_2^2 – \frac{1}{2}mv_1^2$$ or
$$\frac{1}{2}mv_1^2 + U(\vecb{r}_1) = \frac{1}{2}mv_2^2 + U(\vecb{r}_2)$$
Let’s call $U$ the potential energy, and now we’ve obtained an expression for the conservation of mechanical energy. As long as we’re talking about conservative forces like gravity, the sum of the kinetic and potential energy stay the same no matter what you do.
So that’s how you get work, kinetic energy, and potential energy from forces. I hope they seem more connected now and less like a jumble of concepts that physics teachers throw at you! Leave a comment if it was helpful!
]]>For example, here’s one: Consider 1.00 L of a 0.082 M solution of aqueous formic acid (HCO_{2}H), where $K_a= 1.78 \times 10^{-4}$. What are the equilibrium concentrations of HCO_{2}H, HCO^{–}, H_{3}O^{+}, and OH^{–}?
There are two relevant reactions (we know their equilibrium constants):
We can set up an ICE table with the initial, change in, and equilibrium concentrations ([X] denotes concentration of chemical X):
[HCO_{2}H] | [HCO^{–}] | [H_{3}O^{+}] | [OH^{–}] | |
---|---|---|---|---|
initial | $$0.082$$ | $$10^{-7}$$ | $$0$$ | $$10^{-7}$$ |
change | $$-x$$ | $$x + y$$ | $$x$$ | $$y$$ |
equilibrium | $$0.082-x$$ | $$10^{-7}+x + y$$ | $$x$$ | $$10^{-7}+y$$ |
And then we can plug these final concentrations into our two equilibrium constant formulas:
We can solve this system by either making some approximations or shoving it all into something like Mathematica. If we do that, we get [HCO_{2}H] = 0.0783 M, [HCO^{–}] = 0.00373 M, [H_{3}O^{+}] = 0.00373 M, and [OH^{–}] = 2.68 × 10^{-12} M.
But that’s annoying. Being a programmer, I thought it would be a good idea to try and write a program that numerically solves these problems. It was a fun distraction from my actual homework! It works by pushing each reaction in the right direction until its reaction quotient becomes its equilibrium constant.The below is the result (Chrome is best — Firefox doesn’t support the HTML5 <input type="range"> element!):
Here is the code:
HTML
<fieldset id="equilibria"> <legend>Equilibrium simulator <button id="eq-start">Start</button></legend> <b>Reactions</b> <ul id="eq-equations"></ul> <b>Concentrations</b> <div id="eq-concentrations"></div> </fieldset>
JavaScript
// Initial concentrations of each chemical (in mol/L) var concentrations = { 'HCO2H': 0.082, 'HCO2-': 0, 'H3O+': 1e-7, 'OH-': 1e-7 }; // The equations describing the equilibria and their equilibrium constants var equilibria = [ { 'left': ['H2O'], 'right': ['H3O+', 'OH-'], 'K': 1e-14 }, { 'left': ['HCO2H', 'H2O'], 'right': ['HCO2-', 'H3O+'], 'K': 1.78e-4 }, ]; // Whether the simulation is running var going = false; // Some helper functions function p(x) { return -Math.log(x) / Math.LN10; } function e(x) { return Math.pow(10, -x); } function product(a, b) { return a * b; } function min(a, b) { return a < b ? a : b; } function max(a, b) { return a > b ? a : b; } // Returns the concentration of a chemical function c(chem) { return concentrations[chem]; } // Returns true or false depending on whether the reaction quotient // depends on this chemical (that is, does its activity = 1?) function acts(chem) { return chem in concentrations; } // Takes an equation and pushes it towards equilibrium function equilibriate(eq) { var left = eq.left.filter(acts); var right = eq.right.filter(acts); // Calculate Q var quotient = right.map(c).reduce(product, 1) / left.map(c).reduce(product, 1); var reactants = quotient < eq.K ? left : right; var products = quotient < eq.K ? right : left; eq.updateQ(quotient); // Calculate how close we are to equilibrium and adjust the step size. // This formula is kind of hacked together, and it breaks on certain reactions. // Any suggestions for how to implement how big the step size should be? var closeness = Math.min(Math.abs(eq.K / quotient - 1), 1); var diff = (reactants.length ? reactants.map(c).reduce(min) : Math.pow(products.map(c).reduce(product), 1 / products.length)) / 2 * closeness * closeness; // Convert some reactants into products reactants.forEach(function (chem) { concentrations[chem] -= diff; }); products.forEach(function (chem) { concentrations[chem] += diff; }); } // Display all the outputs... this is some ugly code; don't judge! // Write out each equilibrium reaction equilibria.forEach(function (eq, i) { var li = document.createElement('li'); var quotient = document.createElement('span'); li.appendChild(document.createTextNode(eq.left.join(' + ') + ' \u21C4 ' + eq.right.join(' + ') + '; K = ' + eq.K + ', Q = ')); li.appendChild(quotient); document.getElementById('eq-equations').appendChild(li); eq.updateQ = function (q) { quotient.innerHTML = q; } }); // Make all the concentration outputs var updateConcentration = {}; Object.getOwnPropertyNames(concentrations).forEach(function (chem) { var display = document.createElement('span'); var slider = document.createElement('input'); slider.type = 'range'; slider.min = 0; slider.max = 14; slider.step = 0.1; slider.onchange = function () { concentrations[chem] = e(this.value); }; var div = document.createElement('div'); div.appendChild(document.createTextNode('[' + chem + '] = ')); div.appendChild(display); div.appendChild(document.createTextNode(' mol/L')); div.appendChild(document.createElement('br')); div.appendChild(slider); updateConcentration[chem] = function () { display.innerHTML = c(chem); slider.value = p(c(chem)); }; document.getElementById('eq-concentrations').appendChild(div); }); document.getElementById('eq-start').onclick = function () { this.innerHTML = going ? 'Start' : 'Stop'; going = !going; }; // Start the simulation! setInterval(function () { if (going) equilibria.forEach(equilibriate); for (var chem in concentrations) updateConcentration[chem](); }, 50);
I think it’s pretty cool that it gets the right answers! The thing I ran into the most trouble with was how much reactant to convert to product at each step — it’s hard to pick one because the concentrations span many orders of magnitude. If anyone has any ideas, I’d love to know in the comments below!
]]>But I think that it’s easy to get lost in the details and overlook one crucial piece of logic. It’s the basis of all algebra, but it’s a kind of hidden fact that becomes important in more complicated problems.
Let’s say we have the equation $2x + 1 = 5$, and we’re trying to find all possible values of $x$.
By instinct, one might jump directly into subtracting $1$ from both sides and dividing both sides by $2$:
$$\begin{gather*}
2x + 1 = 5 \\
2x + 1 – 1 = 5 – 1 \\
2x = 4 \\
\frac{2x}{2} = \frac{4}{2} \\
x = 2 \\
\end{gather*}$$
But here’s why the algebra works: these clauses are linked by “if and only if”s.
$2x + 1$ equals $5$ if and only if $2x$ equals $4$, which is true if and only if $x = 2$. Therefore, we can conclude that $2x + 1 = 5$ if and only if $x = 2$.
One way to look at it is that we’re replacing the original equation with equivalent forms. $2x + 1 = 5$ says exactly the same information as $x = 2$, just written slightly differently. That’s really the whole idea of algebra (and logic) — replacing one statement with another exactly equivalent until it’s obvious what your result is.
That’s why, in general, it’s not a productive idea to multiply both sides by zero: $2x + 1 = 5$ is not exactly equivalent to $0 \cdot (2x + 1) = 0 \cdot 5$ because the latter is true no matter what $x$ is. Therefore, we’ve lost some information in multiplying by zero.
“If and only if” is abbreviated $\Leftrightarrow$, and I wish that I was taught algebra with that notation. It’d be easier to remember that $\sqrt{x^2} = 1$ is not exactly equivalent to $x = 1$ (it can be $-1$ too!) if I had that symbol to remind me.
]]>The first thing I needed to do was to find the URLs of all the photos. I wrote this quick JavaScript snippet that does that. To get the URLs, I navigate to the page with the album, open the console (Ctrl-Shift-J in Chrome), and paste in this code:
console.log(Array.prototype.map.call(document.querySelectorAll('table.uiGrid.fbPhotosGrid a.uiMediaThumb'), function (a) { // The URL to the photo is within the 'ajaxify' attribute of each thumbnail link // Construct an 'a' element to parse the query string var b = document.createElement('a'); b.href = a.getAttribute('ajaxify'); var params = b.search.replace(/^\?/, '').split('&'); for (var i = 0; i < params.length; i++) if (params[i].indexOf('src') == 0) return unescape(params[i].split('=')[1]); return ''; }).join('\n'));
I hit enter and get a list of URLs:
Now that I have the URLs, I pasted that in a file named urls
. I made a directory named album
and then wrote this simple PHP script that downloads the files and then compresses them into a ZIP file.
<?php $fp = fopen('urls', 'r'); $i = 1; while ($url = fgets($fp)) { file_put_contents("album/$i.jpg", file_get_contents($url)); echo "Downloaded $i\n"; $i++; } shell_exec('zip album.zip album/*');
And now, I have a file named album.zip
with all the photos!
$$D(x) = \begin{cases}1 & \text{ if } x \in \mathbb{Q} \\ 0 & \text{ otherwise }\end{cases}$$
That’s cool, but what’s even cooler is that there exists this expression for the Dirichlet function:
$$D(x) = \lim_{m \to \infty} \lim_{n \to \infty} [\cos(m! \pi x)]^{2n}$$
Unfortunately, neither Wikipedia nor Wolfram’s web site explains why this is equivalent to the Dirichlet function, so I spent a long time thinking about it. Here’s what I’ve come up with.
Let’s consider a simplified version first. Consider the function $f(x) = \cos^2\pi x$. It equals $1$ so long as $x$ is an integer, and a positive decimal less than $1$ otherwise. This is simple to see by graphing the function.
Now, if you exponentiate the result an infinite number of times — i.e., $\lim_{n \to \infty} f(x)^n$ — you end up with two outcomes. If $x$ is an integer, then $f(x) = 1$, and $f(x)$ exponentiated infinitely will still equal $1$. If $x$ is not an integer, then $f(x) < 1$ and will therefore get smaller after every exponentiation, eventually hitting $0$. That is to say, the limit will equal $0$ for non-integers. We've basically created an indicator function for integers: $$\operatorname{isInteger}(x) = \lim_{n \to \infty} (\cos\pi x)^{2n} = \begin{cases}1 & \text{ if } x \in \mathbb{Z} \\ 0 & \text{ otherwise }\end{cases}$$ Now we can rewrite the Dirichlet function in terms of this $\operatorname{isInteger}$ function: $$D(x) = \lim_{m \to \infty} \operatorname{isInteger}(m! x)$$ This is the especially cool part: we can show that if $x$ is rational, then $\lim_{m \to \infty} m! x$ is an integer. If we let $m$ approach $\infty$, then we have a product of all positive integers multiplied by the input $x$. If $x$ is rational, then $x$ can be written as a fraction of integers $\frac{p}{q}$ where $p$ and $q$ are integers. The $q$ will then cancel with one of the integers in the factorial, thus making the whole product $m! x$ an integer. In the other case — where $x$ is irrational — this won't happen, and $m!x$ will not be an integer. Visually, for rational numbers (because $x = \frac{p}{q}$): $$\lim_{m \to \infty} m! x = \lim_{m \to \infty} \frac{p}{\not{q}} \cdot (1 \cdots \not{q} \cdots m)$$
Now if we plug the either-integer-or-non-integer result into our $\operatorname{isInteger}$ function, what results is an indicator function for rationality — the Dirichlet function.
]]>How it works is that it takes a source document and records the probability that certain words appear after each word. For example, if the sentence is “I think that I will think of the will that I wrote,” it would record that “I” is followed by “think” a third of the time, “will” a third of the time, and “wrote” a third of the time. “Think” is followed by “that” and “of” half of the time each, and “that” is followed by “I” a hundred percent of the time, etc.
Visually:
Once the program records the probability, it can generate realistic-sounding texts. It’ll start with the root node (I) and follow random paths until it reaches a node that doesn’t go anywhere. For example, a text it might generate is: “I will think of the will that I will think that I wrote.” Obviously, it doesn’t sound perfect, but if you feed it more source text, it usually produces some pretty hilarious text.
Here’s the demo. Paste some source text into the first box, click Add, and repeat. Then, click Generate, and it should produce some Markov chain text! You can keep clicking Generate for new text. I’ve prepopulated the source textbox with some text about lions from Wikipedia, but feel free to try it with something else!
Sadly, the generator wasn’t able to produce a usable essay for my college applications.
Here’s the HTML:
<textarea id="in" rows="15" cols="50"></textarea><br /> <button id="add">Add</button> <button id="generate" disabled="disabled">Generate</button><br /> <textarea id="out" rows="15" cols="50"></textarea>
And here’s the JavaScript to go with it:
// Returns the source text function get() { return document.getElementById('in').value; } // Clears the source textbox function clear() { document.getElementById('in').value = ''; } // Writes to the output textbox function set(v) { document.getElementById('out').value = v; } // Holds the state information var cache = { '_START': [] }; document.getElementById('add').onclick = function () { // Get the source text and split it into words var text = get().split(/\s+/g); if (!text.length) return; document.getElementById('generate').disabled = false; // Add it to the start node's list of possibility cache['_START'].push(text[0]); // Now go through each word and add it to the previous word's node for (var i = 0; i < text.length - 1; i++) { if (!cache[text[i]]) cache[text[i]] = []; cache[text[i]].push(text[i + 1]); // If it's the beginning of a sentence, add the next word to the start node too if (text[i].match(/\.$/)) cache['_START'].push(text[i + 1]); } clear(); }; document.getElementById('generate').onclick = function () { // Start with the root node var currentWord = '_START'; var str = ''; // Generate 300 words of text for (var i = 0; i < 300; i++) { // Follow a random node, append it to the string, and move to that node var rand = Math.floor(Math.random() * cache[currentWord].length); str += cache[currentWord][rand]; // No more nodes to follow? Start again. (Add a period to make things look better.) if (!cache[cache[currentWord][rand]]) { currentWord = '_START'; if (!cache[currentWord][rand].match(/\.$/)) str += '. '; else str += ' '; } else { currentWord = cache[currentWord][rand]; str += ' '; } } set(str); }]]>
For example, take $f(x) = x^2$ and $g(x) = x + 2$. Then,
$$\operatorname{min}(x^2,\ x + 2) = \frac{x^2 + x + 2 – |x^2 – (x + 2)|}{2}$$
whose graph looks like this:
As you can see, it takes on the shape of $f(x)$ (the parabola) when $f(x)$ is smaller and $g(x)$ (the linear part) when $g(x)$ is smaller.
This can be explained by splitting the fraction for the expression up:
$$\begin{align}
\operatorname{min}(f(x),\ g(x)) &= \frac{f(x) + g(x) – |f(x) – g(x)|}{2} \\
&= \frac{f(x) + g(x)}{2} – \frac{|f(x) – g(x)|}{2}
\end{align}$$
You can see that the first term in the split formula is the average of the two functions, and the second term is half the distance (absolute value) between the two functions. When you subtract half the distance between the two functions from the average of the two functions, you always get the smaller function.
What would happen if you add the distance between the two functions instead of subtracting? You get the greater of the two functions:
$$\operatorname{max}(f(x),\ g(x)) = \frac{f(x) + g(x) + |f(x) – g(x)|}{2}$$
I don’t remember where I found this, but it’s pretty awesome!
]]>