## Proof of separation of variables

Let’s say we have a differential equation:
$$f(y) \dod{y}{x} = g(x)$$

Separation of variables says that we can simply “split” the derivative $\dod{y}{x}$ and write
$$f(y) \dif{y} = g(x) \dif{x}$$

Then, we can integrate both sides: $$\begin{gather*} \int f(y) \dif{y} = \int g(x) \dif{x} \\ F(y) = G(x) + C \end{gather*}$$ where $F(y)$ and $G(x)$ are antiderivatives of $f(y)$ and $g(x)$ respectively. And voila, we can use algebra to solve differential equation.

But the “splitting” part always bugged me. Now my professor has explained why this is okay to do. Consider the antiderivative of $f(y)$, and differentiate it with respect to $x$.

We know that
\begin{align*} \dod{}{x}F(y) &= \dod{}{y}F(y) \cdot \dod{y}{x} \text{ by the chain rule} \\ &= f(y) \dod{y}{x} \end{align*}because $F(y)$ is the antiderivative of $f(y)$, so the derivative of $’(y)$ must be the function $f(y)$ itself.

Now, substituting into the differential equation for $f(y) \dod{y}{x}$, we have
$$\dod{}{x}F(y) = g(x)$$

This is saying that we have two function that are equivalent: $g(x)$, and $h(x) = \dod{}{x}F(y)$ . Now, if we integrate both functions with respect to $x$, we should get the same antiderivative, up to a constant, so
$$\int \left[\dod{}{x}F(y)\right] \dif{x} = G(x) + C$$

The antiderivative of a derivative of a function is just the function itself (up to a constant), so
$$F(y) = G(x) + C$$ which is the result we get from doing it the “intuitive” way.

March 15, 2013, 1:45am by Casey

## Why eigenvectors and eigenvalues?

I took linear algebra in high school, and one thing that really confused me was eigenvectors and eigenvalues. Why are they so important? I’m taking linear algebra again, and my professor made it very clear why.

Let’s say we have a linear transformation $T$, and we want to find $T(\vecb{x})$. But what if transforming $\vecb{x}$ is difficult? Then, we’ll say that $T(\vecb{x}) = A\vecb{x}$, where $A$ is the associated matrix, and we’ll find its eigenvectors and eigenvalues, the $\vecb{v}$’s and $\lambda$’s such that $A\vecb{x} = \lambda\vecb{x}$.

If these eigenvectors happen to form a basis, then $\vecb{x}$ can be written as a linear combinations of the eigenvectors, $c_1\vecb{v}_1 + \ldots + c_n\vecb{v}_n$, and then we can plug it back into our transformation.

\begin{align} T(\vecb{x}) &= A\vecb{x} \\ &= A(c_1\vecb{v}_1 + \ldots + c_n\vecb{v}_n) \\ &= c_1A\vecb{v}_1 + \ldots + c_nA\vecb{v}_n \\ &= c_1\lambda_1\vecb{v}_1 + \ldots + c_n\lambda_n\vecb{v}_n \\ \end{align}
Depending on the transformation we’re working with, this new calculation could turn out to be a lot easier!

Update: But that’s kind of clunky. Let’s try to express more simply the sum
$$T(\vecb{x}) = c_1\lambda_1\vecb{v}_1 + \ldots + c_n\lambda_n\vecb{v}_n$$

Matrices were made to deal with large sets of data, so let’s define $P$ to be a matrix of eigenvectors (as column vectors) and $D$ to be a matrix of eigenvalues along the main diagonal (and zeros elsewhere):
$$P = \begin{bmatrix} | & & | \\ \vecb{v}_1 & \ldots & \vecb{v}_n \\ | & & | \\ \end{bmatrix} \qquad D = \begin{bmatrix} \lambda_1 & \\ & \ddots \\ && \lambda_n \end{bmatrix}$$

If we multiply the matrix $P$, the matrix $D$, and the vector $c$, notice that we get
\begin{align*} PD\vecb{c} &= \begin{bmatrix} | & & | \\ \vecb{v}_1 & \ldots & \vecb{v}_n \\ | & & | \\ \end{bmatrix} \begin{bmatrix} \lambda_1 & \\ & \ddots \\ && \lambda_n \end{bmatrix} \begin{bmatrix} c_1 \\ \vdots \\ c_n \end{bmatrix} \\ &= \begin{bmatrix} | & & | \\ \vecb{v}_1 & \ldots & \vecb{v}_n \\ | & & | \\ \end{bmatrix}\begin{bmatrix} \lambda_1 c_1 \\ \vdots \\ \lambda_n c_n \end{bmatrix} \\ &=c_1\lambda_1\vecb{v}_1 + \ldots + c_n\lambda_n\vecb{v}_n \\ \end{align*}
which is the original sum.

But we still have those $c_k$’s in there. We know that
\begin{align*} \vecb{x} &= \begin{bmatrix} | & & | \\ \vecb{v}_1 & \ldots & \vecb{v}_n \\ | & & | \\ \end{bmatrix} \begin{bmatrix} c_1 \\ \vdots \\ c_n \end{bmatrix} \\ &= P\vecb{c} \end{align*}
so $\vecb{c} = P^{-1}\vecb{x}$ (provided $P$ is invertible).

Cool! Since we know that $T(\vecb{x}) = PD\vecb{c}$, we have our final expression:
$$T(\vecb{x}) = A\vecb{x} = PDP^{-1} \vecb{x}$$

This is the diagonalization of $A$, where $D$ is the diagonalized matrix and $P$ is the change-of-basis matrix. Why would we do this? Working with diagonal matrices tends to be a lot easier than working with non-diagonal matrices. Again, depending on the transformation, this new calculation could be a lot easier than simply applying the original transformation.

March 5, 2013, 11:08am by Casey

## Beyond forces: work, kinetic energy, and potential energy

I learned about stuff like the work and kinetic energy in high school, but I never really understood why theses concepts were introduced — it feels like they just fell out of the sky. That changed today in physics lecture, and I think it’s so cool to understand where exactly these concepts come from.

So let’s say we’re working with forces in physics. If you look at some common forces, like gravity and the electromagnetic force, we see that forces tend to depend on position:

$$\begin{gather} \vecb{F}(\vecb{r}) = \frac{GMm}{r^2} \hatb{r} \\ \vecb{F}(\vecb{r}) = \frac{kq_1q_2}{r^2} \hatb{r} \end{gather}$$

But force is defined using a time derivative of velocity:
$$\vecb{F}(t) = m\vecb{a} = m\dod{\mathbf{v}}{t}$$

This is kind of inconvenient. The definition of force depends on time, but most forces are expressed with respect to position. Meanwhile, the positions of particles tend to vary with time (i.e., they move), making things more complicated. In order to work with moving particles experiencing forces, we would need to convert all of our times into positions or vice-versa.

It turns out that there’s an easier way. Someone clever decided to integrate both sides of Newton’s second law along the position to get these line integrals along a path $C$:
\begin{align} \int_C \vecb{F} \cdot \dif{\vecb{r}} &= m \int_C \dod{\vecb{v}}{t} \cdot \dif{\vecb{r}} \\ &= m \int_{t_1}^{t_2} \dod{\vecb{v}}{t} \cdot \vecb{v}\dif{t} \\ &= m \int_C \vecb{v} \cdot \dif{\vecb{v}} \\ &= m \left( \int_{v_{x1}}^{v_{x2}} v_x \dif{v_x} + \int_{v_{y1}}^{v_{y2}} v_y \dif{v_y} + \int_{v_{z1}}^{v_{z2}} v_z \dif{v_z} \right) \\ &= \frac{1}{2}m \left( \left(v_{x2}^2 – v_{x1}^2\right) + \left(v_{y2}^2 – v_{y1}^2\right) + \left(v_{z2}^2 – v_{z1}^2\right) \right) \\ &= \frac{1}{2}mv_2^2 – \frac{1}{2}mv_1^2 \\ \end{align}

And we’re left with the Work-Energy Theorem:
$$\int_C \vecb{F} \cdot \dif{\vecb{r}} = \frac{1}{2}mv_2^2 – \frac{1}{2}mv_1^2$$

Now, if we define the quantity $\frac{1}{2}mv^2$ to be the “kinetic energy” and the quantity $\int \vecb{F} \cdot \dif{\vecb{r}}$ to be the “work” (the “total” force applied to a system over a distance), then we can express the theorem intuitively: work applied to a system increases the kinetic energy. And now we can use it to solve problems without having to deal with times — only distances.

Let’s try a quick example. If gravity applies a force of 10 N to a box from rest for 10 m, what will its velocity be? In this case, $F = 10 \text{ N}$, $r_2 = 10 \text{ m}$, and $r_1 = 0 \text{ m}$. We can now plug in (we can replace the line integral with a regular integral, assuming we’re moving in a straight line):
$$\begin{gather} \int_{r_1}^{r_2} F \dif{r} = \frac{1}{2}mv_2^2 – \frac{1}{2}mv_1^2 \\ F(r_2 – r_1) = \frac{1}{2}mv_2^2 – \frac{1}{2}mv_1^2 \\ \end{gather}$$
And from there, we can solve for velocity. Again, notice that we don’t care about the time it takes, just the distance over which a force acts. Cool! (What if we cared about the time it takes rather than the distance? Then we can use impulse, which comes from the definition of force.)

There’s one problem with the Work-Energy Theorem: evaluating the line integral is annoying. Converting that line integral into a regular integral is fine for the example I just gave, but it won’t work if the path is not in the same direction as the force.

It turns out, though, that a lot of the forces we care about (gravity, springs, electromagnetic force) are conservative forces. This means that the line integral is path-independent: its value doesn’t depend on which path you take from point A to point B; it only matters where point A and point B are.

One analogy is with regular integrals. If you’re integrating from $x = 0$ to $x = 5$, you’ll always get the same answer as long as the ultimate endpoints are the same.
$$\int_0^5 f(x) \dif{x} = \int_0^{-10} f(x) \dif{x} + \int_{-10}^7 f(x) \dif{x} + \int_7^5 f(x) \dif{x}$$
Along the same lines, for line integrals of conservative forces, you’ll always get the same answer no matter which path you take.

The path-independence of regular integrals is what makes the fundamental theorem of calculus possible:
$$\int_a^b f(x) \dif{x} = F(b) – F(a)$$

It turns out that there’s an analogous version of the fundamental theorem for line integrals of conservative forces:
$$\int_C \vecb{F}(\vecb{r}) \cdot \dif{\vecb{r}} = U(\vecb{r}_2) – U(\vecb{r}_1)$$ where $U$ is the “antiderivative” — the scalar potential of the vector field $\vecb{F}$. In other words, for every conservative force, we can define a potential function that we can use to evaluate line integrals of conservative forces. This potential function will have a value for every point in space. (Note that the value will be a scalar and not a vector.) For example, for the force of gravity, $\vecb{F}(\vecb{r}) = \frac{GMm}{r^2}\hatb{r}$, and $U(\vecb{r}) = -\frac{GMm}{r} + C$, where $C$ is the constant of integration, here usually taken so that $U(\infty) = 0$. Notice that $\dod{U}{r} = F$ (or the multidimensional equivalent, $\nabla{U} = \vecb{F})$!

So for a conservative force, the Work-Energy Theorem becomes:
$$U(\vecb{r}_2) – U(\vecb{r}_1) = \frac{1}{2}mv_2^2 – \frac{1}{2}mv_1^2$$ or
$$\frac{1}{2}mv_1^2 + U(\vecb{r}_1) = \frac{1}{2}mv_2^2 + U(\vecb{r}_2)$$
Let’s call $U$ the potential energy, and now we’ve obtained an expression for the conservation of mechanical energy. As long as we’re talking about conservative forces like gravity, the sum of the kinetic and potential energy stay the same no matter what you do.

So that’s how you get work, kinetic energy, and potential energy from forces. I hope they seem more connected now and less like a jumble of concepts that physics teachers throw at you! Leave a comment if it was helpful!

February 26, 2013, 5:01pm by Casey

## Equilibrium simulator

I’m taking a chemistry class right now on acid-base equilibria, and one problem that keeps getting assigned is one where you’re given some initial concentrations, and they want you to find the equilibrium concentrations.

For example, here’s one: Consider 1.00 L of a 0.082 M solution of aqueous formic acid (HCO2H), where $K_a= 1.78 \times 10^{-4}$. What are the equilibrium concentrations of HCO2H, HCO-, H3O+, and OH-?

There are two relevant reactions (we know their equilibrium constants):

We can set up an ICE table with the initial, change in, and equilibrium concentrations ([X] denotes concentration of chemical X):

[HCO2H] [HCO-] [H3O+] [OH-]
initial $$0.082$$ $$10^{-7}$$ $$0$$ $$10^{-7}$$
change $$-x$$ $$x + y$$ $$x$$ $$y$$
equilibrium $$0.082-x$$ $$10^{-7}+x + y$$ $$x$$ $$10^{-7}+y$$

And then we can plug these final concentrations into our two equilibrium constant formulas:

We can solve this system by either making some approximations or shoving it all into something like Mathematica. If we do that, we get [HCO2H] = 0.0783 M, [HCO-] = 0.00373 M, [H3O+] = 0.00373 M, and [OH-] = 2.68 × 10-12 M.

But that’s annoying. Being a programmer, I thought it would be a good idea to try and write a program that numerically solves these problems. It was a fun distraction from my actual homework! It works by pushing each reaction in the right direction until its reaction quotient becomes its equilibrium constant.The below is the result (Chrome is best — Firefox doesn’t support the HTML5 <input type="range"> element!):

Equilibrium simulator Reactions
Concentrations

February 13, 2013, 8:49pm by Casey

## Why does algebra work?

When I was in middle school, my teachers taught me the mechanics of algebra — try to isolate to variable by adding, subtracting, multiplying, and dividing both sides. And then in later years, my teachers expanded that toolbox to include stuff like logarithms and whatnot.

But I think that it’s easy to get lost in the details and overlook one crucial piece of logic. It’s the basis of all algebra, but it’s a kind of hidden fact that becomes important in more complicated problems.

Let’s say we have the equation $2x + 1 = 5$, and we’re trying to find all possible values of $x$.

By instinct, one might jump directly into subtracting $1$ from both sides and dividing both sides by $2$:

$$\begin{gather*} 2x + 1 = 5 \\ 2x + 1 – 1 = 5 – 1 \\ 2x = 4 \\ \frac{2x}{2} = \frac{4}{2} \\ x = 2 \\ \end{gather*}$$

But here’s why the algebra works: these clauses are linked by “if and only if”s.

$2x + 1$ equals $5$ if and only if $2x$ equals $4$, which is true if and only if $x = 2$. Therefore, we can conclude that $2x + 1 = 5$ if and only if $x = 2$.

One way to look at it is that we’re replacing the original equation with equivalent forms. $2x + 1 = 5$ says exactly the same information as $x = 2$, just written slightly differently. That’s really the whole idea of algebra (and logic) — replacing one statement with another exactly equivalent until it’s obvious what your result is.

That’s why, in general, it’s not a productive idea to multiply both sides by zero: $2x + 1 = 5$ is not exactly equivalent to $0 \cdot (2x + 1) = 0 \cdot 5$ because the latter is true no matter what $x$ is. Therefore, we’ve lost some information in multiplying by zero.

“If and only if” is abbreviated $\Leftrightarrow$, and I wish that I was taught algebra with that notation. It’d be easier to remember that $\sqrt{x^2} = 1$ is not exactly equivalent to $x = 1$ (it can be $-1$ too!) if I had that symbol to remind me.

January 31, 2013, 11:15am by Casey
Categories: Math | Permalink | 1 comment

← Older posts