Conway’s fruitful fractions (or how to generate all the prime numbers by multiplying one of 14 fractions)

(Credit: I recently learned of the fruitful fractions from this blog post on The Aperiodical.)

Conway‘s fruitful fractions first appeared as a problem in the Mathematical Intelligencer in 1980 (Reference 1), and are described in more detail in Guy (1983) (Reference 2). The discussion here closely follows that of Reference 2, and the figures in this post are taken from that paper.

The fruitful fractions are a set of 14 innocent-looking fractions:

Consider the following algorithm:

  1. Start with the input x = 2.
  2. While TRUE (i.e. repeat the steps in the following loop forever):
    1. Multiply x by the leftmost fraction in the row above such that the result is still a whole number.
    2. If x is a pure power of 2, i.e. x = 2^t for some non-negative integer t, output the exponent t.

It turns out that the output sequence is simply the list of all primes! Here are the steps taken to go from the initial input 2 to the first output 2 (when x = 2^2 = 4):

Amazed?! The rest of this post outlines why this is the case. TLDR: We can reformulate the algorithm above as a flow diagram, which can be interpreted as a computer program that finds all the primes using just 3 types of operations (adding 1 to the contents of a register, subtracting 1 from the contents of a register, and checking if a register is 0).

Step 1: Write out the computer program to generate primes with simple operations

The figure below depicts the computer program that generates all the primes as a flow diagram:

The box in blue (“Is n prime?”) is a complex operation which we need to break down even further; the figure below demonstrates how we can do so. For a given n, for d = n, n-1 , n-2, \dots , 1, we check if d divides n (i.e. d is a divisor of n). If d \mid n for some d > 1, then n is composite, otherwise it is prime.

The box in red (“Does d \mid n?”) is still too complex: we can break it down further as shown in the figure below. For given d and n, repeatedly subtract d from n until we reach a number r < d. If r = 0, then d \mid n, if not d does not divide n.

If we combine these 3 flow diagrams into one, we get a flow diagram that generates all the primes using 3 simple operations: adding 1 to a variable, subtracting 1 from a variable, and checking if a variable is equal to 0. The rest of the steps will show that the fruitful fractions replicate this flow diagram.

Step 2: Think of x‘s prime factorization as encoding a state

Rewrite the 14 fruitful fractions with their prime factorizations:

It’s not immediately obvious, but one can show that at any point, the number x can be written in the form

x = 2^t 3^s 5^r 7^q p,

where t, s, r, q are non-negative integers and p \in \{1, 11, 13, 17, 19, 23, 29\}. We can think of x as keeping track of 5 pieces of information, which we write as x = (t, s, r, q)_p. Thinking of this as a computer program / machine, p represents the state of the machine while t, s, r, q are the contents of 4 registers of the machine.

Step 3: Draw the possible transitions between states

Assume that the machine is in a particular state. What other states can the machine reach from there? Each of the 14 fractions corresponds to one state transition, depicted in the state transition diagram below. The numbers in the circle represent the state (the value of p), the letters correspond to the fraction being multiplied, and the numbers correspond to changes to the registers (t, s, r, q). For example, step B takes the machine from state 17 to state 13, increases t and s (the 2- and 3-registers) by 1, decreases r (the 5-register) by 1, and leaves q (the 7-register) unchanged.

Step 4: Amalgamate steps into subroutines

Running through the state diagram several times, you will find that some sequences of steps always happen together: we can amalgamate these into subroutines. For example, in state 19, the machine will make a pair of steps HI t times (where t is the 2-register), transferring t to r (the 5-register), and then do step R to arrive at state 11. State 19 can only be reached by step D from state 17 or by step I from state 23, so we can replace stats 19 and 23 by the subroutine

D_n = D(HI)^n R

which goes directly from state 17 to state 11. With some work we can come up with the following subroutines:

We can then simplify our original state diagram with these subroutines:

Step 5: Assign meaning to registers and subroutines

This is the trickiest step to work out in my opinion. Notice the following 3 things:

  • The sum of the 2- and 5-registers, t + r is always equal to n (except when we have finished checking if n is prime and are moving on to n+1).
  • The sum of the 3- and 7- registers, s + q is always equal to d (except when we have finished checking if d \mid n and are moving on to checking if (d-1) \mid n).
  • The “Next Number” subroutine N_n is always followed by T_n, the “Decrease Divisor” subroutine D_n is always followed by T_{r-1}, and each of these is followed by the “division routine” (S_d T_d)^l A_r.

With some tedious work, we can now match our original prime producing flow diagram to the states and subroutines we have developed! The blue and red boxes match with those earlier in this post. (Because of the font face used in this diagram, the variable t looks like a slightly bigger version of the variable r.)

References:

  1. Conway, J. H. (1980). Problem 2.4.
  2. Guy, R. K. (1983). Conway’s Prime Producing Machine.

Two ways to approximate the root of an equation involving a combinatorial sum

I recently came across two approximations of an equation involving a particular combinatorial sum which I thought were pretty neat. Let r be a positive integer and let x \in [0, 1]. The combinatorial sum in question is

\begin{aligned} f(x, r) &= \sum_{i=1}^r \dfrac{1}{i} \cdot \binom{r}{i} x^{r-i}(1-x)^i  \\  &= \binom{r}{1}x^{r-1}(1-x) + \dfrac{1}{2}\binom{r}{2}x^{r-2}(1-x)^2 + \dots + \dfrac{1}{r}\binom{r}{r}(1-x)^r. \end{aligned}

Think of r as fixed and large. The equation to be solved is

\begin{aligned} x^r = f(x, r). \end{aligned}

This equation comes from the solution of Problem 48 in Mosteller (1987) (Reference 1), and in the solution Mosteller provides two ways to approximate the solution to the equation.

The first approximation notes that for large r, (1-x)^r gets small. Hence, we can approximate the combinatorial sum by its leading term, i.e.

\begin{aligned} f(x, r) &\approx \binom{r}{1}x^{r-1}(1-x) = r x^{r-1} (1-x). \end{aligned}

With this approximation, the equation becomes

\begin{aligned} x^r &= r x^{r-1} (1-x), \\  x &= r(1-x), \\  x &= \dfrac{r}{r+1}. \end{aligned}

The plot below compares the actual value of the root (black line) with this approximation (red line) for values of r from 1 to 20. It’s not a bad approximation, and we can see the gap narrowing as r increases.

The second approximation is a bit more sophisticated. If we let z = \dfrac{1-x}{x}, we can transform the equation:

\begin{aligned} x^r &= \sum_{i=1}^r \dfrac{1}{i} \cdot \binom{r}{i} x^{r-i}(1-x)^i, \\  1 &= \sum_{i=1}^r \dfrac{1}{i} \cdot \binom{r}{i} x^{-i}(1-x)^i, \\  1 &= \sum_{i=1}^r \dfrac{1}{i} \cdot \binom{r}{i} z^i. \end{aligned}

From the first approximation, we know that

z = \dfrac{1-x}{x} \approx \dfrac{1 - r/(r+1)}{r/(r+1)} = \dfrac{1}{r}.

Let’s refine this approximation a bit by letting z = \alpha / r, where \alpha is a constant for us to determine. Since we want a solution for large r, let’s take r to infinity and see what value of \alpha might make sense there. Plugging this value of z into the equation above,

\begin{aligned} 1 &= \sum_{i=1}^\infty \dfrac{1}{i} \cdot \binom{r}{i} \dfrac{\alpha^i}{r^i} \\  &= \sum_{i=1}^\infty \dfrac{1}{i} \cdot \dfrac{r(r-1) \dots (r-i+1)}{i!} \dfrac{\alpha^i}{r^i}  \\  &\approx \sum_{i=1}^\infty \dfrac{1}{i} \cdot \dfrac{r^i}{i!} \dfrac{\alpha^i}{r^i} \\  &= \sum_{i=1}^\infty \dfrac{\alpha^i}{i \cdot i!}. \end{aligned}

Since the RHS is increasing in \alpha, we can use root finding techniques like bisection search to approximate the value of \alpha. We find that \alpha \approx 0.8043, i.e. x \approx \dfrac{r}{r+0.0843}.

The plot below shows this second approximation in blue (the true value is in black and the first approximation is in red). You can see that this is a much better approximation!

This last plot compares the relative error of the two approximations (i.e. |approx - true| / true \times 100\%).

References:

  1. Mosteller, F. (1987). Fifty challenging problems in probability with solutions.

A proof of the Frobenius-König theorem

The Frobenius-König theorem is the following statement:

Theorem (Frobenius-König). Every diagonal of an n \times n matrix \mathbf{A} = \{ a_{i,j}\}_{i,j=1}^n contains a zero element if and only if \mathbf{A} has an s \times t submatrix of zeros with s + t = n + 1.

In the statement above, a diagonal means a set \{ a_{1, \sigma(1)}, a_{2, \sigma(2)}, \dots, a_{n, \sigma(n)} \}, where \sigma is any permutation of the set \{1, 2, \dots, n\}. The theorem is more commonly stated using the concept of a matrix permanent, and is equivalent to the version above:

Theorem (Frobenius-König). The permanent of an n \times n matrix \mathbf{A} = \{ a_{i,j}\}_{i,j=1}^n with all entries either 0 or 1 is zero if and only if \mathbf{A} has an s \times t submatrix of zeros with s + t = n + 1.

A proof of this theorem was surprisingly difficult to track down. It’s commonly stated that the result follows from the König-Egeváry theorem, but the statement of Frobenius-König is so much simpler that one would imagine that there was a simpler proof that did not invoke König-Egeváry. Such a simpler proof does indeed exist; the proof below is paraphrased from that provided in Reference 1.

Let’s start with proving sufficiency first, i.e. if \mathbf{A} has an s \times t submatrix of zeros with s + t = n + 1, then every diagonal contains a zero element. Without loss of generality, we can assume that the s \times t submatrix of zeros is in the top-right corner:

Consider the generic diagonal \{ a_{1, \sigma(1)}, a_{2, \sigma(2)}, \dots, a_{n, \sigma(n)} \}. In particular, consider where the first s elements a_{1, \sigma(1)}, \dots, a_{s, \sigma(s)} might be. In order for the diagonal not to contain any zeros, these s elements must all fall in the submatrix X (shaded green above). However, since X only has s-1 columns, it implies that two of the elements must be in the same column, i.e. \sigma(i) = \sigma(j) for some i \neq j. This contradicts the definition of \sigma as permutation, and hence by a proof by contradiction, the diagonal must contain a zero element.

Next, let’s prove necessity, i.e. a matrix with every diagonal having a zero element must have an s \times t submatrix of zeros with s + t = n + 1. We may assume that the matrix is a 0-1 matrix, since changing all non-zero entries to 1 doesn’t change the truth status of the statement. Under this assumption, having a zero element on every diagonal is equivalent to the matrix having a permanent of zero.

We proceed by induction on n. When n = 1, having a permanent of zero implies that the sole element in the matrix is zero, so the statement holds trivially with s = t = 1.

For the induction step, assume that the statement holds for all values less than < n. If every entry of the matrix \mathbf{A} is zero, then the statement is trivially true. Assume that some element in the matrix, a_{h, k} is non-zero. Let \mathbf{A}(i \mid j) denote the submatrix of \mathbf{A} excluding row i and column j. If we consider the Laplace expansion of the permanent along row h:

\begin{aligned} \text{perm}(\mathbf{A}) = \sum_{j=1}^n a_{hj} \cdot \text{perm}(\mathbf{A}(h \mid j)),  \end{aligned}

in order for this to be zero, we must have \text{perm}(\mathbf{A}(h \mid k)) = 0. Applying the induction assumption to \mathbf{A}(h \mid k), \mathbf{A}(h \mid k) has a s_1 \times t_1 submatrix of zeroes with s_1 + t_1 = (n-1) + 1 = n. Without loss of generality, we may permute the rows and columns such that this s_1 \times t_1 submatrix of zeroes appears in the top-right corner:

Since \mathbf{X} and \mathbf{Z} are both square matrices, it follows that

\begin{aligned} 0 = \text{perm}(\mathbf{A}) = \text{perm}(\mathbf{X}) \text{perm}(\mathbf{Z}). \end{aligned}

Hence, either \text{perm}(\mathbf{X}) = 0 or \text{perm}(\mathbf{Z}) = 0. Without loss of generality, assume \text{perm}(\mathbf{X}) = 0. Then by the induction hypothesis, \mathbf{X} has a u \times v submatrix of zeroes with u + v = s_1 + 1. Let i_1, \dots, i_u and j_1, \dots, j_v denote the row and column indices of this submatrix, i.e.

\begin{aligned} a_{i, j} = 0 \text{ for all } i \in \{ i_1, \dots, i_u\} \text{ and } \{ j_1, \dots, j_v\}. \end{aligned}

It immediately follows that

\begin{aligned} a_{i, j} = 0 \text{ for all } i \in \{ i_1, \dots, i_u\} \text{ and } \{ j_1, \dots, j_v, s_1 + 1, s_1 + 2, \dots n \}. \end{aligned}

This is a submatrix of zeroes with dimension u \times (v + n - s_1), and u + (v + n - s_1) = n + 1. This completes the induction step.

References:

  1. Minc, H. (1984). Permanents.

What is the permanent of a matrix?

Let \mathbf{A} = \{ a_{i,j} \}_{i,j = 1}^n be an n \times n matrix. Most people are familiar with the concept of the determinant of \mathbf{A}. If we let S_n denote the symmetric group of order n (i.e. the group of permutations of the set \{1, 2, \dots, n \}), the determinant can be defined as

\begin{aligned} \det(\mathbf{A}) = \sum_{\sigma \in S_n} \left( \text{sgn}(\sigma) \prod_{i=1}^n a_{i, \sigma(i)} \right), \end{aligned}

where \text{sgn}(\sigma) denotes the signature of the permutation \sigma (which is always equal to either +1 or -1).

Did you know that there’s a related quantity known as the permanent of a matrix? It’s essentially the same as the determinant except we don’t keep track of the signatures:

\begin{aligned} \text{perm}(\mathbf{A}) = \sum_{\sigma \in S_n} \left( \prod_{i=1}^n a_{i, \sigma(i)} \right). \end{aligned}

It appears that permanents were more in vogue in the 1960s and 70s; we don’t seem to talk about them much nowadays. Permanents don’t have an easy geometric interpretation but do have a graph-theoretic interpretation; see the wikipedia article for some applications of permanents. Permanents are also often used in combinatorics.

Here are some useful properties of permanents:

  • The permanent of a matrix is invariant under permutations of rows and/or columns of the matrix. This property can be written as \text{perm}(\mathbf{A}) = \text{perm}(\mathbf{PAQ}) for any permutation matrices \mathbf{P} and \mathbf{Q}.
  • If \mathbf{A} is a triangular matrix, then the permanent is equal to the product of the diagonal entries.
  • If we view the permanent as a map that takes n vectors as arguments, then it is a symmetric multilinear map.

References:

  1. Wikipedia. Permanent (mathematics).

What is the Abel-Ruffini theorem?

The Abel-Ruffini theorem states that

“there is no solution in radicals to general polynomial equations of degree five or higher with arbitrary coefficients” (Reference 1).

By “general polynomial equations of degree five or higher with arbitrary coefficients”, we mean equations of the form

\begin{aligned} a_n x^n + a_{n-1}x^{n-1} + \dots + a_1x + a_0 = 0, \end{aligned}

with a_0, \dots, a_n \in \mathbb{R}, a_n \neq 0, and n \geq 5. By “solution in radicals”, we mean

“a closed-form algebraic expression… [that] relies only on addition, subtraction, multiplication, division, raising to integer powers, and [taking] nth roots” (Reference 2).

The proof of this theorem is often taught in an undergraduate course in Galois theory, and Reference 1 has a section outlining the main steps of the proof.

On the flip side, general polynomials equations of degree 4 and below with arbitrary coefficients do have solutions in radicals. The solutions to linear and quadratic equations are well-known, while those to cubic and quartic equations are not. I must say that the cubic and quartic formulas are not super useful in general. They are also complicated in that there are special things that you need to take note of (e.g. which square root or which cube root to take). If you are solving one of these by hand, it’s often much easier to find a special root of the equation by hand, factor it out, then work with the remaining equation that has a smaller degree.

Keeping these thoughts in mind, here are the links to the general formula for roots of cubic equations and quartic equations.

References:

  1. Wikipedia. Abel-Ruffini theorem.
  2. Wikipedia. Solution in radicals.

What are Carmichael numbers / absolute pseudoprimes?

Fermat’s little theorem states that if p is a prime number, then

\begin{aligned} a^p \equiv a \ (\textrm{mod} \ p) \quad \text{for all integers } a. \end{aligned}

Fermat’s little theorem is not a foolproof test for primality: there are composite numbers n for which a^n \equiv a \ (\textrm{mod}\ n) for some integer a. If a^n \equiv a \ (\textrm{mod}\ n) for a given value of a, n is said to be a (Fermat) pseudoprime to base a. For example, the smallest pseudoprime to base 2 is 341.

Despite the presence of these pseudoprimes, the condition in Fermat’s little theorem is still a widely used test for primality since (i) pseudoprimes to a particular base are pretty rare, and (ii) even if a number is a pseudoprime for some base a, it might not be a pseudoprime for another base b. (Click here for a list of pseudoprimes to different bases.)

The question remains: are there numbers which satisfy the condition of Fermat’s little theorem for all integers a, but yet are not prime? (In other words, is the converse of Fermat’s little theorem true?) It turns out that the converse is not true: there are composite numbers n for which a^n \equiv a \ (\textrm{mod}\ n) for all integers a. These numbers are called Carmichael numbers (or absolute pseudoprimes).

The smallest Carmichael number is 561. I always thought that the proof for this was complicated, but it turns out to be pretty elementary! The proof below is from Reference 1 and hinges on the identity a^k - 1 = (a-1)(a^{k-1} + a^{k-2} + \dots + 1). Since 561 = 3 \cdot 11 \cdot 17, we need to show that a^{561} - a is divisible by 3, 11 and 17. (These together would imply that a^{561} - a is divisible by 561.)

\begin{aligned} a^{561} - a &= a [(a^2)^{280} - 1^{280}] \\  &= a(a^2 - 1) (\dots) \\  &= (a^3 - a) (\dots) \\  &\equiv 0 \qquad  (\textrm{mod}\ 3), \end{aligned}

where the last implication is due to applying Fermat’s little theorem with p = 3. Similarly,

\begin{aligned} a^{561} - a &= a [(a^{10})^{56} - 1^{56}] = a(a^{10} - 1) (\dots) = (a^{11} - a) (\dots) \\  &\equiv 0 \qquad  (\textrm{mod}\ 11), \\  a^{561} - a &= a [(a^{16})^{35} - 1^{35}] = a(a^{16} - 1)(\dots) = (a^{17} - a) (\dots) \\  &\equiv 0 \qquad  (\textrm{mod}\ 17).  \end{aligned}

At the time when Reference 1 was published (1974), it was not known whether there are an infinite number of Carmichael numbers. In 1994, Alford et. al. (Reference 2) proved that there are infinitely many Carmichael numbers.

References:

  1. Honsberger, R. (1974). Mathematical Gems I.
  2. Alford, W. R., Granville, A., and Pomerance, C. (1994). There are infinitely many Carmichael numbers.

What is Hardy’s inequality?

Hardy’s inequality is the following:

Hardy’s inequality. For any integrable function f: (0, T) \mapsto \mathbb{R},

\begin{aligned} \int_0^T \left[ \dfrac{1}{x}\int_0^x f(u) du \right]^2 dx &\leq 4 \int_0^T f(x)^2 dx, \end{aligned}

and the constant 4 cannot be replaced by any smaller constant.

Interpretation

If we let \varphi(x) = \dfrac{1}{x}\int_0^x f(u) du, we can see that \varphi(x) is the mean of the function f on the interval (0, x). Hardy’s inequality loosely says that the mean of a function is “similarly behaved” as the original function: the integral of its square can’t be more than 4 times larger than that for the original function.

Proof

If f is not square-integrable, the RHS is infinite and the inequality is trivial. Assume that f is square-integrable, i.e. \int_0^T f(x)^2 dx = A < \infty for some A.

Applying integration by parts,

\begin{aligned} \int_0^T \left[ \dfrac{1}{x}\int_0^x f(u) du \right]^2 dx &= \left[ \left[ \int_0^x f(u) du \right]^2 \left( -\dfrac{1}{x} \right) \right]_0^T - \int_0^T \left( -\dfrac{1}{x} \right) \cdot 2 f(x) \left[ \int_0^x f(u) du \right] dx \\  &= 2 \int_0^T \varphi(x) f(x) dx - \left[ \left[ \int_0^x f(u) du \right]^2 \left( \dfrac{1}{x} \right) \right]_0^T.  \end{aligned}

Since, by Cauchy-Schwarz and the fact that f is square-integrable,

\begin{aligned} \dfrac{1}{x} \left[ \int_0^x f(u) du \right]^2  &\leq \dfrac{1}{x} \left( \int_0^x 1 du \right) \left( \int_0^x f(u)^2 du \right) \rightarrow 0 \end{aligned}

as x \rightarrow 0, we have

\begin{aligned} \int_0^T \varphi (x)^2 dx =  \int_0^T \left[ \dfrac{1}{x}\int_0^x f(u) du \right]^2 dx &= 2 \int_0^T \varphi(x) f(x) dx - \dfrac{1}{T} \left[ \int_0^T f(u) du \right]^2 \\  &\leq 2 \int_0^T \varphi(x) f(x) dx \\  &\leq 2 \left( \int_0^T \varphi(x)^2 dx \right)^{1/2} \left( \int_0^T f(x)^2 dx \right)^{1/2}. \end{aligned}

If \varphi (x) is not identically zero (in which case the inequality is trivial), we have \left( \int_0^T \varphi(x)^2 dx \right)^{1/2} > 0 and so we can divide both sides of the inequality by it to get

\begin{aligned} \left( \int_0^T \varphi(x)^2 dx \right)^{1/2} &\leq 2\left( \int_0^T f(x)^2 dx \right)^{1/2}, \\  \int_0^T \varphi(x)^2 dx &\leq 4  \int_0^T f(x)^2 dx, \end{aligned}

which is the inequality we wanted to prove.

To show that the constant 4 cannot be replaced by a smaller constant, consider the family of functions x \mapsto x^\alpha. For \alpha > -1/2, we have

\begin{aligned} \int_0^T \left[ \dfrac{1}{x}\int_0^x f(u) du \right]^2 dx &= \int_0^T \left[ \dfrac{1}{x} \left[ \dfrac{1}{\alpha + 1} x^{\alpha + 1} \right]_0^x \right]^2 dx \\  &= \int_0^T \left[ \dfrac{1}{\alpha + 1} x^\alpha \right]^2 dx \\  &= \dfrac{1}{(\alpha+1)^2 (2\alpha + 1)} T^{2\alpha + 1}, \\  \int_0^T f(x)^2 dx &= \int_0^T x^{2\alpha} dx \\  &= \dfrac{1}{2\alpha + 1} T^{2 \alpha + 1}.  \end{aligned}

Hence, any constant C (in the place of the 4) must satisfy

\begin{aligned} \dfrac{1}{(\alpha+1)^2 (2\alpha + 1)} T^{2\alpha + 1} &\leq C \cdot \dfrac{1}{2\alpha + 1} T^{2 \alpha + 1}, \\  C &\geq \dfrac{1}{(\alpha+1)^2} \end{aligned}

for all \alpha > -1/2. As we let \alpha \rightarrow -1/2, the RHS \rightarrow 4, and so C = 4 is indeed the smallest possible constant.

References:

  1. Steele, J. M. (2004). The Cauchy-Schwarz Master Class: An Introduction to the Art of Mathematical Inequalities. Chapter 11.

Proofs with majorant and minorant functions

Let f: \mathcal{D} \mapsto \mathbb{R} be some function. Another function g: \mathcal{D} \mapsto \mathbb{R} is said to be a majorant of f if g(x) \geq f(x) for all x \in \mathcal{D}. It is said to be a minorant of f if g(x) \leq f(x) for all x \in \mathcal{D}.

Often the concept of a majorant/minorant is useful when this majorant/minorant has a certain property P that the original function does not have. Let’s say we want to prove something about f. Let g be a minorant of f that has property P. If g is similar enough to f, a proof strategy might be to prove the statement for g (using property P in the process), then use the similarity between the two functions to prove the statement for f.

Here is an example of this proof technique, where the property P is “convexity”. (This is an abridged version from Chapter 6 of Reference 1. I’ve been reading through the book over the last few weeks and am really loving it!)

Problem: Prove that for a, b, c > 0 such that abc \geq 2^9,

\begin{aligned} \dfrac{1}{\sqrt{1 + (abc)^{1/3}}} \leq \dfrac{1}{3} \left( \dfrac{1}{\sqrt{1 + a}} + \dfrac{1}{\sqrt{1 + b}} + \dfrac{1}{\sqrt{1 + c}} \right). \end{aligned}

Let me leave a few lines of space here so that those of you who want to try solving the problem on your own won’t see any spoilers…

 

 

 

 

 

The first step of the solution is to notice that if we let f(x) = \dfrac{1}{\sqrt{1 + e^x}}, the inequality we want to show is

\begin{aligned} f \left( \dfrac{x+y+z}{3}\right) \leq \dfrac{f(x) + f(y) + f(z)}{3} \qquad \text{for } x + y + z \geq 9 \log 2. \end{aligned}

The inequality looks like Jensen’s inequality! If f were convex, then we would have solved the problem (without the restriction on x, y, z).

Unfortunately the function f is not convex. By computing its derivatives, we find that f is only convex on [ \log 2, \infty), and so Jensen’s inequality gives us

\begin{aligned} f \left( \dfrac{x+y+z}{3}\right) \leq \dfrac{f(x) + f(y) + f(z)}{3} \qquad \text{for } x, y, z \geq \log 2, \end{aligned}

which is not exactly what we are looking for (there are (x, y, z) such that x + y + z \geq 9 \log 2 but one of x, y, z is < \log 2).

Let’s plot the function f:

The function is convex to the right of the blue vertical line only, but it sure looks pretty linear on the left too: the concavity looks very minor. It’s “almost convex”, which gives us some hope that we can use the Jensen inequality argument with a slight modification to f.

Let’s say we find a function g that is a convex minorant of f. Applying Jensen’s inequality to the minorant:

\begin{aligned} \dfrac{f(x) + f(y) + f(z)}{3} \geq \dfrac{g(x) + g(y) + g(z)}{3} \geq g \left( \dfrac{x+y+z}{3}\right).  \end{aligned}

If the minorant is actually equal to f for all values greater than 3 \log 2, then for x + y + z \geq 9 \log 2, we have \dfrac{x+y+z}{3} \geq 3 \log 2 and we can complete the chain of reasoning:

\begin{aligned} \dfrac{f(x) + f(y) + f(z)}{3} \geq \dfrac{g(x) + g(y) + g(z)}{3} &\geq g \left( \dfrac{x+y+z}{3}\right) \\ &= f \left( \dfrac{x+y+z}{3}\right).  \end{aligned}

It remains to prove the existence of such a minorant. The simplest one we could come up with is f(x) = g(x) for x \geq 3 \log 2, and g(x) = the tangent to f at the point (3 \log 2, 1/3) for x < 3 \log 2. Here’s a plot of the minorant (in red) overlaid on the earlier plot:

We can see visually that g is indeed a minorant of f; proving that it is indeed so is just a matter of deriving the expression for g on the interval [0, 3 \log 2] and using som algebra to show that it is less than f.

References:

  1. Steele, J. M. (2004). The Cauchy-Schwarz Master Class: An Introduction to the Art of Mathematical Inequalities.

A J-convex (midpoint-convex) function which is not convex

For this post, consider functions f: \mathcal{D} \mapsto \mathbb{R}, where \mathcal{D} = (a, b) is some open interval on the real line (a can be -\infty and b can be \infty). It’s possible that the result holds for functions with a different domain, but I only verified it for this setting.

f is said to be convex if for all x, y \in \mathcal{D} and \lambda \in [0, 1],

\begin{aligned} f(\lambda x + (1-\lambda) y) \leq \lambda f(x) + (1-\lambda) f(y). \end{aligned}

f is said to be midpoint-convex (or J-convex) if for all x, y \in \mathcal{D},

\begin{aligned} f \left( \dfrac{x + y}{2} \right) \leq \dfrac{f(x) + f(y)}{2}. \end{aligned}

The “J” in J-convex refers to J. L. W. V. Jensen, who introduced such functions in 1906 (Reference 1).

It is obvious that a convex function is J-convex. What about the converse? Is a J-convex function necessarily convex? The answer turns out not to be as straightforward as one might expect.

If f is continuous and J-convex, then it is convex. The proof is quite straightforward: see Reference 2.

If f is measurable and J-convex, it can be proven that it must be continuous, and hence by the above it must be convex. This result is attributed to Sierpinski, and is Theorem 9.4.2 in Reference 3. (Note that in Reference 3, J-convex functions are referred to as simply convex functions; see note on p130 of the book for clarification.)

The weirdness happens when we consider non-measurable functions. If one allows the use of the axiom of choice, then one can construct a function that is midpoint-convex but not convex; see Reference 4.

However, there are models of Zermelo-Fraenkel set theory without the axiom of choice (ZF) in which there are no non-measurable functions (e.g. the Solovay model). In these models, since there are no non-measurable functions, all midpoint-convex functions must be convex. This illustrates that the axiom of choice must be used to construct a counterexample, and so there is no way to explicitly construct a midpoint-convex function that is not convex.

Because midpoint-convex functions that are not convex are very pathological (they are at least non-measurable), for all intents and purposes we can treat midpoint-convex functions as convex functions.

References:

  1. Jensen, J. L. W. V. (1906). Sur les fonctions convexes et les inégalités entre les valeurs moyennes.
  2. StackExchange. Midpoint-Convex and Continuous Implies Convex.
  3. Kuczma, M. (2009). An Introduction to the Theory of Functional Equations and Inequalities.
  4. StackExchange. Example of a function such that  𝜑((x+y)/2)≤ (𝜑(x)+𝜑(y))/2 but 𝜑 is not convex.

What is Lagrange’s identity?

Lagrange’s identity refers to the following: for any real numbers a_1, \dots, a_n and b_1, \dots, b_n,

\begin{aligned} \left( \sum_{i=1}^n a_i b_i \right)^2 = \left( \sum_{i=1}^n a_i^2 \right) \left( \sum_{i=1}^n b_i^2 \right) - \frac{1}{2} \sum_{i=1}^n \sum_{j=1}^n (a_i b_j - a_j b_i)^2. \end{aligned}

The proof is straightforward algebraic manipulation:

\begin{aligned} \left( \sum_{i=1}^n a_i b_i \right)^2 &= \sum_{i=1}^n a_i^2 b_i^2 + \sum_{i \neq j} a_i b_i a_j b_j \\  &= \sum_{i=1}^n a_i^2 b_i^2 + \sum_{i \neq j} a_i^2 b_j^2 - \sum_{i \neq j} a_i^2 b_j^2 + \sum_{i \neq j} a_i b_i a_j b_j \\  &= \left( \sum_{i=1}^n a_i^2 \right) \left( \sum_{i=1}^n b_i^2 \right) - \sum_{i \neq j} (a_i^2 b_j^2 - a_i b_i a_j b_j) \\  &= \left( \sum_{i=1}^n a_i^2 \right) \left( \sum_{i=1}^n b_i^2 \right) - \frac{1}{2}\sum_{i \neq j} (a_i^2 b_j^2 - 2a_i b_i a_j b_j + a_j^2 b_i^2) \\  &= \left( \sum_{i=1}^n a_i^2 \right) \left( \sum_{i=1}^n b_i^2 \right) - \frac{1}{2}\sum_{i \neq j} (a_i b_j - a_j b_i)^2 \\  &= \left( \sum_{i=1}^n a_i^2 \right) \left( \sum_{i=1}^n b_i^2 \right) - \frac{1}{2}\sum_{i = 1}^n \sum_{j=1}^n (a_i b_j - a_j b_i)^2. \end{aligned}

Proving such identities is usually not difficult; it’s often a mechanical matter of expanding both sides and equating terms. What’s difficult is coming up with the identity in the first place! It takes a leap of faith to believe that the difference between (\sum a_i b_i)^2 and (\sum a_i^2) (\sum b_i^2) admits a nice expression.

Notice that the proof above also works if we allow the variables to be complex numbers.

One immediate consequence of Lagrange’s identity is the Cauchy-Schwarz inequality. Since

\begin{aligned} \sum_{i=1}^n \sum_{j=1}^n (a_i b_j - a_j b_i)^2 \geq 0, \end{aligned}

it follows immediately that

\begin{aligned} \left( \sum_{i=1}^n a_i b_i \right)^2 \leq \left( \sum_{i=1}^n a_i^2 \right) \left( \sum_{i=1}^n b_i^2 \right), \end{aligned}

with equality when

\begin{aligned} a_i b_j = a_j b_i \qquad \text{for all } i \neq j. \end{aligned}

If b_j = 0 for all j, the equality conditions hold. If b_j \neq 0 for some j, then a_i = (a_j / b_j) b_i for all i, i.e. there is some \lambda such that a_i = \lambda b_i for all i. This is the other equality condition.

References:

  1. Steele, J. M. (2004). The Cauchy-Schwarz Master Class: An Introduction to the Art of Mathematical Inequalities.