← Back to context

Comment by ndriscoll

2 months ago

One way to think about things is in terms of diagonalization. A generic linear operator is a fairly complicated thing that mixes information from different dimensions. When you diagonalize an operator, it's the same operator, but you're choosing a coordinate system where it becomes clear that all it was really doing was stretching along each coordinate axis, so you've broken it down into something that acts independently in a simple way on each dimension. The Fourier transform is unitary, so the intuition is that you're basically doing something like a rigid transformation of space (e.g. a rotation), and when you look from the correct angle, you see that your original operator wasn't some complicated "add a stretched version of this dimension to this other dimension minus a stretched version of a third dimension", but just "stretch this by this, this by this, etc."

On the other hand, convolution itself is already "just" multiplication. e.g. multiplying polynomials is convolution of their coefficients (to get the x^n coefficient, you need to add up all the combinations of a_i a_j x^i x^j where i+j=n), and this point of view also applies to e.g. linear time-invariant systems[0] by thinking of your function as the weights of an infinite weighted sum (so sort of an infinite polynomial) of time-shift operators (and this point of view works for other groups, not just time shifts). So f(t) is then the "coefficient" for the t-shift, and multiplying two such weighted sums again has you convolve the coefficients (so your original functions). The jargon way to say this is that your space of G-invariant functions is secretly the free algebra generated by G (G being a group). From that point of view, convolution is the "natural" multiplication on G-invariant functions. One can then ask whether there's a Fourier transform for other groups, which leads to abstract harmonic analysis. e.g. the Mellin transform is the Fourier transform for scale/stretch invariant functions as opposed to shift invariant.

[0] The typical systems that one studies in signal processing contexts where convolution and Fourier transforms are your bread and butter: https://en.wikipedia.org/wiki/Linear_time-invariant_system

This is a good general tool for less-mathematically-deep folks to keep in their pocket: look for well-behaved objects that do something nice under the operation you're interested in. Typical "well-behaved" objects do things like stay where they are, or end up as a constant multiple of themselves, or end up as 0 or 1, or something like that. Then try to represent everything else in terms of those objects, so that you can take advantage of their convenient behavior. Less-difficult examples include:

- Prime factorization: primes have nice properties, and you can turn every integer into a product of primes (polynomial factorization is an extension of this idea) and work with the nice prime properties

- Vector spaces: basis vectors have nice properties, so you write vectors as sums of them and do operations on the coefficients instead of the vectors themselves

- The exponential function: it's the unique function with f'(x) = f(x), so you try to turn everything else into exponentials anytime you have to solve some painful differential equation because you know those terms will go away

- Fixed points in dynamical systems: if you don't want to analyze how arbitrary things change, find the points that don't, then think of the other points as (fixed point) + (small perturbation) and reduce your work to handling the perturbation

- Taylor series: polynomials are easy, smooth functions are hard, so turn your smooth function into a polynomial and do polynomial things with it

  • Yeah this mindset is often called "mathematical maturity" in books, and you've laid out a good pratical subset of it.

  • A nice generalization.

    An example in statistics is the expectation operator. You can throw away a lot of detail if you only care about one central moment. And if you need more information about a distribution, add more moments.

    Also, this works for public policy. Frame everything as a well functioning market and hope for the best. /s

    But seriously, a nice intuition.

    • Well, “think of the other points as (fixed point) + (small perturbation) and reduce your work to handling the perturbation” is literally how modern economic models (DSGE) are studied and then used for public policy.