Gaussian integration is cool

6 months ago (rohangautam.github.io)

No, the integral cannot be "expressed" as a sum over weights and function evaluations (with a "="), it can be approximated with this idea. If you fix any n+1 nodes, interpolate your function, and integrate your polynomial, you will get this sum where the weights are integrals over (Lagrange) basis polynomials. By construction, you can compute the integral of polynomials up to degree n exactly. Now, if you choose the nodes in a particular way (namely, as the zeros of some polynomials), you can increase this to up to 2n+1. What I'm getting at is that the Gaussian integration is not estimating the integrals of polynomials of degree 2n+1, but it's evaluating them exactly.

  • You are right and almost all methods of numerical integration (in any case all those that are useful and I am aware of) are equivalent to approximating the target function with another simpler function, which is then integrated using an exact formula.

    So the estimation error is introduced at the step where a function is approximated with another function, which is usually chosen as either a polynomial or a polynomial spline (composed of straight line segments for the simplest trapezoidal integration), not at the actual integration.

    Fortunately, for well-behaved functions, when they are approximated by a suitable simpler function, the errors that this approximation introduces in the values of function integrals are smaller than the errors in the interpolated values of the function, which are in turn smaller than the errors in the values estimated at some point for the derivative of the function (using the same approximating simpler function).

    • > [...] almost all methods of numerical integration (in any case all those that are useful and I am aware of) are equivalent to approximating the target function with another simpler function, which is then integrated using an exact formula.

      The key property of quadrature formulas (i.e. numerical integration formulas) is the degree of exactness, which just says up to which degree we can integrate polynomials exactly. The (convergence of the) error of the quadrature depends on this exactness degree.

      If you approximate the integral using a sum of n+1 weights and function evaluations, then any quadrature that has exactness degree n or better is in fact an interpolatory quadrature, that is, it is equivalent to interpolating your function on the n+1 nodes and integrating the polynomial. You can check this by (exactly) integrating the Lagrange basis polynomials, through which you can express the interpolation polynomial.

      1 reply →

  • You're totally correct, my writeup in that section definitely was not clear. I've updated the blog, hopefully it's better. I've also given you a shoutout in the end of the post in my edit log, if that's cool with you

    • Sure, thanks. I also saw this sentence, which was not there in the last version (I think):

      > That is to say, with n nodes, gaussian integration will approximate your function's integral with a higher order polynomial than a basic technique would - resulting in more accuracy.

      This is not really the case, Gaussian integration is still just interpolation on n nodes, but the way of choosing the nodes increases the integration's exactness degree to 2n-1. It's actually more interesting that Gaussian integration does not require any more work in terms of interpolation, but we just choose our nodes better. (Actually, computing the nodes is sometimes more work, but we can do that once and use them forever.)

A good introduction to the basics.

What is also worth pointing out and which was somewhat glanced over is the close connection between the weight function and the polynomials. For different weight functions you get different classes of orthogonal polynomials. Orthogonal has to be understood in relation to the scalar product given by integrating with respect to the weight function as well.

Interestingly Gauss-Hermite integrates on the entire real line, so from -infinity to infinity. So the choice of weight function also influences the choice of integration domain.

  • Sorry if this is a stupid question, but is there a direct link between the choice of weight function and the applications of the polynomial?

    Like, is it possible to infer that Chebyshev polynomials would be useful in approximation theory using only the fact that they're orthogonal wrt the Wigner semicircle (U_n) or arcsine (T_n) distribution?

    • Chebyshev polynomials are useful in approximation theory because they're the minimax polynomials. The remainder of polynomial interpolation can be given in terms of the nodal polynomial, which is the polynomial with the interpolation nodes as zeros. Minimizing the maximum error then leads to the Chebyshev polynomials. This is a basic fact in numerical analysis that has tons of derivations online and in books.

      The weight function shows the Chebyshev polynomials' relation to the Fourier series . But they are not what you would usually think of as a good candidate for L2 approximation on the interval. Normally you'd use Legendre polynomials, since they have w = 1, but they are a much less convenient basis than Chebyshev for numerics.

      7 replies →

What is Fig. 1 showing? Is it the value of the integral compared with two approximations? Would it not be more interesting to show the error of the approximations instead? Asking for a friend who isn’t computing a lot of integrals.

  • Fig 1 could use a rethink. It uses log scale, but the dynamic range of the y-axis is tiny, so the log transform isn't doing anything.

    It would be better shown as a table with 3 numbers. Or, maybe two columns, one for integral value and one for error, as you suggest.

    • Yeah - my guess is this was just a very roundabout solution for setting axis limits.

      (For some reason, plt.bar was used instead of plt.plot, so the y axis would start at 0 by default, making all results look the same. But when the log scale is applied, the lower y limit becomes the data’s minimum. So, because the dynamic range is so low, the end result is visually identical to having just set y limits using the original linear scale).

      Anyhow for anyone interested the values for those 3 points are 2.0000 (exact), 1.9671 (trapezoid), and 1.9998 (gaussian). The relatives errors are 1.6% vs. 0.01%.

While I love seeing mathy articles on HN, the lead image in this article is terrible.

At first glance I see we're talking about numerical integration so I assuming the red part is this method that is being discussed and that it's much better than the other two. Then I look at the axis, which the caption notes is log scale, and see that it goes from 1.995 to 2. Uh-oh, is someone trying to inflate performance by cutting off the origin? Big no-no, but then wait a minute, this is ground truth compared to two approximations. So actually the one on the right is better. But the middle one is still accurate to within 0.25%. And why is it log scale?

Anyway point is, there's lots of room for improvement!

In particular it's probably not worth putting ground truth as a separate bar, just plot the error of the two methods, then you don't need to cut off the original. And ditch the log scale.

  • Yeah - I totally agree. Some others on the thread pointed it out as well. I ended up replacing it with a table additionally having the error percentages. Thanks for the fair critique! I've also mentioned your username in the edit logs at the end of the blog if you don't mind.

I thought when I first saw the title that it was going to be about the Gaussian integral[1] which has to be one of the coolest results in all of maths.

That is, the integral from - to + infinity of e^(-x^2) dx = sqrt(pi).

I remember being given this as an exercise and just being totally shocked by how beautiful it was as a result (when I eventually managed to work out how to evaluate it).

[1] https://mathworld.wolfram.com/GaussianIntegral.html

This is a neat and accessible introduction to the topic. Nice work! I want to mention nesting quadratures like Gauss-Kronod like Clenshaw-Curtis/Fejér. The nesting means that the quadrature points for lower order are included in the points required for higher orders. This allows you to increase your order of accuracy if required with fewer total evaluations.

Did anyone else's antivirus complain about an exploit on this page?

---EDIT---

I'm about 98% sure this blog has a browser hijack embedded in it targeted at windows+MSEDGE browsers that attempted to launch a malicious powershell script to covertly screen record the target machine

  • That's a major claim. The only thing different in this blog post from my others is that I've embedded an executable python notebook in an iframe. It's a marimo notebook that runs code using WASM in your browser. That project is open source too, with no exploit as far as I know.

    The code for my blog is here : https://github.com/RohanGautam/rohangautam.github.io

    If you could point to anything specific to support that claim, would be nice.

    • I would be happy to be wrong on this one. But I've gotten two pretty convincing threat notifications when visiting the page from the Sentinel One antivirus platform saying that my msedge process had been compromised by a known exploit.

      I'll try to get more details.

      I should note, I do not believe the site is malicious, but i am worried about 3rd party compromise of the site without the owner's knowledge

      1 reply →