hey remember how I spent the entire last week yelling at math
well I DID IT
I FIGURED OUT IMPORTANCE SAMPLING
This is a record of a twitter thread, originally posted in 2022
hey remember how I spent the entire last week yelling at math
well I DID IT
I FIGURED OUT IMPORTANCE SAMPLING
So the goal here was to take an arbitrary image and treat it as a Probability Distribution Function, and then generate random sample points that follow that distribution
Meaning, if you throw a bunch of darts, the darts will be concentrated where the image is brighter
Here's a test that shows my math works: Generate a huge number of sample points, and just increment the value of the pixel that they land on (effectively just a 2D histogram)
After enough samples add up they will eventually recreate the original PDF image
fucking
probability DENSITY function, not distribution function
the integral of the density function is the distribution function
statisticians are psychopaths and all of the terminology for this stuff is FAR more complicated than the actual math
anyway there's a bunch of ways to do this but the one that I Found Online is to integrate the PDF to obtain the Cumulative Distribution Function, then find the inverse of the CDF
The inverse CDF takes a uniform random input, and transforms it into a density-weighted value
Meaning: It doesn't just produce sample points out of nothing; it TAKES a uniform random point (like, C#'s Random.value) and then MOVES it to a new position that obeys the PDF
If you pass it a grid of points instead of random positions, you can see how it transforms space:
Everything about this is wild to me, because the math has like... no intuitive connection to geometry
it is not at ALL clear to me why integrating an image and then finding the integral's bijective inverse would produce a spatial transform? This is way outside my mathwheelhouse
Notably, the integrals are much harder in 2D, and there's essentially infinite ways you COULD find the answer. The way that the transform preserves vertical lines appears to be an implementation detail, not an inherent property of importance sampling
makes cool gifs in any case
anyway I doubt this explanation made much sense; it took me a week of work to understand all of this lol
BUT: "I have an arbitrary curve, and I would like to generate random values whose density matches the curve height" is def a problem I've run into several times before
Anyway if you ever find yourself needing to solve this problem, this is by far the best resource I've found for understanding it:
It's SUPER dense but it's graphics-focused and they actually explain all the concepts/terminology
YouTube
Rendering Lecture 06 - Importance Sampling
you do need to already know how monte carlo integration works
I was very mad when I learned it's hilariously dumb and totally intuitive and I've been doing it for literal years without knowing I was doing 𝓜𝓸𝓷𝓽𝓮 𝓒𝓪𝓻𝓵𝓸 𝓘𝓷𝓽𝓮𝓰𝓻𝓪𝓽𝓲𝓸𝓷