What Underlying the Diffusion?
This article introduces an intuitive analysis of Diffusion Model
Getting Start from Mean-Shift
Assuming there are $m$ points $x_1,x_2,\cdots,x_m$ on space $\mathbb R^n$. All these points may take place distantly from each other, or scatter in groups. However, we want to get to know the places where these points get intensive.
First, we should consider the probabilistic distribution of $x_i$ by setting that:
The above Dirac distribution function is a classical probabilistic model which is unable to calculate gradient. However, by assuming that the real data distribution is close to the sample points (when sample set is large), we can slightly diffuse the dirac distribution so that their neighbour can be estimated: