目录

Flow-Matching-Formula

graph LR;
    A[Flow Matching] --> B("条件概率\边际概率")
    A[Flow Matching] --> C("条件速度场\边际速度场")
    A[Flow Matching] --> D("速度调度器变换")
    A[Flow Matching] --> E("高斯路径下边际速度场的参数化(速度\x_0\x_1\score之间的转换)")
    A[Flow Matching] --> F("边际概率的计算(微分同胚\推前映射\变量替换)")
    A[Flow Matching] --> G("条件引导")

  • 随机向量 X,YX, Y,联合PDF pX,Y(x,y)p_{X,Y}(x,y) 满足边际化性质:
    • pX(x)=pX,Y(x,y)dyp_X(x) = \int p_{X,Y}(x,y) dy
    • pY(y)=pX,Y(x,y)dxp_Y(y) = \int p_{X,Y}(x,y) dx
  • 条件 PDF 定义:pXY(xy)=pX,Y(x,y)pY(y)p_{X \mid Y}(x \mid y) = \frac{p_{X,Y}(x,y)}{p_Y(y)}(要求 pY(y)>0p_Y(y) > 0
  • z:样本数据,x:采样数据
  • 条件概率路径:ptZ(xz)p_{t|Z}(x|z)(生成 Z=zZ=z 时的条件路径);
  • 边际概率路径:pt(x)=ptZ(xz)pZ(z)dzp_t(x) = \int p_{t|Z}(x|z) p_Z(z) dz
  • 条件期望 E[XY=y]=xpXY(xy)dx\mathbb{E}[X \mid Y = y] = \int x p_{X \mid Y}(x \mid y) dx,是“给定 Y=yY = y 时,最小二乘意义下最接近 XX 的函数”;
  • 全期望性质(Tower Property):E[E[XY]]=E[X]\mathbb{E}[\mathbb{E}[X \mid Y]] = \mathbb{E}[X]——多层期望可简化为单层期望,是后续边际速度场推导的关键工具。

全期望性质:μ(Y)=E[XY]\mu(Y) = \mathbb{E}[X \mid Y](给定 YYXX 的条件期望),它是 YY 的函数(随机变量)。

  • 内层 E[XY]\mathbb{E}[X \mid Y]:对 XX 取平均。在 YY 固定为某值 yy 时,用条件分布 pXY(xy)p_{X|Y}(x|y) 算期望,即 E[XY=y]=xpXY(xy)dx\mathbb{E}[X \mid Y=y] = \int x\, p_{X|Y}(x|y)\, dx。因此内层的结果是 YY 的函数 μ(Y)\mu(Y)
  • 外层 E[E[XY]]=E[μ(Y)]\mathbb{E}[\mathbb{E}[X \mid Y]] = \mathbb{E}[\mu(Y)]:对 YY 取平均。用 YY 的边际分布 pY(y)p_Y(y)μ(y)\mu(y) 求期望,即 μ(y)pY(y)dy\int \mu(y)\, p_Y(y)\, dy
  • 右边 E[X]\mathbb{E}[X]:对 (X,Y)(X,Y) 的联合(或等价地对 XX 的边际)取平均,即 xpX(x)dx=xpX,Y(x,y)dxdy\int x\, p_X(x)\, dx = \iint x\, p_{X,Y}(x,y)\, dx\, dy

因此:先对 XX 在“给定 YY”下求期望,再对 YY 求期望,等于直接对 XX 求期望;全期望性质说的是“先条件后边际”与“直接边际”一致。

  • 条件速度场:ut(xz)u_t(x|z) 由条件路径 ptZ(xz)p_{t|Z}(x|z) 唯一确定(满足连续性方程,生成该路径);线性条件流时为 ut(xz)=zx1tu_t(x|z) = \frac{z-x}{1-t}(从当前 xx 指向目标 zz);
  • 边际速度场:ut(x)=ut(xz)pZt(zx)dz=ut(xz)ptZ(xz)pZ(z)pt(x)dz=Ez[ut(xZ)Xt=x]u_t(x) = \int u_t(x|z)\, p_{Z|t}(z|x)\, dz = \int u_t(x|z)\, \frac{p_{t|Z}(x|z)\, p_Z(z)}{p_t(x)}\, dz = \mathbb{E_z}[u_t(x|Z) \mid X_t=x](第二式将后验 pZt(zx)p_{Z|t}(z|x) 用贝叶斯展开;末式为条件期望形式,便于理解和计算)。
  • 边际速度场具体计算公式: ut(x)k=1Kut(xz(k))ptZ(xz(k))权重wkk=1Kwk u_t(x) \approx \frac{ \sum_{k=1}^K u_t(x\mid z^{(k)}) \cdot \underbrace{p_{t\mid Z}(x\mid z^{(k)})}_{\text{权重}w_k} }{ \sum_{k=1}^K w_k } 其中 z(k)pZ(z) z^{(k)} \sim p_Z(z)
边际速度场数学推导:把期望换成可计算形式

边际速度场数学推导:把期望换成可计算形式 你要的积分:

ut(x)=ut(xz)pZt(zx)dz u_t(x) = \int u_t(x\mid z)\,\color{red}{p_{Z\mid t}(z\mid x)}\,dz

把贝叶斯代入:

pZt(zx)=ptZ(xz)pZ(z)pt(x) \color{red}{p_{Z\mid t}(z\mid x)} = \frac{p_{t\mid Z}(x\mid z)\,p_Z(z)}{p_t(x)}

所以:

ut(x)=ut(xz)ptZ(xz)pZ(z)pt(x)dz u_t(x) = \int u_t(x\mid z) \cdot \frac{p_{t\mid Z}(x\mid z)\,p_Z(z)}{p_t(x)} dz

把分母提出来:

ut(x)=1pt(x)ut(xz)ptZ(xz)pZ(z)dz u_t(x) = \frac{1}{p_t(x)} \int u_t(x\mid z)\,p_{t\mid Z}(x\mid z)\,\color{red}{p_Z(z)}\,dz

注意红色部分:

()pZ(z)dz=Ezp[] \int (\cdots) \color{red}{p_Z(z)} dz = \mathbb{E}_{z\sim p}\big[\,\cdots\,\big]

所以:

ut(x)=1pt(x)  Ezp[ut(xz)ptZ(xz)] u_t(x) = \frac{1}{p_t(x)}\; \mathbb{E}_{z\sim p}\big[\,u_t(x\mid z)\,p_{t\mid Z}(x\mid z)\,\big]

分母 pt(x)p_t(x) 也能写成期望

pt(x)=ptZ(xz)pZ(z)dz p_t(x) = \int p_{t\mid Z}(x\mid z)\,p_Z(z)\,dz

也是对 p(z)p(z) 的期望:

pt(x)=Ezp[ptZ(xz)] p_t(x) = \mathbb{E}_{z\sim p}\big[\,p_{t\mid Z}(x\mid z)\,\big]

合起来:重要采样公式 把两个期望合并:

ut(x)=  Ezp[ut(xz)ptZ(xz)]    Ezp[ptZ(xz)]   u_t(x) = \frac{\;\mathbb{E}_{z\sim p}\big[\,u_t(x\mid z)\cdot p_{t\mid Z}(x\mid z)\,\big]\;} {\;\mathbb{E}_{z\sim p}\big[\,p_{t\mid Z}(x\mid z)\,\big]\;}

离散化:变成加权平均 期望用样本平均近似:

E[]1Kk=1K() \mathbb{E}[\cdots] \approx \frac{1}{K}\sum_{k=1}^K (\cdots)

代入:

ut(x)k=1Kut(xz(k))ptZ(xz(k))权重wkk=1Kwk u_t(x) \approx \frac{ \sum_{k=1}^K u_t(x\mid z^{(k)}) \cdot \underbrace{p_{t\mid Z}(x\mid z^{(k)})}_{\text{权重}w_k} }{ \sum_{k=1}^K w_k }

其中 z(k)pZ(z) z^{(k)} \sim p_Z(z)

todo

通过预测score计算速度场:

ut(xy)=atx+btlogptY(xy).(4.87) u_t(x|y) = a_t x + b_t \nabla \log p_{t|Y}(x|y). \tag{4.87} ptY(xy)=pYt(yx)pt(x)pY(y).(4.88) p_{t|Y}(x|y) = \frac{p_{Y|t}(y|x) p_t(x)}{p_Y(y)}. \tag{4.88}

logptY(xy)条件分数=logpYt(yx)分类器+logpt(x)无条件分数,(4.89) \underbrace{\nabla \log p_{t|Y}(x|y)}_{\text{条件分数}} = \underbrace{\nabla \log p_{Y|t}(y|x)}_{\text{分类器}} + \underbrace{\nabla \log p_t(x)}_{\text{无条件分数}}, \tag{4.89}

u~tθ,ϕ(xy)=atx+bt(logpYtϕ(yx)+logptθ(x))=utθ(x)+btlogpYtϕ(yx),(4.90) \tilde{u}_t^{\theta,\phi}(x|y) = a_t x + b_t \bigl( \nabla \log p_{Y|t}^\phi(y|x) + \nabla \log p_t^\theta(x) \bigr) = u_t^\theta(x) + b_t \nabla \log p_{Y|t}^\phi(y|x), \tag{4.90}

u~tθ,ϕ(xy)=utθ(x)+btwlogpYtϕ(yx),(4.91) \tilde{u}_t^{\theta,\phi}(x|y) = u_t^\theta(x) + b_t w \nabla \log p_{Y|t}^\phi(y|x), \tag{4.91} logpYt(yx)分类器=logptY(xy)条件分数logpt(x)无条件分数,(4.92) \underbrace{\nabla \log p_{Y|t}(y|x)}_{\text{分类器}} = \underbrace{\nabla \log p_{t|Y}(x|y)}_{\text{条件分数}} - \underbrace{\nabla \log p_t(x)}_{\text{无条件分数}}, \tag{4.92}

logptY(xy)=utθ(xy)atxbt\nabla \log p_{t|Y}(x|y) = \frac{u_t^\theta(x|y) - a_t x}{b_t}logpt(x)=utθ(x)atxbt\nabla \log p_t(x) = \frac{u_t^\theta(x|\emptyset) - a_t x}{b_t}。代入上式:

u~tθ(xy)=utθ(x)+btwutθ(xy)utθ(x)bt=(1w)utθ(x)+wutθ(xy). \tilde{u}_t^\theta(x|y) = u_t^\theta(x|\emptyset) + b_t w\,\frac{u_t^\theta(x|y) - u_t^\theta(x|\emptyset)}{b_t} = (1-w)\, u_t^\theta(x|\emptyset) + w\, u_t^\theta(x|y).

相关内容