首页 文章

使用pymc与MCMC拟合两个正态分布(直方图)?

提问于
浏览
13

我正在尝试使用CCD上的光谱仪检测线条轮廓 . 为了便于考虑,我已经包含了一个演示,如果解决了,它与我实际想要解决的演示非常相似 .

我看过这个:https://stats.stackexchange.com/questions/46626/fitting-model-for-two-normal-distributions-in-pymc以及其他各种问题和答案,但他们正在做的事情与我想做的事情根本不同 .

import pymc as mc
import numpy as np
import pylab as pl
def GaussFunc(x, amplitude, centroid, sigma):
    return amplitude * np.exp(-0.5 * ((x - centroid) / sigma)**2)

wavelength = np.arange(5000, 5050, 0.02)

# Profile 1
centroid_one = 5025.0
sigma_one = 2.2
height_one = 0.8
profile1 = GaussFunc(wavelength, height_one, centroid_one, sigma_one, )

# Profile 2
centroid_two = 5027.0
sigma_two = 1.2
height_two = 0.5
profile2 = GaussFunc(wavelength, height_two, centroid_two, sigma_two, )

# Measured values
noise = np.random.normal(0.0, 0.02, len(wavelength))
combined = profile1 + profile2 + noise

# If you want to plot what this looks like
pl.plot(wavelength, combined, label="Measured")
pl.plot(wavelength, profile1, color='red', linestyle='dashed', label="1")
pl.plot(wavelength, profile2, color='green', linestyle='dashed', label="2")
pl.title("Feature One and Two")
pl.legend()

Plot of true values and measured values

My question: PyMC(或某些变体)可以给出上面使用的两个组件的平均值,幅度和西格玛吗?

请注意,我真正适合我真正问题的函数不是高斯函数 - 所以请使用泛型函数(如我的示例中的GaussFunc)提供示例,而不是“内置”pymc.Normal()类型功能 .

另外,我理解模型选择是另一个问题:因此,对于当前的噪声,1个组件(配置文件)可能是统计上合理的 . 但我想看看1,2,3等组件的最佳解决方案是什么 .

我也不喜欢使用PyMC的想法 - 如果scikit-learn,astroML或其他一些软件包看起来很完美,请告诉我!

编辑:

我在很多方面都失败了,但我认为其中一条正确的方法是:

sigma_mc_one = mc.Uniform('sig', 0.01, 6.5)
height_mc_one = mc.Uniform('height', 0.1, 2.5)
centroid_mc_one = mc.Uniform('cen', 5015., 5040.)

但我无法构建一个有效的mc.model .

1 回答

  • 15

    不是最简洁的PyMC代码,但我做出了帮助读者的决定 . 这应该运行,并给出(真正)准确的结果 .

    我决定使用自由范围的Uniform priors,因为我真的不知道我们在建模什么 . 但是,可能有一个关于质心位置的想法,并且可以在那里使用更好的先验 .

    ### Suggested one runs the above code first.
    ### Unknowns we are interested in
    
    
    est_centroid_one = mc.Uniform("est_centroid_one", 5000, 5050 )
    est_centroid_two = mc.Uniform("est_centroid_two", 5000, 5050 )
    
    est_sigma_one = mc.Uniform( "est_sigma_one", 0, 5 )
    est_sigma_two = mc.Uniform( "est_sigma_two", 0, 5 )
    
    est_height_one = mc.Uniform( "est_height_one", 0, 5 ) 
    est_height_two = mc.Uniform( "est_height_two", 0, 5 ) 
    
    #std deviation of the noise, converted to precision by tau = 1/sigma**2
    precision= 1./mc.Uniform("std", 0, 1)**2
    
    #Set up the model's relationships.
    
    @mc.deterministic( trace = False) 
    def est_profile_1(x = wavelength, centroid = est_centroid_one, sigma = est_sigma_one, height= est_height_one):
        return GaussFunc( x, height, centroid, sigma )
    
    
    @mc.deterministic( trace = False) 
    def est_profile_2(x = wavelength, centroid = est_centroid_two, sigma = est_sigma_two, height= est_height_two):
        return GaussFunc( x, height, centroid, sigma )
    
    
    @mc.deterministic( trace = False )
    def mean( profile_1 = est_profile_1, profile_2 = est_profile_2 ):
        return profile_1 + profile_2
    
    
    observations = mc.Normal("obs", mean, precision, value = combined, observed = True)
    
    
    model = mc.Model([est_centroid_one, 
                  est_centroid_two, 
                    est_height_one,
                    est_height_two,
                    est_sigma_one,
                    est_sigma_two,
                    precision])
    
    #always a good idea to MAP it prior to MCMC, so as to start with good initial values
    map_ = mc.MAP( model )
    map_.fit()
    
    mcmc = mc.MCMC( model )
    mcmc.sample( 50000,40000 ) #try running for longer if not happy with convergence.
    

    重要

    请记住,算法与标记无关,因此结果可能显示 profile1 具有 profile2 的所有特征,反之亦然 .

相关问题