首页 文章

C和cython - 寻求避免模板限制的设计模式

提问于
浏览
10

Cython中的一个主要问题是缺少python文件中的模板支持 . 我有一个用C编写的模拟系统,我用Cython包装各个类并使用python运行它们 .

当模板化ac方法时,无法从python中将模板类发送到包装器方法 - 相反,我最终向Cython发送字符串,然后必须根据已知值检查字符串,手动将C类传递给基础C方法 . 这是绝对有意义的,因为Cython确实需要知道可能的模板参数以便编译C,但它仍然是一个问题 .

由于这些模板化方法的候选者列表正在增长,这变得非常令人烦恼 - 特别是对于一个c方法的两个或三个模板,我必须在cython中执行两层或三层if语句 .

幸运的是,我现在处于这个代码库的唯一作者和用户的幸运位置 . 我很乐意重构,并希望借此机会这样做,以避免将来的头痛 . 我特别在寻找建议,以避免在C方面使用模板(作为设计模式问题)的一些方法,而不是依赖于cython方面的某种hacky方法 . 如果cython对模板有这些限制 .

我已经编写了一个最小的工作示例来突出显示我的程序中出现的进程类型 . 但实际上,它是一个浓缩物质模拟,它从并行处理(使用OMP)中获益很大,而这正是我的模板在我看来必要的地方 . 虽然在简单性方面尽量保持最小化,但它会编译并生成输出,以便您可以看到正在发生的事情 . 它是用g编译的,我使用-lgomp链接OMP(或删除pragma和include)并使用std = c 11标志 .

#include <vector>
#include <map>
#include <algorithm>
#include <omp.h>
#include <iostream>
#include <iomanip>

/*
 * Just a class containing some components to run through
 * a Modifier (see below)
 */
class ToModify{
public:
    std::vector<double> Components;
    ToModify(std::vector<double> components) : Components(components){}
};

/*
 * An abstract class which handles the modification of ToModify
 * components in an arbitrary way.
 * It is, however, known that child classes have a parameter
 * (here, unimaginatively called Parameter).
 * These parameters have a minimum and maximum value, which is
 * to be determined by the child class.
 */
class Modifier{
protected:
    double Parameter;
public:
    Modifier(double parameter = 0) : Parameter(parameter){}
    void setParameter(double parameter){
        Parameter = parameter;
    }
    double getParameter(){
        return Parameter;
    }
    virtual double getResult(double component) = 0;
};

/*
 * Compute component ratios with a pre-factor.
 * The minimum is zero, such that getResult(component) == 0 for all components.
 * The maximum is such that getResult(component) <= 1 for all components.
 */
class RatioModifier : public Modifier{
public:
    RatioModifier(double parameter = 0) : Modifier(parameter){}
    double getResult(double component){
        return Parameter * component;
    }

    static double getMaxParameter(const ToModify toModify){
        double maxComponent = *std::max_element(toModify.Components.begin(), toModify.Components.end());
        return 1.0 / maxComponent;
    }
    static double getMinParameter(const ToModify toModify){
        return 0;
    }
};

/*
 * Compute the multiple of components with a factor f.
 * The minimum parameter is the minimum of the components,
 *     such that f(min(components)) == min(components)^2.
 * The maximum parameter is the maximum of the components,
 *     such that f(max(components)) == max(components)^2.
 */
class MultipleModifier : public Modifier{
public:
    MultipleModifier(double parameter = 0) : Modifier(parameter){}

    double getResult(double component){
        return Parameter * component;
    }

    static double getMaxParameter(const ToModify toModify){
        return *std::max_element(toModify.Components.begin(), toModify.Components.end());
    }
    static double getMinParameter(const ToModify toModify){
        return *std::min_element(toModify.Components.begin(), toModify.Components.end());
    }

};

/*
 * A class to handle the mass-calculation of a ToModify objects' components
 * through a given Modifier child class, across a range of parameters.
 * The use of parallel processing highlights
 * my need to generate multiple classes of a given type, and
 * hence my (apparent) need to use templating.
 */
class ModifyManager{
protected:
    const ToModify Modify;
public:
    ModifyManager(ToModify modify) : Modify(modify){}

    template<class ModifierClass>
    std::map<double, std::vector<double>> scanModifiers(unsigned steps){
        double min = ModifierClass::getMinParameter(Modify);
        double max = ModifierClass::getMaxParameter(Modify);
        double step = (max - min)/(steps-1);

        std::map<double, std::vector<double>> result;

        #pragma omp parallel for
        for(unsigned i = 0; i < steps; ++i){
            double parameter = min + step*i;
            ModifierClass modifier(parameter);
            std::vector<double> currentResult;
            for(double m : Modify.Components){
                currentResult.push_back(modifier.getResult(m));
            }
            #pragma omp critical
            result[parameter] = currentResult;
        }
        return result;
    }

    template<class ModifierClass>
    void outputScan(unsigned steps){
        std::cout << std::endl << "-----------------" << std::endl;
        std::cout << "original: " << std::endl;
        std::cout << std::setprecision(3);
        for(double component : Modify.Components){
            std::cout << component << "\t";
        }
        std::cout << std::endl << "-----------------" << std::endl;
        std::map<double, std::vector<double>> scan = scanModifiers<ModifierClass>(steps);
        for(std::pair<double,std::vector<double>> valueSet : scan){
            std::cout << "parameter: " << valueSet.first << ": ";
            std::cout << std::endl << "-----------------" << std::endl;
            for(double component : valueSet.second){
                std::cout << component << "\t";
            }
            std::cout << std::endl << "-----------------" << std::endl;
        }
    }
};

int main(){
    ToModify m({1,2,3,4,5});
    ModifyManager manager(m);
    manager.outputScan<RatioModifier>(10);
    return 0;
}

我希望这不是太多的代码 - 我觉得有必要使用一个例子 . 如果有帮助,我可以制作精简版 .

为了在python中使用这种东西,我(通过我当前的方法)必须通过参数将 "RatioModifier""MultipleModifier" 传递给cython,然后根据已知值检查字符串,然后以相应的类作为模板运行 scanModifier . 这一切都很好,但是当我去添加一种修饰符或者有多个模板时,在cython方面存在问题 - 如果我有一些不同参数的 scanModifier 变体,那就特别糟糕了 .

一般的想法是我有一组修改器(在实际应用中,这些模拟磁场/电场和格子上的应变,而不仅仅是对数字列表进行基本数学运算),它们作用于对象内的值 . 这些修饰符具有一系列潜在值,重要的是修饰符具有一个状态('参数'在其他地方使用和访问,除了扫描范围之外的用途) . ToModify(网格)对象占用了大量RAM,因此无法创建副本 .

每个修饰符类对于给定的ToModify对象具有不同的值范围 . 这取决于修改的性质,而不是实例本身,因此我不能(在语义上)证明将它们设置为对象的非静态方法 . 将Modifier类的实例发送到扫描方法似乎太麻烦了,因为它的状态没有意义 .

我考虑过使用工厂模式 - 但是,因为它没有理由保持任何类型的状态,它将是静态的 - 并且将静态类传递给方法仍然需要模板化,这使我回到模板转换问题中用Cython . 我可以创建一个接受类名字符串的工厂类,并选择要使用的正确类,但这似乎只是将我的问题转换为C端 .

因为我总是打算编写有意义的代码,所以我有点陷入困境 . 似乎解决问题的最简单方法是将状态赋予不需要它的对象,但我根本不喜欢这种方法 . 围绕这类问题还有哪些其他方法?我应该改变扫描方法实际工作的方式,还是将其移动到自己的类中?为此,我被困住了 .

编辑

我认为提供一个cython方面的例子是一个好主意,以展示这可能是一场如此糟糕的噩梦 .

想象一下,我有一个方法,如上面的方法,但有两个模板参数 . 例如,假设一个是Modifier的子节点,另一个是SecondaryModifier,它进一步修改结果(对于任何感兴趣的使用:在实际程序的情况下,一个'Modifier'是一个修改的EdgeManager边缘权重来模拟应变或外部磁场的影响;另一个可以是模拟类型 - 例如,用于寻找能量/状态的紧束缚模型方法,或更多涉及的东西) .

并说我的修饰语是 ModifierA1ModifierA2ModifierA3 ,我的辅助修饰符是 ModifierB1ModifierB2ModifierB3 . 并且,为了变得非常丑陋,让我们有三个方法使用两个模板参数, method1method2method3 ,并给它们两个签名(一个取双,一个取整数) . 在正常的C设置中,这是非常常见的,并且不需要随后的可怕代码 .

cdef class SimulationManager:
    cdef SimulationManager_Object* pointer

    def __cinit__(self, ToModify toModify):
        self.pointer = new SimulationManager_Object(<ToModify_Object*>(toModify.pointer))

    def method1(self, str ModifierA, str ModifierB, someParameter):

        useInt = False

        if isinstance(someParameter, int):
            useInt = True
        elif not isinstance(someParameter, str):
            raise NotImplementedError("Third argument to method1 must be an int or a string")

        if ModifierA not in ["ModifierA1", "ModifierA2", "ModifierA3"]:
            raise NotImplementedError("ModifierA '%s' not handled in SimulationManager.method1" % ModifierA)
        if ModifierB not in ["ModifierB1", "ModifierB2", "ModifierB3"]:
            raise NotImplementedError("ModifierB '%s' not handled in SimulationManager.method1" % ModifierB)

        if ModifierA == "ModifierA1":
            if ModifierB == "ModifierB1":
                if useInt:
                    return self.pointer.method1[ModifierA1, ModifierB1](<int>someParameter)
                else:
                    return self.pointer.method1[ModifierA1, ModifierB1](<str>someParameter)                    
            elif ModifierB == "ModifierB2":
                if useInt:
                    return self.pointer.method1[ModifierA1, ModifierB2](<int>someParameter)
                else:
                    return self.pointer.method1[ModifierA1, ModifierB2](<str>someParameter)      
            else:
                if useInt:
                    return self.pointer.method1[ModifierA1, ModifierB3](<int>someParameter)
                else:
                    return self.pointer.method1[ModifierA1, ModifierB3](<str>someParameter)

        elif ModifierA == "ModifierA2":
            if ModifierB == "ModifierB1":
                if useInt:
                    return self.pointer.method1[ModifierA2, ModifierB1](<int>someParameter)
                else:
                    return self.pointer.method1[ModifierA2, ModifierB1](<str>someParameter)                    
            elif ModifierB == "ModifierB2":
                if useInt:
                    return self.pointer.method1[ModifierA2, ModifierB2](<int>someParameter)
                else:
                    return self.pointer.method1[ModifierA2, ModifierB2](<str>someParameter)
            else:
                if useInt:
                    return self.pointer.method1[ModifierA2, ModifierB3](<int>someParameter)
                else:
                    return self.pointer.method1[ModifierA2, ModifierB3](<str>someParameter)

        elif ModifierA == "ModifierA3":
            if ModifierB == "ModifierB1":
                if useInt:
                    return self.pointer.method1[ModifierA3, ModifierB1](<int>someParameter)
                else:
                    return self.pointer.method1[ModifierA3, ModifierB1](<str>someParameter)                    
            elif ModifierB == "ModifierB2":
                if useInt:
                    return self.pointer.method1[ModifierA3, ModifierB2](<int>someParameter)
                else:
                    return self.pointer.method1[ModifierA3, ModifierB2](<str>someParameter)
            else:
                if useInt:
                    return self.pointer.method1[ModifierA3, ModifierB3](<int>someParameter)
                else:
                    return self.pointer.method1[ModifierA3, ModifierB3](<str>someParameter)

    def method2(self, str ModifierA, str ModifierB, someParameter):

        useInt = False

        if isinstance(someParameter, int):
            useInt = True
        elif not isinstance(someParameter, str):
            raise NotImplementedError("Third argument to method2 must be an int or a string")

        if ModifierA not in ["ModifierA1", "ModifierA2", "ModifierA3"]:
            raise NotImplementedError("ModifierA '%s' not handled in SimulationManager.method2" % ModifierA)
        if ModifierB not in ["ModifierB1", "ModifierB2", "ModifierB3"]:
            raise NotImplementedError("ModifierB '%s' not handled in SimulationManager.method2" % ModifierB)

        if ModifierA == "ModifierA1":
            if ModifierB == "ModifierB1":
                if useInt:
                    return self.pointer.method2[ModifierA1, ModifierB1](<int>someParameter)
                else:
                    return self.pointer.method2[ModifierA1, ModifierB1](<str>someParameter)
            elif ModifierB == "ModifierB2":
                if useInt:
                    return self.pointer.method2[ModifierA1, ModifierB2](<int>someParameter)
                else:
                    return self.pointer.method2[ModifierA1, ModifierB2](<str>someParameter)
            else:
                if useInt:
                    return self.pointer.method2[ModifierA1, ModifierB3](<int>someParameter)
                else:
                    return self.pointer.method2[ModifierA1, ModifierB3](<str>someParameter)

        elif ModifierA == "ModifierA2":
            if ModifierB == "ModifierB1":
                if useInt:
                    return self.pointer.method2[ModifierA2, ModifierB1](<int>someParameter)
                else:
                    return self.pointer.method2[ModifierA2, ModifierB1](<str>someParameter)                    
            elif ModifierB == "ModifierB2":
                if useInt:
                    return self.pointer.method2[ModifierA2, ModifierB2](<int>someParameter)
                else:
                    return self.pointer.method2[ModifierA2, ModifierB2](<str>someParameter)
            else:
                if useInt:
                    return self.pointer.method2[ModifierA2, ModifierB3](<int>someParameter)
                else:
                    return self.pointer.method2[ModifierA2, ModifierB3](<str>someParameter)

        elif ModifierA == "ModifierA3":
            if ModifierB == "ModifierB1":
                if useInt:
                    return self.pointer.method2[ModifierA3, ModifierB1](<int>someParameter)
                else:
                    return self.pointer.method2[ModifierA3, ModifierB1](<str>someParameter)                    
            elif ModifierB == "ModifierB2":
                if useInt:
                    return self.pointer.method2[ModifierA3, ModifierB2](<int>someParameter)
                else:
                    return self.pointer.method2[ModifierA3, ModifierB2](<str>someParameter)
            else:
                if useInt:
                    return self.pointer.method2[ModifierA3, ModifierB3](<int>someParameter)
                else:
                    return self.pointer.method2[ModifierA3, ModifierB3](<str>someParameter)


    def method3(self, str ModifierA, str ModifierB, someParameter):

        useInt = False

        if isinstance(someParameter, int):
            useInt = True
        elif not isinstance(someParameter, str):
            raise NotImplementedError("Third argument to method3 must be an int or a string")

        if ModifierA not in ["ModifierA1", "ModifierA2", "ModifierA3"]:
            raise NotImplementedError("ModifierA '%s' not handled in SimulationManager.method3" % ModifierA)
        if ModifierB not in ["ModifierB1", "ModifierB2", "ModifierB3"]:
            raise NotImplementedError("ModifierB '%s' not handled in SimulationManager.method3" % ModifierB)

        if ModifierA == "ModifierA1":
            if ModifierB == "ModifierB1":
                if useInt:
                    return self.pointer.method3[ModifierA1, ModifierB1](<int>someParameter)
                else:
                    return self.pointer.method3[ModifierA1, ModifierB1](<str>someParameter)
            elif ModifierB == "ModifierB2":
                if useInt:
                    return self.pointer.method3[ModifierA1, ModifierB2](<int>someParameter)
                else:
                    return self.pointer.method3[ModifierA1, ModifierB2](<str>someParameter)
            else:
                if useInt:
                    return self.pointer.method3[ModifierA1, ModifierB3](<int>someParameter)
                else:
                    return self.pointer.method3[ModifierA1, ModifierB3](<str>someParameter)

        elif ModifierA == "ModifierA2":
            if ModifierB == "ModifierB1":
                if useInt:
                    return self.pointer.method3[ModifierA2, ModifierB1](<int>someParameter)
                else:
                    return self.pointer.method3[ModifierA2, ModifierB1](<str>someParameter)                    
            elif ModifierB == "ModifierB2":
                if useInt:
                    return self.pointer.method3[ModifierA2, ModifierB2](<int>someParameter)
                else:
                    return self.pointer.method3[ModifierA2, ModifierB2](<str>someParameter)
            else:
                if useInt:
                    return self.pointer.method3[ModifierA2, ModifierB3](<int>someParameter)
                else:
                    return self.pointer.method3[ModifierA2, ModifierB3](<str>someParameter)

        elif ModifierA == "ModifierA3":
            if ModifierB == "ModifierB1":
                if useInt:
                    return self.pointer.method3[ModifierA3, ModifierB1](<int>someParameter)
                else:
                    return self.pointer.method3[ModifierA3, ModifierB1](<str>someParameter)                    
            elif ModifierB == "ModifierB2":
                if useInt:
                    return self.pointer.method3[ModifierA3, ModifierB2](<int>someParameter)
                else:
                    return self.pointer.method3[ModifierA3, ModifierB2](<str>someParameter)
            else:
                if useInt:
                    return self.pointer.method3[ModifierA3, ModifierB3](<int>someParameter)
                else:
                    return self.pointer.method3[ModifierA3, ModifierB3](<str>someParameter)

这段代码不仅对于功能而言是荒谬的,而且它意味着我现在需要编辑.h文件,.cpp文件,.pxd文件和.pyx文件,只是为了添加一种新类型的修饰符 . 考虑到我们程序员对效率有着内在的痴迷,这种过程对我来说是不可接受的 .

再一次,我承认这是cython的一种必要过程(虽然我可以想到很多方法可以改进这个过程 . 也许当我有更多的空闲时间时,我会加入社区的努力) . 我问的纯粹是在C方面(除非在我自己和谷歌不知道的cython中有一个解决方法) .

我以前没有考虑过的一件事是一个工厂,其状态表明要创建的对象的类型,并传递它 . 然而,这似乎有点浪费,而且再一次只是在地毯下扫除问题 . 如果有的话,我真的在寻求想法(或设计模式),我不介意它们有多疯狂或不完整;我只是想获得一些创造力 .

1 回答

  • 3

    好的,所以我一直在玩工厂的想法 . 我仍然不相信它是“有意义的”,但也许我对“正当”状态的痴迷在这种情况下不值得麻烦 .

    为此,我提出以下建议 . 返回通用修饰符的通用工厂类,具有处理某些常见(但特定于类的)方法的子模板化工厂以及覆盖参数的特定子工厂 . 这确实意味着依赖指针(从通用工厂返回抽象类指针),但我在原始代码库中使用它们(不仅仅是为了它,诚实) .

    我不相信这是最好的方法(并且不会“接受”它作为答案) . 但是,这意味着我可以避免使用嵌套的if语句 . 我以为我会发表评论 . 您在评论中的一些建议非常出色,我要感谢大家 .

    #include <vector>
    #include <map>
    #include <algorithm>
    #include <omp.h>
    #include <iostream>
    #include <iomanip>
    
    /*
     * Just a class containing some components to run through
     * a Modifier (see below)
     */
    class ToModify{
    public:
        std::vector<double> Components;
        ToModify(std::vector<double> components) : Components(components){}
    };
    
    /*
     * An abstract class which handles the modification of ToModify
     * components in an arbitrary way. They each have a range of valid
     * parameters.
     * These parameters have a minimum and maximum value, which is
     * to be determined by the _factory_.
     */
    class Modifier{
    protected:
        double Parameter;
    public:
        Modifier(double parameter = 0) : Parameter(parameter){}
        void setParameter(double parameter){
            Parameter = parameter;
        }
        double getParameter(){
            return Parameter;
        }
        virtual double getResult(double component) = 0;
    };
    
    /*
     * A generalised modifier factory, acting as the parent class
     * for the specialised ChildModifierFactories below. This will
     * be the type that the scanning method accepts as an argument.
     */
    class GeneralModifierFactory{
    public:
        virtual Modifier* get(double parameter) = 0;
        virtual double getMinParameter(ToModify const toModify) = 0;
        virtual double getMaxParameter(ToModify const toModify) = 0;
    };
    
    /*
     * This takes the type of modifier as a template argument. It
     * is designed to be a parent to the ModifierFactories that
     * follow. Other common methods that involve the modifier
     * can be placed here to save code.
     */
    template<class ChildModifier>
    class ChildModifierFactory : public GeneralModifierFactory{
    public:
        ChildModifier* get(double parameter){
            return new ChildModifier(parameter);
        }
        virtual double getMinParameter(ToModify const toModify) = 0;
        virtual double getMaxParameter(ToModify const toModify) = 0;
    };
    
    /*
     * Compute component ratios with a pre-factor.
     * The minimum is zero, such that getResult(component) == 0 for all components.
     * The maximum is such that getResult(component) <= 1 for all components.
     */
    class RatioModifier : public Modifier{
    public:
        RatioModifier(double parameter = 0) : Modifier(parameter){}
        double getResult(double component){
            return Parameter * component;
        }
    };
    
    /*
     * This class handles the ranges of parameters which are valid in
     * the RatioModifier. The parent class handles the constructions.
     */
    class RatioModifierFactory : public ChildModifierFactory<RatioModifier>{
    public:
    
        double getMaxParameter(ToModify const toModify){
            double maxComponent = *std::max_element(toModify.Components.begin(), toModify.Components.end());
            return 1.0 / maxComponent;
        }
    
        double getMinParameter(ToModify const toModify){
            return 0;
        }
    };
    
    /*
     * Compute the multiple of components with a factor f.
     * The minimum parameter is the minimum of the components,
     *     such that f(min(components)) == min(components)^2.
     * The maximum parameter is the maximum of the components,
     *     such that f(max(components)) == max(components)^2.
     */
    class MultipleModifier : public Modifier{
    public:
        MultipleModifier(double parameter = 0) : Modifier(parameter){}
        double getResult(double component){
            return Parameter * component;
        }
    };
    
    /*
     * This class handles the ranges of parameters which are valid in
     * the MultipleModifier. The parent class handles the constructions.
     */
    class MultipleModifierFactory : public ChildModifierFactory<MultipleModifier>{
    public:
        double getMaxParameter(ToModify const toModify){
            return *std::max_element(toModify.Components.begin(), toModify.Components.end());
        }
        double getMinParameter(ToModify const toModify){
            return *std::min_element(toModify.Components.begin(), toModify.Components.end());
        }
    };
    
    /*
     * A class to handle the mass-calculation of a ToModify objects' components
     * through a given Modifier child class, across a range of parameters.
     */
    class ModifyManager{
    protected:
        ToModify const Modify;
    public:
        ModifyManager(ToModify modify) : Modify(modify){}
    
        std::map<double, std::vector<double>> scanModifiers(GeneralModifierFactory& factory, unsigned steps){
            double min = factory.getMinParameter(Modify);
            double max = factory.getMaxParameter(Modify);
            double step = (max - min)/(steps-1);
    
            std::map<double, std::vector<double>> result;
    
            #pragma omp parallel for
            for(unsigned i = 0; i < steps; ++i){
                double parameter = min + step*i;
                Modifier* modifier = factory.get(parameter);
                std::vector<double> currentResult;
                for(double m : Modify.Components){
                    currentResult.push_back(modifier->getResult(m));
                }
                delete modifier;
                #pragma omp critical
                result[parameter] = currentResult;
            }
            return result;
        }
    
        void outputScan(GeneralModifierFactory& factory, unsigned steps){
            std::cout << std::endl << "-----------------" << std::endl;
            std::cout << "original: " << std::endl;
            std::cout << std::setprecision(3);
            for(double component : Modify.Components){
                std::cout << component << "\t";
            }
            std::cout << std::endl << "-----------------" << std::endl;
            std::map<double, std::vector<double>> scan = scanModifiers(factory, steps);
            for(std::pair<double,std::vector<double>> valueSet : scan){
                std::cout << "parameter: " << valueSet.first << ": ";
                std::cout << std::endl << "-----------------" << std::endl;
                for(double component : valueSet.second){
                    std::cout << component << "\t";
                }
                std::cout << std::endl << "-----------------" << std::endl;
            }
        }
    };
    
    int main(){
        ToModify m({1,2,3,4,5});
        ModifyManager manager(m);
        RatioModifierFactory ratio;
        MultipleModifierFactory multiple;
        manager.outputScan(ratio, 10);
        std::cout << " --------------- " << std::endl;
        manager.outputScan(multiple, 10);
        return 0;
    }
    

    现在我可以传递一个包装的工厂类,或者一个字符串,可以为每个这样的参数转换为这样一个类(通过辅助函数) . 不完全理想,因为工厂有一个状态 - 它不需要,除非它有一个ToModify对象作为成员(这似乎相当无意义) . 但是,唉,它有效 .

相关问题