首页 文章

如何让不可变的F#更高效?

提问于
浏览
12

我想用不可变的F#写一大块C#代码 . 它是一个设备监视器,当前的实现通过不断从串行端口获取数据并根据新数据更新成员变量来工作 . 我想将它转移到F#并获得不可变记录的好处,但我在概念验证实现中的第一次拍摄非常慢 .

open System
open System.Diagnostics

type DeviceStatus = { RPM         : int;
                      Pressure    : int;
                      Temperature : int }

// I'm assuming my actual implementation, using serial data, would be something like 
// "let rec UpdateStatusWithSerialReadings (status:DeviceStatus) (serialInput:string[])".
// where serialInput is whatever the device streamed out since the previous check: something like
// ["RPM=90","Pres=50","Temp=85","RPM=40","Pres=23", etc.]
// The device streams out different parameters at different intervals, so I can't just wait for them all to arrive and aggregate them all at once.
// I'm just doing a POC here, so want to eliminate noise from parsing etc.
// So this just updates the status's RPM i times and returns the result.
let rec UpdateStatusITimes (status:DeviceStatus) (i:int) = 
    match i with
    | 0 -> status
    | _ -> UpdateStatusITimes {status with RPM = 90} (i - 1)

let initStatus = { RPM = 80 ; Pressure = 100 ; Temperature = 70 }
let stopwatch = new Stopwatch()

stopwatch.Start()
let endStatus = UpdateStatusITimes initStatus 100000000
stopwatch.Stop()

printfn "endStatus.RPM = %A" endStatus.RPM
printfn "stopwatch.ElapsedMilliseconds = %A" stopwatch.ElapsedMilliseconds
Console.ReadLine() |> ignore

这在我的机器上运行大约1400毫秒,而等效的C#代码(具有可变成员变量)在大约310毫秒运行 . 有没有办法在不失去不变性的情况下加快速度?我希望F#编译器会注意到initStatus和所有中间状态变量从未被重用,因此只是改变场景后面的那些记录,但我猜不是 .

4 回答

  • 4

    在F#社区中,只要不是公共接口的一部分,命令式代码和可变数据就不会受到诟病 . 即,只要您封装它并将其与其余代码隔离,使用可变数据就可以了 . 为此,我建议如下:

    type DeviceStatus =
      { RPM         : int
        Pressure    : int
        Temperature : int }
    
    // one of the rare scenarios in which I prefer explicit classes,
    // to avoid writing out all the get/set properties for each field
    [<Sealed>]
    type private DeviceStatusFacade =
        val mutable RPM         : int
        val mutable Pressure    : int
        val mutable Temperature : int
        new(s) =
            { RPM = s.RPM; Pressure = s.Pressure; Temperature = s.Temperature }
        member x.ToDeviceStatus () =
            { RPM = x.RPM; Pressure = x.Pressure; Temperature = x.Temperature }
    
    let UpdateStatusITimes status i =
        let facade = DeviceStatusFacade(status)
        let rec impl i =
            if i > 0 then
                facade.RPM <- 90
                impl (i - 1)
        impl i
        facade.ToDeviceStatus ()
    
    let initStatus = { RPM = 80; Pressure = 100; Temperature = 70 }
    let stopwatch = System.Diagnostics.Stopwatch.StartNew ()
    let endStatus = UpdateStatusITimes initStatus 100000000
    stopwatch.Stop ()
    
    printfn "endStatus.RPM = %d" endStatus.RPM
    printfn "stopwatch.ElapsedMilliseconds = %d" stopwatch.ElapsedMilliseconds
    stdin.ReadLine () |> ignore
    

    这样,公共接口不受影响 - UpdateStatusITimes 仍然需要并返回一个本质上不可变的 DeviceStatus - 但内部 UpdateStatusITimes 使用可变类来消除分配开销 .

    EDIT: (作为对评论的回应)这是我通常喜欢的类的样式,使用主构造函数和 let 属性而不是 val

    [<Sealed>]
    type private DeviceStatusFacade(status) =
        let mutable rpm      = status.RPM
        let mutable pressure = status.Pressure
        let mutable temp     = status.Temperature
        member x.RPM         with get () = rpm      and set n = rpm      <- n
        member x.Pressure    with get () = pressure and set n = pressure <- n
        member x.Temperature with get () = temp     and set n = temp     <- n
        member x.ToDeviceStatus () =
            { RPM = rpm; Pressure = pressure; Temperature = temp }
    

    但对于简单的门面类,每个属性都是盲目的getter / setter,我觉得这有点单调乏味 .

    F#3允许以下内容,但我仍然没有发现它是一个改进,个人(除非一个教条地避免字段):

    [<Sealed>]
    type private DeviceStatusFacade(status) =
        member val RPM         = status.RPM with get, set
        member val Pressure    = status.Pressure with get, set
        member val Temperature = status.Temperature with get, set
        member x.ToDeviceStatus () =
            { RPM = x.RPM; Pressure = x.Pressure; Temperature = x.Temperature }
    
  • 7

    这不会回答你的问题,但它可能值得退一步并考虑大局:

    • 您认为这个用例的不可变数据结构的优势是什么? F#也支持可变数据结构 .

    • 你声称F#是"really slow" - 但它只比C#代码慢4.5倍,并且每秒更新超过7000万次......这对你的实际应用程序来说可能是不可接受的性能吗?您是否有特定的性能目标?有理由相信这种类型的代码会成为您应用程序的瓶颈吗?

    设计总是需要权衡 . 您可能会发现,为了在短时间内记录许多更改,根据您的需要,不可变数据结构会产生令人无法接受的性能损失 . 另一方面,如果您有一些要求,例如同时跟踪数据结构的多个旧版本,那么不可变数据结构的好处可能会使它们具有吸引力,尽管性能会受到影响 .

  • 12

    我怀疑你看到的性能问题是由于在循环的每次迭代中克隆记录时所涉及的块内存归零(加上分配它的时间可忽略不计并随后进行垃圾收集) . 您可以使用结构重写您的示例:

    [<Struct>]
    type DeviceStatus =
        val RPM : int
        val Pressure : int
        val Temperature : int
        new(rpm:int, pres:int, temp:int) = { RPM = rpm; Pressure = pres; Temperature = temp }
    
    let rec UpdateStatusITimes (status:DeviceStatus) (i:int) = 
        match i with
        | 0 -> status
        | _ -> UpdateStatusITimes (DeviceStatus(90, status.Pressure, status.Temperature)) (i - 1)
    
    let initStatus = DeviceStatus(80, 100, 70)
    

    现在,性能将接近于使用全局可变变量或将 UpdateStatusITimes status i 重新定义为 UpdateStatusITimes rpm pres temp i . 这只有在你的结构长度不超过16个字节时才有效,否则它将以与记录相同的缓慢方式被复制 .

    如果您在评论中暗示过,您打算将其用作共享内存多线程设计的一部分,那么您将需要在某些时候进行可变性 . 您的选择是a)每个参数的共享可变变量b)一个包含结构的共享可变变量或c)包含可变字段的共享外观对象(如ildjarn的答案) . 我会选择最后一个,因为它很好地封装并扩展到超过四个int字段 .

  • 7

    使用如下元组比原始解决方案快15倍:

    type DeviceStatus = int * int * int
    
    let rec UpdateStatusITimes (rpm, pressure, temp) (i:int) = 
        match i with
        | 0 -> rpm, pressure, temp
        | _ -> UpdateStatusITimes (90,pressure,temp) (i - 1)
    
    while true do
      let initStatus = 80, 100, 70
      let stopwatch = new Stopwatch()
    
      stopwatch.Start()
      let rpm,_,_ as endStatus = UpdateStatusITimes initStatus 100000000
      stopwatch.Stop()
    
      printfn "endStatus.RPM = %A" rpm
      printfn "Took %fs" stopwatch.Elapsed.TotalSeconds
    

    顺便说一下,你应该在计时时使用 stopwatch.Elapsed.TotalSeconds .

相关问题