首页 文章

使用Rails应用程序中的ActionMailer,winmail.dat附件损坏

提问于
浏览
0

我在Ruby on Rails应用程序中使用ActionMailer来阅读电子邮件(ruby 1.9.3,rails 3.2.13) . 我有一封电子邮件,附有一个winmail.dat文件(ms-tnef),我正在使用tnef gem来提取其内容 .

问题是,当我从邮件中读取附件时,它会被破坏,并且tnef无法从中提取文件 .

$ tnef winmail.dat
ERROR: invalid checksum, input file may be corrupted

使用任何邮件应用程序提取winmail.dat附件,提取的winmail.dat可以正常使用tnef,我得到了它的内容 .

比较两个文件,我注意到: - 原始文件较大(76k对72k) - 它们在换行符上有所不同:原始文件有Windows格式(0D 0A),rails保存的文件有linux格式(0A)

我写了这个测试:

it 'should extract winmail.dat from email and extract its contents' do
    file_path = "#{::Rails.root}/spec/files/winmail-dat-001.eml"
    message = Mail::Message.new(File.read(file_path))
    anexo = message.attachments[0]
    files = []
    Tnef.unpack(anexo) do |file|
      files << File.basename(file)
    end
    puts files.inspect
    files.size.should == 2
end

这些消息失败了:

WARNING: invalid checksum, input file may be corrupted
Invalid RTF CRC, input file may be corrupted
WARNING: invalid checksum, input file may be corrupted

Assertion failed: ((attr->lvl_type == LVL_MESSAGE) || (attr->lvl_type == LVL_ATTACHMENT)), function attr_read, file attr.c, line 240.

Errno::EPIPE: Broken pipe


anexo = message.attachments[0]
 => #<Mail::Part:2159872060, Multipart: false, Headers: <Content-Type: application/ms-tnef; name="winmail.dat">, <Content-Transfer-Encoding: quoted-printable>, <Content-Disposition: attachment; filename="winmail.dat">>

我试图将它保存到磁盘上,然后再次读取它,但我得到了相同的结果

it 'should extract winmail.dat from email and extract its contents' do
    file_path = "#{::Rails.root}/spec/files/winmail-dat-001.eml"
    message = Mail::Message.new(File.read(file_path))
    anexo = message.attachments[0]

    tmpfile_name = "#{::Rails.root}/tmp/#{anexo.filename}"
    File.open(tmpfile_name, 'w+b', 0644) { |f| f.write anexo.body.decoded }
    anexo = File.open(tmpfile_name)

    files = []
    Tnef.unpack(anexo) do |file|
      files << File.basename(file)
    end
    puts files.inspect
    files.size.should == 2
end

我该如何阅读附件?

1 回答

  • 1

    anexo.body.decoded方法调用附件的最适合编码( Mail::Encodings )的解码方法,在您的情况下为quoted_printable .

    其中一些编码(7bit,8bit和quoted_printable)执行转换,将不同类型的换行符更改为特定于平台的换行符 . * quoted_printable“调用.to_lf来破坏winmail.dat文件

    # Decode the string from Quoted-Printable. Cope with hard line breaks
      # that were incorrectly encoded as hex instead of literal CRLF.
      def self.decode(str)
        str.gsub(/(?:=0D=0A|=0D|=0A)\r\n/, "\r\n").unpack("M*").first.to_lf
      end
    

    邮件/ core_extensions / string.rb:

    def to_lf
      to_str.gsub(/\n|\r\n|\r/) { "\n" }
    end
    

    要解决这个问题,您可以在没有最后一个.to_lf的情况下执行相同的编码 . 为此,您可以创建一个新的编码,该编码不会损坏您的文件并使用它来编码您的附件 .

    创建文件:lib / encodings / tnef_encoding.rb

    require 'mail/encodings/7bit'
    
    module Mail
      module Encodings
    
        # Encoding to handle Microsoft TNEF format
        # It's pretty similar to quoted_printable, except for the 'to_lf' (decode) and 'to_crlf' (encode)
        class TnefEncoding < SevenBit
          NAME='tnef'
    
          PRIORITY = 2
    
          def self.can_encode?(str)
            EightBit.can_encode? str
          end
    
          def self.decode(str)
            # **difference here** removed '.to_lf'
            str.gsub(/(?:=0D=0A|=0D|=0A)\r\n/, "\r\n").unpack("M*").first
          end
    
          def self.encode(str)
            # **difference here** removed '.to_crlf'
            [str.to_lf].pack("M")
          end
    
          def self.cost(str)
            # These bytes probably do not need encoding
            c = str.count("\x9\xA\xD\x20-\x3C\x3E-\x7E")
            # Everything else turns into =XX where XX is a
            # two digit hex number (taking 3 bytes)
            total = (str.bytesize - c)*3 + c
            total.to_f/str.bytesize
          end
    
          private
    
          Encodings.register(NAME, self)
        end
      end
    end
    

    要使用自定义编码,首先必须注册它:

    Mail::Encodings.register('tnef', Mail::Encodings::TnefEncoding)
    

    然后,将其设置为附件的首选编码:

    anexo.body.encoding('tnef')
    

    那么你的测试将成为:

    it 'should extract winmail.dat from email and extract its contents' do
        file_path = "#{::Rails.root}/spec/files/winmail-dat-001.eml"
        message = Mail::Message.new(File.read(file_path))
        anexo = message.attachments[0]
    
        tmpfile_name = "#{::Rails.root}/tmp/#{anexo.filename}"
        Mail::Encodings.register('tnef', Mail::Encodings::TnefEncoding)
        anexo.body.encoding('tnef')
        File.open(tmpfile_name, 'w+b', 0644) { |f| f.write anexo.body.decoded }
        anexo = File.open(tmpfile_name)
    
        files = []
        Tnef.unpack(anexo) do |file|
            files << File.basename(file)
        end
        puts files.inspect
        files.size.should == 2
    end
    

    希望能帮助到你!

相关问题