我深入杂草逆向工程一种非常古老的专有文档存储格式(Keyfile) . 嵌入在较大文件中间的是用CCITT4编码的图像数据块(单个文档页面的扫描) . 到目前为止,我已经对文件和TIFF规范有了足够的了解,可以编写一个过滤器,从源文件中提取数据并写入一个应该是普通TIFF的新文件,但它还没有完全存在,我可以弄清楚我还缺少什么 .

令人鼓舞的是,Adobe Photoshop打开了我新发布的TIFF文件,并且显示文档很好(没有错误,没有警告) . 不幸的是,其他常用工具都不会 . 我在Mac上并且可以访问linux,所以我尝试过:

  • Gimp

  • 预览(OSX)

  • ImageMagick

  • 一些libtiff实用程序,如fax2pdf

我怀疑我的TIFF文件还有问题,Photoshop正在默默地忽略 . 我希望它不在原始CCITT4图像数据中,因为我宁愿不必编写代码来完全解码 .

I can't post the files I'm working with because they contain sensitive data. 但是,我只是对我的tiff Headers 块做错了,有人可以指出 . 为此 . 这里是我的测试文件的一些基本信息(在Photoshop中打开的那个) .

Keyfile.tiff 31K (32300 bytes)
 Keyfile TIFF Version 1.01
   0100.0004.00000001.000009f0 ImageWidth
   0101.0004.00000001.00000ce0 ImageLength
   0102.0003.00000001.00000001 BitsPerSample
   0103.0003.00000001.00000004 Compression
   0106.0003.00000001.00000000 PhotometricInterpolation
   0111.0004.00000001.00000200 StripOffsets
   0115.0003.00000001.00000001 SamplesPerPixel
   0116.0004.00000001.00000ce0 RowsPerStrip
   0117.0004.00000001.00007c2c StripByteCounts
   011a.0005.00000001.000001d6 XResolution
   011b.0005.00000001.000001de YResolution
   0128.0004.00000001.00000002 ResolutionUnit
   0131.0002.0000001a.000001e6 Software

这个TIFF头块的解码来自我写的代码 . 这是文件头部分的十六进制转储,地址为0x200 .

49492A00080000000D000001040001000000F00900000101040001000000E00C00000201030001000000010000000301030001000000040000000601030001000000000000001101040001000000000200001501030001000000010000001601040001000000E00C000017010400010000002C7C00001A01050001000000D60100001B01050001000000DE010000280104000100000002000000310102001A000000E6010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002C010000010000002C010000010000004B657966696C6520544946462056657273696F6E20312E303100

以下是压缩图像数据的0x7c2c字节 . 我这样说是基于tiff压缩标签(4),它被复制到原始文件的完整形式,并通过十六进制编辑器查看数十个和几十个文件,并学习识别图像数据块 . 此外,Photoshop打开此文件的事实似乎表明我是正确的 .

任何帮助搞清楚我仍然需要做什么来使这个文件与其他实用程序兼容将是非常感谢 .

这里有什么值得的imagemagick产生的错误:

转换Keyfile.tiff Keyfile.pdf convert:条带0的第0行的过早EOL(得到0,预期为2544) . `Fax4Decode'@ warning / tiff.c / TIFFWarnings / 881 .

我是编码TIFF的新手,因此任何可以让我收集有关正在发生的事情的详细信息的实用程序或提示也将受到赞赏 .

Update:

这是文件的第一个0x318字节 . 这里没有什么敏感的,你有图像数据的第一个0x118字节 . 如果需要,我可以提供更多的文件 .

49492A00080000000D000001040001000000F00900000101040001000000E00C00000201030001000000010000000301030001000000040000000601030001000000000000001101040001000000000200001501030001000000010000001601040001000000E00C000017010400010000002C7C00001A01050001000000D60100001B01050001000000DE010000280104000100000002000000310102001A000000E6010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002C010000010000002C010000010000004B657966696C6520544946462056657273696F6E20312E3031000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000FFFFFFFC8085B51FFFFFFFFFFFFFFFFFFFFFFFF90154E0C4221836AC80A900F04142050814204679705E823C0D3089900E92D641B9B1D2907364E94886C112854118E6208686E6492B47D11C1A29289806DC25083A41427495102E6D349641736AA96439B08496113867960B314A08CC1A2102141410221AADC28102123E918508E02AC41143D2C5131C3C68B1620B8CCB02A8238F564536394D16F11AA050CEA8A9944105DB92591D12D04513E195B23E1252561A742191D11B0628110DA6E5259A6881891832C74B704A0C8F1B4618450E2AA4087391D17988888EA41CDAD8A2B0AAA4436A2647D94CC585

Update 2:

好的,我发现了一个可以发布的文件 . 这是一个大多数白页,但如果渲染正确,你会看到两个黑暗的月牙卫星,这是原始扫描页面上的孔的反射 . 右侧和顶部也有一些噪音 . 这是它的样子(图像):

Sample problem page

我用Photoshop来转换/保存我可以上传的文件 . 这是我的代码生成的文件的十六进制转储,在Photoshop中打开很好,但没有其他任何东西 .

49492A00080000000D000001040001000000F00900000101040001000000E00C00000201030001000000010000000301030001000000040000000601030001000000000000001101040001000000000200001501030001000000010000001601040001000000E00C00001701040001000000530300001A01050001000000D60100001B01050001000000DE010000280104000100000002000000310102001A000000E6010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002C010000010000002C010000010000004B657966696C6520544946462056657273696F6E20312E3031002C19461170350282E88E8AF52889A91024623806A1C8F97C8E8D111D1847115B44CF3A2388DA2E8C2388122F98C868E23451112508B88600D4297C8E88E44788F91E308BC4745CC8F91E23A2EC8E88F11E23B36447C8F11CC8E611020711111A6888390E39C738E0848E8BA23A388D4A224111B03681C206478DA892946E2E06D06B51121718036032092844E0AE470350604AA229C88E0680CC224511803402E24A11F88E0660D8224A40CD1016ACC8E0B606048906482C101752460C8E19006E224AC3203901D091B03C08122D9C0DA12141BFFFFFFFFFFFFFFFFFFFFFFFFFFFFF2D2125082123F1A2EA08124122EB6820A475E2105130A8209826474388886475612449543B295550C8E88224EC591D1174295B23A48C0EC591E08762111E23A2F9F46D11D02E22323E088A3870447542223EE35BDF56AD5856AD430A1856AC2879692C06C2FC304259A688BA23D2D23211A4088FC504162A5373447C20A2396062188A891F23F7C48E89502F41A46D11B417126E51328709EDE4747D04171D8B23A650E5714E13158921F111588AB0AF72CA6AB50ED27690664750C286B6B1B29D351609F21976B8685A8613C309A96014631FFFFFFFFFFFFF2039C720383A5C5DFEB56B0B51FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF9601A8FFFFFFFFFFFFFFFFFFFFFFFFFFCEC6947FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF95CEA3FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFE5A852A3FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFC004004

这是它的规格 .

Keyfile_66.tiff 1K (1363 bytes)
Keyfile TIFF Version 1.01
 0100.0004.00000001.000009f0 ImageWidth
 0101.0004.00000001.00000ce0 ImageLength
 0102.0003.00000001.00000001 BitsPerSample
 0103.0003.00000001.00000004 Compression
 0106.0003.00000001.00000000 PhotometricInterpolation
 0111.0004.00000001.00000200 StripOffsets
 0115.0003.00000001.00000001 SamplesPerPixel
 0116.0004.00000001.00000ce0 RowsPerStrip
 0117.0004.00000001.00000353 StripByteCounts
 011a.0005.00000001.000001d6 XResolution
 011b.0005.00000001.000001de YResolution
 0128.0004.00000001.00000002 ResolutionUnit
 0131.0002.0000001a.000001e6 Software

这是a link to download the file .

任何想法为什么会非常感激 .