首页 文章

读取和base64编码二进制文件

提问于
浏览
8

我'm trying to read a binary file from the filesystem and then base64 encode it in JavaScript. I'使用FileReader API读取数据并找到base64编码器here .

我的代码似乎接近工作,问题是生成的base64数据是错误的 . 这是我到目前为止所得到的:

function saveResource() {
    var file = $(".resourceFile")[0].files[0];

    var reader = new FileReader();
    reader.onload = function(evt) {
        var fileData = evt.target.result;
        var bytes = new Uint8Array(fileData);
        var binaryText = '';

        for (var index = 0; index < bytes.byteLength; index++) {
            binaryText += String.fromCharCode( bytes[index] );
        }

        console.log(Base64.encode(binaryText));

    };
    reader.readAsArrayBuffer(file);
};

这是我正在测试的文件(它是100x100蓝色方块):

enter image description here

根据online base64 decoder/encoder,此文件应编码为:

/ 9J / 4AAQSkZJRgABAgAAAQABAAD / 2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL / 2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL / wAARCABkAGQDASIAAhEBAxEB / 8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL / 8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4 Tl5ufo6erx8vP09fb3 PN6 / 8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL / 8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3 PN6 / 9oADAMBAAIRAxEAPwDxyiiiv3E8wKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAoo ooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooA //ž

...而是我从JavaScript中获得的是:

W7 / DmMO / w6AAEEpGSUYAAQIAAAEAAQAAw7 / DmwBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDLDv8ObAEMBCQkJDAsMGA0NGDIhHCEyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMsO / w4AAEQgAZABkAwEiAAIRAQMRAcO / w4QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoLw7 / DhADCtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDLCgcKRwqEII0LCscOBFVLDkcOwJDNicsKCCQoWFxgZGiUmJygpKjQ1Njc4OTpDREVGR0hJSlNUVVZXWFlaY2RlZmdoaWpzdHV2d3h5esKDwoTChcKGwofCiMKJworCksKTwpTClcKWwpfCmMKZwprCosKjwqTCpcKmwqfCqMKpwqrCssKzwrTCtcK2wrfCuMK5wrrDgsODw4TDhcOGw4fDiMOJw4rDksOTw5TDlcOWw5fDmMOZw5rDocOiw6PDpMOlw6bDp8Oow6nDqsOxw7LDs8O0w7XDtsO3w7jDucO6w7 / DhAAfAQADAQEBAQEBAQEBAAAAAAAAAQIDBAUGBwgJCgvDv8OEAMK1EQACAQIEBAMEBwUEBAABAncAAQIDEQQFITEGEkFRB2FxEyIywoEIFELCkcKhwrHDgQkjM1LDsBVicsORChYkNMOhJcOxFxgZGiYnKCkqNTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXrCgsKDwoTChcKGwofCiMKJworCksKTwpTClcKWwpfCmMKZwprCosKjwqTCpcKmwqfCqMKpwqrCssKzwrTCtcK2wrfCuMK5wrrDgsODw4TDhcOGw4fDiMOJw4rDksOTw5TDlcOWw5 fDmMOZw5rDosOjw6TDpcOmw6fDqMOpw6rDssOzw7TDtcO2w7fDuMO5w7rDv8OaAAwDAQACEQMRAD8Aw7HDiijCosK / cTzDgMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooAMKiwoooA8O / w5k =

如果我不得不冒险猜测我会说这个问题与二进制数据中的非打印字符有关(如果我对明文文档进行编码,则可以正常工作) . 但解决这个问题的最佳方法是什么?

Edit

看起来这可能是base64库本身的一个问题(如果不是这样,那么将Uint8Array如何解压缩到用于库调用的字符串中) . 如果我改为使用浏览器的 btoa() 函数,并直接传递给Uint8Array binaryText ,那就行了 . 太糟糕了,所有浏览器都不存在该功能 .

2 回答

  • 2

    和谷歌一起救援 . 我找到了以下代码,它将输入数据作为"bytes"的普通数组(0到255之间的数字,包括在内;如果 Uint8Array 直接传递给它,也可以正常工作),并将其添加到我正在使用的库中:

    //note:  it is assumed that the Base64 object has already been defined
    //License:  Apache 2.0
    Base64.byteToCharMap_ = null;
    Base64.charToByteMap_ = null;
    Base64.byteToCharMapWebSafe_ = null;
    Base64.charToByteMapWebSafe_ = null;
    Base64.ENCODED_VALS_BASE =
        'ABCDEFGHIJKLMNOPQRSTUVWXYZ' +
        'abcdefghijklmnopqrstuvwxyz' +
        '0123456789';
    
    /**
     * Our default alphabet. Value 64 (=) is special; it means "nothing."
     * @type {string}
     */
    Base64.ENCODED_VALS = Base64.ENCODED_VALS_BASE + '+/=';
    Base64.ENCODED_VALS_WEBSAFE = Base64.ENCODED_VALS_BASE + '-_.';
    
    /**
     * Base64-encode an array of bytes.
     *
     * @param {Array.<number>|Uint8Array} input An array of bytes (numbers with
     *     value in [0, 255]) to encode.
     * @param {boolean=} opt_webSafe Boolean indicating we should use the
     *     alternative alphabet.
     * @return {string} The base64 encoded string.
     */
    Base64.encodeByteArray = function(input, opt_webSafe) {
      Base64.init_();
    
      var byteToCharMap = opt_webSafe ?
                          Base64.byteToCharMapWebSafe_ :
                          Base64.byteToCharMap_;
    
      var output = [];
    
      for (var i = 0; i < input.length; i += 3) {
        var byte1 = input[i];
        var haveByte2 = i + 1 < input.length;
        var byte2 = haveByte2 ? input[i + 1] : 0;
        var haveByte3 = i + 2 < input.length;
        var byte3 = haveByte3 ? input[i + 2] : 0;
    
        var outByte1 = byte1 >> 2;
        var outByte2 = ((byte1 & 0x03) << 4) | (byte2 >> 4);
        var outByte3 = ((byte2 & 0x0F) << 2) | (byte3 >> 6);
        var outByte4 = byte3 & 0x3F;
    
        if (!haveByte3) {
          outByte4 = 64;
    
          if (!haveByte2) {
            outByte3 = 64;
          }
        }
    
        output.push(byteToCharMap[outByte1],
                    byteToCharMap[outByte2],
                    byteToCharMap[outByte3],
                    byteToCharMap[outByte4]);
      }
    
      return output.join('');
    };
    
    /**
     * Lazy static initialization function. Called before
     * accessing any of the static map variables.
     * @private
     */
    Base64.init_ = function() {
      if (!Base64.byteToCharMap_) {
        Base64.byteToCharMap_ = {};
        Base64.charToByteMap_ = {};
        Base64.byteToCharMapWebSafe_ = {};
        Base64.charToByteMapWebSafe_ = {};
    
        // We want quick mappings back and forth, so we precompute two maps.
        for (var i = 0; i < Base64.ENCODED_VALS.length; i++) {
          Base64.byteToCharMap_[i] =
              Base64.ENCODED_VALS.charAt(i);
          Base64.charToByteMap_[Base64.byteToCharMap_[i]] = i;
          Base64.byteToCharMapWebSafe_[i] =
              Base64.ENCODED_VALS_WEBSAFE.charAt(i);
          Base64.charToByteMapWebSafe_[
              Base64.byteToCharMapWebSafe_[i]] = i;
        }
      }
    };
    

    包含上述函数的库的完整代码是available here,但在其未修改的形式中,它似乎依赖于许多其他库 . 上面稍微被黑客攻击的版本应该适用于只需要快速解决此问题的任何人 .

  • 8

    将二进制视为arraybuffer,这与任何字符编码无关 . 您的蓝色方块(.jpg)有361个本机字节,表示从0..255(十进制)的八位字节,它们不是字符!

    这意味着:使用ArrayBuffer将其编码为具有众所周知的base64算法的Base64 .

    随着Perl回归原点,显示如上所示的蓝色方块:

    my $fh = IO::File->new;
    $fh->open("d:/tmp/x.jpg", O_BINARY|O_CREAT|O_RDWR|O_TRUNC) or die $!;
    
    $fh->print(decode_base64("/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBD
    AQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL/wAARCABkAGQDASIAAhEBAxEB/8QAFQABAQAA
    AAAAAAAAAAAAAAAAAAf/xAAUEAEAAAAAAAAAAAAAAAAAAAAA/8QAFgEBAQEAAAAAAAAAAAAAAAAAAAUH/8QAFBEBAAAAAAAAAAAAAAAAAAAAAP/aAAwDAQACEQMR
    AD8AjgDcUwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAB//2Q==
    "));
    
    
    $fh->close;
    

相关问题