如何解码/获取文件编码（Power BI桌面文件）-Java 学习之路

我正在使用功能BI桌面报告（pbix）内部文件（DataMashup），我正在尝试解码 . 我的目标是使用任何编程语言创建Power-BI桌面报告，数据模型 . 我正在使用Java作为初始 .

文件使用某种编码技术进行编码 .

我试图得到文件的编码，它正在返回Windows 1254.但解码没有发生 .

File f = new File("example.txt");

    String[] charsetsToBeTested = {"UTF-8", "windows-1254", "ISO-8859-7"};

    CharsetDetector cd = new CharsetDetector();
    Charset charset = cd.detectCharset(f, charsetsToBeTested);

    if (charset != null) {
        try {
            InputStreamReader reader = new InputStreamReader(new FileInputStream(f), charset);
            int c = 0;
            while ((c = reader.read()) != -1) {
                System.out.print((char)c);
            }
            reader.close();
        } catch (FileNotFoundException fnfe) {
            fnfe.printStackTrace();
        }catch(IOException ioe){
            ioe.printStackTrace();
        }

    }else{
        System.out.println("Unrecognized charset.");
    }

解压缩文件也无法正常工作

public void unZipIt(String zipFile, String outputFolder)
{
    byte buffer[] = new byte[1024];
    try
    {
        File folder = new File(outputFolder);
        if(!folder.exists())
        {
            folder.mkdir();
        }
        ZipInputStream zis = new ZipInputStream(new FileInputStream(zipFile));
        System.out.println(zis);

        System.out.println(zis.getNextEntry());
        for(ZipEntry ze = zis.getNextEntry(); ze != null; ze = zis.getNextEntry())
        {
            String fileName = ze.getName();
            System.out.println(ze);
            File newFile = new File((new StringBuilder(String.valueOf(outputFolder))).append(File.separator).append(fileName).toString());
            System.out.println((new StringBuilder("file unzip : ")).append(newFile.getAbsoluteFile()).toString());
            (new File(newFile.getParent())).mkdirs();
            FileOutputStream fos = new FileOutputStream(newFile);
            int len;
            while((len = zis.read(buffer)) > 0) 
            {
                fos.write(buffer, 0, len);
            }
            fos.close();
        }

        zis.closeEntry();
        zis.close();
        System.out.println("Done");
    }
    catch(IOException ex)
    {
        ex.printStackTrace();
    }
}

1 回答

0
该文件包含二进制标头，然后是指定了UTF-8的XML . Headers 数据似乎保存文件名（Config / Package.xml），因此假设zip格式是可以理解的 . 使用zip格式也会在文件末尾有二进制数据 .

也许文件是使用FTP下载的，并且完成了文本转换（“\ n”到“\ r \ n”） . 然后拉链就会被破坏 . 将文件重命名为.zip可能有助于使用zip工具测试文件 .

Try first the .tar format. 这是合乎逻辑的，因为XML文件未被压缩 . 将.tar添加到文件结尾 .

否则，如果内容始终是UTF-8 XML：
```
Path f = Paths.get("example.txt");
String start ="<?xml";
String end = ">";
byte[] bytes = Files.readAllBytes(f);
String s = new String(bytes, StandardCharsets.ISO_8859_1); // Single byte encoding.
int startI = s.indexOf(start);
int endI = s.lastIndexOf(end) + end.length();
//bytes = Arrays.copyOfRange(bytes, startI, endI);
String xml = new String(bytes, startI, endI - startI, StandardCharsets.UTF_8);
```
回复于 2024-04-25T17:02:39+08:00

如何解码/获取文件编码（Power BI桌面文件）

1 回答

相关问题