我想用Apache POI读取受密码保护的excel文件(.xls和.xlsx) . 我没有使用usermodel(org.apache.poi.ss.usermodel),而是使用Event API来处理xls和xlsx文件(以解决内存占用问题) .

我正在实现HSSFListener并覆盖其xls文件的processRecord(记录记录)方法 . 对于xlsx文件,我使用的是javax.xml.parsers.SAXParser和org.xml.sax.XMLReader .

如果我使用下面的代码来读取.xls文件:

Biff8EncryptionKey.setCurrentUserPassword("password");
        POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream(this.getFileName()));
        MissingRecordAwareHSSFListener listener = new MissingRecordAwareHSSFListener(this);
        formatListener = new FormatTrackingHSSFListener(listener);

        HSSFEventFactory factory = new HSSFEventFactory();
        HSSFRequest request = new HSSFRequest();

        request.addListenerForAllRecords(formatListener);
        rowsReadSet.clear();
        factory.processWorkbookEvents(request, fs);

我得到这个例外:

Exception in thread "Thread-6" org.apache.poi.EncryptedDocumentException: HSSF does not currently support CryptoAPI encryption
    at org.apache.poi.hssf.record.FilePassRecord$Rc4KeyData.read(FilePassRecord.java:65)
    at org.apache.poi.hssf.record.FilePassRecord.<init>(FilePassRecord.java:193)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
    at java.lang.reflect.Constructor.newInstance(Unknown Source)
    at org.apache.poi.hssf.record.RecordFactory$ReflectionConstructorRecordCreator.create(RecordFactory.java:87)
    at org.apache.poi.hssf.record.RecordFactory.createSingleRecord(RecordFactory.java:338)
    at org.apache.poi.hssf.record.RecordFactoryInputStream$StreamEncryptionInfo.<init>(RecordFactoryInputStream.java:74)
    at org.apache.poi.hssf.record.RecordFactoryInputStream.<init>(RecordFactoryInputStream.java:207)
    at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents(HSSFEventFactory.java:136)
    at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents(HSSFEventFactory.java:103)
    at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processWorkbookEvents(HSSFEventFactory.java:62)
    at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processWorkbookEvents(HSSFEventFactory.java:50)
    at com.mycompany.component.reader.MSExcelReader.readxls(MSExcelReader.java:300)
    at com.mycompany.component.reader.MSExcelReader.run(MSExcelReader.java:274)
    at java.lang.Thread.run(Unknown Source)

一旦我读完.xls,我将在以后发布.xlsx的代码 .

我正在使用JDK7和Apache POI 3.11 . 有人可以帮忙吗?

[EDITED]

另一个问题是,我读了here

POI将无法读取加密的工作簿 - 这意味着如果您保护整个工作簿(而不仅仅是工作表),那么它将无法读取它 . 否则,它应该工作 .

这是真的?因此,对于POI 3.11,我无法读取密码保护文件,其中为整个工作簿设置了密码(通常这是通过另存为 - >工具 - >常规选项完成的)?

[EDITED]:

如果我设置工作表密码(使用Review - > Protect Sheet功能区选项)并使用事件模型读取文件,它可以正常工作 . 但是,如果我为整个工作簿设置密码(使用Review - > Protect workbook功能区选项)或设置文件密码(使用另存为 - >工具 - >常规选项),则会失败 . 以下是我对这两种方法的例外情况:

1. Using Review -> Protect Workbook ribbon option

Exception in thread "Thread-6" org.apache.poi.EncryptedDocumentException: Supplied password is invalid for salt/verifier/verifierHash
    at org.apache.poi.hssf.record.RecordFactoryInputStream$StreamEncryptionInfo.createDecryptingStream(RecordFactoryInputStream.java:127)
    at org.apache.poi.hssf.record.RecordFactoryInputStream.<init>(RecordFactoryInputStream.java:209)
    at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents(HSSFEventFactory.java:136)
    at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents(HSSFEventFactory.java:103)
    at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processWorkbookEvents(HSSFEventFactory.java:62)
    at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processWorkbookEvents(HSSFEventFactory.java:50)

我可以通过提供Decryptor.DEFAULT_PASSWORD或“VelvetSweatshop”字符串作为密码(这是默认密码)来读取此文件 . 那么为什么它无法用手动设置的密码字符串读取?

2. Using Save As -> Tools -> General Options

Exception in thread "Thread-6" org.apache.poi.EncryptedDocumentException: HSSF does not currently support CryptoAPI encryption
    at org.apache.poi.hssf.record.FilePassRecord$Rc4KeyData.read(FilePassRecord.java:65)
    at org.apache.poi.hssf.record.FilePassRecord.<init>(FilePassRecord.java:193)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
    at java.lang.reflect.Constructor.newInstance(Unknown Source)
    at org.apache.poi.hssf.record.RecordFactory$ReflectionConstructorRecordCreator.create(RecordFactory.java:87)
    at org.apache.poi.hssf.record.RecordFactory.createSingleRecord(RecordFactory.java:338)
    at org.apache.poi.hssf.record.RecordFactoryInputStream$StreamEncryptionInfo.<init>(RecordFactoryInputStream.java:74)
    at org.apache.poi.hssf.record.RecordFactoryInputStream.<init>(RecordFactoryInputStream.java:207)
    at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents(HSSFEventFactory.java:136)
    at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents(HSSFEventFactory.java:103)
    at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processWorkbookEvents(HSSFEventFactory.java:62)
    at org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processWorkbookEvents(HSSFEventFactory.java:50)

我使用相同的密码用于上述所有三种方法并使用MS Office Professional Plus 2013.为什么它不能使用上面的第一种方法(使用Review - > Protect Workbook) . 我在异常中获取密码不正确 . 使用第二种方法的例外明确指出HSSF不支持加密,所以没关系 . 但是,如果Workbook也受到保护(使用Review - > Protect Workbook),如果它可以使用密码(使用Review - > Protect Sheet设置)读取Sheet,我希望它能够正常工作 . 专家可以澄清一下吗?

[EDITED]

好吧,当我说它适用于usermodel时,我错了 . 使用上述两种方法(在之前的编辑部分中列出),它也不适用于usermodel . 方法1使用Review - > Protect Workbook功能区选项和方法2.使用另存为 - >工具 - >常规选项 . 而我可以读取设置了Sheet密码的文件(甚至模型也可以看到类似的观察结果) . 请参阅下面的示例测试用例 . 我没有找到附加excel文件的选项,所以不能这样做,但任何简单的excel文件都可以使用下面的测试用例进行测试(尽管测试条件会根据输入而改变) .

Simple Testcase with user model and event model:

import java.io.File;
import java.io.FileInputStream;
import java.util.ArrayList;
import java.util.List;
import junit.framework.TestCase;
import org.apache.poi.hssf.eventusermodel.HSSFEventFactory;
import org.apache.poi.hssf.eventusermodel.HSSFListener;
import org.apache.poi.hssf.eventusermodel.HSSFRequest;
import org.apache.poi.hssf.record.BoundSheetRecord;
import org.apache.poi.hssf.record.NumberRecord;
import org.apache.poi.hssf.record.Record;
import org.apache.poi.hssf.record.crypto.Biff8EncryptionKey;
import org.apache.poi.hssf.usermodel.HSSFRow;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.poifs.filesystem.NPOIFSFileSystem;
import org.apache.poi.poifs.filesystem.POIFSFileSystem;
import org.apache.poi.ss.usermodel.Cell;

/**
 * Testing for {@link HSSFEventFactory}
 */
public final class TestHSSFEventFactory extends TestCase {

    private String[] fileNames = {"C:\\XLS\\General_Password.xls",
            "C:\\XLS\\Sheet_Password.xls",
            "C:\\XLS\\Workbook_Password.xls"};

    private static class MockHSSFListener implements HSSFListener {
        private final List<Record> records = new ArrayList<Record>();

        public MockHSSFListener() {}
        public Record[] getRecords() {
            Record[] result = new Record[records.size()];
            records.toArray(result);
            return result;
        }

        public void processRecord(Record record) {
            records.add(record);
        }
    }

    public void testWithPasswordProtectedWorkbooksUserModel() throws Exception {
        // XOR/RC4 decryption for xls
        Biff8EncryptionKey.setCurrentUserPassword("4Sys-Tem");
        NPOIFSFileSystem nfs = new NPOIFSFileSystem(new File(fileNames[2]), true);
        HSSFWorkbook hwb = new HSSFWorkbook(nfs.getRoot(), true);
        HSSFSheet sheet = hwb.getSheetAt(0);
        HSSFRow row = sheet.getRow(2);
        Cell cell1 = row.getCell(3);
        row = sheet.getRow(3);
        Cell cell2 = row.getCell(3);
        row = sheet.getRow(4);
        Cell cell3 = row.getCell(3);

        assertEquals("17000.0", cell1.toString());
        assertEquals("7500.0", cell2.toString());
        assertEquals("5000.0", cell3.toString());

        Biff8EncryptionKey.setCurrentUserPassword(null);
    }

    public void testWithPasswordProtectedWorkbooksEvenModel() throws Exception {
        // With the password, is properly processed
        Biff8EncryptionKey.setCurrentUserPassword("4Sys-Tem");
        HSSFRequest req = new HSSFRequest();
        MockHSSFListener mockListen = new MockHSSFListener();
        POIFSFileSystem fs = new POIFSFileSystem(new FileInputStream(fileNames[2]));
        HSSFEventFactory factory = new HSSFEventFactory();
        req.addListenerForAllRecords(mockListen);
        factory.processWorkbookEvents(req, fs);

        // Check we got the sheet and the contents
        Record[] recs = mockListen.getRecords();
        assertTrue( recs.length > 50 );

        // Has one sheet, with values 1,2,3 in column A rows 1-3
        boolean hasSheet=false, hasA1=false, hasA2=false, hasA3=false;
        for (Record r : recs) {
            if (r instanceof BoundSheetRecord) {
                BoundSheetRecord bsr = (BoundSheetRecord)r;
                assertEquals("Trade Data", bsr.getSheetname());
                hasSheet = true;
            }
            if (r instanceof NumberRecord) {
                NumberRecord nr = (NumberRecord)r;
                if (nr.getColumn() == 3 && nr.getRow() == 2) {
                    assertEquals(17000, (int)nr.getValue());
                    hasA1 = true;
                }
                if (nr.getColumn() == 3 && nr.getRow() == 3) {
                    assertEquals(7500, (int)nr.getValue());
                    hasA2 = true;
                }
                if (nr.getColumn() == 3 && nr.getRow() == 4) {
                    assertEquals(5000, (int)nr.getValue());
                    hasA3 = true;
                }
            }
        }

        assertTrue("Sheet record not found", hasSheet);
        assertTrue("Numeric record for A1 not found", hasA1);
        assertTrue("Numeric record for A2 not found", hasA2);
        assertTrue("Numeric record for A3 not found", hasA3);
    }
}