我是solr的新手 . 我已经完成了solr wiki教程,我试图使用DataImportHandler索引solr中的一些xml文件,但它似乎没有索引我的数据 .
我的data-config.xml是:
<dataConfig>
<dataSource type="FileDataSource"/><document>
<entity name="f" processor="FileListEntityProcessor" fileName=".*xml" newerThan="'NOW-3DAYS'"
recursive="true" rootEntity="false" dataSource="null" baseDir="C:/Users/admin/Desktop/Project_work/ImageCLEF(Dataset)/all_text/all_text/metadata/sample">
<entity name="image" processor="XPathEntityProcessor" forEach="/image/text[@xml:lang='en']" url="${f.fileAbsolutePath}" >
<field column="name" xpath="/image/name"/>
<field column="description" xpath="/image/text/description"/>
<field column="caption" xpath="/image/text/caption"/>
<field column="comment" xpath="/image/text/comment"/>
</entity>
</entity>
</document>
</dataConfig>
以下是我的示例xml文件的样子:
<?xml version="1.0" encoding="UTF-8" ?>
<image id="140017" file="images/15/140017.jpg">
<name>MTwainAppletonsJournal4July74.jpg</name>
<text xml:lang='en'>
<description>adsadasd</description>
<comment>sddasd</comment>
<caption article="text/en/2/310906">1874 engraving of Twain</caption>
</text>
<text xml:lang="de">
<description/>
<comment />
<caption />
</text>
<text xml:lang="fr">
<description />
<comment />
<caption />
</text>
<comment>({{Information |Description= |Source= |Date= |Author= |Permission= |other_versions= }})</comment>
<license>Public Domain</license>
</image>
在我的schema.xml中:
<fields>
<field name="_version_" type="long" indexed="true" stored="true"/>
<field name="name" type="string" indexed="true" stored="true" required="true" />
<field name="description" type="text_en_splitting" indexed="true" stored="true" required="true"/>
<field name="caption" type="text_en_splitting" indexed="true" stored="true" required="true"/>
<field name="comment" type="text_en_splitting" indexed="true" stored="true" required="true"/>
<field name="text" type="text_en_splitting" indexed="true" stored="false" multiValued="true"/>
</fields>
<uniqueKey>name</uniqueKey>
我在solr-config.xml中配置了数据导入处理程序:
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">C:/solr-4.6.1/example/solr/collection1/conf/data-config.xml</str>
</lst>
</requestHandler>
使用Solr导入URL后:
http://localhost:8983/solr/collection1/dataimport?command=full-import
我收到以下回复:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">4</int>
</lst>
<lst name="initArgs">
<lst name="defaults">
<str name="config">C:/solr-4.6.1/example/solr/collection1/conf/data-config.xml</str>
</lst>
</lst>
<str name="command">full-import</str>
<str name="status">idle</str>
<str name="importResponse"/>
<lst name="statusMessages">
<str name="Total Requests made to DataSource">0</str>
<str name="Total Rows Fetched">2</str>
<str name="Total Documents Skipped">0</str>
<str name="Full Dump Started">2014-02-16 16:48:36</str>
<str name="">
<b>Indexing completed. Added/Updated: 0 documents. Deleted 0 documents.</b>
</str>
<str name="Committed">2014-02-16 16:48:36</str>
<str name="Total Documents Processed">0</str>
<str name="Time taken">0:0:0.23</str>
</lst>
<str name="WARNING">
<b>This response format is experimental. It is likely to change in the future.</b>
</str>
</response>
我厌倦了在互联网/ Stackoverflow中搜索解决方案,但没有解决我的问题 . 请帮我 .
提前致谢