首页 文章

如何从形状不好的xml中获取值

提问于
浏览
0

我有以下字符串(你可以说xml)

<News News-type="alert" ID="498" NewsPath="GetNewsFrom[3]" NewsMark="0" />
<News News-type="alert" ID="1507" NewsPath="GetNewsFrom[3]" NewsMark="0"/>
<News News-type="alert" ID="1509" NewsPath="GetNewsFrom[3]" NewsMark="0"/>
<News News-type="alert" ID="1511" NewsPath="GetNewsFrom[3]" NewsMark="0" />
<News News-type="alert" ID="1520" NewsPath="GetNewsFrom[3]" NewsMark="0" />
<News News-type="alert" ID="2999" NewsPath="data-theft[1]" NewsMark="0" />
<News News-type="alert" ID="2535" NewsPath="GetNewsFrom[3]" NewsMark="0" />
<News News-type="alert" ID="6052" NewsPath="GetNewsFrom[3]" NewsMark="100" />

我无法在它上面应用xml阅读器/解析器,他们说它不是一个很好的形式的xml文件 . 你能帮我解决一下如何从这些字符串中得到以下输出

String attr[4]={"News-type","ID", "NewsPath", "NewsMark"};
String values[4];
//There values dynamically in array as well 
int i;
for(i=0; i<4;i++)
{
    if(i==0)
        value[i]=????;
    else if(i==1)
    ...
}

如何获取 values[] 数组中的所有属性值,以便我可以进一步使用它 .

Exception:
在java中传递它作为xml文件[致命错误]:2:2:根元素后面的文档中的标记必须格式正确 . 2014年3月18日上午11:43:21 GUI.NewsReport jMenuItem2ActionPerformed SEVERE:null org.xml.sax.SAXParseException; lineNumber:2; columnNumber:2;根元素后面的文档中的标记必须格式正确 . at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)在GUI.NewsReport.AcessXML(NewsReport.java:185)的GUI.NewsReport.jMenuItem2ActionPerformed(NewsReport.java:126)GUI.NewsReport.access $ 100处的javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121) (NewsReport.java:33)GUI.NewsReport $ 2.actionPerformed(NewsReport.java:88)at javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:2018)at javax.swing.AbstractButton $ Handler.actionPerformed(AbstractButton.java) :2341)javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:402)javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:259)javax.swing.AbstractButton.doClick(AbstractButton.java:376)javax .swing.plaf.basic.BasicMenuItemUI.doClick(BasicMenuItemUI.java:833)at javax.swing.plaf.basic.BasicMenuItemUI $ Handler.mouseReleased(BasicMenuItemUI.j ava:877)at java.awt.Component.processMouseEvent(Component.java:6505)at javax.swing.JComponent.processMouseEvent(JComponent.java:3320)at java.awt.Component.processEvent(Component.java:6270)at at java.awt.Compad上的java.awt.Container.processEvent(Container.java:2229)java.awt.ComptainerEventImpl(Component.java:4861),java.awt.Component上的java.awt.Container.dispatchEventImpl(Container.java:2287)位于java.awt.LightweightDispatcher.processMouseEvent(Container.java:4492)的java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:4832)中的.dispatchEvent(Component.java:4687)位于java.awt.LightweightDispatcher.dispatchEvent(Container . java:4422)at java.awt.Container.dispatchEventImpl(Container.java:2273)at java.awt.Window.dispatchEventImpl(Window.java:2719)at java.awt.Component.dispatchEvent(Component.java:4687)at at java.awt.EventQueue.access的java.awt.EventQueue.dispatchEventImpl(EventQueue.java:735)在java.awt的java.awt.EventQueue $ 3.run(EventQueue.java:694)上的$ 200(EventQueue.java:103) . 事件在java.security.ProtectionDomain $ 1.doIntersectionPrivilege(ProtectionDomain.java:76)上的java.security.AccessController.doPrivileged(Native Method)队列$ 3.run(EventQueue.java:692)java.security.ProtectionDomain $ 1.doIntersectionPrivilege(ProtectionDomain) .java:87)java.awt.EventQueue $ 4.run(EventQueue.java:708)at java.awt.EventQueue $ 4.run(EventQueue.java:706)at java.security.AccessController.doPrivileged(Native Method)at at java.security.ProtectionDomain $ 1.doIntersectionPrivilege(ProtectionDomain.java:76)位于java.awt的java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:242)的java.awt.EventQueue.dispatchEvent(EventQueue.java:705) . 位于java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:146)的java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:150)中的EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:161),位于java.awt.EventDispatchThread.pumpEvents(EventDispatchThread) .java:138)在java.awt.EventDispatchThre ad.run(EventDispatchThread.java:91)`执行时我遇到了这个异常..

  • 谢谢分配!

3 回答

  • 1

    没有单个根元素,因此它不是格式良好的XML文档......尽管它可能是格式良好的XML文档片段 .

    如果后者是真的,那么在Java中解析它的最简单的解决方案是实现一个修改过的读取器,它包含一个虚拟的顶层元素 - 例如,在内容之前用 <wrapper> 跟随它 </wrapper> . 然后实现应用程序的其余部分,并意识到 <wrapper> 不是原始文件内容的一部分 .

  • 0

    那么在这种情况下解决这个问题的简单方法是在所有 News 标记中添加父标记,然后像任何其他xml一样解析它 .

    <NewsParent>
    <News News-type="alert" ID="498" NewsPath="GetNewsFrom[3]" NewsMark="0" />
    <News News-type="alert" ID="1507" NewsPath="GetNewsFrom[3]" NewsMark="0"/>
    <News News-type="alert" ID="1509" NewsPath="GetNewsFrom[3]" NewsMark="0"/>
    <News News-type="alert" ID="1511" NewsPath="GetNewsFrom[3]" NewsMark="0" />
    <News News-type="alert" ID="1520" NewsPath="GetNewsFrom[3]" NewsMark="0" />
    <News News-type="alert" ID="2999" NewsPath="data-theft[1]" NewsMark="0" />
    <News News-type="alert" ID="2535" NewsPath="GetNewsFrom[3]" NewsMark="0" />
    <News News-type="alert" ID="6052" NewsPath="GetNewsFrom[3]" NewsMark="100" />
    </NewsParent>
    
  • 3

    除了做一些预处理(这应该比正则表达式更好)之外,你的另一种选择是使用正则表达式,例如: News-type=\\"([^\\"]+?)\\"\\s+ID=\\"([^\\"]+?)\\"\\s+NewsPath=\\"([^\\"]+?)\\"\\s+NewsMark=\\"([^\\"]+?)\\" .

    上面的正则表达式应该与您所使用的相匹配,并将其放在您可以稍后访问的组中 .

    正则表达式的解释是here .

相关问题