当我尝试使用sax在套接字上解析xml时,我遇到了一个奇怪的现象 . 经过分析,我注意到DataOutputStream在我的数据前添加了2个字节 .
DataOutputStream发送的消息:
0020 50 18 00 20 0f df 00 00 00 9d 3c 3f 78 6d 6c 20 P.. .... ..<?xml
0030 76 65 72 73 69 6f 6e 3d 22 31 2e 30 22 3f 3e 3c version= "1.0"?><
0040 63 6f 6d 70 61 6e 79 3e 3c 73 74 61 66 66 3e 3c company> <staff><
0050 66 69 72 73 74 6e 61 6d 65 3e 79 6f 6e 67 3c 2f firstnam e>yong</
0060 66 69 72 73 74 6e 61 6d 65 3e 3c 6c 61 73 74 6e firstnam e><lastn
0070 61 6d 65 3e 6d 6f 6f 6b 20 6b 69 6d 3c 2f 6c 61 ame>mook kim</la
0080 73 74 6e 61 6d 65 3e 3c 6e 69 63 6b 6e 61 6d 65 stname>< nickname
0090 3e c2 a7 3c 2f 6e 69 63 6b 6e 61 6d 65 3e 3c 73 >..</nic kname><s
00a0 61 6c 61 72 79 3e 31 30 30 30 30 30 3c 2f 73 61 alary>10 0000</sa
00b0 6c 61 72 79 3e 3c 2f 73 74 61 66 66 3e 3c 2f 63 lary></s taff></c
00c0 6f 6d 70 61 6e 79 3e ompany>
使用Transformer发送消息:
0020 50 18 00 20 b6 b1 00 00 3c 3f 78 6d 6c 20 76 65 P.. .... <?xml ve
0030 72 73 69 6f 6e 3d 22 31 2e 30 22 20 65 6e 63 6f rsion="1 .0" enco
0040 64 69 6e 67 3d 22 75 74 66 2d 38 22 3f 3e 3c 63 ding="ut f-8"?><c
0050 6f 6d 70 61 6e 79 3e 3c 73 74 61 66 66 3e 3c 66 ompany>< staff><f
0060 69 72 73 74 6e 61 6d 65 3e 79 6f 6e 67 3c 2f 66 irstname >yong</f
0070 69 72 73 74 6e 61 6d 65 3e 3c 6c 61 73 74 6e 61 irstname ><lastna
0080 6d 65 3e 6d 6f 6f 6b 20 6b 69 6d 3c 2f 6c 61 73 me>mook kim</las
0090 74 6e 61 6d 65 3e 3c 6e 69 63 6b 6e 61 6d 65 3e tname><n ickname>
00a0 c2 a7 3c 2f 6e 69 63 6b 6e 61 6d 65 3e 3c 73 61 ..</nick name><sa
00b0 6c 61 72 79 3e 31 30 30 30 30 30 3c 2f 73 61 6c lary>100 000</sal
00c0 61 72 79 3e 3c 2f 73 74 61 66 66 3e 3c 2f 63 6f ary></st aff></co
00d0 6d 70 61 6e 79 3e mpany>
正如人们可能会注意到DataOutputStream在消息前面添加了两个字节 . 因此,sax解析器抛出异常“org.xml.sax.SAXParseException:prolog中不允许使用内容 . ”但是,当我跳过这两个字节时,sax解析器工作得很好 . 另外我注意到DataInputStream无法读取Transformer消息 .
我的问题是:为什么DataOutputStream会添加这些字节,为什么不是Transformer呢?
对于那些有兴趣复制问题的人,这里有一些代码:
使用DataInputStream的服务器:
String data = "<?xml version=\"1.0\"?><company><staff><firstname>yong</firstname><lastname>mook kim</lastname><nickname>§</nickname><salary>100000</salary></staff></company>";
ServerSocket server = new ServerSocket(60000);
Socket socket = server.accept();
DataOutputStream os = new DataOutputStream(socket.getOutputStream());
os.writeUTF(data);
os.close();
socket.close();
使用Transformer的服务器:
ServerSocket server = new ServerSocket(60000);
Socket socket = server.accept();
Document doc = createDocument();
printXML(doc, os);
os.close();
socket.close();
public synchronized static void printXML(Document document, OutputStream stream) throws TransformerException
{
DOMSource domSource = new DOMSource(document);
StreamResult streamResult = new StreamResult(stream);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer serializer = tf.newTransformer();
serializer.setOutputProperty(OutputKeys.ENCODING, "utf-8");
serializer.setOutputProperty(OutputKeys.INDENT, "no");
serializer.transform(domSource, streamResult);
}
private static Document createDocument() throws ParserConfigurationException
{
Document document = DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument();
Element company = document.createElement("company");
Element staff = document.createElement("staff");
Element firstname = document.createElement("firstname");
Element lastname = document.createElement("lastname");
Element nickname = document.createElement("nickname");
Element salary = document.createElement("salary");
Text firstnameText = document.createTextNode("yong");
Text lastnameText = document.createTextNode("mook kim");
Text nicknameText = document.createTextNode("§");
Text salaryText = document.createTextNode("100000");
document.appendChild(company);
company.appendChild(staff);
staff.appendChild(firstname);
staff.appendChild(lastname);
staff.appendChild(nickname);
staff.appendChild(salary);
firstname.appendChild(firstnameText);
lastname.appendChild(lastnameText);
nickname.appendChild(nicknameText);
salary.appendChild(salaryText);
return document;
}
使用SAX Parser的客户:
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new MyHandler();
Socket socket = new Socket("localhost", 60000);
InputSource is = new InputSource(new InputStreamReader(socket.getInputStream()));
is.setEncoding("UTF-8");
//socket.getInputStream().skip(2); // skip over the 2 bytes from the DataInputStream
saxParser.parse(is, handler);
使用DataInputStream的客户端:
Socket socket = new Socket("localhost", 60000);
DataInputStream os = new DataInputStream(socket.getInputStream());
while(true) {
String data = os.readUTF();
System.out.println("Data: " + data);
}
2 回答
DataOutputStream.writeUTF()
的输出是自定义格式,旨在由DataInputStream.readUTF()
读取 .您正在调用的
writeUTF
方法的javadocs说:在读取和写入数据时始终使用相同类型的流 . 如果要将流直接提供给sax解析器,则不应使用DataOutputStream .
只是用