XML中的<！[CDATA []]>是什么意思？-Java 学习之路

841

我经常在 XML 文件中找到这个奇怪的 CDATA 标签：

<![CDATA[some stuff]]>

我观察到这个 CDATA 标签总是在开头，然后是一些东西 .

但有时它被使用，有时则不然 . 我假设是标记 some stuff 是之后将插入的"data" . 但是什么样的数据是 some stuff ？我在XML标签中写的东西不是某种数据吗？

11 回答

814

它用于包含可能被视为xml的数据，因为它包含某些字符 .

这样，将显示内部数据，但不会解释 .

回复于 2024-04-24T17:54:43+08:00
26

Cdata是您可能希望传递给xml解析器但仍未解释为xml的数据 .

比如说： - 你有一个xml，它封装了问题/答案对象 . 这样的开放字段可以具有不严格属于基本数据类型或xml定义的自定义数据类型的任何数据 . 喜欢 - 这是xml评论的正确标签吗？ .--您可能需要按原样传递它，而不会被xml解析器解释为另一个子元素 . 在这里，卡塔塔来救你 . 通过声明为Cdata，您告诉解析器不会将数据包装为xml（尽管它可能看起来像一个）

回复于 2024-04-24T17:54:43+08:00

314

CDATA代表Character Data，这意味着这些字符串之间的数据包含可以解释为XML标记的数据，但不应该 .

CDATA和评论之间的主要区别是：

如Richard points out，CDATA仍然是文件的一部分，而评论则不是 .
在CDATA中，您不能在注释-- is invalid中包含字符串 ]]> （CDEnd） .
在评论中无法识别
Parameter Entity个引用 .

这意味着从一个格式良好的文档中给出这三个XML片段：

<!ENTITY MyParamEntity "Has been expanded">

<!--
Within this comment I can use ]]>
and other reserved characters like <
&, ', and ", but %MyParamEntity; will not be expanded
(if I retrieve the text of this node it will contain
%MyParamEntity; and not "Has been expanded")
and I can't place two dashes next to each other.
-->

<![CDATA[
Within this Character Data block I can
use double dashes as much as I want (along with <, &, ', and ")
*and* %MyParamEntity; will be expanded to the text
"Has been expanded" ... however, I can't use
the CEND sequence. If I need to use CEND I must escape one of the
brackets or the greater-than sign using concatenated CDATA sections.
]]>

<description>An example of escaped CENDs</description>
<!-- This text contains a CEND ]]> -->
<!-- In this first case we put the ]] at the end of the first CDATA block
     and the > in the second CDATA block -->
<data><![CDATA[This text contains a CEND ]]]]><![CDATA[>]]></data>
<!-- In this second case we put a ] at the end of the first CDATA block
     and the ]> in the second CDATA block -->
<alternative><![CDATA[This text contains a CEND ]]]><![CDATA[]>]]></alternative>

回复于 2024-04-24T17:54:43+08:00

0

其中包含的数据不会被解析为XML，因此不需要是有效的XML，也不能包含可能看似XML而不是XML的元素 .

回复于 2024-04-24T17:54:43+08:00
34

通常用于嵌入自定义数据，如XML文档中的图片或声音数据 .

回复于 2024-04-24T17:54:43+08:00

CDATA部分是“a section of element content that is marked for the parser to interpret as only character data, not markup.”

从语法上讲，它的行为类似于注释：

<exampleOfAComment>
<!--
    Since this is a comment
    I can use all sorts of reserved characters
    like > < " and &
    or write things like
    <foo></bar>
    but my document is still well-formed!
-->
</exampleOfAComment>

......但它仍然是文件的一部分：

<exampleOfACDATA>
<![CDATA[
    Since this is a CDATA section
    I can use all sorts of reserved characters
    like > < " and &
    or write things like
    <foo></bar>
    but my document is still well formed!
]]>
</exampleOfACDATA>

尝试将以下内容保存为 .xhtml 文件（而不是 .html ）并使用FireFox（而不是Internet Explorer）将其打开，以查看注释和CDATA部分之间的区别;当您在浏览器中查看文档时，注释不会出现，而CDATA部分将：

<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" >
<head>
<title>CDATA Example</title>
</head>
<body>

<h2>Using a Comment</h2>
<div id="commentExample">
<!--
You won't see this in the document
and can use reserved characters like
< > & "
-->
</div>

<h2>Using a CDATA Section</h2>
<div id="cdataExample">
<![CDATA[
You will see this in the document
and can use reserved characters like
< > & "
]]>
</div>

</body>
</html>

CDATA部分需要注意的是它们没有编码，因此无法在其中包含字符串 ]]> . 根据我所知，包含 ]]> 的任何字符数据都必须是文本节点 . 同样，从DOM操作角度来看，您无法创建包含 ]]> 的CDATA部分：

var myEl = xmlDoc.getElementById("cdata-wrapper");
myEl.appendChild(xmlDoc.createCDATASection("This section cannot contain ]]>"));

这个DOM操作代码将抛出异常（在Firefox中）或导致结构不良的XML文档：http://jsfiddle.net/9NNHA/

回复于 2024-04-24T17:54:43+08:00

58

来自维基百科：

[在] XML文档或外部解析实体中，CDATA部分是元素内容的一部分，标记为解析器仅解释为字符数据，而不是标记 . http://en.wikipedia.org/wiki/CDATA

因此：解析器可以看到CDATA中的文本，但只作为字符而不是XML节点 .

回复于 2024-04-24T17:54:43+08:00
3
作为它的另一个例子......

如果您有RSS Feed（xml文档）并希望在描述的显示中包含一些基本的HTML编码，则可以使用CData对其进行编码：
```
<item>
  <title>Title of Feed Item</title>
  <link>/mylink/article1</link>
  <description>
    <![CDATA[
      <p>
      <a href="/mylink/article1"><img style="float: left; margin-right: 5px;" height="80" src="/mylink/image" alt=""/></a>
      Author Names
      
<em>Date</em>
      
Paragraph of text describing the article to be displayed</p>
    ]]>
  </description>
</item>
```
RSS阅读器提取描述并在CDATA中呈现HTML .

注意 - 并非所有HTML标记都有效 - 我认为这取决于您使用的RSS阅读器 .

并解释为什么这个例子使用CData（而不是相应的pubData和dc：creator标签）...这是用于使用RSS小部件进行网站显示，我们没有真正的格式控制 .

这使我们能够指定所包含图像的高度和位置，正确格式化作者姓名和日期，等等，而无需新的小部件 . 这也意味着我可以编写脚本，而不必手动添加它们 .
回复于 2024-04-24T17:54:43+08:00
6
一个很大的用例：你的xml包含一个程序，作为数据（例如Java的网页教程） . 在这种情况下，您的数据包含大量字符，包括“＆”和“<”，但这些字符不是xml .

相比：
```
<example-code>
while (x &lt; len &amp;&amp; !done) {
    print( &quot;Still working, &apos;zzz&apos;.&quot; );
    ++x;
    }
</example-code>
```
同
```
<example-code><![CDATA[
while (x < len && !done) {
    print( "Still working, 'zzzz'." );
    ++x;
    }
]]></example-code>
```
特别是如果您从文件中复制/粘贴此代码（或包含它，在预处理器中），最好只在xml文件中包含所需的字符，而不会将它们与XML标记/属性混淆 . 正如@paary所提到的，其他常见用途包括嵌入包含＆符号的URL . 最后，即使数据只包含一些特殊字符，但数据非常长（比如一章的文字），不一定非常好在编辑xml文件时对这几个实体进行编码/解码 .

（我怀疑所有与评论的比较都有点误导/无益 . ）
回复于 2024-04-24T17:54:43+08:00
10
当我的xml标签需要存储HTML代码时，我曾经不得不使用CDATA . 就像是
```
<codearea>
  <![CDATA[ 
  <div> <p> my para </p> </div> 
  ]]>
</codearea>
```
因此CDATA意味着它将忽略任何可能被解释为XML标签的字符，如<和>等 .
回复于 2024-04-24T17:54:43+08:00
7

CDATA代表字符数据 . 您可以使用它来转义某些字符，否则这些字符将被视为常规XML . 其中的数据将不会被解析 . 例如，如果要传递包含 & 的URL，则可以使用CDATA执行此操作 . 否则，您将收到错误，因为它将被解析为常规XML .

回复于 2024-04-24T17:54:43+08:00

XML中的<！[CDATA []]>是什么意思？

11 回答

相关问题