DOM
在线手册:中文 英文
PHP手册

The DOMNode class

(PHP 5)

类摘要

DOMNode {
/* 属性 */
public readonly string $nodeName ;
public string $nodeValue ;
public readonly int $nodeType ;
public readonly DOMNode $parentNode ;
public readonly DOMNodeList $childNodes ;
public readonly DOMNode $firstChild ;
public readonly DOMNode $lastChild ;
public readonly DOMNode $previousSibling ;
public readonly DOMNode $nextSibling ;
public readonly DOMNamedNodeMap $attributes ;
public readonly DOMDocument $ownerDocument ;
public readonly string $namespaceURI ;
public string $prefix ;
public readonly string $localName ;
public readonly string $baseURI ;
public string $textContent ;
/* 方法 */
public DOMNode appendChild ( DOMNode $newnode )
public string C14N ([ bool $exclusive [, bool $with_comments [, array $xpath [, array $ns_prefixes ]]]] )
public int C14NFile ( string $uri [, bool $exclusive [, bool $with_comments [, array $xpath [, array $ns_prefixes ]]]] )
public DOMNode cloneNode ([ bool $deep ] )
public int getLineNo ( void )
public string getNodePath ( void )
public bool hasAttributes ( void )
public bool hasChildNodes ( void )
public DOMNode insertBefore ( DOMNode $newnode [, DOMNode $refnode ] )
public bool isDefaultNamespace ( string $namespaceURI )
public bool isSameNode ( DOMNode $node )
public bool isSupported ( string $feature , string $version )
public string lookupNamespaceURI ( string $prefix )
public string lookupPrefix ( string $namespaceURI )
public void normalize ( void )
public DOMNode removeChild ( DOMNode $oldnode )
public DOMNode replaceChild ( DOMNode $newnode , DOMNode $oldnode )
}

属性

nodeName

Returns the most accurate name for the current node type

nodeValue

The value of this node, depending on its type

nodeType

Gets the type of the node. One of the predefined XML_xxx_NODE constants

parentNode

The parent of this node

childNodes

A DOMNodeList that contains all children of this node. If there are no children, this is an empty DOMNodeList.

firstChild

The first child of this node. If there is no such node, this returns NULL.

lastChild

The last child of this node. If there is no such node, this returns NULL.

previousSibling

The node immediately preceding this node. If there is no such node, this returns NULL.

nextSibling

The node immediately following this node. If there is no such node, this returns NULL.

attributes

A DOMNamedNodeMap containing the attributes of this node (if it is a DOMElement) or NULL otherwise.

ownerDocument

The DOMDocument object associated with this node.

namespaceURI

The namespace URI of this node, or NULL if it is unspecified.

prefix

The namespace prefix of this node, or NULL if it is unspecified.

localName

Returns the local part of the qualified name of this node.

baseURI

The absolute base URI of this node or NULL if the implementation wasn't able to obtain an absolute URI.

textContent

This attribute returns the text content of this node and its descendants.

注释

Note:

The DOM extension uses UTF-8 encoding. Use utf8_encode() and utf8_decode() to work with texts in ISO-8859-1 encoding or Iconv for other encodings.

Table of Contents


DOM
在线手册:中文 英文
PHP手册
PHP手册 - N: The DOMNode class

用户评论:

alastair dot dallas at gmail dot com (25-Sep-2011 04:44)

The issues around mixed content took me some experimentation to remember, so I thought I'd add this note to save others time.

When your markup is something like: <div><p>First text.</p><ul><li><p>First bullet</p></li></ul></div>, you'll get XML_ELEMENT_NODEs that are quite regular. The <div> has children <p> and <ul> and the nodeValue for both <p>s yields the text you expect.

But when your markup is more like <p>This is <b>bold</b> and this is <i>italic</i>.</p>, you realize that the nodeValue for XML_ELEMENT_NODEs is not reliable. In this case, you need to look at the <p>'s child nodes. For this example, the <p> has children: #text, <b>, #text, <i>, #text.

In this example, the nodeValue of <b> and <i> is the same as their #text children. But you could have markup like: <p>This <b>is bold and <i>bold italic</i></b>, you see?</p>. In this case, you need to look at the children of <b>, which will be #text, <i>, because the nodeValue of <b> will not be sufficient.

XML_TEXT_NODEs have no children and are always named '#text'. Depending on how whitespace is handled, your tree may have "empty" #text nodes as children of <body> and elsewhere.

Attributes are nodes, but I had forgotten that they are not in the tree expressed by childNodes. Walking the full tree using childNodes will not visit any attribute nodes.

mjpelmear at gmail dot com (03-Jun-2011 09:33)

getAttribute() returns an empty string if the requested attribute doesn't exist in the node.

stilgar at gilevski dot fakepart dot name (21-May-2011 09:17)

If you have empty $node->textContent and $node->textValue, check if document that is loaded have UTF-8 encoding.

imranomar at gmail dot com (20-Mar-2011 02:10)

Just discovered that node->nodeValue strips out all the tags

I. Cook (19-Apr-2010 10:43)

For a reference with more information about the XML DOM node types, see http://www.w3schools.com/dom/dom_nodetype.asp

(When using PHP DOMNode, these constants need to be prefaced with "XML_")

R. Studer (13-Jan-2010 05:03)

For clarification:
The assumingly 'discoverd' by previous posters and seemingly undocumented methods (.getElementsByTagName and .getAttribute) on this class (DOMNode) are in fact methods of the class DOMElement, which inherits from DOMNode.

See: http://www.php.net/manual/en/class.domelement.php

David Rekowski (08-Jan-2010 09:54)

You cannot simply overwrite $textContent, to replace the text content of a DOMNode, as the missing readonly flag suggests. Instead you have to do something like this:

<?php

$node
->removeChild($node->firstChild);
$node->appendChild(new DOMText('new text content'));

?>

This example shows what happens:

<?php

$doc
= DOMDocument::loadXML('<node>old content</node>');
$node = $doc->getElementsByTagName('node')->item(0);
echo
"Content 1: ".$node->textContent."\n";

$node->textContent = 'new content';
echo
"Content 2: ".$node->textContent."\n";

$newText = new DOMText('new content');

$node->appendChild($newText);
echo
"Content 3: ".$node->textContent."\n";

$node->removeChild($node->firstChild);
$node->appendChild($newText);
echo
"Content 4: ".$node->textContent."\n";

?>

The output is:

Content 1: old content // starting content
Content 2: old content // trying to replace overwriting $node->textContent
Content 3: old contentnew content // simply appending the new text node
Content 4: new content // removing firstchild before appending the new text node

If you want to have a CDATA section, use this:

<?php
$doc
= DOMDocument::loadXML('<node>old content</node>');
$node = $doc->getElementsByTagName('node')->item(0);
$node->removeChild($node->firstChild);
$newText = $doc->createCDATASection('new cdata content');
$node->appendChild($newText);
echo
"Content withCDATA: ".$doc->saveXML($node)."\n";
?>

Steve K (03-Nov-2009 07:47)

This class apparently also has a getElementsByTagName method.

I was able to confirm this by evaluating the output from DOMNodeList->item() against various tests with the is_a() function.

marc at ermshaus dot org (05-May-2009 04:36)

It took me forever to find a mapping for the XML_*_NODE constants. So I thought, it'd be handy to paste it here:

 1 XML_ELEMENT_NODE
 2 XML_ATTRIBUTE_NODE
 3 XML_TEXT_NODE
 4 XML_CDATA_SECTION_NODE
 5 XML_ENTITY_REFERENCE_NODE
 6 XML_ENTITY_NODE
 7 XML_PROCESSING_INSTRUCTION_NODE
 8 XML_COMMENT_NODE
 9 XML_DOCUMENT_NODE
10 XML_DOCUMENT_TYPE_NODE
11 XML_DOCUMENT_FRAGMENT_NODE
12 XML_NOTATION_NODE

matt at lamplightdb dot co dot uk (06-Apr-2009 01:39)

And apparently also a setAttribute method too:

$node->setAttribute( 'attrName' , 'value' );

jorge dot hebrard at gmail dot com (24-Jan-2009 01:29)

Try canonicalization:
<?php
$dom
= new DOMDocument;
$dom->loadHTMLFile('http://www.example.com/');
echo
$dom->documentElement->C14N();
?>

Or output it to a file, using C14NFile()

Undocumented stuff ;)

brian wildwoodassociates.info (08-Dec-2008 07:27)

This class has a getAttribute method.

Assume that a DOMNode object $ref contained an anchor taken out of a DOMNode List.  Then

    $url = $ref->getAttribute('href');

would isolate the url associated with the href part of the anchor.