Dig into XPath and Java sample code analysis

2020-04-01 02:02:04
OfStack


import java.io.IOException;
import javax.xml.parsers.*;
import javax.xml.xpath.*;
import org.w3c.dom.*;
import org.xml.sax.SAXException;
public class XpathTest {
 public static void main(String[] args) throws ParserConfigurationException,
   SAXException, IOException, XPathExpressionException {
  DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
  factory.setNamespaceAware(false);
  DocumentBuilder builder = factory.newDocumentBuilder();
  Document doc = builder.parse("C:/Users/Administrator/Desktop/test.xml");
  System.out.println(doc.getChildNodes().getLength());
  XPathFactory xFactory = XPathFactory.newInstance();
  XPath xpath = xFactory.newXPath();
  XPathExpression expr = xpath
    .compile("//name/text()");
  Object result = expr.evaluate(doc, XPathConstants.NODESET);
  NodeList nodes = (NodeList) result;
  System.out.println(nodes.getLength());
  for (int i = 0; i < nodes.getLength(); i++) {
   System.out.println(nodes.item(i).getNodeValue());
  }
 }
}

One, node type
There are seven node types in XPath: elements, attributes, text, namespaces, processing instructions, comments, and document nodes (or root nodes). The root node of the document is the document node. The corresponding attribute has an attribute node and the element has an element node.
Common path expressions
Expression description
Nodename Select all children of this node
/ Select from the root node
/ / Select the nodes in the document from the current node that matches the selection, regardless of their location
. Select the current node
. Selects the parent of the current node
@ Select properties
For example, there are documents:


<?xml version="1.0" encoding="ISO-8859-1"?>  
<bookstore>  
<book>  
  <title lang="eng">Harry Potter</title>  
  <price>29.99</price>  
</book>  
<book>  
  <title lang="eng">Learning XML</title>  
  <price>39.95</price>  
</book>  
</bookstore>

Is:
Path expression result
Bookstore selects all child nodes of the bookstore element
Bookstore: if the path starts with a forward slash (/), this path always represents the absolute path to an element!
Bookstore /book selects the book elements of all the child elements belonging to bookstore.
//book selects all child elements of book, regardless of their location in the document.
Bookstore //book selects all book elements that belong to the children of bookstore elements, regardless of where they are under the bookstore.
//@lang selects all attributes named lang.
3. Qualifiers
A node used to find a particular node or to contain a specified value. Enclosed in square brackets.
Such as:
Path expression result
/bookstore/book[1] selects the first book element that belongs to the bookstore child element.
/bookstore/book[last()] selects the last book element that belongs to the bookstore child element.
/bookstore/book[last()-1] selects the penultimate book element that belongs to the bookstore child element.
/bookstore/book[position()<3] selects the first two book elements that are children of the bookstore element.
//title[@lang] selects all title elements that have an attribute named lang.
//title[@lang='eng'] selects all title elements that have a lang attribute with the value eng.
/bookstore/book[price>35.00] selects the book element of all bookstore elements, and the value of the price element must be greater than 35.00.
/bookstore/book[price>35.00]/title selects the title element of the book element in all bookstore elements and the value of the price element must be greater than 35.00.
Four, wildcard characters
Wildcard description
* matches any element node
@* matches any attribute node
Node () matches any type of node
| Select a number of paths
Such as:
Path expression result
/bookstore/* selects all child nodes of the bookstore element
//* selects all elements in the document
//title[@*] selects all title elements with attributes.
//book/title | //book/price selects tilte and price elements of all book elements.
//title | //price selects the title and price elements in all documents.
/bookstore/book/title | //price selects the title element of all book elements belonging to bookstore elements, as well as all price elements in the document.

Five, the function
The name of the results
Analysis selects all ancestors (father, grandfather, etc.) of the current node
Metho-or-self selects all the ancestors of the current node (father, grandfather, etc.) and the current node itself
Attribute selects all attributes of the current node
The child selects all child elements of the current node.
Descendant selects all descendant elements (children, grandchildren, and so on) of the current node.
Descendant -or self selects all descendant elements of the current node (children, grandchildren, and so on) and the current node itself.
Following selects all nodes in the document after the end tag of the current node.
The namespace selects all namespace nodes of the current node
Parent selects the parent of the current node.
Preceding the selection in the document before the start of the current node label all the nodes.
Preceding - (select all nodes at the same level before the current node.
Self selects the current node.
A path expression can be either an absolute path or a relative path. Such as:
Absolute location path:
/ step/step /... Relative position path:
Step/step /... Each of these steps can be an expression, including:
Axis (function) (axis)
Defines the tree relationship between the selected node and the current node
Node test (node-test)
Identify the nodes within an axis
Zero or more predicates.
Refine the selected node set more deeply
For example: example result
Child ::book selects all the book nodes that are children of the current node
Attribute ::lang selects the lang attribute of the current node
Child ::* selects all child elements of the current node
Attribute ::* selects all attributes of the current node
Child ::text() selects all text children of the current node
Child ::node() selects all children of the current node
Descendant ::book selects all book descendants of the current node
Analyze ::book selects all book ancestors of the current node
Method-or-self ::book selects all book ancestors of the current node and the current node
Child ::*/child::price selects all price grandchildren of the current node.
operators
The operator describes the instance return value
| computes two node sets //book | // CD returns all node sets with book and ck elements
Plus plus 6 plus 4 plus 10
Minus 6 minus 4, 2
Times 6 times 4, 24
Div division 8 div 4 2
= = price=9.80 returns true if price is 9.80. If the price is 9.90, it returns fasle.
! = not equal to price! Returns true if price is 9.90. If price is 9.98, it returns fasle.
< Less than price< 9.80 returns true if price is 9.00 and if price is 9.98, it returns fasle
< = less than or equal to price< =9.80 returns true if price is 9.00. If the price is 9.90, it returns fasle.
> Greater than price> 9.80 returns true if the price is 9.90. If the price is 9.80, it returns fasle.
> = greater than or equal to price> Returns true if price is 9.90. If the price is 9.70, it returns fasle.
Or or price=9.80 or price=9.70 returns true if the price is 9.80. If the price is 9.50, it returns fasle.
And with price> 9.00 the and price< 9.90 returns true if price is 9.80. If price is 8.50, it returns fasle.
Mod calculates the remainder of division 5 mod 2 1

Use Xpath in Java
In java1.5, a javax.xml.xpath package was introduced specifically for reading XML using xpath expressions in Java. 1. Data type
The first thing to note before learning is that Xpath's data does not have a one-to-one correspondence with Java, and Xpath1.0 only declares four data types:
The & # 8226; node-set
The & # 8226; number
The & # 8226; Boolean
The & # 8226; The string

Corresponding to Java is:
The & # 8226; Number maps to java.lang.double
The & # 8226; String maps to java.lang.string
The & # 8226; Boolean maps to java.lang.boolean
The & # 8226; Node-set maps to org.w3c. Dom.NodeList

Therefore, when using Java's xpathAPI, you need to be aware of the return type:
Java code


public Object evaluate(Object item, QName returnType)throws XPathExpressionException;    

public String evaluate(Object item)throws XPathExpressionException;    

public Object evaluate(InputSource source, QName returnType)throws XPathExpressionException;    

public String evaluate(InputSource source)throws XPathExpressionException;


public Object evaluate(Object item, QName returnType)throws XPathExpressionException;  

public String evaluate(Object item)throws XPathExpressionException;  

public Object evaluate(InputSource source, QName returnType)throws XPathExpressionException;  

public String evaluate(InputSource source)throws XPathExpressionException;

When you do not specify a return type, the default return type is String. When you specify a return type, you need to cast the return value from the Object type to the corresponding return type.
The use of API
Similar to the Dom, to get an Xpath object, use the following: Java code


XPathFactory factory = XPathFactory.newInstance();    
XPath xpath = factory.newXPath();    
XPathExpression expression = xpath.compile("/bookstore//book/title/text()");


<strong><strong>        XPathFactory factory = XPathFactory.newInstance();  
        XPath xpath = factory.newXPath();  
        XPathExpression expression = xpath.compile("/bookstore//book/title/text()");</strong></strong>

Again, take the previous XML document. To get the result of this expression, we first need to get an input object, such as a document:


DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();    
DocumentBuilder documentBuilder = builderFactory.newDocumentBuilder();    
Document document = documentBuilder.parse(new File("books.xml"));    
NodeList list = (NodeList) expression.evaluate(document,XPathConstants.NODESET);


<strong><strong>        DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();  
        DocumentBuilder documentBuilder = builderFactory.newDocumentBuilder();  
        Document document = documentBuilder.parse(new File("books.xml"));  
        NodeList list = (NodeList) expression.evaluate(document,XPathConstants.NODESET);</strong></strong>

As you can see here, when using Xpath, it seems like we need to be very clear about what the return result is. Otherwise you won't get the desired result.
Finally, we get a list value of title:


for(int i = 0;i<list.getLength();i++){                System.out.println(list.item(i).getNodeValue());    
}


<strong><strong>        for(int i = 0;i</strong></strong>


Everyday Italian  
Harry Potter  
XQuery Kick Start  
Learning XML


<strong><strong>Everyday Italian  
Harry Potter  
XQuery Kick Start  
Learning XML</strong></strong>

In general, a specification XML will have a definition of the namespace, such as:


<strong><strong>  

            
            Hello  

</strong></strong>


<?xml version="1.0" encoding="UTF-8"?>    
<tg:bookstore xmlns:tg="http://www.tibco.com/cdc/liugang"    
           xmlns:ns="http://www.tibco.com/cdc/liugang/ns">    
          <ns:book>    
            <tg:title>Hello</tg:title>    
          </ns:book>    
</tg:bookstore>

Three functions related to node names and namespaces are defined in xpath:
The & # 8226; The local -name ()
The & # 8226; The namespace uri - ()
The & # 8226; The name ()
For example, to find all nodes defined in the current document, the element's local name is book, as follows:


XPathFactory xPathFactory = XPathFactory.newInstance();    
XPath xpath = xPathFactory.newXPath();    
XPathExpression compile = xpath.compile("//*[local-name()='book']");    
NodeList list = (NodeList) compile.evaluate(document,XPathConstants.NODESET);


<strong><strong>        XPathFactory xPathFactory = XPathFactory.newInstance();  
        XPath xpath = xPathFactory.newXPath();  
        XPathExpression compile = xpath.compile("//*[local-name()='book']");  
        NodeList list = (NodeList) compile.evaluate(document,XPathConstants.NODESET);</strong></strong>

If the element has a namespace defined, you must also specify that it is in the same namespace when using xpath lookups, even if the element USES the default namespace. For example, documents:


<?xml version="1.0" encoding="UTF-8"?>    
<bookstore xmlns="http://www.tibco.com/cdc/liugang" xmlns:tg="http://www.tibco.com/cdc/liugang/tg"    
           xmlns:ns="http://www.tibco.com/cdc/liugang/ns">    
          <ns:book>    
            <tg:title>Hello</tg:title>    
          </ns:book>    
          <computer>    
               <id>ElsIOIELdslke-1233</id>    
          </computer>    
</bookstore>


<strong><strong>  

            
            Hello  

            
               ElsIOIELdslke-1233  

</strong></strong>

Three namespaces are defined: default; XMLNS: tg; XMLNS: ns. To use namespaces, we need to set the NamespaceContext for XPath: NamespaceContext. This is an interface type that we need to customize to implement. For example, the three namespaces corresponding to the above document can be implemented as follows:


class CustomNamespaceContext implements NamespaceContext{    

        public String getNamespaceURI(String prefix) {    
            if(prefix.equals("ns")){    
                return "http://www.tibco.com/cdc/liugang/ns";    
            }else if(prefix.equals("tg")){    
                return "http://www.tibco.com/cdc/liugang/tg";    
            }else if(prefix.equals("df")){    
                return "http://www.tibco.com/cdc/liugang";    
            }    
            return XMLConstants.NULL_NS_URI;    
        }    

        public String getPrefix(String namespaceURI) {    
            return null;    
        }    

        public Iterator getPrefixes(String namespaceURI) {    
            return null;    
        }    

    }


<strong><strong>class CustomNamespaceContext implements NamespaceContext{  

        public String getNamespaceURI(String prefix) {  
            if(prefix.equals("ns")){  
                return "http://www.tibco.com/cdc/liugang/ns";  
            }else if(prefix.equals("tg")){  
                return "http://www.tibco.com/cdc/liugang/tg";  
            }else if(prefix.equals("df")){  
                return "http://www.tibco.com/cdc/liugang";  
            }  
            return XMLConstants.NULL_NS_URI;  
        }  

        public String getPrefix(String namespaceURI) {  
            return null;  
        }  

        public Iterator getPrefixes(String namespaceURI) {  
            return null;  
        }  

    }</strong></strong>

Method names are very intuitive. Only the first method is implemented here. In this way, if you want to find all elements with the namespace as the default and the element name computer, you can do the following:


XPathFactory xPathFactory = XPathFactory.newInstance();    
XPath xpath = xPathFactory.newXPath();    
xpath.setNamespaceContext(new CustomNamespaceContext());    
XPathExpression compile = xpath.compile("//df:computer");    
NodeList list = (NodeList) compile.evaluate(document,XPathConstants.NODESET);    
for(int i = 0;i  
    Node item = list.item(i);    
    System.out.println(item.getNodeName()+"  "+item.getNodeValue());    
}


<strong><strong>        XPathFactory xPathFactory = XPathFactory.newInstance();  
        XPath xpath = xPathFactory.newXPath();  
        xpath.setNamespaceContext(new CustomNamespaceContext());  
        XPathExpression compile = xpath.compile("//df:computer");  
        NodeList list = (NodeList) compile.evaluate(document,XPathConstants.NODESET);  
        for(int i = 0;i</strong></strong>

Nine, other
In addition, in Java, you can define an extended function interpreter and a variable interpreter.


        
    public void setXPathVariableResolver(XPathVariableResolver resolver);    

    
        
    public void setXPathFunctionResolver(XPathFunctionResolver resolver);


<strong><strong>    /** 
     * Establish a variable resolver. 
     *  
     * A <code>NullPointerException</code> is thrown if <code>resolver</code> is <code>null</code>. 
     *  
     * @param resolver Variable resolver. 
     *  
     *  @throws NullPointerException If <code>resolver</code> is <code>null</code>. 
     */  
    public void setXPathVariableResolver(XPathVariableResolver resolver);  

  
    /** 
       * Establish a function resolver. 
       *  
       * A <code>NullPointerException</code> is thrown if <code>resolver</code> is <code>null</code>. 
       *  
       * @param resolver XPath function resolver. 
       *  
       * @throws NullPointerException If <code>resolver</code> is <code>null</code>. 
       */  
    public void setXPathFunctionResolver(XPathFunctionResolver resolver);</strong></strong>