ASA SQL User's Guide
Using XML in the Database
Importing XML documents as relational data
OPENXML is used in the FROM clause of a query to generate a result set from an XML document. OPENXML uses a subset of the XPath query language to select nodes from an XML document.
When you use OPENXML, the XML document is parsed and the result is modeled as a tree. The tree is made up of nodes. XPath expressions are used to select nodes in the tree. The following list describes some commonly-used XPath expressions:
/ indicates the root node of the XML document
. (single period) indicates the current node of the XML document
// indicates all descendants of the current node, including the current node
.. indicates the parent node of the current node
./@attributename indicates the attribute of the current node having the name attributename
./childname indicates the children of the current node that are elements having the name childname
Consider the following XML document:
<inventory> <product id="301" size="Medium">Tee Shirt <quantity>54</quantity> </product> <product id="302" size="One size fits all">Tee Shirt <quantity>75</quantity> </product> <product id="400" size="One size fits all">Baseball Cap <quantity>112</quantity> </product> </inventory>
The <inventory> element is the root node. You can refer to it using the following XPath expression:
/inventory
Suppose that the current node is a <quantity> element. You can refer to this node using the following XPath expression:
.
To find all the <product> elements that are children of the <inventory> element, use the following XPath expression:
/inventory/product
If the current node is a <product> element and you want to refer to the size attribute, use the following XPath expression:
./@size
For a complete list of XPath syntax supported by OPENXML, see OPENXML function [String].
For information about the XPath query language, see http://www.w3.org/TR/xpath.
Each match for the first xpath-query argument to OPENXML generates one row in the result set. The WITH clause specifies the schema of the result set and how the value is found for each column in the result set. For example, consider the following query:
SELECT * FROM OPENXML( '<inventory> <product>Tee Shirt <quantity>54</quantity> <color>Orange</color> </product> <product>Baseball Cap
<quantity>112</quantity> <color>Black</color> </product> </inventory>', '/inventory/product' ) WITH ( quantity CHAR(3) 'quantity', color CHAR(20) 'color')
The first xpath-query argument is /inventory/product, and there are two <product> elements in the XML, so two rows are generated by this query.
The WITH clause specifies that there are two columns: quantity and color. The values for these columns are taken from the <quantity> and <color> elements. The query above generates the following result:
quantity | color |
---|---|
54 | Orange |
112 | Black |
For more information, see OPENXML function [String].
OPENXML can be used to generate an edge table, a table that contains a row for every element in the XML document. You may wish to generate an edge table so that you can query the data in the result set using SQL.
The following SQL statement creates a variable, x, that contains an XML document. The XML generated by the query has a root element called <root>, which is generated using the XMLELEMENT function, and elements are generated for each column in the employee, sales_order, and customer tables using FOR XML AUTO with the ELEMENTS modifier specified.
For information about the XMLELEMENT function, see XMLELEMENT function [String].
For information about FOR XML AUTO, see Using FOR XML AUTO.
CREATE VARIABLE x XML; SET x=(SELECT XMLELEMENT( NAME root, (SELECT * FROM employee KEY JOIN sales_order KEY JOIN customer FOR XML AUTO, ELEMENTS)));
The generated XML looks as follows:
<root> <employee> <emp_id>299</emp_id> <manager_id>902</manager_id> <emp_fname>Rollin</emp_fname> <emp_lname>Overbey</emp_lname> <dept_id>200</dept_id> <street>191 Companion Ct.</street>
<city>Kanata</city> <state>CA</state> <zip_code>94608</zip_code> <phone>5105557255</phone> <status>A</status> <ss_number>025487133</ss_number> <salary>39300.000</salary> <start_date>1987-02-19</start_date>
<birth_date>1964-03-15</birth_date> <bene_health_ins>Y</bene_health_ins> <bene_life_ins>Y</bene_life_ins> <bene_day_care>N</bene_day_care> <sex>M</sex> <sales_order> <id>2001</id> <cust_id>101</cust_id>
<order_date>2000-03-16</order_date> <fin_code_id>r1</fin_code_id> <region>Eastern</region> <sales_rep>299</sales_rep> <customer> <id>101</id> <fname>Michaels</fname> <lname>Devlin</lname> <address>114 Pioneer Avenue</address>
<city>Kingston</city> <state>NJ</state> <zip>07070</zip> <phone>2015558966</phone> <company_name>The Power Group</company_name> </customer> </sales_order> </employee> ...
The following query uses the descendant-or-self (//*) XPath expression to match every element in the above XML document, and for each element the id metaproperty is used to obtain an id for the node, and the parent (../) XPath expression is used with the id metaproperty to get the parent node. The localname metaproperty is used to obtain the name of each element.
SELECT * FROM OPENXML( x, '//*' ) WITH (id INT '@mp:id', parent INT '../@mp:id', name CHAR(20) '@mp:localname', text LONG VARCHAR 'text()' ) ORDER BY id
The result set generated by this query shows the id of each node, the id of the parent node, and the name and content for each element in the XML document.
id | parent | name | text |
---|---|---|---|
5 | (NULL) | root | (NULL) |
23 | 15 | emp_id | 299 |
47 | 15 | manager_id | 902 |
74 | 15 | emp_fname | Rollin |
... | ... | ... | ... |
If you have a table with a column that contains XML, you can use OPENXML to query all the XML values in the column at once. This can be done using a lateral derived table.
The following statements create a table with two columns, manager_id and reports. The reports column contains XML data generated from the employee table.
CREATE TABLE t (manager_id INT, reports XML); INSERT INTO t SELECT manager_id, XMLELEMENT( NAME reports, XMLAGG( XMLELEMENT( NAME e, emp_id))) FROM employee GROUP BY manager_id;
Execute the following query to view the data in the t table:
SELECT * FROM t;
This query produces the following result:
manager_id | reports |
---|---|
1293 |
<reports> <e>148</e> <e>390</e> <e>586</e> <e>757</e> ... </reports> |
1576 |
<reports> <e>184</e> <e>207</e> <e>318</e> <e>409</e> ... </reports> |
902 |
<reports> <e>129</e> <e>195</e> <e>299</e> <e>467</e> ... </reports> |
703 |
<reports> <e>191</e> <e>750</e> <e>868</e> <e>921</e> ... </reports> |
... | ... |
The following query uses a lateral derived table to generate a result set with two columns: one that lists the id for each manager, and one that lists the id for each employee that reports to that manager:
SELECT manager_id, eid FROM t, LATERAL( OPENXML( t.reports, '//e' ) WITH (eid INT '.') ) dt
This query generates the following result:
manager_id | eid |
---|---|
1293 | 148 |
1293 | 390 |
1293 | 586 |
1293 | 757 |
... | ... |
For more information about lateral derived tables, see FROM clause.