Tuesday, May 17, 2011

XML Parsing with Recursive Table Structure

After you load the XML data into the temp table using the code from yesterday's post, you end up with a hierarchical structure representing the XML data with simple id/parentid columns. So a node at the top level with an id=1 would have nodes at the next level with parentid=1, where id is unique and parentid always references id. There is also a column called 'nodetype' that seems to determine what this node represents. Now just on inspection, I deduced that nodetype=1 are root nodes, 3's are leaf nodes, and 2's are a combination (they have attribute info and children). This code will condense type 3's into their parent:

SELECT
X1.Id
,X1.ParentId
,X1.NodeType
,X1.LocalName
,X1.prev
,text = COALESCE(X1.text, X3.text)
INTO #tmp2
FROM #tmp X1
LEFT JOIN #tmp X3
ON X3.ParentId = X1.Id
AND X3.NodeType = 3
WHERE X1.NodeType <> 3
ORDER BY Id

No comments:

Post a Comment