Tuesday, May 17, 2011

XML SQL - Probing Depth of Document

Now let's say that you want to know how many levels deep this relationship goes. Without knowing the answer ahead of time, we have to use a recursive query to determine this:

; WITH XML_Doc AS (
SELECT
id
,parentid
,[level] = 1
,nodetype
,localname
,prev
,text
FROM #tmp2
WHERE LocalName = 'RootNode'
UNION ALL
SELECT
T.id
,T.parentid
,[level] = R.[level] + 1
,T.nodetype
,T.localname
,T.prev
,T.text
FROM XML_Doc R
JOIN #tmp2 T
ON R.Id = T.ParentId
)
SELECT *
INTO #XML_Doc
FROM XML_Doc

XML Parsing with Recursive Table Structure

After you load the XML data into the temp table using the code from yesterday's post, you end up with a hierarchical structure representing the XML data with simple id/parentid columns. So a node at the top level with an id=1 would have nodes at the next level with parentid=1, where id is unique and parentid always references id. There is also a column called 'nodetype' that seems to determine what this node represents. Now just on inspection, I deduced that nodetype=1 are root nodes, 3's are leaf nodes, and 2's are a combination (they have attribute info and children). This code will condense type 3's into their parent:

SELECT
X1.Id
,X1.ParentId
,X1.NodeType
,X1.LocalName
,X1.prev
,text = COALESCE(X1.text, X3.text)
INTO #tmp2
FROM #tmp X1
LEFT JOIN #tmp X3
ON X3.ParentId = X1.Id
AND X3.NodeType = 3
WHERE X1.NodeType <> 3
ORDER BY Id

Monday, May 16, 2011

XML into SQL

I'm working on loading an XML file into SQL right now, and struggling a bit with it. Not so much the coding itself, but figuring out how to represent the hierarchical XML data in a relational SQL table.

I did discover this tidbit for taking advantage of SQL's built-in tools for handling XML:

DECLARE @hdoc int DECLARE @doc varchar(max) SELECT @doc = CONVERT(varchar(max), XML_Column) FROM dbo.XML_Table EXEC sp_xml_preparedocument @hdoc OUTPUT, @doc PRINT @hdoc SELECT * INTO #tmp FROM OPENXML (@hdoc, '/RootNodeName',2) EXEC sp_xml_removedocument @hdoc

Works well.

Thursday, May 12, 2011

Getting the Real Name of a Temp Table

If you've ever wanted to retrieve the schema for a temp table, you probably noticed that in the TempDb database, the name of your temp table is not quite the same (unless you're using a global temp table - this entry does not apply to those). Say that you create a temp table named "#tmp" in your development database. If you go into TempDb, for example in the Sys.Tables view, you will find your table name, but padded on the right with underscores and a string of digits, to a length of 128 characters. So if you try to look up the schema in TempDb.Information_Schema.Columns using "#tmp", it will fail. To alleviate this problem, I wrote a simple SELECT that returns the true name of that temp table as it is stored in TempDb:

SELECT TempTableName = OBJECT_NAME(OBJECT_ID('TempDb..#tmp'), (SELECT Database_Id FROM SYS.DATABASES WHERE Name = 'TempDb'))