Jesse's SQL Blog

Tuesday, December 24, 2019

Complicated Comments

This is a simpler version of a post I made 10 years ago.

DECLARE @x int

/*

-- */

SET @x = 1

-- /*

--*/ SET @x = 2

PRINT @x

Question: after running the statement above, what will the output be?

1
2
Null
An error occurs.

Monday, December 16, 2019

Removing 100% Duplicates

In a table that contains duplicates as identified by some business rule (same name/address, etc.), but the rows included in the sets of duplicates are actually 100% the same, it's impossible to identify a "survivor" of a de-duplication process that deletes all but one of the rows in each set. There is a way in T-SQL to do this, by using the TOP(n) clause after the DELETE. It's kind of unusual, and you should be careful with it. Here's an example:

CREATE TABLE #src (x int, y int, z int)
GO
INSERT #src VALUES (1, 2, 3)
INSERT #src VALUES (1, 2, 3)
INSERT #src VALUES (1, 2, 3)
INSERT #src VALUES (10, 20, 30)
INSERT #src VALUES (100, 200, 300)
GO
SELECT * FROM #src

At this point, #src looks like this (I cannot seem to remove the odd white space in this table):
x y z
1 2 3
1 2 3
1 2 3
10 20 30
100 200 300

Now let's remove our duplicates:
DELETE TOP(2)
FROM #src
WHERE x = 1

SELECT * FROM #src

And we see the results:
x y z
1 2 3
10 20 30
100 200 300

In the DELETE TOP(n) part of the statement, n=#dupes - 1. In our case, we had 3 duplicates and removed 2 of them. Which brings up an important point - we used the "x=1" condition to identify the group, and we knew that it had exactly 3 rows in that dupe group. So if we generalize the problem, we may want to use dynamic SQL to build a solution:

; WITH DupeGroups AS (
SELECT
[x]
,[cnt] = COUNT(*)
FROM #src
GROUP BY [x]
HAVING COUNT(*) > 1
)
SELECT

[sql] = 'DELETE TOP(' + CONVERT(varchar, [cnt] - 1)

+ ') FROM #src WHERE [x] = ' + CONVERT(varchar, [x]) + ';'

FROM DupeGroups

This can then be executed using sp_SqlExec to remove the duplicates.

Wednesday, June 12, 2019

Default Schema

I'm working at a site where prior developers referenced objects without schemas, because everything is done under 'dbo'. I fell into this habit, and this led to me creating a table with no explicit schema - which means it was created under a schema tied to my login. When I later tried to reference the table in code, using the implicit 'dbo' schema, I got an error that it didn't exist, when it certainly seemed to exist! It took a bit to track down the error, and it was a 'smack my forehead' feeling when I found it. Lesson learned: don't get lazy about explicit schemas.

Saturday, May 25, 2019

Multiple UNPIVOTs

This is a great article about using multiple UNPIVOTs simultaneously on a data set, which I hadn't seen before.

Wednesday, May 15, 2019

Conversion failed when converting the varchar value '*' to data type int.

Interesting error today: "Conversion failed when converting the varchar value '*' to data type int." I partially recognized it as a conversion issue, when an int value with more digits than the target varchar column has, it will hold the value "*". A later process tried to convert it back to an int.

Thursday, February 14, 2019

SSMS Slowness

I’ve had some Hulk-like moments dealing with SSMS’s tendency to burn through all available memory, push usage to near 100%, and then freeze when I try to continue working in it. Classic “turn it off and turn it on again” fix works sometimes. I’ve even threatened it with uninstalling it (I think it knows I was bluffing). But today – today – I found a silver bullet to kill the beast. This post recommends turning off Auto-recovery. I tried it – and it works. Before, memory for SSMS was allocated at around 2.5gb, but now it’s a sweet and cuddly half gig. Yup, 80% reduction. Now, you may flinch at turning off Auto-recovery, and in principle I agree. The thing about that is, when SSMS has completely crashed with it on, it doesn’t always recover my most important documents. So, if you’re sick of SSMS slowness, consider this.

Monday, February 11, 2019

CREATE TABLE Script for Temp Table

Following up on the earlier queries that interrogated INFORMATION_SCHEMA.TABLES to get metadata on temp tables, this script builds on those to produce a CREATE TABLE statement. I find this useful for capturing the table-typed output of a stored procedure, especially if I need to capture that output in an INSERT..EXEC statement.

DECLARE @sql varchar(max); SET @sql = 'CREATE TABLE #capture_output (' + CHAR(13) + CHAR(10)
; WITH struct AS (
SELECT *
FROM TempDb.INFORMATION_SCHEMA.COLUMNS
WHERE [TABLE_NAME] = OBJECT_NAME(OBJECT_ID('TempDb..#tmp'), (SELECT Database_Id FROM SYS.DATABASES WHERE Name = 'TempDb'))
)
SELECT
@sql += CHAR(9) + ',' + QUOTENAME(struct.[COLUMN_NAME]) + ' ' + struct.[DATA_TYPE]
+ ISNULL('(' + CASE struct.[DATA_TYPE]
WHEN 'varchar' THEN CONVERT(varchar, struct.[CHARACTER_MAXIMUM_LENGTH])
WHEN 'decimal' THEN CONVERT(varchar, struct.[NUMERIC_PRECISION]) + ', ' + CONVERT(varchar, struct.[NUMERIC_SCALE])
END
+ ')', '')
+ CHAR(13) + CHAR(10)
FROM struct
SET @sql = REPLACE(@sql, '(' + CHAR(13) + CHAR(10) + CHAR(9) + ',', '(' + CHAR(13) + CHAR(10) + CHAR(9)) + ')'
PRINT @sql

I realized that I don't have a link to a tip on the FOR XML trick for concatenating values over rows into one value. I like this one: https://www.mytecbits.com/microsoft/sql-server/concatenate-multiple-rows-into-single-string

Tuesday, November 27, 2018

Classic Mistake

Today I was working on an ETL feed that parsed data elements from a single column of a single string value. The table held a log from an enterprise build tool, and we wanted to analyze the logs to find patterns of unusually slow builds.

https://dba.stackexchange.com/questions/218073/the-disappearing-act-of-the-invalid-length-parameter-passed-to-the-left-or-subs

Monday, September 10, 2018

DISTINCT with TOP n

I came across an uncommon query need to use DISTINCT and TOP n in the same query, and it made me realize that I wasn't sure how they would interact. Which would take precedence? I decided to create a simple example to help understand this.

DECLARE @x TABLE (gender char(1))

INSERT @x VALUES ('M')
INSERT @x VALUES ('M')
INSERT @x VALUES ('F')
INSERT @x VALUES ('F')

SELECT DISTINCT TOP 2 gender FROM @x ORDER BY gender

If the TOP 2 took precedence and acted first, it would take the top 2 values of gender, 'F' and 'F', and then remove duplicates in the DISTINCT clause, resulting in one row of 'F'.

If the DISTINCT acted first, it would have the preliminary result of two rows, 'F' and 'M', and then the TOP 2 would return just that.

When we run this code, we get two rows, 'F' and 'M', showing that the DISTINCT takes precedence over the TOP 2.

Monday, September 18, 2017

Clever workaround to the limitation of 900 byte index widths

Clever workaround to the limitation of 900 byte index widths: https://www.brentozar.com/archive/2013/05/indexing-wide-keys-in-sql-server/.

Friday, June 23, 2017

View defined as SELECT * FROM Another View

This has been covered previously, but is so insidious that it deserves more attention. Take a look at the code below. Fairly straightforward, right?

CREATE VIEW V1 AS SELECT [object_id] FROM sys.objects
GO

CREATE VIEW V2 AS SELECT * FROM V1
GO

SELECT TOP 10 * FROM V2
GO

Now let's modify the definition of view V1, such that the column definition includes [name]:

ALTER VIEW V1 AS SELECT [name], [object_id] FROM sys.objects

SELECT TOP 10 * FROM V2

See how the definition of V2 has not changed, even as the data that it contains has? No error has occured, which would have alerted us to the issue. This illustrates one of the biggest risks of using "SELECT *" in SQL code.

Thursday, February 20, 2014

SSMS Freeze Workaround

I'm not sure if you've ever had this problem, but while quickly cutting/pasting in a SQL script using the shortcut keys (ctrl+c, ctrl+v), in rare occasions SSMS will freeze. It doesn't crash, there's no error, it doesn't explode, it just freezes - which in and of itself is no big deal, but losing the code you've written since the last save is. What makes this worse is that there's nothing autosaved (that I've ever seen), in the SQL temp folders. The only thing I could do is shut SSMS down via Task Manager (or unplug my PC out of spite). But after a little research on the interwebs (yes, this problem just happened to me again!), I discovered a way to break the freeze. That link suggests directly right-clicking the SSMS window in the Task Bar, but I found that I had to hover above it, then right-click the instance that pops up, like this:

One time this didn't work. Playing around with it, I found that when the single instance of SSMS running popped up, I right-clicked it, then selected "Restore" from the popup menu (which only appears if there is no modal popup already in SSMS), and then re-maximized it. Then the blinking cursor reappeared, and all was well.

Monday, September 16, 2013

Temp Table Collision Between Sprocs

I recently debugged a report in which one sproc called another (let's call them OuterSproc and InnerSproc), and the resulting error was a complete mystery. When I executed InnerSproc on it's own, with the same param values provided by OuterSproc, the results were fine. After scratching my head for awhile, I decided to just keep debugging by executing lines from OuterSproc, and that's when I noticed that the two sprocs shared some temp table names. We tend to think of temp tables (as in "#temp"), as being local to the scope that created them, but that's not the whole story. Temp tables are scoped in with any code that is called from that scope also. So when OuterSproc creates data in #temp, then calls InnerSproc, InnerSproc can see and modify #temp. The problem is that InnerSproc creates its own #temp table, not thinking that it already exists. As an example of how this can be a problem, take a look at this code sample:

USE TRPTools
GO
DROP PROCEDURE InnerSproc
GO
CREATE PROCEDURE InnerSproc
AS
BEGIN
SELECT * FROM #temp
CREATE TABLE #temp (Sproc_id int, sproc_name varchar(20), when_done datetime)
INSERT #temp (Sproc_id, sproc_name) VALUES (OBJECT_ID('tempdb..#temp'), 'InnerSproc')
SELECT OBJECT_ID('tempdb..#temp')
SELECT * FROM #temp
END
GO

DROP PROCEDURE OuterSproc
GO
CREATE PROCEDURE OuterSproc
AS
BEGIN
CREATE TABLE #temp (Sproc_id int, sproc_name varchar(20), who_did varchar(20))
INSERT #temp (Sproc_id, sproc_name) VALUES (OBJECT_ID('tempdb..#temp'), 'OuterSproc')
SELECT OBJECT_ID('tempdb..#temp')
EXEC InnerSproc
SELECT * FROM #temp
END
GO
EXEC OuterSproc

Notice in sproc InnerSproc that before it creates #temp, it selects from it. This is perfectly acceptable code, because of the downstream scope of temp tables. What is weird is that the same line of code, SELECT * FROM #temp, at the beginning and end of the sproc, produces different results! The first time, it pulls from the OuterSproc version of #temp, and the second, from the InnerSproc version. It's as if the CREATE TABLE #temp statement in InnerSproc creates a version of #temp that overrides the original definition, but only within the scope of InnerSproc. Preliminary research finds no explanation of this behavior.

Takeaways:

Use temp table names that are non-generic, that are specific to your project/code/data
Consider using table variables instead, as these are scoped only to the code that creates them.

Wednesday, June 26, 2013

Case-Sensitive Comparisons

One of the first things we "learn" about SQL is that it is case-insensitive, so that "Smith", "SMITH", and "smith" are equal in comparison. In fact, it is not SQL Server that is case-insensitive, or even the database, but the collation of the database. Collation is the way the database sorts string values. An example of why this is important can be found in the ordering of alphabets of foreign languages. Another example is case-sensitivity. Most SQL Server databases are set up as case-insensitive, which allows searches for string values to ignore case.
If we right-click on a typical database in the Object Explorer of SSMS, and select the last menu option, "Properties", a window named "Database Properties" pops up. The last property listed on the General page is Collation, and we can see that it is set to SQL_Latin1_General_CP1_CI_AS. According to this SO Post, this collation has the following properties:

Latin1 makes the server treat strings using charset Latin 1, basically ASCII
CI case insensitive comparisons so 'ABC' would equal 'abc'
AS accent sensitive, so 'ü' does not equal 'u'

Let's run a query to look for all of the "Smiths" in an imaginary sample table
SELECT * FROM contacts
WHERE [last_name] = 'smith'
Let's say that the query retrieves 10 rows, and that 5 of the rows have names that appear completely in upper-case, and none are completely lower-cased. If we wanted to clean up this data so that these 5 names appear in mixed-case (e.g., "Smith" rather than "SMITH"), how would we rewrite the query to isolate these rows?
If we run this
SELECT * FROM contact
WHERE [last_name] = 'smith'
AND [last_name] = UPPER([last_name])
We still get 10 rows. The additional restriction had no effect, in the same way that it makes no difference in the first query whether we search for Smith, smith, or SMITH. The way we get around this is by forcing the collation in our query
SELECT * FROM contacts
WHERE [last_name] = 'smith'
AND [last_name] = UPPER([last_name]) COLLATE SQL_Latin1_General_CP1_CS_AS
Notice the "CS" in the collation string. It specifies case-sensitive comparisons. Running this query successfully returns the 5 rows in which we are interested. This override of the database collation applies only to the comparison preceding it (otherwise no rows would be returned, because there are no rows where [last_name] = 'smith' with respect to case-sensitivity).
References
Collation - Wikipedia, the free encyclopedia
COLLATE (Transact-SQL)
http://msdn.microsoft.com/en-us/library/aa174903%28v=sql.80%29.aspx
sql - what does COLLATE SQL_Latin1_General_CP1_CI_AS do - Stack Overflow

Tuesday, March 5, 2013

Random Numbers

While working on a project, I needed to create test data, and came across another of Pindal Dave's excellent posts: "SQL SERVER – Random Number Generator Script – SQL Query".

SQL Permissions

Good article on SQL permissions: "Understanding GRANT, DENY, and REVOKE in SQL Server".

Clever workaround for passing an array to a stored procedure or function: "Using a Variable for an IN Predicate".

Friday, March 1, 2013

Data Profiling Article

There's a good article on Data Profiling in SQL Server, complete with a script to profile tables, in SSC. It reminded me of a talk I gave at BSSUG on a data profiling script I wrote.

Thursday, February 28, 2013

Interview Questions for SQL Developers

Read a good article on SSC today: "15 Quick Short Interview Questions Useful When Hiring SQL Developers". Some of the questions one might cover only in specific situations, but overall they were broad enough to cover what a Db Developer must know.