If we right-click on a typical database in the Object Explorer of SSMS, and select the last menu option, "Properties", a window named "Database Properties" pops up. The last property listed on the General page is Collation, and we can see that it is set to SQL_Latin1_General_CP1_CI_AS. According to this SO Post, this collation has the following properties:
- Latin1 makes the server treat strings using charset Latin 1, basically ASCII
- CI case insensitive comparisons so 'ABC' would equal 'abc'
- AS accent sensitive, so 'ΓΌ' does not equal 'u'
SELECT * FROM contacts
WHERE [last_name] = 'smith'
Let's say that the query retrieves 10 rows, and that 5 of the rows have names that appear completely in upper-case, and none are completely lower-cased. If we wanted to clean up this data so that these 5 names appear in mixed-case (e.g., "Smith" rather than "SMITH"), how would we rewrite the query to isolate these rows?
If we run this
SELECT * FROM contact
WHERE [last_name] = 'smith'
AND [last_name] = UPPER([last_name])
We still get 10 rows. The additional restriction had no effect, in the same way that it makes no difference in the first query whether we search for Smith, smith, or SMITH. The way we get around this is by forcing the collation in our query
SELECT * FROM contacts
WHERE [last_name] = 'smith'
AND [last_name] = UPPER([last_name]) COLLATE SQL_Latin1_General_CP1_CS_AS
Notice the "CS" in the collation string. It specifies case-sensitive comparisons. Running this query successfully returns the 5 rows in which we are interested. This override of the database collation applies only to the comparison preceding it (otherwise no rows would be returned, because there are no rows where [last_name] = 'smith' with respect to case-sensitivity).
References
Collation - Wikipedia, the free encyclopedia
COLLATE (Transact-SQL)
http://msdn.microsoft.com/en-us/library/aa174903%28v=sql.80%29.aspx
sql - what does COLLATE SQL_Latin1_General_CP1_CI_AS do - Stack Overflow
No comments:
Post a Comment