How to compare data in two tables?

In the beginning of this week I got a task from Susantha to modify a stored procedure. After the modification I had to test to compare the result sets before modification and after modification. That procedure only returns a data table. So I had so many test data on another table. Then I wrote a script to go through the script one by one. Actually I wanted to check whether those results are identical. I mean not the table structure. I wanted to perform a data comparison.

DECLARE @COUNTN INT;
DECLARE @COUNTO INT;
DECLARE @COUNTU INT;

CREATE TABLE #TEMPN
    (
        --Columns Here
    )

CREATE TABLE #TEMPO
    (
        --Columns Here
    )

CREATE TABLE #TEMPU
    (
        --Columns Here
    )

INSERT INTO #TEMPN 
    --Select From new table         

INSERT INTO #TEMPO
    --Select From old table     

SELECT  @COUNTU = COUNT(1)
FROM    #TEMPU
SELECT  @COUNTN = COUNT(1)
FROM    #TEMPN
SELECT  @COUNTO = COUNT(1)
FROM    #TEMPO

IF @COUNTN <> @COUNTO 
    BEGIN
        SELECT  *
        FROM    #TEMPN
        SELECT  *
        FROM    #TEMPO
        PRINT 'ERROR' 
    END

IF @COUNTN <> @COUNTU 
    BEGIN
        SELECT  *
        FROM    #TEMPN
        SELECT  *
        FROM    #TEMPO
        PRINT 'ERROR' 
    END

IF @COUNTU <> @COUNTO 
    BEGIN
        SELECT  *
        FROM    #TEMPN
        SELECT  *
        FROM    #TEMPO
        PRINT 'ERROR' 
    END

DROP TABLE #TEMPN
DROP TABLE #TEMPO
DROP TABLE #TEMPU    

This will check whether your data in two tables are identical or not. if there’s any mismatch it will print error and show the both result sets. And there’s another issue on this script if one table returns NULL and the other table returns empty table it will take it as matched. Then I corrected it again.

DECLARE @COUNTN INT;
DECLARE @COUNTO INT;
DECLARE @COUNTU INT;

CREATE TABLE #TEMPN
    (
        --Columns Here
    )

CREATE TABLE #TEMPO
    (
        --Columns Here
    )

CREATE TABLE #TEMPU
    (
        --Columns Here
    )

INSERT INTO #TEMPN 
    --Select From new table         

INSERT INTO #TEMPO
    --Select From old table     

SELECT  @COUNTU = COUNT(1)
FROM    #TEMPU
SELECT  @COUNTN = COUNT(1)
FROM    #TEMPN
SELECT  @COUNTO = COUNT(1)
FROM    #TEMPO

IF @COUNTN <> @COUNTO 
    BEGIN
        SELECT  *
        FROM    #TEMPN
        SELECT  *
        FROM    #TEMPO
        PRINT 'ERROR' 
    END

IF @COUNTN <> @COUNTU 
    BEGIN
        SELECT  *
        FROM    #TEMPN
        SELECT  *
        FROM    #TEMPO
        PRINT 'ERROR' 
    END

IF @COUNTU <> @COUNTO 
    BEGIN
        SELECT  *
        FROM    #TEMPN
        SELECT  *
        FROM    #TEMPO
        PRINT 'ERROR' 
    END

--New Code Block here

IF EXISTS(#TEMPO) BEGIN IF NOT EXISTS(#TEMPN) BEGIN SELECT * FROM #TEMPN SELECT * FROM #TEMPO PRINT 'ERROR' END END IF EXISTS(#TEMPN) BEGIN IF NOT EXISTS(#TEMPO) BEGIN SELECT * FROM #TEMPN SELECT * FROM #TEMPO PRINT 'ERROR' END END

DROP TABLE #TEMPN
DROP TABLE #TEMPO
DROP TABLE #TEMPU

Advertisements

Be careful if you are using CTE!

Previously I wrote a post about CTE (Common Table Expressions). And Common Table Expressions are valid for a single statement only. But single statement can use many CTEs. And statement also should be within same batch.

USE AdventureWorks2012
GO

--Defining CTE
WITH ctePerson 
AS
(
    SELECT * 
    FROM Person.Person
)
--First Statement will run 
SELECT FirstName FROM ctePerson 
--Second statement will occur an error
SELECT FirstName FROM ctePerson

And in my previous post I have mentioned about many CTEs for one statement. And there is another important thing that is be careful when you are naming CTEs. Take a look on below example.

USE tempdb
GO

--Creating two tables
CREATE TABLE cteSampletbl1
(
id int,
name varchar(50)
)
GO

CREATE TABLE cteSampletbl2
(
id int,
name varchar(50)
)
GO

--Populating tables
INSERT INTO cteSampletbl1 VALUES
(1,'Satheeq'),
(2,'Preethi'),
(3,'Dinesh'),
(4,'Angelo')
GO

INSERT INTO cteSampletbl2 VALUES
(1,'Shamil'),
(2,'Hasitha'),
(3,'Abhinandana'),
(4,'Susantha')
GO

/*
    This CTE is defined with existing table name without any error
*/
WITH cteSampletbl2
AS(
    SELECT * FROM cteSampletbl1
)
--This statement uses CTE
SELECT * FROM cteSampletbl2
--This statement uses existing table
SELECT * FROM cteSampletbl2

--Dropping tables
DROP TABLE cteSampletbl1
DROP TABLE cteSampletbl2
GO

Below example it lets us to create a CTE with a name of existing table without any warnings or errors. And it gives priority to CTE in first statement. Then it uses existing table in second statement. So it would be a best practice to use a separate prefix for CTEs. Then take a look into next sample which is taken from Dinesh’s presentation for SQL Server Universe Group.

USE tempdb
GO

--Creating two tables
CREATE TABLE cteSampletbl1
(
id int,
name varchar(50)
)
GO

--Populating tables
INSERT INTO cteSampletbl1 VALUES
(1,'Satheeq'),
(2,'Preethi'),
(3,'Dinesh'),
(4,'Angelo')
GO

--This CTE got error
WITH cteSampletbl1
AS(
    --Same name could not be used as name and in CTE
    SELECT * FROM cteSampletbl1
)
SELECT * FROM cteSampletbl1

--This CTE don’t have any error
WITH cteSampletbl1
AS(
    --full name of object should provide
    SELECT * FROM dbo.cteSampletbl1
)
SELECT * FROM cteSampletbl1

--Dropping tables
DROP TABLE cteSampletbl1
GO

You can’t use same table name as the name of the CTE and inside the CTE.

Common Table Expressions (CTE)

In Stored procedures and in our SQL Statement sometimes we need to use same query again and again. In that case you can use CTE.

USE AdventureWorks2012
GO

--Defining CTE
WITH Empl
AS
(
    SELECT * FROM HumanResources.Employee
)

--Using CTE
SELECT Empl.* FROM Empl
GO

And you can use several CTE’s in one statement.

USE AdventureWorks2012 GO

–Defining CTE

WITH Empl AS ( SELECT * FROM HumanResources.Employee ),

Pers AS ( SELECT * FROM Person.Person )

–Using CTE

SELECT Empl.*, Pers.*

FROM Empl INNER JOIN Pers

ON Empl.BusinessEntityID = Pers.BusinessEntityID

GO

Updated : At the very begining I though that CTE were reusable again and again inside a one batch. But it was a myth. I realized it after posting this post, but forgot to remove that today I removed that.

Ranking in T-SQL

In Queries we need to rank and number records. SQL Server gives you few functions to rank your records.

RANK

Rank we can use  to rank data in normal way. In this ranking if we have two 1 next one will have rank 3. 

Name Marks Rank
John 75 1
Mark 75 1
Steve 64 3
Gates 54 4

 

USE [tempdb]
GO

--Creating table Marks
CREATE TABLE [MARKS]
(
    [Name] varchar(25),
    [Score] int
)
GO

--Inserting values into Mark table
INSERT INTO [MARKS]([Name], [Score]) 
    VALUES ('John',75),('Mark',75),('Steve',64),('Gates',54)
GO

--Ranking data over Score
SELECT *, RANK() OVER (ORDER BY [Score] DESC) AS [RANK] FROM MARKS

--Dropping table
DROP TABLE [MARKS]
GO

This ranks based on the column that you specify by OVER.

DENSE_RANK

Dense Rank is not missing ranks like in normal rank function. It has continuity over ranking.

Name Marks Dense Rank
John 75 1
Mark 75 1
Steve 64 2
Gates 54 3

 

USE [tempdb]
GO

--Creating table Marks
CREATE TABLE [MARKS]
(
    [Name] varchar(25),
    [Score] int
)
GO

--Inserting values into Mark table
INSERT INTO [MARKS]([Name], [Score]) 
    VALUES ('John',75),('Mark',75),('Steve',64),('Gates',54)
GO

--Ranking data over Score
SELECT *, DENSE_RANK() OVER (ORDER BY [Score] DESC) AS [DENSE RANK] FROM MARKS

--Dropping table
DROP TABLE [MARKS]
GO

ROW NUMBER

Row Number don’t worry much about ranking it just give an incremental number to row. It’s simple.

Name Marks Row Number
John 75 1
Mark 75 2
Steve 64 3
Gates 54 4

 

USE [tempdb]
GO

--Creating table Marks
CREATE TABLE [MARKS]
(
    [Name] varchar(25),
    [Score] int
)
GO

--Inserting values into Mark table
INSERT INTO [MARKS]([Name], [Score]) 
    VALUES ('John',75),('Mark',75),('Steve',64),('Gates',54)
GO

--Numbering data over Score
SELECT *, ROW_NUMBER() OVER (ORDER BY [Score] DESC) AS [ROW NUMBER] FROM MARKS

--Dropping table
DROP TABLE [MARKS]
GO

NTile

NTile will separate your recode into the number that you parse. If you want to group your results into 3 groups you can use NTILE(3). If you want to divide your results into 7 you can use NTILE(7).

Name Mark NTile(3)
Dedunu 89 1
Dhananjaya 85 1
Hasitha 75 2
Shamil 74 2
Sanjana 65 3
Anuradha 55 3

 

USE [tempdb]
GO

--Creating table Marks
CREATE TABLE [MARKS]
(
    [Name] varchar(25),
    [Score] int
)
GO

--Inserting values into Mark table
INSERT INTO [MARKS]([Name], [Score]) 
    VALUES ('Dedunu',89),('Dhananjaya',85),('Hasitha',75),('Shamil',74),('Sanjana',65),('Anuradha',55)
GO

--NTile(3) over Score
SELECT *, NTILE(3) OVER (ORDER BY [Score] DESC) AS [NTile(3)] FROM MARKS

--Dropping table
DROP TABLE [MARKS]
GO

If you are going to group them into 4, you will have this result. It always try to round to the lower numbers.

Name Mark NTile(4)
Dedunu 89 1
Dhananjaya 85 1
Hasitha 75 2
Shamil 74 2
Sanjana 65 3
Anuradha 55 4

 

USE [tempdb]
GO

--Creating table Marks
CREATE TABLE [MARKS]
(
    [Name] varchar(25),
    [Score] int
)
GO

--Inserting values into Mark table
INSERT INTO [MARKS]([Name], [Score]) 
    VALUES ('Dedunu',89),('Dhananjaya',85),('Hasitha',75),('Shamil',74),('Sanjana',65),('Anuradha',55)
GO

--NTile(4) over Score
SELECT *, NTILE(4) OVER (ORDER BY [Score] DESC) AS [NTile(4)] FROM MARKS

--Dropping table
DROP TABLE [MARKS]
GO

Then if you are going to group odd numbered records by even or even numbered records by odd, first tile will have an extra record.

Name Mark NTile(5)
Dedunu 89 1
Dhananjaya 85 1
Hasitha 75 2
Shamil 74 3
Sanjana 65 4
Anuradha 55 5

 

USE [tempdb]
GO

--Creating table Marks
CREATE TABLE [MARKS]
(
    [Name] varchar(25),
    [Score] int
)
GO

--Inserting values into Mark table
INSERT INTO [MARKS]([Name], [Score]) 
    VALUES ('Dedunu',89),('Dhananjaya',85),('Hasitha',75),('Shamil',74),('Sanjana',65),('Anuradha',55)
GO

--NTile(5) over Score
SELECT *, NTILE(5) OVER (ORDER BY [Score] DESC) AS [NTile(5)] FROM MARKS

--Dropping table
DROP TABLE [MARKS]
GO

If you are trying to group records by more than records number or same number it will act like ROW_NUMBER function. 

Name Mark NTile(>=6)
Dedunu 89 1
Dhananjaya 85 2
Hasitha 75 3
Shamil 74 4
Sanjana 65 5
Anuradha 55 6

 

USE [tempdb]
GO

--Creating table Marks
CREATE TABLE [MARKS]
(
    [Name] varchar(25),
    [Score] int
)
GO

--Inserting values into Mark table
INSERT INTO [MARKS]([Name], [Score]) 
    VALUES ('Dedunu',89),('Dhananjaya',85),('Hasitha',75),('Shamil',74),('Sanjana',65),('Anuradha',55)
GO

--NTile(6) data over Score
SELECT *, NTILE(6) OVER (ORDER BY [Score] DESC) AS [NTile(6)] FROM MARKS

--Dropping table
DROP TABLE [MARKS]
GO

 

USE [tempdb]
GO

--Creating table Marks
CREATE TABLE [MARKS]
(
    [Name] varchar(25),
    [Score] int
)
GO

--Inserting values into Mark table
INSERT INTO [MARKS]([Name], [Score]) 
    VALUES ('Dedunu',89),('Dhananjaya',85),('Hasitha',75),('Shamil',74),('Sanjana',65),('Anuradha',55)
GO

--NTile(8) over Score
SELECT *, NTILE(8) OVER (ORDER BY [Score] DESC) AS [NTile(8)] FROM MARKS

--Dropping table
DROP TABLE [MARKS]
GO

PARTITION BY

Lets move to Partition By clause. If you want to categorize your ranks into any other factor, you can use PARTITION BY clause. It lets you to have separate rankings on a factor that you want.

Name Subject Mark Rank
Hasitha Eng 75 1
Anuradha Eng 55 2
Dedunu Math 89 1
Dhananjaya Math 85 2
Sanjana Math 65 3
Shamil Sci 74 1

 

USE [tempdb]
GO

--Creating table Marks
CREATE TABLE [MARKS]
(
    [Name] varchar(25),
    [Subject] varchar(25),
    [Score] int
)
GO

--Inserting values into Mark table
INSERT INTO [MARKS]([Name], [Subject], [Score]) 
    VALUES ('Dedunu','Math',89),('Dhananjaya','Math',85),('Hasitha','Eng',75),('Shamil','Sci',74),('Sanjana','Math',65),('Anuradha','Eng',55)
GO

--Partitioning rank by Suject, and ranking over Score
SELECT *, RANK() OVER (PARTITION BY [Subject] ORDER BY [Score] DESC) AS [RANK] FROM MARKS

--Dropping table
DROP TABLE [MARKS]
GO

Delete using another Table

In this example I create a temp table on AdventureWorks2012 Database, and first I load all the data in HumanResources.Employee to the temp table. Then I delete data using HumanResources.Employee. This command will delete all the records in Tmp which has Gender = ‘M’ in HumanResources.Employee table.

USE [AdventureWorks2012]
GO

--Inserting data from HumanResources.Employee and Person.Person to non existing table
SELECT  *
    INTO Tmp
    FROM [HumanResources].[Employee] 
GO

--DELETE Using another table
DELETE Tmp 
    FROM [HumanResources].[Employee] AS HE, Tmp
    WHERE Tmp.BusinessEntityID = HE.BusinessEntityID
    AND HE.Gender = 'M'

--Retrieving data
SELECT * FROM Tmp
GO

--Dropping Table
DROP TABLE Tmp
GO

You can use inner or outer join to combine those data.

USE [AdventureWorks2012]
GO

--Inserting data from HumanResources.Employee and Person.Person to non existing table
SELECT  *
    INTO Tmp
    FROM [HumanResources].[Employee] 
GO

--DELETE Using another table
DELETE Tmp 
    FROM [HumanResources].[Employee] AS HE
    INNER JOIN Tmp
    ON Tmp.BusinessEntityID = HE.BusinessEntityID
    WHERE HE.Gender = 'M'

--Retrieving data
SELECT * FROM Tmp
GO

--Dropping Table
DROP TABLE Tmp
GO

Both of those statements gives same results. It will delete all the male employees.

Updating from other Tables

If you want to synchronize data from another table only for once, you can use this way. When you are going to update from another table don’t forget to give a condition.

USE [AdventureWorks2012]
GO

--Inserting data from HumanResources.Employee and Person.Person to non existing table
SELECT  *
    INTO Tmp
    FROM [HumanResources].[Employee] 
GO

--Retrieving data
SELECT * FROM Tmp

--Update Using another table
UPDATE Tmp
    SET Tmp.JobTitle = HE.JobTitle + ', updated'
    FROM [HumanResources].[Employee] AS HE
    WHERE Tmp.BusinessEntityID = HE.BusinessEntityID


--Retrieving data
SELECT * FROM Tmp
SELECT * FROM HumanResources.Employee
GO

--Dropping Table
DROP TABLE Tmp
GO

If you don’t specify a condition for update it will update all the rows with one value. For example if you run below code you will see the difference.

USE [AdventureWorks2012]
GO

--Inserting data from HumanResources.Employee and Person.Person to non existing table
SELECT  *
    INTO Tmp
    FROM [HumanResources].[Employee] 
GO

--Retrieving data
SELECT * FROM Tmp

--Update Using another table
UPDATE Tmp
    SET Tmp.JobTitle = HE.JobTitle + ', updated'
    FROM [HumanResources].[Employee] AS HE
    --WHERE Tmp.BusinessEntityID = HE.BusinessEntityID


--Retrieving data
SELECT * FROM Tmp
SELECT * FROM HumanResources.Employee
GO

--Dropping Table
DROP TABLE Tmp
GO

Also you can use complex update statement using conditions.

USE [AdventureWorks2012]
GO

--Inserting data from HumanResources.Employee and Person.Person to non existing table
SELECT  *
    INTO Tmp
    FROM [HumanResources].[Employee] 
GO

--Retrieving data
SELECT * FROM Tmp

--Update Using another table
UPDATE Tmp 
    SET Tmp.JobTitle = HE.JobTitle + ', updated'
    FROM [HumanResources].[Employee] AS HE, Tmp AS T
    WHERE T.BusinessEntityID = HE.BusinessEntityID
    AND HE.Gender = 'M'


--Retrieving data
SELECT * FROM Tmp
SELECT * FROM HumanResources.Employee
GO

--Dropping Table
DROP TABLE Tmp
GO