The Data Shizzle: The EXISTS predicate

Today, I've been learning about correlated subqueries. One of the cool new things that I learned today was the EXISTS predicate. I haven't really used it before. But it is similar to the IN predicate. However, after doing some research, IN and EXISTS are quite different from each other especially when it comes from an optimization perspective.

Basically the way EXISTS works is that it accepts a subquery as input and returns true if the subquery returns at least one row, otherwise it returns false. For example:


SELECT custid, companyname

FROM Sales.Customers AS C 

WHERE EXISTS

  (SELECT * 

   FROM Sales.Orders AS O

   WHERE O.custid = C.custid 

     AND O.orderdate = '20070212');

The query returns customers and their company that had orders placed on February 12, 2007.

custid	companyname
5	Customer HGVLZ
66	Customer LHANT

If the subquery did not return anything, then the outer query will not return any customer.

Can be negated

The EXISTS predicate can also be negated. For example:


SELECT custid, companyname

FROM Sales.Customers AS C 

WHERE NOT EXISTS

  (SELECT * 

   FROM Sales.Orders AS O

   WHERE O.custid = C.custid 

     AND O.orderdate = '20070212');

The result will return customers who did not order on February 7, 2012.

custid	companyname
72	Customer AHPOP
58	Customer AHXHT
25	Customer AZJED
18	Customer BSVAR

EXISTS vs IN

As I mentioned earlier, the first question that came to mind when I was looking at the EXISTS, was how is it different from IN? Someone posted the same question at Stack Overflow which produced some interesting discussion http://stackoverflow.com/questions/24929/difference-between-exists-and-in-in-sql. EXISTS predicate is supposed to be faster than the IN. The main purpose of the EXISTS is to well, check if a row does exists. Thus, avoiding the counting stuff such as:

SELECT COUNT(*) FROM Table

I've actually seen a lot of the query above used at work in places where they could have used EXISTS instead. Maybe I can start introducing more EXISTS since it is supposed to be faster and simpler to write. Also the example above has to check the entire table wherein if we use EXISTS, the statement will return true as soon as a row is found.

The Data Shizzle

Thursday, August 6, 2015

The EXISTS predicate

No comments:

Post a Comment