SQL case whens or where ANDs

epistemus · December 26, 2023, 10:58pm

You have a query pulling with more than one where clause. maybe state, evaldate, and product. Is it computationally easier to pull

select * from table
where state=targetstate
and evaldate<=datecuttoff
and product=targetproduct

or

select * from table
where
(
( case when state=targetstate then 1 else 0 end )
+
( case when evaldate<=datecuttoff then 1 else 0 end )
+
( case when product=targetproduct then 1 else 0 end)
)
= 3

or is this just your mileage may vary and any distinction in runtime is anecdotal?

Till all are one,

Epistemus

magillaG · December 26, 2023, 11:19pm

I would guess the first one.

In any case, computation will not be an issue. Computation usually becomes important only with joins. What matters is how many times the engine must cycle through each table.

And the second one does not read well. And you are still doing all the same evaluations but then converting them to numbers and adding them.

epistemus · December 26, 2023, 11:22pm

You are correct. The syntax isn’t as user friendly to read as WHERE AND, but it’s the same filter. A couple years back a friend recommending using the case when in lieu of some OR statements, because that was easier for the compiler on our 30+ year old db.

Thx

SteveGrondin · December 27, 2023, 2:03am

I am not sure if the idea on CASE is that it stops evaluation of alternatives once a case is satisfied. If that is what really happens in a CASE statement.

SredniVashtar · December 27, 2023, 3:02am

I would randomly expect the ANDs to be faster.

If nothing else, I suspect some DB’s would quit after the first false.

Rastiln · December 27, 2023, 1:04pm

This is my take. I believe in SSMS the former would call it after the first failed WHERE condition, or in the latter case it would evaluate all 3 for each row.

epistemus · December 27, 2023, 3:14pm

This is a learning opportunity for me, what’s “SMSS” ?

Rastiln · December 27, 2023, 3:19pm

SQL Server Management Studio. It’s actually a graphical user interface that many people use to run SQL code. There are many different flavors of SQL. Different sub-languages may handle certain optimizations differently. I’m far from an expert on the topic.

I think that SSMS by default runs on T-SQL which is kind of the “standard base” version of SQL. I’m sure that’s not a perfect explanation. I think that SSMS can work with other languages but not sure if they can run in that GUI or if it just connects to a MySQL database management system, etc. (I’m pretty sure T-SQL predated most SQL flavors, albeit it’s been updated.)

epistemus · December 27, 2023, 3:35pm

Thanks for the explanation. At various times in my career i’ve run sql in toad, oracle programs, Microsoft programs, proc sql in SAS, snowflake, and most recently snowflake through visual studio.

So SSMS would apply to me, if i was writing sql code in the SQL Server Management Studio program, i think, but i’m not so that would not apply to me.

I believe in addition to T-SQL, there’s also PL-SQL which i think is the ORACLE variety. they’ve always been similar, but some you don’t get temp tables, some you can say select top 5 *, others you say select * limit 5.

Rastiln · December 27, 2023, 3:45pm

Fair. I don’t think you said how you’re using SQL and I’m not an authority on the differences anyway. I think in T-SQL, WHEREs and CASE WHENs are evaluated clause by clause and terminate when they’re no longer relevant to continue checking.

I’d assume in general, your first case would be faster or no difference at least.

SteveGrondin · December 27, 2023, 3:50pm

Agree with this about the specific example. Although CASE is being used, all three CASES will still be evaluated.

Rastiln · December 27, 2023, 3:56pm

Yes for sure, in this case the CASE WHENs can’t terminate early because it’s not CASE WHEN X=Y AND Z=A AND…

It must evaluate every one to check the condition.

dr_t_non-fan · December 27, 2023, 7:24pm

Second one looks like shit. Stop that, right now.
Else …

examsarehard · December 28, 2023, 12:28am

The first query will be faster if all 3 columns are indexed. If they’re not, then the first statement will still require a read of the entire table (called a table scan) and it’s really up in the air.

As a rule of thumb, using functions on data in your WHERE statement causes the query planner to be inaccurate, which may cause the planner to choose an inappropriate execution plan. For simple select statements this won’t be a problem, but it can be an issue on more complicated queries.

The first statement is a lot easier to read than the second.