Important note:
https://www.cdata.com/blog/transactional-vs-analytical-databases
Database tech is going to be a big factor in performance.
Important note:
https://www.cdata.com/blog/transactional-vs-analytical-databases
Database tech is going to be a big factor in performance.
If I was doing this in R, I’d be doing the not in on just one column. If you had a multicore computer, I’d think you could do this in parallel. I’d guess this is an operation that could run quickly (less than a minute).
There is always a reason for a left join vs an inner join. In the first, you want to keep all rows on the left, in the second, you only want to keep matches. You should never default to either of these choices. But if there is a default, I would personally default to a left join over an inner join, but an outer/full join is probably even better. It’s never good to lose data by accident, which is what an inner join will do.
There are ways to put left join criteria in a where clause without losing the left join. (+) anyone?
I haven’t experienced issues with NOT IN vs IN and I have used tables with that many records.
But to be honest, lots of times people are not using tables with that many records anyway. You should always balance the fastest coding time to the fastest performance time. If you’re optimizing a query that you’re only going to run once and you optimize it so it takes 5 secs instead of 5 minutes, but you spend 15 minutes doing that, what was the point exactly?
This assumes the captain doesn’t have other Scotties giving more honest views of how long something can take.
Like Geordi?
You might think so, but 'tis the C.
I used a right join the other day. Not because I had to, just because I knew how.
You spent so much time wondering if you could…
woot woot
Now time for a full join
From time to time, I use a Cartesian product.
That’s gotta endear you to the DBA
Yknow, when I learned SQL, I learned about 1st normal form, 2nd normal form, etc. These days, in the lake, IT throws normal from out the window.
would expect nothing less from my vantage point up here in the cloud
sorry about the waste of time with normal form but at least you can chat with the DBA over old times when he pings you about your cartesian join
“cloud” is a fancy term for somebody else’s computer
cartesian join
Cartesian Product