Beginner SQL help

Oh, seeing the ‘-’ in SV example makes me think: My example would be insufficient if [Policy] were not “well-behaved” since Policy XXX with 11 units and Policy XXX1 with 1 unit would be indistinguishable. SV would have a problem if negative units were possible and “-” could be a terminal character in [Policy].

Slight modification -

SELECT LOB,
COUNT(DISTINCT CASE
WHEN LOB = ‘prop’ THEN Policy
WHEN LOB = ‘auto’ THEN Policy || ‘_’ || Unit END) AS thecount
FROM table
GROUP BY LOB

This will depend on what your platform uses as a concatenation operator (|| in this case). You may also need to cast the Policy and Unit as string if they are numeric.

This seems to make sense. I wasn’t sure you could condition on the concatenation without making it it’s own column but I’ll give it a try.

Someone gave me some Sql that almost does what I need and was trying to edit it accordingly. If I was going from scratch I would’ve just SAS-ed it and saved you lads the trouble.

It will depend on the platform. On a related note, I once had trouble getting sum(case… to work in Snowflake. I would have to do the case statement in an inner query and the aggregate function in an outer query. It could be that count(distinct case…. has the same problem.

Don’t use UNION ALL, Use just UNION

In this case, no difference, but in T-SQL, UNION eliminates duplicates, where UNION ALL preserves duplicates.

It worked. Got a chance to mess with it tonight.
Knowledge = knowledge+1

Next followup…
Why do I have to reference the logic to build a group in the GROUP BY instead of just being able to reference the group aliases created earlier? That is annoying. The actual dataset is split into individual lines, and grouped into Auto/Prop/Othr within the actual query…I stole the Case/When logic to do this as well. Apparently I have to use that same Case/When logic in the GROUP BY which makes it messier than I feel it should be.

So something like…
Select …,
Case
when ‘auto lines’ Auto
when ‘prop lines’ Prop
else ‘Othr’
End as NewName,
GROUP BY NewName

…doesn’t work. Because the last NewName has to be a repitition of the Case/When logic again in the Group By. Anything I’m missing so as not to have to actually do that?

UNION also runs much, much faster. UNION ALL, like LEFT JOIN, should only be used if you know you need it. Don’t make me break out Big-O notation.

1 Like

Some of the newer platforms let you group by the number of the column - Group by 1,2,3 etc.

Well, the answer to that question is because the developers chose not include that feature. But, you don’t need to put the whole case statement in the GROUP BY clause. Just the component fields.

That’s interesting. Is the why a straightforward explanation or too in depth to explain?

So instead of this:

SELECT …
CASE
WHEN SYMBOL IN (‘BSN’,‘CF’,‘DP’,‘FO’,‘FP’,‘HP’,‘IM’,‘LH’,‘MHP’,‘SMP’)
THEN ‘PROP’
WHEN SYMBOL IN (‘AP’,‘BAP’)
THEN ‘AUTO’
ELSE ‘OTHR’
END AS PROP_AUTO,
GROUP BY
, CASE
WHEN SYMBOL IN (‘BSN’,‘CF’,‘DP’,‘FO’,‘FP’,‘HP’,‘IM’,‘LH’,‘MHP’,‘SMP’)
THEN ‘PROP’
WHEN SYMBOL IN (‘AP’,‘BAP’)
THEN ‘AUTO’
ELSE ‘OTHR’
END

Do this?:

SELECT …
CASE
WHEN SYMBOL IN (‘BSN’,‘CF’,‘DP’,‘FO’,‘FP’,‘HP’,‘IM’,‘LH’,‘MHP’,‘SMP’)
THEN ‘PROP’
WHEN SYMBOL IN (‘AP’,‘BAP’)
THEN ‘AUTO’
ELSE ‘OTHR’
END AS PROP_AUTO,
GROUP BY SYMBOL

eta: no. this does not work.

It may be differences in versions of SQL. I would not expect the second block of code to work in T-SQL.

My further digging has uncovered we’re using SQLServer, which doesn’t actually tell me much personally, but I DID discover the order by which each component is processed, and since GROUP BY is processed before SELECT, the alias doesn’t exist yet. I did see an interesting suggestion on using CROSS APPLY to set the alias ahead of time, so you can then use the alias in both the SELECT and GROUP BY components. That seemed neat.

As an FYI, SQL Server is Microsoft’s SQL product. Transact SQL (T-SQL) is MS’s version of SQL. SQL Server has become somewhat ubiquitous. On the other hand, my company has moved us to a different platform that doesn’t use T-SQL. The version we use now stinks. It doesn’t even do implicit conversions.

If it’s not too many columns you’re selecting, you could put it all in a subquery, then group -

Select …
,Prop_Auto
from
(SELECT …
CASE
WHEN SYMBOL IN (‘BSN’,‘CF’,‘DP’,‘FO’,‘FP’,‘HP’,‘IM’,‘LH’,‘MHP’,‘SMP’)
THEN ‘PROP’
WHEN SYMBOL IN (‘AP’,‘BAP’)
THEN ‘AUTO’
ELSE ‘OTHR’
END AS PROP_AUTO…
)dt
group by …,Prop_Auto

I did consider that, but the other way seems easier given what I’m working with.

Sidenote, does anyone know what ‘AQT’ is? Advanced Query Tool? It’s apparently what we use to actually run the SQL stuff. My guess is it’s hilariously outdated like everything else we do, but curious if any outside thoughts.

Then it should be “Antiquated Query Tool” . . . no?

1 Like

DATA:
SUPPLIER_NAME = BANNER HEALTH SYSTEM
SUPPLIER_LOCATION_NAME = BANNER URGENT CARE_123456789 OTHER STUFF

Trying to get everything after the SUPPLIER_LOCATION_NAME “_” character deleted
Google says to do this:

,[SUPPLIER_NAME]
,LEFT([SUPPLIER_LOCATION_NAME], CHARINDEX(‘_’, [SUPPLIER_LOCATION_NAME]) - 1) as supplier_location_name

But it gives me this error

Msg 537, Level 16, State 2, Line 64
Invalid length parameter passed to the LEFT or SUBSTRING function.

Anyone have an idea to get rid of all the garbage characters after the “_”?

I merely ask Chatgpt these days. This might or might not be helpful:

1 Like