Beginner SQL help

SteveGrondin · May 24, 2023, 10:54pm

Oh, seeing the ‘-’ in SV example makes me think: My example would be insufficient if [Policy] were not “well-behaved” since Policy XXX with 11 units and Policy XXX1 with 1 unit would be indistinguishable. SV would have a problem if negative units were possible and “-” could be a terminal character in [Policy].

knoath · May 24, 2023, 11:11pm

Slight modification -

SELECT LOB,
COUNT(DISTINCT CASE
WHEN LOB = ‘prop’ THEN Policy
WHEN LOB = ‘auto’ THEN Policy || ‘_’ || Unit END) AS thecount
FROM table
GROUP BY LOB

This will depend on what your platform uses as a concatenation operator (|| in this case). You may also need to cast the Policy and Unit as string if they are numeric.

ALivelySedative · May 24, 2023, 11:27pm

This seems to make sense. I wasn’t sure you could condition on the concatenation without making it it’s own column but I’ll give it a try.

Someone gave me some Sql that almost does what I need and was trying to edit it accordingly. If I was going from scratch I would’ve just SAS-ed it and saved you lads the trouble.

knoath · May 24, 2023, 11:58pm

It will depend on the platform. On a related note, I once had trouble getting sum(case… to work in Snowflake. I would have to do the case statement in an inner query and the aggregate function in an outer query. It could be that count(distinct case…. has the same problem.

dothemath · May 24, 2023, 11:59pm

Don’t use UNION ALL, Use just UNION

SteveGrondin · May 25, 2023, 2:48am

In this case, no difference, but in T-SQL, UNION eliminates duplicates, where UNION ALL preserves duplicates.

ALivelySedative · May 25, 2023, 3:03am

It worked. Got a chance to mess with it tonight.
Knowledge = knowledge+1

Next followup…
Why do I have to reference the logic to build a group in the GROUP BY instead of just being able to reference the group aliases created earlier? That is annoying. The actual dataset is split into individual lines, and grouped into Auto/Prop/Othr within the actual query…I stole the Case/When logic to do this as well. Apparently I have to use that same Case/When logic in the GROUP BY which makes it messier than I feel it should be.

So something like…
Select …,
Case
when ‘auto lines’ Auto
when ‘prop lines’ Prop
else ‘Othr’
End as NewName,
GROUP BY NewName

…doesn’t work. Because the last NewName has to be a repitition of the Case/When logic again in the Group By. Anything I’m missing so as not to have to actually do that?

dothemath · May 25, 2023, 3:05am

UNION also runs much, much faster. UNION ALL, like LEFT JOIN, should only be used if you know you need it. Don’t make me break out Big-O notation.

knoath · May 25, 2023, 3:12am

Some of the newer platforms let you group by the number of the column - Group by 1,2,3 etc.

dothemath · May 25, 2023, 3:33am

Well, the answer to that question is because the developers chose not include that feature. But, you don’t need to put the whole case statement in the GROUP BY clause. Just the component fields.

SteveGrondin · May 25, 2023, 6:13am

That’s interesting. Is the why a straightforward explanation or too in depth to explain?

ALivelySedative · May 25, 2023, 12:42pm

So instead of this:

SELECT …
CASE
WHEN SYMBOL IN (‘BSN’,‘CF’,‘DP’,‘FO’,‘FP’,‘HP’,‘IM’,‘LH’,‘MHP’,‘SMP’)
THEN ‘PROP’
WHEN SYMBOL IN (‘AP’,‘BAP’)
THEN ‘AUTO’
ELSE ‘OTHR’
END AS PROP_AUTO,
GROUP BY
, CASE
WHEN SYMBOL IN (‘BSN’,‘CF’,‘DP’,‘FO’,‘FP’,‘HP’,‘IM’,‘LH’,‘MHP’,‘SMP’)
THEN ‘PROP’
WHEN SYMBOL IN (‘AP’,‘BAP’)
THEN ‘AUTO’
ELSE ‘OTHR’
END

Do this?:

SELECT …
CASE
WHEN SYMBOL IN (‘BSN’,‘CF’,‘DP’,‘FO’,‘FP’,‘HP’,‘IM’,‘LH’,‘MHP’,‘SMP’)
THEN ‘PROP’
WHEN SYMBOL IN (‘AP’,‘BAP’)
THEN ‘AUTO’
ELSE ‘OTHR’
END AS PROP_AUTO,
GROUP BY SYMBOL

eta: no. this does not work.

SteveGrondin · May 25, 2023, 1:03pm

It may be differences in versions of SQL. I would not expect the second block of code to work in T-SQL.

ALivelySedative · May 25, 2023, 2:11pm

My further digging has uncovered we’re using SQLServer, which doesn’t actually tell me much personally, but I DID discover the order by which each component is processed, and since GROUP BY is processed before SELECT, the alias doesn’t exist yet. I did see an interesting suggestion on using CROSS APPLY to set the alias ahead of time, so you can then use the alias in both the SELECT and GROUP BY components. That seemed neat.

dothemath · May 25, 2023, 2:37pm

As an FYI, SQL Server is Microsoft’s SQL product. Transact SQL (T-SQL) is MS’s version of SQL. SQL Server has become somewhat ubiquitous. On the other hand, my company has moved us to a different platform that doesn’t use T-SQL. The version we use now stinks. It doesn’t even do implicit conversions.

knoath · May 26, 2023, 12:15am

If it’s not too many columns you’re selecting, you could put it all in a subquery, then group -

Select …
,Prop_Auto
from
(SELECT …
CASE
WHEN SYMBOL IN (‘BSN’,‘CF’,‘DP’,‘FO’,‘FP’,‘HP’,‘IM’,‘LH’,‘MHP’,‘SMP’)
THEN ‘PROP’
WHEN SYMBOL IN (‘AP’,‘BAP’)
THEN ‘AUTO’
ELSE ‘OTHR’
END AS PROP_AUTO…
)dt
group by …,Prop_Auto

ALivelySedative · May 26, 2023, 2:43am

I did consider that, but the other way seems easier given what I’m working with.

Sidenote, does anyone know what ‘AQT’ is? Advanced Query Tool? It’s apparently what we use to actually run the SQL stuff. My guess is it’s hilariously outdated like everything else we do, but curious if any outside thoughts.

Vorian_Atreides · May 26, 2023, 2:44pm

Then it should be “Antiquated Query Tool” . . . no?

T-roy · August 1, 2023, 8:35pm

DATA:
SUPPLIER_NAME = BANNER HEALTH SYSTEM
SUPPLIER_LOCATION_NAME = BANNER URGENT CARE_123456789 OTHER STUFF

Trying to get everything after the SUPPLIER_LOCATION_NAME “_” character deleted
Google says to do this:

,[SUPPLIER_NAME]
,LEFT([SUPPLIER_LOCATION_NAME], CHARINDEX(‘_’, [SUPPLIER_LOCATION_NAME]) - 1) as supplier_location_name

But it gives me this error

Msg 537, Level 16, State 2, Line 64
Invalid length parameter passed to the LEFT or SUBSTRING function.

Anyone have an idea to get rid of all the garbage characters after the “_”?

1695814 · August 1, 2023, 8:45pm

I merely ask Chatgpt these days. This might or might not be helpful:

The error you are encountering is likely due to some records in the SUPPLIER_LOCATION_NAME column not containing the character ‘_’ (underscore), which causes the CHARINDEX function to return 0. To handle this, you can use a combination of CHARINDEX and CASE statement to check if the underscore exists before applying the LEFT function. Here’s the modified SQL query:
SELECT 
    [SUPPLIER_NAME],
    CASE
        WHEN CHARINDEX('_', [SUPPLIER_LOCATION_NAME]) > 0 THEN 
            LEFT([SUPPLIER_LOCATION_NAME], CHARINDEX('_', [SUPPLIER_LOCATION_NAME]) - 1)
        ELSE 
            [SUPPLIER_LOCATION_NAME]
    END AS supplier_location_name
FROM
    your_table_name;
This query will return the SUPPLIER_NAME column as it is and for the SUPPLIER_LOCATION_NAME column, it will remove everything after the underscore character (‘_’) if it exists. If there is no underscore, it will keep the original value. Replace “your_table_name” with the actual name of your table in the query.