I have a large data set i’d like to do some EDA on.
For a given variable, i can run
select Variable1, count(*) as counter from bigdata group by variable1 order by counter desc;
But I have to do that for each variable, then look at the results.
Is there any way to create a loop in sql or python or R that will give me for each variable i’m interested in, the distinct values and the counts? So the ideal end data set would be
Variable1 variable1counter Variable2 variable2counter…VariableN VariableNcounter
I do recognize that variable1 may have 5 ditinct values, while variable2 has 3 or VariableN is only 1 or 2.
Any ideas?