tl;dr: Which should I use and/or which is more efficient =SUMIF(A$5:A$5555,$D5,B$5:B$5555) or =SUMIF(A:A,$D5,B:B)?
I’m doing a SUMIF in Excel.
Col A has LOB starting in row 5 and goes through row 5555.
Col B has Face Amount similarly.
Row 4 is a header.
Cell B3 is =SUM(B5:B5555)…just as a checksum since that’s good information to have.
Then starting in cell D5 I sum up the face amount by lob, so D5 is “WL,” D6 is “UL,” etc. and E5 is =SUMIF(A$5:A$5555,$D5,B$5:B$5555)…that’s copied down in col E for each LOB that I have.
I’m looking to make the calculation efficiency of the spreadsheet as good as is reasonable…if efficiency improves by 1% but usability degrades by 50% that’s not a good tradeoff.
Should I use =SUMIF(A$5:A$5555,$D5,B$5:B$5555) or =SUMIF(A:A,$D5,B:B)?
The problem with =SUMIF(A$5:A$5555… is that you always need to make sure that your data range is appropriately accounted for in the SUMIF statement. I’m wondering if =SUMIF(A:A… is horribly worse in calculation efficiency (since it may look at every cell in the column) or if it’s actually better. I have a vague recollection of reading somewhere that Excel actually does better with =SUMIF(A:A… but maybe I’m misremembering.
I’ve started to use Tables frequently when I need SUMIF’s for just that reason. Then the SUMIF uses the table column name and the formula is invariant to the length of the data, but also very efficient. Simplifies data columns that are formulas too by auto copying them down when the table length changes size.
Optimal is to use an external data connection to query the data directly into the table. Then you can just right-click and “Refresh” to update the data. Length/formula columns should all auto-correct.
If it’s the typical “get a file from somewhere/someone” and copy/paste the data into your file, then Tables are easier when the data gets longer with each update. If the data is longer, you can simply paste value it in and the table will grow to the new length and formula columns should auto-complete to the end. When the data is shorter, I usually delete (not clear) all except the header and 1st data row (to preserve formula columns). Then paste-value my new data in. Since it’s now longer than the table, it’ll auto-grow and be ready.
I don’t actually know if it matters, but I generally keep all formula columns to the right of any data only ones; and when I paste-value, I only overwrite the data portion, not the headers.
If you are appending data, just make sure you insert new data into your old data.
If you are replacing data, insert new data into old data, then delete the old data.
(As long as the formats are equal. If not, well, that sucks for you.)
I generally paste new data a few rows below the old data. Then, along with a top empty row and a bottom empty row, insert data into the old data, then delete the old data, leaving only the new data.
The spaces aid in discerning the old data and the new data. Without them you won’t know which is which.
Also, SUMPRODUCT((data) * (conditions)), not SUMIF. SUMIF sucks, no matter what people say, because the format is confusing.
For the OP formula: SUMPRODUCT((A$5:A$5555) * ($D5=B$5:B$5555))
(Without spaces, as the GoA Board likes single stars as italics.)
Also note that SUMPRODUCT does not require the data first and conditions afterward.
It does, however, require those summed cells to be numbers.
Indeed, predecessor set up this monstrosity. It works and correctly at that.
I’ve inherited it and want to make it work better.
I have little to no choice to be doing this in Excel.
But, good point, none the less.
But, also, and, on the other hand, if you know how to do something better, why not just do it the right way (or at least a better way) the first time?
Right, this is a “replacing data” sort of thing…what I’ve done in the past is put a border row at the bottom of the data and just insert/deleted rows above there as necessary. This works really well for whenever I do the monthly updates. It doesn’t work so well when the intern or boss-guy tries to do the updating because they don’t fully consider how the data fits into the process.
…which is why I was considering using the A:A approach since it wouldn’t require having to worry about where the soggy bottom of the data was.
Well, if you’re going to be working on it for a while, I recommend finding a way to find bottlenecks and clock changes. There’s often just a few formulas that are causing all your problems though I guess it depends on a lot of details.
Yes…yes, I do. I have found that the good people of GoActuary and its predecessor are generally very helpful, so I post a lot of them here.
I have a short attention span…and tend to be able to do just one thing at time. I’ll leave it to the archaeologists & paleontologists to sift through the rubble that is my threads and come up with a story of how one man defied expectations and was able to make a career out of such a limited skillset.
It’s not that impressive…just a lot of vlookup, sumif, sumifs, if then, etc…only in the 10s of thousands of rows…not too bulky yet.