Inferring Race from Name/Location Data

At the CAS Spring Meeting, there was a number of sessions on bias testing and what is coming in terms of regulation. There was mentioned of a methodology where you don’t need to collect racial data, but can infer it from other characteristics. Does anyone know of a paper or vendor that does this type of thing?

1 Like

From the discussion it sounded like it was a publicly available thing, but maybe I misunderstood.

I remember there were some known proxies for protected classes. You could Google proxies for protected classes.

1 Like

There is a vendor that does this. They talked to us about it but it was too much for our company. Don’t remember the company name. I will try to find it

1 Like

Thanks!

I remember hearing about something similar at the Annual Meeting last fall.

I imagine that if one contacted the presenters, they might be willing to share the details of who they contracted with.

From past life experience, I’d recommend having a chat with your corporate legal folks before going too far down that path if your interest is more than just professional curiosity. It seemed to me at the fall meeting that the folks doing the work had responses to the potential legal can of worms involved, but I don’t recall the details.

1 Like

Given how quickly regulators are moving to potentially require bias testing, I don’t think the legal claims to avoid it will hold water much longer, but our legal team will be in the loop I’m sure as we move forward.

1 Like

Okay here we go:

Bayesian Improved Surname Geocoding (BISG).

That’s BISG

Adding in first name:

1 Like

I would think insurance would want to specifically AVOID trying to find proxies for race. Either the classification used for underwriting stands on its own or it doesn’t.

I can see where that BISG would be helpful for some study specifically looking at racial breakdowns. Outside of certain health effects, which tend to offset, I would think race has little to no effect on insurance in the US. As a proxy for socio-economic status it would have a large effect. But you can get that better from direct financial data.

Thanks!

Regulation appears to be getting ready to move away from the ‘banning input variable’ to requiring insurers to explicitly test the output of their models to show that they aren’t unfairly charging policyholders from protected classes more. What that means remains to be seen, but my suspicion is that there will be a (small?) list of adjustments you can make for things like driving record, limits/deductibles, car type, etc and beyond that, the average premium charged will need to be close.

Colorado wants insurers to explicitly show that they’re not discriminating w/ their algorithms, etc.

As insurers generally do not gather racial/ethnic data, they have to impute it somehow.

1 Like

This is pretty much exactly what I wanted to do a masters in. Guess I’m about 2 years to late.

Socio-economic status is a major underwriting factor in almost any type of insurance. It has effects across the board. If they allow for racial studies that control for socio-economic and geo-location effects (where applicable) then fine(ish). Trying to control for racial bias without controlling for at least those 2 confounding factors would be stupid.

I’m not trying to argue policy here, just trying to get the data. Personally, if it were up to me, there would be about 10-12 rating factors total for personal lines. The point of insurance is to pool risk, and this endless drive to segment the population more and more is harmful.

1 Like