Financial institutions must comply with multiple laws and regulations that seek to prevent unfair treatment in any phase of a credit transaction. Those laws include: a. The Equal Credit Opportunity Act (ECOA) of 1974, which prohibits discrimination on the grounds of race or color, religion, national origin, sex, marital status, age, or being a public assistance recipient. b. The Fair Housing Act of 1968, which prohibits discrimination in all residential real-estate related transactions based on race or color, religion, national origin, sex, family status, or disability. However, the ECOA also prohibits lenders not only from considering those ‘prohibited’ characteristics in the credit decisions, but also from inquiring about and recording the race, color, religion, national origin, or sex of a credit applicant, with only one exception that applies to residential real estate loans. So, under such restriction, how can a financial institution meet their obligation of monitoring that they are following the law for loans not backed by real estate collateral? How would they prove, without knowing the race or gender of credit applicant or borrower, that their processes do not result in disparate impact to protected classes? Similarly, any fairness measurement or improvement solution for AI/ML models used in consumer credit is subject to this constraint in data availability. The answer is simply through proxy assignments. Race/ethnicity proxies are derived through a methodology known as Bayesian Improved Surname Geocoding (BISG). First developed by the RAND Corporation and applied to patient treatment in healthcare, this approach is now the default method used by banks and regulators alike in their fair lending assessments. BISG relies in frequency-based probability vectors derived from U.S. Census tables for the race/ethnicity categories defined by the U.S. Office of Management and Budget (OMB).
In this talk we will present:
How to calculate those probability vectors
The tradeoffs of using thresholds to assign a proxy race/ethnicity
Efforts to test for accuracy, and the results
Main limitations of the BISG methodology and what their implications are
Recent work to improve accuracy by adding more conditional probabilities (e.g., including first name frequency tables in addition to surnames)
Finally, we briefly describe how a proxy method is also used to assign a gender category.
Dr. Cruz is a consultant, startup advisor & mentor, and prior bank executive with 20+ years of corporate experience leading highly skilled teams in predictive modeling and advanced analytics at large U.S. financial institutions, including Truist, Bank of America, Fannie Mae, Ally Financial.
Her professional experience combines expertise in advanced quantitative techniques—grounded on statistics, engineering, econometrics, and machine learning—with extensive knowledge of risk management in the highly-regulated banking industry.
She is also a proven project execution leader delivering data-based solutions, managing internal resources as well as external consultants. Those solutions addressed fraud risk detection, compliance (sanction screening and anti-money laundering monitoring), revenue forecasting, capital planning, credit risk management, and fair lending compliance analytics. In addition, Dr. Cruz led model risk governance and validation functions within model risk management organizations.
In her current role, Dr. Cruz helps organizations develop a sound data analytics strategic roadmap. She also offers her model risk governance to financial services organizations facing heightened regulatory expectations.
Dr Cruz holds a B.S. degree in Electronics Engineering, a PhD from Virginia Tech, and an MBA. She is a certified Financial Risk Manager and a Six Sigma Master Black Belt. She is a member of INFORMS, the Global Association of Risk Professionals (GARP), IEEE, Women in Analytics, and Women in Data. She is also a nonprofit founder, having established a foundation to support STEM scholarships in Dominican Republic.