Testing Methodologies for Credit Score Models to Identify Statistical Bias toward Protected Classes

Similar documents
Universe Expansion: Is the Way You Score Customers State of the Art or State of Denial?

Universe Expansion: Is the Way You Score Customers State of the Art or State of Denial?

Universe expansion. Growth strategies in the evolving consumer market

2008 VantageScore Revalidation

Instructional Reminder Regarding: Collection of Applicant s Ethnicity, Race and Sex on the Loan Application Demographic Information Addendum

Credit Score Basics, Part 1: What s Behind Credit Scores? October 2011

Status of New Uniform Residential Loan Application and Collection of Expanded Home

Maximizing the Credit Universe

White paper. Trended Solutions. Fueling profitable growth

Regulatory Environments

Fair Lending Examination Procedures Summary and Risk Factors Table

Credit Score Basics, Part 3: Achieving the Same Risk Interpretation from Different Models with Different Ranges

Keeping Fintech Fair: Thinking about Fair Lending and UDAP Risks

Keeping Fintech Fair: Thinking about Fair Lending and UDAP Risks

Inaugural VantageScore 4.0 Trended Data Model Validation

May 19, 2017 VIA ELECTRONIC SUBMISSION

Loan Application Checklist

HERITAGE MORTGAGE CORPORATION

RCAC Idaho SRF/ Household Septic System Program

GET SOCIAL WITH US. #vision2016. Tweet, follow, share throughout the session.

Credit Research Center Seminar

LOAN APPLICATION P.O. BOX 1138, HUNTSVILLE, AR OFFICE: FAX:

If you should have any questions, please feel free to call a Loan Department Representative at (805)

Score migration strategies for turbulent times

Despite Growing Market, African Americans and Latinos Remain Underserved

COMMUNITY ADVANTAGE PANEL SURVEY: DATA COLLECTION UPDATE AND ANALYSIS OF PANEL ATTRITION

Understanding Your FICO Score. Understanding FICO Scores

FAIR SERVICING: REGULATORS WATCH FOR DISCRIMINATION BY SERVICERS

Life After Foreclosure and Hidden Opportunities

Onondaga County Community Development Division

A Decade of Validation Demonstrates Superior Performance

Amendments to Equal Credit Opportunity Act (Regulation B) Ethnicity and Race

MBBA-NH & MAMP. Compliance Conference. April 19, 2017

Executing Effective Validations

LISC Building Sustainable Communities Initiative Neighborhood Quality Monitoring Report

Please submit all of the above forms via one of the following options:

THE VALUE OF AN INVESTMENT & INSURANCE CUSTOMER TO A BANK

Equal Credit Opportunity Act - Regulation B

Summary. October 2009

BUSINESS LOAN APPLICATION COMPANY INFORMATION

Analytic measures of credit capacity can help bankcard lenders build strategies that go beyond compliance to deliver business advantage

Poverty in the United Way Service Area

To learn about navigation and other features of this e-learning course, click Help. Click Next to continue to the next page.

LEGACY BANK BUSINESS CREDIT APPLICATION

SMALL BUSINESS LOAN APPLICATION PACKAGE

City of Tacoma Environmental Services

PERSONAL FINANCIAL STATEMENT AS OF

Decorah Small Business Revolving Loan Fund Application

A Nation of Renters? Promoting Homeownership Post-Crisis. Roberto G. Quercia Kevin A. Park

CFPB Consumer Laws and Regulations

CONSUMER CREDIT APPLICATION

FREQUENTLY ASKED QUESTIONS ABOUT THE NEW HMDA DATA. General Background

Report 10. Is Consumer Ability To Repay Predictive Of Actual Repayment Of Storefront Payday Loans? BY RICK HACKETT 1

Market Research for Business and Public Policy Decisions in Consumer Lending

THDA Homebuyer Education Initiative Customer Intake Form

Freehold Savings Bank, 68 West Main Street, Freehold, N.J Commercial Mortgage Construction Loan Term Loan Equipment Loan Line of Credit

Business Loan Application

How Are Credit Line Decreases Impacting Consumer Credit Risk?

July 31, :30PM to 2:30PM CDT. Fair Lending: Can You Make Exceptions?

Kemba Commercial Loan Application

Top US Bankcard Issuer Validates the Power of FICO 8 Score Key metrics exceed client expectations in originations testing

Office of Consumer Financial Protection and Access. Fair Lending Guide

The American Panel Survey. Study Description and Technical Report Public Release 1 November 2013

HMDA 2018 IMPLEMENTATION PLANNING. HMDA Process Inventory

COMMUNITY ADVANTAGE PANEL SURVEY: DATA COLLECTION UPDATE AND ANALYSIS OF PANEL ATTRITION

Loan Growth and Compliance Pitfalls

Mile High Money: Payday Stores Target Colorado Communities of Color

ONLINE APPENDIX. The Vulnerability of Minority Homeowners in the Housing Boom and Bust. Patrick Bayer Fernando Ferreira Stephen L Ross

SEGMENTATION FOR CREDIT-BASED DELINQUENCY MODELS. May 2006

Indirect Auto Lending Fair Lending Considerations

COMMUNITY ADVANTAGE PANEL SURVEY: DATA COLLECTION UPDATE AND ANALYSIS OF PANEL ATTRITION

Confidence Intervals. σ unknown, small samples The t-statistic /22

Wholesale Price Monitoring in the Age of Tough Enforcement

Understanding HELOC end of draw

CFPB Supervision and Examination Manual ECOA Components

Random Group Variance Adjustments When Hot Deck Imputation Is Used to Compensate for Nonresponse 1

Equity Loan, Line of Credit, and Consumer Loan Application

CIT Group Accused of Redlining and Violating Fair Housing Act

Implications and Risks of New HMDA Data Disclosure

Spearfish Economic Development Corporation Community Capital Revolving Loan Fund. Application Information

Using alternative data, millions more consumers qualify for credit and go on to improve their credit standing

Community Assistantship Program. Best Practices in Microlending

Appendix C-5 Environmental Justice and Title VI Analysis Methodology

Online Payday Loan Payments

Turning the tide. Managing troubled portfolios

Memorandum. Human Resources Division

UNDERSTANDING YOUR CREDIT REPORT & YOUR CREDIT SCORE

The High Cost of Segregation: Exploring the Relationship Between Racial Segregation and Subprime Lending

FFIEC HMDA Examiner Transaction Testing Guidelines 1

PROFESSIONAL PRACTICE GROUP APPLICATION

COMMUNITY REINVESTMENT ACT PERFORMANCE EVALUATION

CRIF Lending Solutions WHITE PAPER

The changing face of installment lending

Racial Discrimination in Mortgage Lending Is There a Problem Here?

Understanding. What you need to know about the most widely used credit scores

DATA SUMMARIZATION AND VISUALIZATION

Harnessing Traditional and Alternative Credit Data: Credit Optics 5.0

White Paper. Who s Getting Paid During the Subprime Crisis?

We are excited that you have chosen Habitat for Humanity Saint Louis as your partner in your journey towards owning your own home!

Scoring Credit Invisibles

Transcription:

White Paper Series May 2014 Testing Methodologies for Credit Score Models to Identify Statistical Bias toward Protected Classes Introduction The Equal Credit Opportunity Act (ECOA), implemented by Federal Reserve Board s Regulation B (12 CFR 202), prohibits discrimination in any aspect of a credit transaction on the basis of specific population classifications. Protected classes are: Contents Introduction...1,2 Measurement methodology... 2-4 Data design for study of statistical bias toward protected classes...5 Race Color National origin Marital status The applicant s exercise, in good faith, of any right under the Consumer Credit Protection Act Religion Sex Age (provided the applicant has the capacity to contract) The applicant s receipt of income derived from any public assistance program Statistical bias analysis unsecured credit (bankcard)...6,7 Statistical bias analysis secured credit (first mortgage)... 8-10 Conclusions...10 Credit score models, such as VantageScore 3.0, are mathematical algorithms derived from information in a consumer s credit report to assess whether a consumer is likely to pay their debt obligations within the agreed upon terms. These credit reports, which use data from lenders, other creditors and public records, are primarily based on information regarding an individual s previous use of, and application for, credit. No additional information such as age of the consumer, marital status, employment history, ethnicity, etc., is used in the algorithm.

Introduction (cont.) ECOA concerns would arise with respect to the credit score used in a credit extension transaction if the model unduly favors an outcome for a particular group of people over another outcome with a different group of people, even though they both receive the same score with the same model. Specifically, if a given credit score represents different levels of risk (probability of default) given the same model for two similarly situated populations of different ethnicity, then the credit score model is favoring one population over another. It is the purpose of this paper to discuss how to appropriately analyze and measure evidence of statistical bias in a credit score that causes disparate impact in a lender s credit extension transaction and to demonstrate that VantageScore 3.0 reflects no bias on protected classes, specifically by analyzing ethnic classes. Measurement Methodology How to assess if a credit score model reflects statistical bias toward a protected class The formal definition of a credit score is a measure of risk defined by the probability that a consumer will default on a loan. Default, in this instance, is defined as a consumer being 90 or more days past due (90+ DPD) on an account. Assessing if a credit score reflects statistical bias requires assessing the probability of default for each credit score (collectively known as credit score default curves ) for each subpopulation of consumers and comparing the credit score default curve to all other sub-populations. If there are measureable differences between the sub-populations, the corresponding curves will look decidedly different when compared amongst each other. In these cases, the credit score model is unduly biased (either positively or negatively) towards a particular sub-population and this suggests that there is potential bias or preferential treatment/mistreatment. 2

Graphically, such a comparison between biased and unbiased scores would look like the charts below: Biased Score Probability of Default (90 Days or More Past Due) Unbiased Score Probability of Default (90 Days or More Past Due) Example of a Biased Model: In this case, the grey sub-population is defaulting at a much lower rate than the orange population at each credit score value. Thus, the credit score is unfairly impacting the orange population. For example, at a score of 550, the grey population defaults at a 30% rate, whereas the orange population defaults at a 40% rate. A consumer in the grey population with a score of 550 behaves more similarly to a consumer in the orange population at a score of 600, and is being negatively impacted by the credit score. Example of an Unbiased Model: Here, given the same credit model, all the sub-populations have similar outcomes; thus, there is no statistical bias that favors any sub-population. Although graphs show a nice visual explanation of bias, they are not conclusive in determining the existence of bias, since they do not formally compare default probabilities. A more rigorous process is required to conclusively determine the presence of bias. This requires statistical testing of individual default probabilities within each sub-population across the entire credit score range to determine if there are significant differences. 3

A formal test to determine if there are differences between subpopulation default probabilities is a statistical comparison test called the Chi-Square test for multiple probabilities. To perform this test on VantageScore 3.0, the score range is divided into buckets of 25-point bands, to ensure that sufficient samples exist for the testing procedure. The initial score band is set to be any score less than 500 since the population in the distribution tail is very sparse. All intervals above 500 are in 25 point bands. In each score band the Chi-Square comparison tests to see if there are differences in default probabilities amongst sub-populations. The test calculates the actual proportions within each sub-population in the score interval and compares them to the whole population in the same interval. If the differences between sub-populations and whole population proportions are large (i.e., statistically significant, as measured by comparing to a critical value), then there is a demonstrated measureable bias. If not, then there is no measureable impact. The test is performed across all score bands; if one band fails the test then there is a bias implication for the model as a whole. This test can be represented graphically. The comparison test produces thresholds (lower and upper) to determine where each sub-population is considered within normal population boundaries. If the orange sub-population breaches the lower and upper thresholds (grey dashed lines) then there is statistically significant evidence to suggest bias. 4

Data Design for Study of Statistical Bias toward Protected Classes To assess if VantageScore 3.0 exhibits statistical bias toward protected classes, two products are considered: an unsecured credit product (Bankcard) and a secured credit product (First Mortgage). This study is assessed on ethnic protected class sub-populations, namely African- American (AOMC) and Hispanic-American (AOHC) populations. One million consumers owning the product in question are randomly selected from data spanning the 2010 to 2012 time frame. To measure ethnicity, a consumer s ZIP Code also was appended to the file for look-up purposes based on the US Census Bureau s database. Ethnicity Weighting Since a consumer s ethnicity cannot be directly determined, weights are applied to each of the randomly selected consumer credit reports in the sample. This is done by using corresponding ZIP Codes matched to the US Census Bureau s 2011 American Community Survey 1 which help identify the ethnic demographic; specifically: AOMC proportion of African-American households in ZIP Code AOHC proportion of Hispanic-American household in ZIP Code Non-AOMC/Non-AOHC proportion neither African-American nor Hispanic-American in ZIP Code For each consumer, these proportions are attached to their credit file and the corresponding credit score and account information are summed to produce population credit score default curves. For example, if a particular ZIP Code has the following weights: 30% African-American, 20% Hispanic- American and 50% Non African-American/Hispanic-American, then each consumer is weighted 0.3 AOMC, 0.2 AOHC and 0.50 Non-AOMC/Non- AOHC. If the ZIP Code had 100 consumers sampled then the weights would sum to 30 AOMC, 20 AOHC and 50 Non AOMC/Non AOHC, respectively. 1 US Census Bureau s American Community Survey, 2011: http://www.census.gov/acs/www/ 5

Statistical Bias Analysis Unsecured Credit (Bankcard) Protected Classes Ethnicity Using VantageScore 3.0, a graphical comparison shows all three ethnic classes essentially establishing the same probability of default as the credit default curves are on top of each other. Moreover, all default curves based on VantageScore 3.0 are well contained within the upper and lower acceptable thresholds. Although the graphs look to align, the exact results from the Chi-Square test are needed to provide conclusive evidence that no disparate impacts exist. Disparate Impact: Bankcard Default Profiles by Ethnicity with Confidence Intervals Probability of Default (90 Days or More Past Due) 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Non-AOMC/AOHC AOMC Lower AOMC Upper AOMC AOHC Lower AOHC Upper AOHC Overall 0 500 525 550 575 600 625 650 675 700 725 750 775 800 825 839 VantageScore 3.0 Range A closer inspection of the graph (see below) shows the lower score range default rates (500 to 575) are slightly lower for AOHC (Hispanic grey line) population versus the other groups, whereas AOMC (African-American orange line) default rates are on top of overall population default rates. Both ethnic groups are well within their confidence intervals, indicating there are no measureable differences between the groups at each credit score value and the overall population default rates. Disparate Impact: Bankcard Default Profiles by Ethnicity with Confidence Intervals (Score Range 500-575) 6

Multiple Comparison Test of Probability to Default for Identifying Statistical Bias in the credit score model toward Ethnic Classes on Unsecured Credit VantageScore 3.0 Interval Start Point 350 501 526 551 576 601 626 651 676 701 726 751 776 801 826 End Point 500 525 550 575 600 625 650 675 700 725 750 775 800 825 839 Test Chi-Square 1.048 0.424 0.821 1.879 3.581 3.744 2.265 3.543 7.545 9.682 3.239 6.821 2.932 3.729 0.808 Critical Value 11.408 11.408 11.408 11.408 11.408 11.408 11.408 11.408 11.408 11.408 11.408 11.408 11.408 11.408 11.408 Is Test > Critical Value (if "Yes" then Disparate Impact) NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO Within each VantageScore 3.0 score interval, the Chi-Square test statistics for comparisons of probability of default are well below the critical value, 11.408, established as the threshold to determine if statistical bias is evident. In other words, since no calculation is larger than 11.408, there is no evidence of statistical bias amongst the three sub-populations default rates across the entire VantageScore 3.0 score range. 7

Statistical Bias Analysis Secured Credit (First Mortgage) Unlike the unsecured product, the secured product (namely a first mortgage) has an asset value attached to the loan. As a result of the 2008 economic housing crisis, these asset values have been under extreme stress. Hence, most mortgages prior to 2009 have had unprecedented stresses applied to them and have produced overwhelming factors, unrelated to credit scores, that contribute to default behaviors. Data Design To remove the inherent problems associated with these stress factors, two filters have been established to assess whether the model reflects bias in credit extension transactions used for secured products. Reviewing mortgages originated after 2009 removes some of the stresses induced by the crisis and the housing bubble. A second filter, price-to-income (PTI) scaled mortgages, represents ability to pay off based on an income factor. This scale determines if the consumer has enough resources to pay off the mortgage. Many of the mortgages that went into default during the housing crisis have been linked to a class of low documentation or no documentation loans which essentially ignored the ability to repay requirement, causing mortgage defaults to occur. To establish a sound mortgage, the value of the mortgage is divided by the household s income to produce a PTI ratio on the mortgage. A mortgage is considered sound if the mortgage has a PTI of 3 or less. If no actual income information is available on the credit file, the US Census American Community Survey data can be used as a proxy. The survey has median homeowner household income according to zip code. A random sample of 860,000 mortgage consumers from 2009 onwards with sound price-to-income (PTI 3) values was obtained for this analysis. 8

Protected Classes Ethnicity Again, applying VantageScore 3.0, the graphical comparison on the following page shows some initial separation, in terms of default profiles, in the lower credit score range. Yet, all credit score default curves are contained within the upper and lower acceptable boundaries, providing evidence that VantageScore 3.0 does not exhibit statistical bias when used on credit extension transactions for sound mortgages. Measurable Bias: First Mortage Default Profiles by Ethnicity with Confidence Intervals Again, a closer inspection of the graph (see below) shows the lower score range default rates (500 to 575) are lower for AOHC (Hispanic solid grey line) population versus the overall population and AOMC (African-American solid orange line) default rates are higher than the overall population. However, both ethnic groups are well within their confidence intervals indicating there are no measureable differences between the groups at each credit score value and the overall population default rates. Measurable Bias: First Mortage Default Profiles by Ethnicity with Confidence Intervals (Score Range 500-575) 9

Multiple Comparison Test of Probability to Default for Identifying Statistical Bias in the credit score model toward Ethnic Classes on Secured Credit VantageScore 3.0 Interval Start Point 350 501 526 551 576 601 626 651 676 701 726 751 776 801 826 End Point 500 525 550 575 600 625 650 675 700 725 750 775 800 825 839 Test Chi-Square 0.450 2.102 4.651 1.420 6.325 5.606 5.819 2.111 9.261 7.618 5.111 3.993 4.500 0.568 0.943 Critical Value 11.408 11.408 11.408 11.408 11.408 11.408 11.408 11.408 11.408 11.408 11.408 11.408 11.408 11.408 11.408 Is Test > Critical Value (if "Yes" then Disparate Impact) NO NO NO NO NO NO NO NO NO NO NO NO NO NO NO Examining the Chi-Square tests for proportions demonstrates that VantageScore 3.0 does not exhibit statistical bias in credit extension transactions for secured lending, since all intervals show no probability of default differences amongst the ethnic classes. Call 1-888-202-4025 Visit www.equifax.com/ vantagescore Conclusions Credit score bias toward protected class analysis requires focusing only on the outcomes used in credit-making decisions. Doing so will ensure that all consumers are properly assessed by comparing default rates based on credit scores within any protected class. The probability of default given a credit score should be consistent amongst all protected classes, although there may be score distribution differences between different protected classes. VantageScore 3.0 exhibits no statistical bias amongst protected classes. Ethnicity Study - In both instances, secured and unsecured credit products, there is no evidence of bias toward protected classes when using VantageScore 3.0. By comparing one million randomly selected consumers in each case, there were no discernible differences in the probability to default within each score band when these consumers were overlaid with demographic data. VantageScore 3.0 produces no favorable biases in assessing risk outcomes for either product for any impacted ethnic group. Call 1-888-414-1120 Visit www.experian.com/ consumer-information/ vantagescore-lenders.html Call 1-866-922-2100 Visit www.transunion. com/corporate/business/ solutions/financialservices/ bank_acq_vantage-score. page VantageScore is a registered trademark of VantageScore Solutions, LLC. 2014 VantageScore Solutions, LLC. All rights reserved. 10