SEX DISCRIMINATION PROBLEM

Similar documents
STAB22 section 2.2. Figure 1: Plot of deforestation vs. price

M249 Diagnostic Quiz

Health and the Future Course of Labor Force Participation at Older Ages. Michael D. Hurd Susann Rohwedder

NOTE: A trend line cannot be added to data series in a stacked, 3-D, radar, pie, surface, or doughnut chart.

Chapter 6. Transformation of Variables

the display, exploration and transformation of the data are demonstrated and biases typically encountered are highlighted.

SOLUTIONS TO THE LAB 1 ASSIGNMENT

Overview. Family of powers and roots

When determining but for sales in a commercial damages case,

The Affordable Care Act Has Led To Significant Gains In Health Insurance Coverage And Access To Care For Young Adults

Statistical Models of Stocks and Bonds. Zachary D Easterling: Department of Economics. The University of Akron

CHAPTER 4 DATA ANALYSIS Data Hypothesis

2016 FACULTY SALARY EQUITY ANALYSIS

Annual Equal Pay Audit 1 April 2013 to 31 March 2014

Hydrology 4410 Class 29. In Class Notes & Exercises Mar 27, 2013

MAC Learning Objectives. Learning Objectives (Cont.)

Jaime Frade Dr. Niu Interest rate modeling

starting on 5/1/1953 up until 2/1/2017.

POLICYHOLDER BEHAVIOR IN THE TAIL UL WITH SECONDARY GUARANTEE SURVEY 2012 RESULTS Survey Highlights

Math 1526 Summer 2000 Session 1

TIME SERIES MODELS AND FORECASTING

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

chapter 2-3 Normal Positive Skewness Negative Skewness

Chapter 14. Descriptive Methods in Regression and Correlation. Copyright 2016, 2012, 2008 Pearson Education, Inc. Chapter 14, Slide 1

Bargaining with Grandma: The Impact of the South African Pension on Household Decision Making

The line drawn for part (a) will depend on each student s subjective choice about the position of the line. For this reason, it has been omitted.

WEB APPENDIX 8A 7.1 ( 8.9)

Session 5: Associations

Lecture 10: Alternatives to OLS with limited dependent variables, part 1. PEA vs APE Logit/Probit

Today's Agenda Hour 1 Correlation vs association, Pearson s R, non-linearity, Spearman rank correlation,

MLC at Boise State Logarithms Activity 6 Week #8

9. Logit and Probit Models For Dichotomous Data

Using Recursion in Models and Decision Making: Relationships in Data IV.A Student Activity Sheet 1: Using Scatterplots in Reports

Chapter 18: The Correlational Procedures

CHAPTER 2. Hidden unemployment in Australia. William F. Mitchell

a. Explain why the coefficients change in the observed direction when switching from OLS to Tobit estimation.

GGraph. Males Only. Premium. Experience. GGraph. Gender. 1 0: R 2 Linear = : R 2 Linear = Page 1

Section 5.6: HISTORICAL AND EXPONENTIAL DEPRECIATION OBJECTIVES

GRAPHS IN ECONOMICS. Appendix. Key Concepts. Graphing Data

Public Employees as Politicians: Evidence from Close Elections

Perspectives on the Youth Labour Market in Canada

Retirement. Optimal Asset Allocation in Retirement: A Downside Risk Perspective. JUne W. Van Harlow, Ph.D., CFA Director of Research ABSTRACT

Healthy life expectancy: key points (new data this update)

To be two or not be two, that is a LOGISTIC question

Buying A Car. Mathematics Capstone Course

Families and Careers

Negative Binomial Model for Count Data Log-linear Models for Contingency Tables - Introduction

Introduction to Population Modeling

File: ch08, Chapter 8: Cost Curves. Multiple Choice

Decision 411: Class 6

POWER LAW ANALYSIS IMPLICATIONS OF THE SAN BRUNO PIPELINE FAILURE

Changes in Economic Mobility

Effect of Change Management Practices on the Performance of Road Construction Projects in Rwanda A Case Study of Horizon Construction Company Limited

Decision 411: Class 6

The Gender Earnings Gap: Evidence from the UK

Establishing a framework for statistical analysis via the Generalized Linear Model

Introduction of the euro in the new member states

QUADRATIC. Parent Graph: How to Tell it's a Quadratic: Helpful Hints for Calculator Usage: Domain of Parent Graph:, Range of Parent Graph: 0,

Mathematics: A Christian Perspective

PREMIUM DRIVERS REPORT

Unit2: Probabilityanddistributions. 3. Normal distribution

Presented at the 2003 SCEA-ISPA Joint Annual Conference and Training Workshop -

Solution of Equations

Women s pay and employment update: a public/private sector comparison

Are Old Age Workers Out of Luck? An Empirical Study of the U.S. Labor Market. Keith Brian Kline II Sreenath Majumder, PhD March 16, 2015

An Examination of the Predictive Abilities of Economic Derivative Markets. Jennifer McCabe

Making Hard Decision. ENCE 627 Decision Analysis for Engineering. Identify the decision situation and understand objectives. Identify alternatives

1.1 Forms for fractions px + q An expression of the form (x + r) (x + s) quadratic expression which factorises) may be written as

The Normal Distribution

The Long Term Evolution of Female Human Capital

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

Answers to Exercise 8

The Effects of Increasing the Early Retirement Age on Social Security Claims and Job Exits

The Application of the Theory of Power Law Distributions to U.S. Wealth Accumulation INTRODUCTION DATA

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

Non-linearities in Simple Regression

Logistic Regression Analysis

WEALTH INEQUALITY AND HOUSEHOLD STRUCTURE: US VS. SPAIN. Olympia Bover

Financial Risk Tolerance and the influence of Socio-demographic Characteristics of Retail Investors

THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41202, Spring Quarter 2003, Mr. Ruey S. Tsay

Most of the transformations we will deal with will be in the families of powers and roots: p X -> (X -1)/-1.

Math of Finance Exponential & Power Functions

Supporting Information for:

Review questions for Multinomial Logit/Probit, Tobit, Heckit, Quantile Regressions

It is now commonly accepted that earnings inequality

Window Width Selection for L 2 Adjusted Quantile Regression

Learning Curve Theory

Modelling the average income dependence on work experience in the USA from 1967 to 2002

MLC at Boise State Polynomials Activity 2 Week #3

Lecture 1: Review and Exploratory Data Analysis (EDA)

The Evidence for Differences in Risk for Fixed vs Mobile Telecoms For the Office of Communications (Ofcom)

Mortality Table Development 2014 VBT Primary Tables. Table of Contents

Estimating a demand function

The Trend of the Gender Wage Gap Over the Business Cycle

NCSS Statistical Software. Reference Intervals

The Province of Prince Edward Island Employment Trends and Data Poverty Reduction Action Plan Backgrounder

A Study of Relationship between Accruals and Managerial Operating Decisions over Firm Life Cycle among Listed Firms in Tehran Stock Exchange

Confusion in scorecard construction - the wrong scores for the right reasons

NPTEL Project. Econometric Modelling. Module 16: Qualitative Response Regression Modelling. Lecture 20: Qualitative Response Regression Modelling

Risk Management - Managing Life Cycle Risks. Module 9: Life Cycle Financial Risks. Table of Contents. Case Study 01: Life Table Example..

Transcription:

SEX DISCRIMINATION PROBLEM 5. Displaying Relationships between Variables In this section we will use scatterplots to examine the relationship between the dependent variable (starting salary) and each of the four independent variables: seniority, age, education, and previous experience. The relationship with the third variable gender will be visualized by using different marking symbols for male and female subjects. 5.1 Why should starting salary be examined on the log scale? 5.2 Scatterplot of log starting salary versus prior education. 5.3 Scatterplot of log starting salary versus seniority. 5.4 Scatterplot of log starting salary versus previous experience. 5.5 Scatterplot of log starting salary versus age. 5.1 Why should starting salary be examined on the log scale? The analysis of the sex discrimination data carried out in the Two-Sample Problems module was suitable on the original scale of the untransformed salaries. Nevertheless, the graphical displays of the log-transformed salaries displayed in this section will indicate that analysis would also be suitable on the log scale. There are two reasons for which starting salary should be examined on the log scale. The first reason is a consequence of the nature of the relationship between starting salary and the independent variables, the other is a consequence of the linear regression model assumptions. Indeed, how should salaries depend on variables such as amount of education, experience, and time of hire (seniority)? Most would agree that an additional year of education might be reflected in a percentage increase in beginning salary. Similarly, an additional year of experience would lead, up to a point, to another percentage increase. For these reasons, it is quite natural to use a log transformation on analysis before beginning the regression analysis. The other reason follows from the assumptions of linear regression model. The scatterplots of starting salary versus some independent variables such as prior experience, education, age, seniority displayed in Section 6 in the Two-Sample Problems module revealed some non-linear patterns. As in some cases the pattern resembles an exponential curve, it is expected that the logarithm transformation will make the relationship linear. The logarithm transformation helps to compress data. In general, the logarithm transformation tends to pull in the long tail of the distribution on the right, but stretch it out on the left. The higher values are pulled in, and the lower values are more spread out.

5.2 Scatterplot of log starting salary versus prior education. Did the females tend to receive lower starting salaries than similarly educated males? In order to answer the question, we will obtain a scatterplot of starting salaries versus the number of years of education for males and females. The scatterplot of starting salary versus education displayed in Section 6.3 in Two-Sample Problems module revealed a non-linear pattern. We will make the pattern closer to a straight line by the log transformation. The following plot is a scatterplot of log starting salaries versus education: Scatterplot of Log Starting Salary vs. Education Log Starting Salary (in dollars) 6 8 10 12 14 16 18 Education (in years) There is a slight upward trend, and no compelling reason to rule out a linear trend is observed. The log transformation helped to compress the starting salaries and made the pattern in the plot linear. The plot shows that males are better educated than females.

5.3 Scatterplot of log starting salary versus seniority. Did the bank pay higher starting salaries to men than to women hired at the same time? In order to answer the question, we will obtain a scatterplot of log starting salaries versus seniority for males and females. Plotting salaries against seniority ensures that we will be able to compare the salaries for both gender groups hired at the same time. Scatterplot of Log Starting Salary vs. Seniority Log Starting Salary (dollars) 60 70 80 90 100 Seniority (in months) As you can see the starting salaries of males tend to be higher than the salaries of females hired at the same time. No matter when the clerks have been employed, the highest paid employees are males. The situation has not improved for those hired at the end of the three-year period (low seniority), even it has worsened because almost all new male employees get higher salaries than the females. The plot indicates increasing disparity over the considered period. A slow upward drift of salaries over the study period is discernible in the plot. However, the rate of increase is smaller for females. The female starting salaries seem to be rather flat. The spread increases over time for both male and female salaries. On the plot, several males stand out as having much higher salaries than other employees hired at approximately the same time. There is no compelling reason to rule out a linear trend in the data. Notice also that the above plot shows also the change in the gender structure over the time period. Most new clerks hired at the end of the period are females.

5.4 Scatterplot of log starting salary versus previous experience. Did the bank discriminatorily pay higher starting salaries to men than to women with approximately the same previous experience? In order to answer the question, we will obtain a scatterplot of log starting salaries versus the number of months of prior experience for males and females. Scatterplot of Log Starting Salary vs. Prior Experience Log Starting Salary (dollars) -100 0 100 200 300 400 Prior Experience (in months) It is clear from the above plot that the males tend to receive higher salaries than females with the same number of months of prior experience. The plot also shows that male employees tend to have less previous experience than females. Since only entry-level jobs are being considered, there is an effect of diminishing returns in the relationship of experience on beginning salary. There is an evident increase of beginning salaries up to about 80 month of prior experience. But then relationship seems to level off. For an entry-level position, very large amounts of experience do not correspond to large beginning salaries. As you can see, there is a curved pattern in the plot. One approach to modeling this relationship would be to use a quadratic curve in the experience variable. We will do this in Section 10 to develop a polynomial regression model. We will obtain here another measure of experience such that the relationship between log starting salary and the new variable will be approximately linear. We will use the variable in Section 8 in a multiple linear regression model. We will obtain a new measure of experience by using logs or reciprocals of the experience variable. Trying to take logs of the experience variable results in an immediate problem. It is not possible to take the logarithm of zero! A similar difficulty arises when trying to calculate the reciprocal of zero. When zero occurs as a predictor value, it is customary to add a small constant to all of the values before taking logs or reciprocals. What value should be added? The goal is to produce a relationship between log salary and a transformed experience variable that is reasonably modelled by a straight line.

With a computer and SPSS, it is easy to try several different transformations and compare the results. For this data set, 1, 6, and 12 months were added before taking both logs and reciprocals. In each case, the resulting regression equation and the significance of the regression coefficients were considered. The adequacy of the model with respect to model assumptions through diagnostics was also considered. Although the regression results were not very different, the reciprocal of (experience+12 months) was finally chosen because it produced the best model with respect to nearly all criteria. The following exhibit gives the scatterplot of log starting versus this transformed predictor. Scatterplot of Log Starting Salary vs. 1/(Experience+12) Log Starting Salary (dollars) 0.00.02.04.06.08.10 1/(Experience+12) As you can see, now the relationship is approximately linear but with plenty of scatter.

5.5 Scatterplot of log starting salary versus age. Did the older employees tend to receive lower starting salaries in our case study? Did the female employees tend to be older than the male employees? In order to answer the two questions, we will obtain a scatterplot of log starting salaries versus age for males and females. Scatterplot of Log Starting Salary vs. Age Log Starting Salary (in dollars) 200 300 400 500 600 700 800 Age (in months) As you can see the older employees tend to receive lower starting salaries. Indeed, the positions considered in the case study are entry-level clerical jobs. These positions are usually granted to young people with no or little prior job experience. Older applicants have smaller chances to get the job, and even if they do the employer very likely takes advantage of their age by offering them lower salary. Older employees are also considered to be slow learners and not that willing to take over different job responsibilities when needed. The plot also shows that the female employees tend to be older than the male employees. The relationship between log starting salary and age shows a clear nonlinear pattern. We have tried to apply several different transformations of salaries to obtain a straight line pattern, but unfortunately all the transformations failed to achieve the goal.