Data Mining Linear and Logistic Regression

Similar documents
MgtOp 215 Chapter 13 Dr. Ahn

Notes are not permitted in this examination. Do not turn over until you are told to do so by the Invigilator.

Which of the following provides the most reasonable approximation to the least squares regression line? (a) y=50+10x (b) Y=50+x (d) Y=1+50x

Elton, Gruber, Brown and Goetzmann. Modern Portfolio Theory and Investment Analysis, 7th Edition. Solutions to Text Problems: Chapter 4

S yi a bx i cx yi a bx i cx 2 i =0. yi a bx i cx 2 i xi =0. yi a bx i cx 2 i x

Tests for Two Correlations

/ Computational Genomics. Normalization

Chapter 5 Student Lecture Notes 5-1

Evaluating Performance

3/3/2014. CDS M Phil Econometrics. Vijayamohanan Pillai N. Truncated standard normal distribution for a = 0.5, 0, and 0.5. CDS Mphil Econometrics

II. Random Variables. Variable Types. Variables Map Outcomes to Numbers

Understanding Annuities. Some Algebraic Terminology.

Economic Design of Short-Run CSP-1 Plan Under Linear Inspection Cost

OCR Statistics 1 Working with data. Section 2: Measures of location

Graphical Methods for Survival Distribution Fitting

The Hiring Problem. Informationsteknologi. Institutionen för informationsteknologi

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

Merton-model Approach to Valuing Correlation Products

Tests for Two Ordered Categorical Variables

Elton, Gruber, Brown, and Goetzmann. Modern Portfolio Theory and Investment Analysis, 7th Edition. Solutions to Text Problems: Chapter 9

Numerical Analysis ECIV 3306 Chapter 6

PhysicsAndMathsTutor.com

occurrence of a larger storm than our culvert or bridge is barely capable of handling? (what is The main question is: What is the possibility of

3: Central Limit Theorem, Systematic Errors

CHAPTER 9 FUNCTIONAL FORMS OF REGRESSION MODELS

Parallel Prefix addition

Multifactor Term Structure Models

Lecture 7. We now use Brouwer s fixed point theorem to prove Nash s theorem.

2) In the medium-run/long-run, a decrease in the budget deficit will produce:

Mathematical Thinking Exam 1 09 October 2017

TCOM501 Networking: Theory & Fundamentals Final Examination Professor Yannis A. Korilis April 26, 2002

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #21 Scribe: Lawrence Diao April 23, 2013

A Comparison of Statistical Methods in Interrupted Time Series Analysis to Estimate an Intervention Effect

Chapter 3 Descriptive Statistics: Numerical Measures Part B

SIMPLE FIXED-POINT ITERATION

Chapter 3 Student Lecture Notes 3-1

Notes on experimental uncertainties and their propagation

Linear Combinations of Random Variables and Sampling (100 points)

CHAPTER 3: BAYESIAN DECISION THEORY

Elements of Economic Analysis II Lecture VI: Industry Supply

Finite Math - Fall Section Future Value of an Annuity; Sinking Funds

4. Greek Letters, Value-at-Risk

Final Exam. 7. (10 points) Please state whether each of the following statements is true or false. No explanation needed.

Supplementary material for Non-conjugate Variational Message Passing for Multinomial and Binary Regression

Random Variables. 8.1 What is a Random Variable? Announcements: Chapter 8

Stochastic ALM models - General Methodology

Finance 402: Problem Set 1 Solutions

Consumption Based Asset Pricing

OPERATIONS RESEARCH. Game Theory

Introduction to PGMs: Discrete Variables. Sargur Srihari

Introduction. Chapter 7 - An Introduction to Portfolio Management

CrimeStat Version 3.3 Update Notes:

EXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY

Principles of Finance

Survey of Math Test #3 Practice Questions Page 1 of 5

Financial mathematics

Monetary Tightening Cycles and the Predictability of Economic Activity. by Tobias Adrian and Arturo Estrella * October 2006.

Spatial Variations in Covariates on Marriage and Marital Fertility: Geographically Weighted Regression Analyses in Japan

Module Contact: Dr P Moffatt, ECO Copyright of the University of East Anglia Version 2

Midterm Exam. Use the end of month price data for the S&P 500 index in the table below to answer the following questions.

Problem Set 6 Finance 1,

y\ 1 Target E-2 Extra Practice r i r Date: Name: 1. a) What is the approximate value of d when t = 3? Explain the method you used.

Elton, Gruber, Brown, and Goetzmann. Modern Portfolio Theory and Investment Analysis, 7th Edition. Solutions to Text Problems: Chapter 16

Risk and Return: The Security Markets Line

Physics 4A. Error Analysis or Experimental Uncertainty. Error

Testing for Omitted Variables

Simple Regression Theory II 2010 Samuel L. Baker

Homework 9: due Monday, 27 October, 2008

The IBM Translation Models. Michael Collins, Columbia University

Quiz on Deterministic part of course October 22, 2002

Lecture Note 2 Time Value of Money

Appendix for Solving Asset Pricing Models when the Price-Dividend Function is Analytic

FORD MOTOR CREDIT COMPANY SUGGESTED ANSWERS. Richard M. Levich. New York University Stern School of Business. Revised, February 1999

Capability Analysis. Chapter 255. Introduction. Capability Analysis

Global sensitivity analysis of credit risk portfolios

Appendix - Normally Distributed Admissible Choices are Optimal

Bayesian belief networks

NEW APPROACH TO THEORY OF SIGMA-DELTA ANALOG-TO-DIGITAL CONVERTERS. Valeriy I. Didenko, Aleksander V. Ivanov, Aleksey V.

A Comparison of Risk Return Relationship in the Portfolio Selection Models

Foundations of Machine Learning II TP1: Entropy

A Php 5,000 loan is being repaid in 10 yearly payments. If interest is 8% effective, find the annual payment. 1 ( ) 10) 0.

How Likely Is Contagion in Financial Networks?

UNIVERSITY OF VICTORIA Midterm June 6, 2018 Solutions

Production and Supply Chain Management Logistics. Paolo Detti Department of Information Engeneering and Mathematical Sciences University of Siena

Monte Carlo Rendering

Maturity Effect on Risk Measure in a Ratings-Based Default-Mode Model

YORK UNIVERSITY Faculty of Science Department of Mathematics and Statistics MATH A Test #2 November 03, 2014

ECE 586GT: Problem Set 2: Problems and Solutions Uniqueness of Nash equilibria, zero sum games, evolutionary dynamics

Applications of Myerson s Lemma

Scribe: Chris Berlind Date: Feb 1, 2010

Trivial lump sum R5.1

Random Variables. b 2.

ISE High Income Index Methodology

A Bootstrap Confidence Limit for Process Capability Indices

Equilibrium in Prediction Markets with Buyers and Sellers

Ch Rival Pure private goods (most retail goods) Non-Rival Impure public goods (internet service)

Hewlett Packard 10BII Calculator

A Set of new Stochastic Trend Models

σ may be counterbalanced by a larger

Facility Location Problem. Learning objectives. Antti Salonen Farzaneh Ahmadzadeh

Transcription:

07/02/207 Data Mnng Lnear and Logstc Regresson Mchael L of 26 Regresson In statstcal modellng, regresson analyss s a statstcal process for estmatng the relatonshps among varables. Regresson models are bult from data to predct the average you would expect one varable to have, gven you know the value of one or more others. Smple lnear regresson maps one varable onto the mean value of another. 2 of 26

Weght 07/02/207 Example: weght-heght relaton Weght aganst Heght 90 80 70 60 50 40 30 20 0 0 0 20 40 60 80 00 20 y bx Heght a 3 of 26 Smple Lnear Regresson To fnd the best values for a and b, smple lnear regresson uses a method known as ordnary least squares (OLS) Least squares means that the sum of the squared dstance between each data pont and ts assocated predcton s mnmsed 2 That s, t mnmses n 4 of 26 2

07/02/207 3 Fndng a and b In the case of smple lnear regresson, a and b can be calculated as follows: n n x x y y x x b 2 ) ( ) )( ( bx y a 5 of 26 Multple Regresson Wth multple nputs, the general form of lnear regresson s The parameters n b are calculated as b x b x b x b y... 3 3 2 2 0 Y X X X b T T ) ( 6 of 26 Xb Y

07/02/207 Stats Packages Many statstcs packages (such as SPSS) offer multple regresson Assumes there s a lnear relatonshp between the nputs and the output Wdely used n many felds Trend lne Rsk of nvestment 7 of 26 Logstc Regresson But what f one of the varables s a class, rather than a number? For example, let s say we have data descrbng heght and gender When we want to predct heght from gender, t s easy just calculate the average heght of males and that of females, and that s t What f you want to predct gender from heght? 8 of 26 4

07/02/207 Logstc Regresson There s no average gender for a gven heght Better to predct the probablty of beng male (or female) gven a heght value One way to do ths s to recode the classes, for example Male =0 and Female = Then you can do a regresson 9 of 26 Lnear Class Regresson Gender Code 2.5 y = -0.0277x + 2.686 0.5 0 0 20 40 60 80 00 20-0.5 P( c x) bx a - Problems Probablty values go outsde [0,] Volates other assumptons made by lnear regresson 0 of 26 5

07/02/207 There s a Better Way Leave the class labels as they are (Male, Female, n ths case) Calculate a probablty based on log odds of 26 Odds The odds of an event (beng male, for example) are 0.5/0.5 = 0.75/0.25 = 3 P( c) p( c) So odds mean tmes as probable 2 of 26 6

Probablty Probablty 07/02/207 Odds and Probablty 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0. 0 0 2 4 6 8 0 2 Odds Lacks a desrable symmetry as the odds of male are not opposte the odds of female 3 of 26 Log Odds Note that ln(x) = -ln(/x) So we take the log odds and get a functon known as the logt P( c) ln P( c) 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0. 0-6 -4-2 0 2 4 6 Log Odds (Logt) 4 of 26 7

07/02/207 Logstc Regresson Instead of tryng to predct P(c x)=ax + b We can predct the log odds gven x P( c x) ln ax b P( c x) Solvng ths equaton (later...) gves us the logstc regresson curve we need 5 of 26 Logt to Probablty OK, but f I say The logt of x beng male s 0.8, you may not know what I mean We can get back to probabltes: P( c x) ln ax b P( c x) P( c x) axb e P( c x) ax e P( c x) e b axb e ( axb) 6 of 26 8

07/02/207 Fndng a and b All we need to do now s solve the set of equatons that result from pluggng our data nto P( c x) ( axb) e But there s a problem For a gven x (heght) we don t have a probablty measure, we have a or 0 7 of 26 Maxmum Lkelhood Let s say we want to guess a parameter that predcts a probablty (whch, n ths case we do, but ths s more general...) We can test a canddate value for the parameter usng Maxmum Lkelhood Lkelhood s the reverse of a condtonal probablty: L( x y) P( y x) 8 of 26 9

07/02/207 Maxmum Lkelhood Tossng a con Probablty dstrbuton of tossng ths con Assume that we receved 40 heads n 00 tossng, what s the probablty of head? 9 of 26 Maxmum Lkelhood P( head ) 0.5 40 heads n 00 tossng P( head ) 0.4 L( x y) P( y x) 20 of 26 0

07/02/207 Lkelhood of a Model Call our data set D and magne we want to estmate a sngle parameter, a The lkelhood of the parameter, gven the data s L( a D) P( D a) The probablty of the data s P ( D a) p( d a) dd Unversty of Strlng 2067 CSCU9T6 Informaton Systems 2 of 26 Lkelhood of a Model The lkelhood of a model s a measure of how well the parameters guess at the true dstrbuton, wthout ever needng to know the true dstrbuton Note that P(c x) does not appear n the formula, and we don t need to know t P(d a) s the estmate by the model of the probablty of each data pont 22 of 26

07/02/207 Maxmum Lkelhood Logstc Regresson. Pck a value for a and b 2. Plug those values nto for every value of x n the data e ( x) ( axb) 3. Fnd the product of all of these values by multplyng them together 4. Record that value as the lkelhood 5. Choose better values for a and b and repeat P 23 of 26 Log Lkelhood One more problem to fx... Multplyng many small probabltes together soon suffers from arthmetc underflow the number s too small to represent or compare The soluton s to take logs and sum because ln( a) ln( b) ln( ab) 24 of 26 2

P(Fal) 07/02/207 An example of usng logstc regresson Can I get a mortgage wth my credt ratng? Credt score 85 75 73 0 64 0 69 Result P(Fal score).2 0.8 0.6 0.4 0.2 0 0 20 40 60 80 00 20 Credt score 25 of 26 Logstc regresson Rule of Ten : A wdely-used rule of thumb states that logstc regresson models gve stable values for the explanatory varables f based on a mnmum of about 0 events. Samplng: As a rule of thumb, samplng controls at a rate of fve tmes the number of cases wll produce suffcent control data. Convergence: In some nstances the model may not reach convergence. 26 of 26 3

07/02/207 In Weka WEKA tutoral: http://www.cs.ccsu.edu/~markov/weka-tutoral.pdf 4