Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Evaluation of Models. Niels Landwehr

Size: px
Start display at page:

Download "Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Evaluation of Models. Niels Landwehr"

Transcription

1 Universität Potsdam Institut für Informatik ehrstuhl Maschinelles ernen Evaluation of Models Niels andwehr

2 earning and Prediction Classification, Regression: earning problem Input: training data Output: model f : X Y {( x, y ),...,( x, y )} 1 1 Model will be used to obtain predictions for novel test instances m m f ( x)? x test instance Have seen several different types of models inear models, decision trees 2

3 Evaluation of Models Question: After having implemented learning algorithm, having trained model etc.: how accurate are our predictions? What exactly do we mean by accurate? How do we calculate / measure / estimate accuracy? We care about accuracy of predictions when applying the model to unseen, novel test data (not about accuracy on observable training data). Evaluation of models: Estimate accuracy of predictions of learned models. 3

4 Evaluation of models: Assumptions In order to study the evaluation of models formally, we have to make assumptions about the properties of training and test data. Central assumption: all data are drawn from fixed (unknown) distribution p(x,y) Distribution over instances Distribution over labels given instance Example spam-filtering p( x, y) p( x) p( y x) p(x) Probability to see x p(y x) Probability to see label y {Spam/Ok} for x. 4

5 Evaluation of models: Assumptions i.i.d. -assumption: Examples are independent and identically distributed. Training instances are drawn independently from distribution p(x,y) : ( x, y ) ~ p( x, y) i i training examples Test instances are also drawn independently from this distribution ( x, y) ~ p( x, y) test instance (seen when applying the model) Is this always realistic? In the following, we will always assume i.i.d. data. 5

6 oss functions We have made assumption about the instances (x,y) that we will see New test instance (x,y) arrives, model predicts f(x). oss function defines how good/bad the prediction is: Non-negative: Problem specific, given a priori. oss functions for classification Zero-one loss: ( y, y') 0, if y y'; 1, otherwise ( y, f( x)) y, y': ( y, y') 0 Class-dependent cost matrix oss functions for regression Squared loss: ( y, y') ( y y') oss of prediction f(x) for instance (x,y) 2 6

7 Evaluating models: Risk of model Central definition when evaluating models: risk. Risk of a model: expected loss for novel test instance ( x, y) ~ p( x, y) test instance x with label y (random variable) ( y, f( x)) loss for test instance (random variable) R( f ) E[ ( y, f ( x))] ( y, f ( x)) p( x, y) dxdy For zero-one-loss risk is also called error rate. For squared loss risk is also called mean squared error. Main goal when evaluating models: determine risk of model. Risk cannot be determined exactly, because p(x,y) is unknown estimation problem. 7

8 Evaluation of models: risk estimate Estimating risk from data. If data sampled from p(x,y) is available, T {( x, y ),...,( x, y )} ( x, y ) ~ p( x, y) 1 1 we can estimate the risk: m 1 m m m Rˆ( f ) ( y, f ( x )) "empirical risk" j1 j j i i Important: Which data T to use? Training data (T=)? Split available data into und T. Cross-validation. 8

9 Estimator as a random variable Estimator Estimator is random variable: 1 ˆ( ) m j 1 ( j, ( x j )) m R f y f Instances in T have been drawn randomly ( x, y ) ~ p( x, y) Which ( x, y ) where drawn? j j Value of estimator depends on randomly sampled instances, thus it is the result of a random process. j j Estimator has an expected value E[ Rˆ ( f )]. Estimator is unbiased if and only if: Expectation of empirical risk = true risk. 9

10 Bias of estimator ˆ f Estimator R ( ) is unbiased if and only if: E[ Rˆ( f )] R( f ) ˆ f Otherwise, R ( ) has a bias: Bias E[ Rˆ ( f )] R( f ). Estimator is optimistic, if E[ Rˆ ( f )] R( f ). Estimator is pessimistic, if E[ Rˆ ( f )] R( f ). 10

11 Variance of estimator Estimator ( ) has a variance The larger the sample T used for computing the estimate is, the lower the resulting variance. Variance vs. bias: ˆ f R 2 2 Var[ Rˆ ( f )] E[ Rˆ ( f ) ] E[ Rˆ ( f )] High variance: large random component in empirical risk estimate. arge bias: systematic error in empirical risk estimate. value Rˆ ( f) R bias dominates R variance dominates 11

12 Risk estimate on training data Which set T should we use? 1. Try: training data Model f, trained on {( x, y ),...,( x, y )} 1 1 Empirical risk measured on training data R ˆ 1 ( f m ) 1 ( y j j, f ( x j )) m Risk estimated on Is this risk estimate an unbiased optimistic pessimistic estimator of the true risk R( f)? m m 12

13 Risk estimate on training data Empirical risk on training data is an optimistic estimator of the true risk. Empirical risk of all possible models for a fixed? Due to random effects it holds for some models f, that Rˆ ( f ) R ( f ) and for other models f, that Rˆ ( f ) R ( f ). earning algorithm chooses a model f with small empirical risk Rˆ ( f ). ˆ ( ) ( ) ikely that R f R f (optimistic risk estimate). 13

14 Risk estimate on training data Empirical risk of the model chosen by the learning algorithm on the training data ( training error ) is optimistic estimator of true risk: E[ Rˆ ( f )] R( f ). The problem is caused by the dependency of the chosen model on the data used for evaluation. Approch to fix the problem: use test data that are independent of the training data. 14

15 Holdout-Testing Idea: estimate risk on independent test data Given: data Split data into D {( x, y ),...,( x, y )} 1 1 Training data {( x, y ),...,( x, y )} and 1 1 m m Test data T {( x, y ),...,( x, y )} m1 m1 d d d d T 15

16 Holdout-Testing Run learning algorithm on data, this yields model. Compute empirical risk Rˆ ( f ) on test data T. T Run learning algorithm on data D, this yields model. Output: Model f, use ˆ D RT ( f ) as estimator for true risk of the model f D. f f D T 16

17 Holdout-Testing: Analysis Is the estimator Rˆ ( f ) for the risk of the model unbiased, optimistic, pessimistic? T f D 17

18 Holdout-Testing: Analysis Estimator Rˆ T ( f ) is pessimistic for R( f D ): Rˆ ( f ) is unbiased for f T f was learned on fewer training examples than f D, and therefore has a higher risk (in expectation). But the estimator Rˆ T ( f ) is useful in practice, while the estimator Rˆ ( f ) is usually wildly optimistic (often close to 0). Why do we train and return model f D? Final model f D rather than f, because f D has a lower risk, and is therefore better. 18

19 Holdout-Testing: Analysis What are the advantages/disadvantages when choosing the test set T large small? T should be large to ensure that risk estimate Rˆ T ( f ) has low variance. T should be small to ensure that Rˆ T ( f ) has low bias, that is, is not too pessimistic. We need a lot of data in order to obtain good estimates In practice, holdout-testing is only used when data is plentyful. Cross-validation (next slide) usually gives better results. 19

20 Cross-Validation Given: data D {( x, y ),...,( x, y )} 1 1 d d Split D into n equally sized blocks with D D and D i D 0 n i1 Repeat for i=1 n i earn model f i with i =D \ D i. Compute empirical risk ˆ ( ) on D i. j R i D f i D1 D2 D3 D4 D,..., 1 Dn Training examples 20

21 Cross-Validation Average empirical risk estimates from the different test sets D i : 1 n R Rˆ D ( f ) n i 1 earn model f D on complete data set D. Return model f D and estimator R. i i Training examples 21

22 Cross-Validation: Analysis Is the estimator optimistic / pessimistic / unbiased? 22

23 Cross-Validation: Analysis Is the estimator optimistic / pessimistic / unbiased? Estimator is pessimistic: Models f i are trained on fraction (n-1)/n of overall data. Model f D will be trained on all data. Cross-Validation Holdout Training examples 23

24 Cross-Validation: Analysis Bias/Variance compared to holdout-testing? Variance is lower than for holdout-testing Averaging over several holdout experiments, this reduces variance Estimator is based on all data, because all instances appear as test instances in some block. Bias similar as for holdout-testing, depends on number of blocks. Cross-Validation Holdout Training examples 24

25 Example: regularized polynomial regression Polynomial model fw ( x) wi x i0 earn model by minimizing regularized loss M w* arg min w ( ( ) ) ln 18 i1 M 2 fw xi yi w i 2 Training data {( x, y ),...,( x, y )} 1 1 m m earned model True model y x 25

26 Tune regularization parameter We have to determine a good regularization parameter. Regularization parameter controls complexity of model. ln ln 18 ln 0 26

27 Tune regularization parameter Perform cross-validation for different parameters, save the corresponding risk estimates. When training the final model on all of the data, use the * parameter that resulted in smallest risk estimate. Training error minimal for unregularized model, but test error better for moderate regularization. * 0 27

28 Tune regularization parameter Algorithm: earn model with optimal regularization parameter. Function For : trainmodeloptimalambda( D) 1 1 { 2 k,2 k...,2 k, 2 k } Determine error( ) crossvalidatio n(, D) cross-validation risk estimate for model with parameter on data D. Set earn * arg mi n error( ) f trainmodel(, D) * * earning model with * parameter on data D. Output: model f *. 28

29 Estimating error of model with tuned regularization parameter How do we estimate the error of the model with tuned * regularization parameter? Warning: we can not simply use the error estimate error( * )! * The parameter was chosen such that error( * ) is as small as possible. The error estimate error( * ) is therefore optimistic. Compare with earlier argument: training error is optimistic, because model parameters have been chosen based on training data. Instead, what we need is a nested cross-validation (see next slide). 29

30 Nested Cross-Validation Algorithm: risk estimate with tuned regularization parameter Function trainandevaluateoptimalambda( D) Split D into n equally sized blocks D,..., 1 Dn with D n D i1 i and D D 0. For earn i i {1,..., n}: f j * i trainmodeloptimalambda( D \ D) i Determine empirical risk * Rˆ ( f ) on data Di D i i Average the different empirical risk estimates: earn f * trainmodeloptimalambda( D) R 1 n R ˆ ( f ). i1 Di i n * f R. Output: model and risk estimate 30

31 Evaluation: Summary Studied the problem of risk estimation: expected loss on novel test data. Training error optimistic, cannot be used as risk estimate. Appropriate approaches are holdout-testing and crossvalidation. Cross-validation is also used to tune hyperparameters such as regularization parameter. Error estimate for model with tuned hyperparameters requires nested cross-validation. 31

Computational Finance Least Squares Monte Carlo

Computational Finance Least Squares Monte Carlo Computational Finance Least Squares Monte Carlo School of Mathematics 2019 Monte Carlo and Binomial Methods In the last two lectures we discussed the binomial tree method and convergence problems. One

More information

MS&E 448 Final Presentation High Frequency Algorithmic Trading

MS&E 448 Final Presentation High Frequency Algorithmic Trading MS&E 448 Final Presentation High Frequency Algorithmic Trading Francis Choi George Preudhomme Nopphon Siranart Roger Song Daniel Wright Stanford University June 6, 2017 High-Frequency Trading MS&E448 June

More information

Learning From Data: MLE. Maximum Likelihood Estimators

Learning From Data: MLE. Maximum Likelihood Estimators Learning From Data: MLE Maximum Likelihood Estimators 1 Parameter Estimation Assuming sample x1, x2,..., xn is from a parametric distribution f(x θ), estimate θ. E.g.: Given sample HHTTTTTHTHTTTHH of (possibly

More information

Random Variables and Applications OPRE 6301

Random Variables and Applications OPRE 6301 Random Variables and Applications OPRE 6301 Random Variables... As noted earlier, variability is omnipresent in the business world. To model variability probabilistically, we need the concept of a random

More information

ECS171: Machine Learning

ECS171: Machine Learning ECS171: Machine Learning Lecture 15: Tree-based Algorithms Cho-Jui Hsieh UC Davis March 7, 2018 Outline Decision Tree Random Forest Gradient Boosted Decision Tree (GBDT) Decision Tree Each node checks

More information

Module 4: Point Estimation Statistics (OA3102)

Module 4: Point Estimation Statistics (OA3102) Module 4: Point Estimation Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 8.1-8.4 Revision: 1-12 1 Goals for this Module Define

More information

Time Observations Time Period, t

Time Observations Time Period, t Operations Research Models and Methods Paul A. Jensen and Jonathan F. Bard Time Series and Forecasting.S1 Time Series Models An example of a time series for 25 periods is plotted in Fig. 1 from the numerical

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Monte Carlo Methods Mark Schmidt University of British Columbia Winter 2018 Last Time: Markov Chains We can use Markov chains for density estimation, p(x) = p(x 1 ) }{{} d p(x

More information

Monte-Carlo Methods in Financial Engineering

Monte-Carlo Methods in Financial Engineering Monte-Carlo Methods in Financial Engineering Universität zu Köln May 12, 2017 Outline Table of Contents 1 Introduction 2 Repetition Definitions Least-Squares Method 3 Derivation Mathematical Derivation

More information

Review of Expected Operations

Review of Expected Operations Economic Risk and Decision Analysis for Oil and Gas Industry CE81.98 School of Engineering and Technology Asian Institute of Technology January Semester Presented by Dr. Thitisak Boonpramote Department

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Monte Carlo Methods Mark Schmidt University of British Columbia Winter 2019 Last Time: Markov Chains We can use Markov chains for density estimation, d p(x) = p(x 1 ) p(x }{{}

More information

Lecture 3: Factor models in modern portfolio choice

Lecture 3: Factor models in modern portfolio choice Lecture 3: Factor models in modern portfolio choice Prof. Massimo Guidolin Portfolio Management Spring 2016 Overview The inputs of portfolio problems Using the single index model Multi-index models Portfolio

More information

Review of key points about estimators

Review of key points about estimators Review of key points about estimators Populations can be at least partially described by population parameters Population parameters include: mean, proportion, variance, etc. Because populations are often

More information

Machine Learning for Quantitative Finance

Machine Learning for Quantitative Finance Machine Learning for Quantitative Finance Fast derivative pricing Sofie Reyners Joint work with Jan De Spiegeleer, Dilip Madan and Wim Schoutens Derivative pricing is time-consuming... Vanilla option pricing

More information

Statistical analysis and bootstrapping

Statistical analysis and bootstrapping Statistical analysis and bootstrapping p. 1/15 Statistical analysis and bootstrapping Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Statistical analysis and bootstrapping

More information

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi Chapter 4: Commonly Used Distributions Statistics for Engineers and Scientists Fourth Edition William Navidi 2014 by Education. This is proprietary material solely for authorized instructor use. Not authorized

More information

Predicting Foreign Exchange Arbitrage

Predicting Foreign Exchange Arbitrage Predicting Foreign Exchange Arbitrage Stefan Huber & Amy Wang 1 Introduction and Related Work The Covered Interest Parity condition ( CIP ) should dictate prices on the trillion-dollar foreign exchange

More information

Lecture 12: The Bootstrap

Lecture 12: The Bootstrap Lecture 12: The Bootstrap Reading: Chapter 5 STATS 202: Data mining and analysis October 20, 2017 1 / 16 Announcements Midterm is on Monday, Oct 30 Topics: chapters 1-5 and 10 of the book everything until

More information

Value (x) probability Example A-2: Construct a histogram for population Ψ.

Value (x) probability Example A-2: Construct a histogram for population Ψ. Calculus 111, section 08.x The Central Limit Theorem notes by Tim Pilachowski If you haven t done it yet, go to the Math 111 page and download the handout: Central Limit Theorem supplement. Today s lecture

More information

Gamma. The finite-difference formula for gamma is

Gamma. The finite-difference formula for gamma is Gamma The finite-difference formula for gamma is [ P (S + ɛ) 2 P (S) + P (S ɛ) e rτ E ɛ 2 ]. For a correlation option with multiple underlying assets, the finite-difference formula for the cross gammas

More information

Lecture 22. Survey Sampling: an Overview

Lecture 22. Survey Sampling: an Overview Math 408 - Mathematical Statistics Lecture 22. Survey Sampling: an Overview March 25, 2013 Konstantin Zuev (USC) Math 408, Lecture 22 March 25, 2013 1 / 16 Survey Sampling: What and Why In surveys sampling

More information

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage 6 Point Estimation Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Point Estimation Statistical inference: directed toward conclusions about one or more parameters. We will use the generic

More information

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality Point Estimation Some General Concepts of Point Estimation Statistical inference = conclusions about parameters Parameters == population characteristics A point estimate of a parameter is a value (based

More information

The Loans_processed.csv file is the dataset we obtained after the pre-processing part where the clean-up python code was used.

The Loans_processed.csv file is the dataset we obtained after the pre-processing part where the clean-up python code was used. Machine Learning Group Homework 3 MSc Business Analytics Team 9 Alexander Romanenko, Artemis Tomadaki, Justin Leiendecker, Zijun Wei, Reza Brianca Widodo The Loans_processed.csv file is the dataset we

More information

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved.

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved. STAT 509: Statistics for Engineers Dr. Dewei Wang Applied Statistics and Probability for Engineers Sixth Edition Douglas C. Montgomery George C. Runger 7 Point CHAPTER OUTLINE 7-1 Point Estimation 7-2

More information

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #24 Scribe: Jordan Ash May 1, 2014

COS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #24 Scribe: Jordan Ash May 1, 2014 COS 5: heoretical Machine Learning Lecturer: Rob Schapire Lecture #24 Scribe: Jordan Ash May, 204 Review of Game heory: Let M be a matrix with all elements in [0, ]. Mindy (called the row player) chooses

More information

Subject : Computer Science. Paper: Machine Learning. Module: Decision Theory and Bayesian Decision Theory. Module No: CS/ML/10.

Subject : Computer Science. Paper: Machine Learning. Module: Decision Theory and Bayesian Decision Theory. Module No: CS/ML/10. e-pg Pathshala Subject : Computer Science Paper: Machine Learning Module: Decision Theory and Bayesian Decision Theory Module No: CS/ML/0 Quadrant I e-text Welcome to the e-pg Pathshala Lecture Series

More information

The Simple Regression Model

The Simple Regression Model Chapter 2 Wooldridge: Introductory Econometrics: A Modern Approach, 5e Definition of the simple linear regression model "Explains variable in terms of variable " Intercept Slope parameter Dependent var,

More information

1. The number of dental claims for each insured in a calendar year is distributed as a Geometric distribution with variance of

1. The number of dental claims for each insured in a calendar year is distributed as a Geometric distribution with variance of 1. The number of dental claims for each insured in a calendar year is distributed as a Geometric distribution with variance of 7.315. The amount of each claim is distributed as a Pareto distribution with

More information

Learning Objectives = = where X i is the i t h outcome of a decision, p i is the probability of the i t h

Learning Objectives = = where X i is the i t h outcome of a decision, p i is the probability of the i t h Learning Objectives After reading Chapter 15 and working the problems for Chapter 15 in the textbook and in this Workbook, you should be able to: Distinguish between decision making under uncertainty and

More information

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems Interval estimation September 29, 2017 STAT 151 Class 7 Slide 1 Outline of Topics 1 Basic ideas 2 Sampling variation and CLT 3 Interval estimation using X 4 More general problems STAT 151 Class 7 Slide

More information

STRESS-STRENGTH RELIABILITY ESTIMATION

STRESS-STRENGTH RELIABILITY ESTIMATION CHAPTER 5 STRESS-STRENGTH RELIABILITY ESTIMATION 5. Introduction There are appliances (every physical component possess an inherent strength) which survive due to their strength. These appliances receive

More information

1. The number of dental claims for each insured in a calendar year is distributed as a Geometric distribution with variance of

1. The number of dental claims for each insured in a calendar year is distributed as a Geometric distribution with variance of 1. The number of dental claims for each insured in a calendar year is distributed as a Geometric distribution with variance of 7.3125. The amount of each claim is distributed as a Pareto distribution with

More information

CSC 411: Lecture 08: Generative Models for Classification

CSC 411: Lecture 08: Generative Models for Classification CSC 411: Lecture 08: Generative Models for Classification Richard Zemel, Raquel Urtasun and Sanja Fidler University of Toronto Zemel, Urtasun, Fidler (UofT) CSC 411: 08-Generative Models 1 / 23 Today Classification

More information

Lecture outline W.B.Powell 1

Lecture outline W.B.Powell 1 Lecture outline What is a policy? Policy function approximations (PFAs) Cost function approximations (CFAs) alue function approximations (FAs) Lookahead policies Finding good policies Optimizing continuous

More information

Chapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are

Chapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are Chapter 7 presents the beginning of inferential statistics. Concept: Inferential Statistics The two major activities of inferential statistics are 1 to use sample data to estimate values of population

More information

Gradient Boosting Trees: theory and applications

Gradient Boosting Trees: theory and applications Gradient Boosting Trees: theory and applications Dmitry Efimov November 05, 2016 Outline Decision trees Boosting Boosting trees Metaparameters and tuning strategies How-to-use remarks Regression tree True

More information

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate

More information

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate

More information

Stratified Sampling in Monte Carlo Simulation: Motivation, Design, and Sampling Error

Stratified Sampling in Monte Carlo Simulation: Motivation, Design, and Sampling Error South Texas Project Risk- Informed GSI- 191 Evaluation Stratified Sampling in Monte Carlo Simulation: Motivation, Design, and Sampling Error Document: STP- RIGSI191- ARAI.03 Revision: 1 Date: September

More information

The method of Maximum Likelihood.

The method of Maximum Likelihood. Maximum Likelihood The method of Maximum Likelihood. In developing the least squares estimator - no mention of probabilities. Minimize the distance between the predicted linear regression and the observed

More information

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is: **BEGINNING OF EXAMINATION** 1. You are given: (i) A random sample of five observations from a population is: 0.2 0.7 0.9 1.1 1.3 (ii) You use the Kolmogorov-Smirnov test for testing the null hypothesis,

More information

Modelling Returns: the CER and the CAPM

Modelling Returns: the CER and the CAPM Modelling Returns: the CER and the CAPM Carlo Favero Favero () Modelling Returns: the CER and the CAPM 1 / 20 Econometric Modelling of Financial Returns Financial data are mostly observational data: they

More information

Modeling of Claim Counts with k fold Cross-validation

Modeling of Claim Counts with k fold Cross-validation Modeling of Claim Counts with k fold Cross-validation Alicja Wolny Dominiak 1 Abstract In the ratemaking process the ranking, which takes into account the number of claims generated by a policy in a given

More information

10/1/2012. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1

10/1/2012. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1 PSY 511: Advanced Statistics for Psychological and Behavioral Research 1 Pivotal subject: distributions of statistics. Foundation linchpin important crucial You need sampling distributions to make inferences:

More information

The Simple Regression Model

The Simple Regression Model Chapter 2 Wooldridge: Introductory Econometrics: A Modern Approach, 5e Definition of the simple linear regression model Explains variable in terms of variable Intercept Slope parameter Dependent variable,

More information

Stock Trading Following Stock Price Index Movement Classification Using Machine Learning Techniques

Stock Trading Following Stock Price Index Movement Classification Using Machine Learning Techniques Stock Trading Following Stock Price Index Movement Classification Using Machine Learning Techniques 6.1 Introduction Trading in stock market is one of the most popular channels of financial investments.

More information

Comparison of OLS and LAD regression techniques for estimating beta

Comparison of OLS and LAD regression techniques for estimating beta Comparison of OLS and LAD regression techniques for estimating beta 26 June 2013 Contents 1. Preparation of this report... 1 2. Executive summary... 2 3. Issue and evaluation approach... 4 4. Data... 6

More information

B. Maddah INDE 504 Discrete-Event Simulation. Output Analysis (3)

B. Maddah INDE 504 Discrete-Event Simulation. Output Analysis (3) B. Maddah INDE 504 Discrete-Event Simulation Output Analysis (3) Variance Reduction Variance reduction techniques (VRT) are methods to reduce the variance (i.e. increase precision) of simulation output

More information

Year 0 $ (12.00) Year 1 $ (3.40) Year 5 $ Year 3 $ Year 4 $ Year 6 $ Year 7 $ 8.43 Year 8 $ 3.44 Year 9 $ (4.

Year 0 $ (12.00) Year 1 $ (3.40) Year 5 $ Year 3 $ Year 4 $ Year 6 $ Year 7 $ 8.43 Year 8 $ 3.44 Year 9 $ (4. Four Ways to do Project Analysis Project Analysis / Decision Making Engineering 9 Dr. Gregory Crawford Statistical / Regression Analysis (forecasting) Sensitivity Analysis Monte Carlo Simulations Decision

More information

Chapter 8 Statistical Intervals for a Single Sample

Chapter 8 Statistical Intervals for a Single Sample Chapter 8 Statistical Intervals for a Single Sample Part 1: Confidence intervals (CI) for population mean µ Section 8-1: CI for µ when σ 2 known & drawing from normal distribution Section 8-1.2: Sample

More information

Agricultural and Applied Economics 637 Applied Econometrics II

Agricultural and Applied Economics 637 Applied Econometrics II Agricultural and Applied Economics 637 Applied Econometrics II Assignment I Using Search Algorithms to Determine Optimal Parameter Values in Nonlinear Regression Models (Due: February 3, 2015) (Note: Make

More information

Optimal Search for Parameters in Monte Carlo Simulation for Derivative Pricing

Optimal Search for Parameters in Monte Carlo Simulation for Derivative Pricing Optimal Search for Parameters in Monte Carlo Simulation for Derivative Pricing Prof. Chuan-Ju Wang Department of Computer Science University of Taipei Joint work with Prof. Ming-Yang Kao March 28, 2014

More information

Is Greedy Coordinate Descent a Terrible Algorithm?

Is Greedy Coordinate Descent a Terrible Algorithm? Is Greedy Coordinate Descent a Terrible Algorithm? Julie Nutini, Mark Schmidt, Issam Laradji, Michael Friedlander, Hoyt Koepke University of British Columbia Optimization and Big Data, 2015 Context: Random

More information

Multivariate Statistics Lecture Notes. Stephen Ansolabehere

Multivariate Statistics Lecture Notes. Stephen Ansolabehere Multivariate Statistics Lecture Notes Stephen Ansolabehere Spring 2004 TOPICS. The Basic Regression Model 2. Regression Model in Matrix Algebra 3. Estimation 4. Inference and Prediction 5. Logit and Probit

More information

Lecture outline W.B. Powell 1

Lecture outline W.B. Powell 1 Lecture outline Applications of the newsvendor problem The newsvendor problem Estimating the distribution and censored demands The newsvendor problem and risk The newsvendor problem with an unknown distribution

More information

Accelerated Option Pricing Multiple Scenarios

Accelerated Option Pricing Multiple Scenarios Accelerated Option Pricing in Multiple Scenarios 04.07.2008 Stefan Dirnstorfer (stefan@thetaris.com) Andreas J. Grau (grau@thetaris.com) 1 Abstract This paper covers a massive acceleration of Monte-Carlo

More information

Binomial Distributions

Binomial Distributions Binomial Distributions A binomial experiment is a probability experiment that satisfies these conditions. 1. The experiment has a fixed number of trials, where each trial is independent of the other trials.

More information

Week 1 Quantitative Analysis of Financial Markets Basic Statistics A

Week 1 Quantitative Analysis of Financial Markets Basic Statistics A Week 1 Quantitative Analysis of Financial Markets Basic Statistics A Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 October

More information

Introduction To Risk & Return

Introduction To Risk & Return Calculating the Rate of Return on Assets Introduction o Risk & Return Econ 422: Investment, Capital & Finance University of Washington Summer 26 August 5, 26 Denote today as time the price of the asset

More information

Overview. Transformation method Rejection method. Monte Carlo vs ordinary methods. 1 Random numbers. 2 Monte Carlo integration.

Overview. Transformation method Rejection method. Monte Carlo vs ordinary methods. 1 Random numbers. 2 Monte Carlo integration. Overview 1 Random numbers Transformation method Rejection method 2 Monte Carlo integration Monte Carlo vs ordinary methods 3 Summary Transformation method Suppose X has probability distribution p X (x),

More information

Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model

Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model Simerjot Kaur (sk3391) Stanford University Abstract This work presents a novel algorithmic trading system based on reinforcement

More information

Econometric Methods for Valuation Analysis

Econometric Methods for Valuation Analysis Econometric Methods for Valuation Analysis Margarita Genius Dept of Economics M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 1 / 25 Outline We will consider econometric

More information

Chapter 7: Point Estimation and Sampling Distributions

Chapter 7: Point Estimation and Sampling Distributions Chapter 7: Point Estimation and Sampling Distributions Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 20 Motivation In chapter 3, we learned

More information

Reinforcement Learning. Slides based on those used in Berkeley's AI class taught by Dan Klein

Reinforcement Learning. Slides based on those used in Berkeley's AI class taught by Dan Klein Reinforcement Learning Slides based on those used in Berkeley's AI class taught by Dan Klein Reinforcement Learning Basic idea: Receive feedback in the form of rewards Agent s utility is defined by the

More information

Support Vector Machines: Training with Stochastic Gradient Descent

Support Vector Machines: Training with Stochastic Gradient Descent Support Vector Machines: Training with Stochastic Gradient Descent Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Support vector machines Training by maximizing margin The SVM

More information

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern Monte-Carlo Planning: Introduction and Bandit Basics Alan Fern 1 Large Worlds We have considered basic model-based planning algorithms Model-based planning: assumes MDP model is available Methods we learned

More information

Point Estimators. STATISTICS Lecture no. 10. Department of Econometrics FEM UO Brno office 69a, tel

Point Estimators. STATISTICS Lecture no. 10. Department of Econometrics FEM UO Brno office 69a, tel STATISTICS Lecture no. 10 Department of Econometrics FEM UO Brno office 69a, tel. 973 442029 email:jiri.neubauer@unob.cz 8. 12. 2009 Introduction Suppose that we manufacture lightbulbs and we want to state

More information

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS Part 1: Introduction Sampling Distributions & the Central Limit Theorem Point Estimation & Estimators Sections 7-1 to 7-2 Sample data

More information

STUDY SET 1. Discrete Probability Distributions. x P(x) and x = 6.

STUDY SET 1. Discrete Probability Distributions. x P(x) and x = 6. STUDY SET 1 Discrete Probability Distributions 1. Consider the following probability distribution function. Compute the mean and standard deviation of. x 0 1 2 3 4 5 6 7 P(x) 0.05 0.16 0.19 0.24 0.18 0.11

More information

Estimation after Model Selection

Estimation after Model Selection Estimation after Model Selection Vanja M. Dukić Department of Health Studies University of Chicago E-Mail: vanja@uchicago.edu Edsel A. Peña* Department of Statistics University of South Carolina E-Mail:

More information

Lecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.

Lecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ. Sufficient Statistics Lecture Notes 6 Sufficiency Data reduction in terms of a particular statistic can be thought of as a partition of the sample space X. Definition T is sufficient for θ if the conditional

More information

Online Appendix for The Importance of Being. Marginal: Gender Differences in Generosity

Online Appendix for The Importance of Being. Marginal: Gender Differences in Generosity Online Appendix for The Importance of Being Marginal: Gender Differences in Generosity Stefano DellaVigna, John List, Ulrike Malmendier, Gautam Rao January 14, 2013 This appendix describes the structural

More information

COMPARISON OF RATIO ESTIMATORS WITH TWO AUXILIARY VARIABLES K. RANGA RAO. College of Dairy Technology, SPVNR TSU VAFS, Kamareddy, Telangana, India

COMPARISON OF RATIO ESTIMATORS WITH TWO AUXILIARY VARIABLES K. RANGA RAO. College of Dairy Technology, SPVNR TSU VAFS, Kamareddy, Telangana, India COMPARISON OF RATIO ESTIMATORS WITH TWO AUXILIARY VARIABLES K. RANGA RAO College of Dairy Technology, SPVNR TSU VAFS, Kamareddy, Telangana, India Email: rrkollu@yahoo.com Abstract: Many estimators of the

More information

Econ 300: Quantitative Methods in Economics. 11th Class 10/19/09

Econ 300: Quantitative Methods in Economics. 11th Class 10/19/09 Econ 300: Quantitative Methods in Economics 11th Class 10/19/09 Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write. --H.G. Wells discuss test [do

More information

Yao s Minimax Principle

Yao s Minimax Principle Complexity of algorithms The complexity of an algorithm is usually measured with respect to the size of the input, where size may for example refer to the length of a binary word describing the input,

More information

Review of key points about estimators

Review of key points about estimators Review of key points about estimators Populations can be at least partially described by population parameters Population parameters include: mean, proportion, variance, etc. Because populations are often

More information

November 2001 Course 1 Mathematical Foundations of Actuarial Science. Society of Actuaries/Casualty Actuarial Society

November 2001 Course 1 Mathematical Foundations of Actuarial Science. Society of Actuaries/Casualty Actuarial Society November 00 Course Mathematical Foundations of Actuarial Science Society of Actuaries/Casualty Actuarial Society . An urn contains 0 balls: 4 red and 6 blue. A second urn contains 6 red balls and an unknown

More information

Bias Reduction Using the Bootstrap

Bias Reduction Using the Bootstrap Bias Reduction Using the Bootstrap Find f t (i.e., t) so that or E(f t (P, P n ) P) = 0 E(T(P n ) θ(p) + t P) = 0. Change the problem to the sample: whose solution is so the bias-reduced estimate is E(T(P

More information

IEOR E4703: Monte-Carlo Simulation

IEOR E4703: Monte-Carlo Simulation IEOR E4703: Monte-Carlo Simulation Simulation Efficiency and an Introduction to Variance Reduction Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University

More information

Alternative VaR Models

Alternative VaR Models Alternative VaR Models Neil Roeth, Senior Risk Developer, TFG Financial Systems. 15 th July 2015 Abstract We describe a variety of VaR models in terms of their key attributes and differences, e.g., parametric

More information

Structural credit risk models and systemic capital

Structural credit risk models and systemic capital Structural credit risk models and systemic capital Somnath Chatterjee CCBS, Bank of England November 7, 2013 Structural credit risk model Structural credit risk models are based on the notion that both

More information

Reliable region predictions for Automated Valuation Models

Reliable region predictions for Automated Valuation Models Reliable region predictions for Automated Valuation Models Tony Bellotti, Department of Mathematics, Imperial College London Royal Holloway, University of London 29 April 2016 Outline Automated valuation

More information

Portfolio Credit Risk Models

Portfolio Credit Risk Models Portfolio Credit Risk Models Paul Embrechts London School of Economics Department of Accounting and Finance AC 402 FINANCIAL RISK ANALYSIS Lent Term, 2003 c Paul Embrechts and Philipp Schönbucher, 2003

More information

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern Monte-Carlo Planning: Introduction and Bandit Basics Alan Fern 1 Large Worlds We have considered basic model-based planning algorithms Model-based planning: assumes MDP model is available Methods we learned

More information

High Frequency Trading Strategy Based on Prex Trees

High Frequency Trading Strategy Based on Prex Trees High Frequency Trading Strategy Based on Prex Trees Yijia Zhou, 05592862, Financial Mathematics, Stanford University December 11, 2010 1 Introduction 1.1 Goal I am an M.S. Finanical Mathematics student

More information

Stat 139 Homework 2 Solutions, Fall 2016

Stat 139 Homework 2 Solutions, Fall 2016 Stat 139 Homework 2 Solutions, Fall 2016 Problem 1. The sum of squares of a sample of data is minimized when the sample mean, X = Xi /n, is used as the basis of the calculation. Define g(c) as a function

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Math 140 Introductory Statistics Let s make our own sampling! If we use a random sample (a survey) or if we randomly assign treatments to subjects (an experiment) we can come up with proper, unbiased conclusions

More information

Sampling and sampling distribution

Sampling and sampling distribution Sampling and sampling distribution September 12, 2017 STAT 101 Class 5 Slide 1 Outline of Topics 1 Sampling 2 Sampling distribution of a mean 3 Sampling distribution of a proportion STAT 101 Class 5 Slide

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation The likelihood and log-likelihood functions are the basis for deriving estimators for parameters, given data. While the shapes of these two functions are different, they have

More information

Heavy-tailedness and dependence: implications for economic decisions, risk management and financial markets

Heavy-tailedness and dependence: implications for economic decisions, risk management and financial markets Heavy-tailedness and dependence: implications for economic decisions, risk management and financial markets Rustam Ibragimov Department of Economics Harvard University Based on joint works with Johan Walden

More information

Statistics for Business and Economics

Statistics for Business and Economics Statistics for Business and Economics Chapter 7 Estimation: Single Population Copyright 010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-1 Confidence Intervals Contents of this chapter: Confidence

More information

Antitrust Notice. Copyright 2010 National Council on Compensation Insurance, Inc. All Rights Reserved.

Antitrust Notice. Copyright 2010 National Council on Compensation Insurance, Inc. All Rights Reserved. Antitrust Notice The Casualty Actuarial Society is committed to adhering strictly to the letter and spirit of the antitrust laws. Seminars conducted under the auspices of the CAS are designed solely to

More information

Introduction to Sequential Monte Carlo Methods

Introduction to Sequential Monte Carlo Methods Introduction to Sequential Monte Carlo Methods Arnaud Doucet NCSU, October 2008 Arnaud Doucet () Introduction to SMC NCSU, October 2008 1 / 36 Preliminary Remarks Sequential Monte Carlo (SMC) are a set

More information

Machine Learning in Risk Forecasting and its Application in Low Volatility Strategies

Machine Learning in Risk Forecasting and its Application in Low Volatility Strategies NEW THINKING Machine Learning in Risk Forecasting and its Application in Strategies By Yuriy Bodjov Artificial intelligence and machine learning are two terms that have gained increased popularity within

More information

2011 Pearson Education, Inc

2011 Pearson Education, Inc Statistics for Business and Economics Chapter 4 Random Variables & Probability Distributions Content 1. Two Types of Random Variables 2. Probability Distributions for Discrete Random Variables 3. The Binomial

More information

A useful modeling tricks.

A useful modeling tricks. .7 Joint models for more than two outcomes We saw that we could write joint models for a pair of variables by specifying the joint probabilities over all pairs of outcomes. In principal, we could do this

More information

1. You are given the following information about a stationary AR(2) model:

1. You are given the following information about a stationary AR(2) model: Fall 2003 Society of Actuaries **BEGINNING OF EXAMINATION** 1. You are given the following information about a stationary AR(2) model: (i) ρ 1 = 05. (ii) ρ 2 = 01. Determine φ 2. (A) 0.2 (B) 0.1 (C) 0.4

More information

12 The Bootstrap and why it works

12 The Bootstrap and why it works 12 he Bootstrap and why it works For a review of many applications of bootstrap see Efron and ibshirani (1994). For the theory behind the bootstrap see the books by Hall (1992), van der Waart (2000), Lahiri

More information

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE AP STATISTICS Name: FALL SEMESTSER FINAL EXAM STUDY GUIDE Period: *Go over Vocabulary Notecards! *This is not a comprehensive review you still should look over your past notes, homework/practice, Quizzes,

More information