Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Evaluation of Models. Niels Landwehr
|
|
- Asher Flynn
- 6 years ago
- Views:
Transcription
1 Universität Potsdam Institut für Informatik ehrstuhl Maschinelles ernen Evaluation of Models Niels andwehr
2 earning and Prediction Classification, Regression: earning problem Input: training data Output: model f : X Y {( x, y ),...,( x, y )} 1 1 Model will be used to obtain predictions for novel test instances m m f ( x)? x test instance Have seen several different types of models inear models, decision trees 2
3 Evaluation of Models Question: After having implemented learning algorithm, having trained model etc.: how accurate are our predictions? What exactly do we mean by accurate? How do we calculate / measure / estimate accuracy? We care about accuracy of predictions when applying the model to unseen, novel test data (not about accuracy on observable training data). Evaluation of models: Estimate accuracy of predictions of learned models. 3
4 Evaluation of models: Assumptions In order to study the evaluation of models formally, we have to make assumptions about the properties of training and test data. Central assumption: all data are drawn from fixed (unknown) distribution p(x,y) Distribution over instances Distribution over labels given instance Example spam-filtering p( x, y) p( x) p( y x) p(x) Probability to see x p(y x) Probability to see label y {Spam/Ok} for x. 4
5 Evaluation of models: Assumptions i.i.d. -assumption: Examples are independent and identically distributed. Training instances are drawn independently from distribution p(x,y) : ( x, y ) ~ p( x, y) i i training examples Test instances are also drawn independently from this distribution ( x, y) ~ p( x, y) test instance (seen when applying the model) Is this always realistic? In the following, we will always assume i.i.d. data. 5
6 oss functions We have made assumption about the instances (x,y) that we will see New test instance (x,y) arrives, model predicts f(x). oss function defines how good/bad the prediction is: Non-negative: Problem specific, given a priori. oss functions for classification Zero-one loss: ( y, y') 0, if y y'; 1, otherwise ( y, f( x)) y, y': ( y, y') 0 Class-dependent cost matrix oss functions for regression Squared loss: ( y, y') ( y y') oss of prediction f(x) for instance (x,y) 2 6
7 Evaluating models: Risk of model Central definition when evaluating models: risk. Risk of a model: expected loss for novel test instance ( x, y) ~ p( x, y) test instance x with label y (random variable) ( y, f( x)) loss for test instance (random variable) R( f ) E[ ( y, f ( x))] ( y, f ( x)) p( x, y) dxdy For zero-one-loss risk is also called error rate. For squared loss risk is also called mean squared error. Main goal when evaluating models: determine risk of model. Risk cannot be determined exactly, because p(x,y) is unknown estimation problem. 7
8 Evaluation of models: risk estimate Estimating risk from data. If data sampled from p(x,y) is available, T {( x, y ),...,( x, y )} ( x, y ) ~ p( x, y) 1 1 we can estimate the risk: m 1 m m m Rˆ( f ) ( y, f ( x )) "empirical risk" j1 j j i i Important: Which data T to use? Training data (T=)? Split available data into und T. Cross-validation. 8
9 Estimator as a random variable Estimator Estimator is random variable: 1 ˆ( ) m j 1 ( j, ( x j )) m R f y f Instances in T have been drawn randomly ( x, y ) ~ p( x, y) Which ( x, y ) where drawn? j j Value of estimator depends on randomly sampled instances, thus it is the result of a random process. j j Estimator has an expected value E[ Rˆ ( f )]. Estimator is unbiased if and only if: Expectation of empirical risk = true risk. 9
10 Bias of estimator ˆ f Estimator R ( ) is unbiased if and only if: E[ Rˆ( f )] R( f ) ˆ f Otherwise, R ( ) has a bias: Bias E[ Rˆ ( f )] R( f ). Estimator is optimistic, if E[ Rˆ ( f )] R( f ). Estimator is pessimistic, if E[ Rˆ ( f )] R( f ). 10
11 Variance of estimator Estimator ( ) has a variance The larger the sample T used for computing the estimate is, the lower the resulting variance. Variance vs. bias: ˆ f R 2 2 Var[ Rˆ ( f )] E[ Rˆ ( f ) ] E[ Rˆ ( f )] High variance: large random component in empirical risk estimate. arge bias: systematic error in empirical risk estimate. value Rˆ ( f) R bias dominates R variance dominates 11
12 Risk estimate on training data Which set T should we use? 1. Try: training data Model f, trained on {( x, y ),...,( x, y )} 1 1 Empirical risk measured on training data R ˆ 1 ( f m ) 1 ( y j j, f ( x j )) m Risk estimated on Is this risk estimate an unbiased optimistic pessimistic estimator of the true risk R( f)? m m 12
13 Risk estimate on training data Empirical risk on training data is an optimistic estimator of the true risk. Empirical risk of all possible models for a fixed? Due to random effects it holds for some models f, that Rˆ ( f ) R ( f ) and for other models f, that Rˆ ( f ) R ( f ). earning algorithm chooses a model f with small empirical risk Rˆ ( f ). ˆ ( ) ( ) ikely that R f R f (optimistic risk estimate). 13
14 Risk estimate on training data Empirical risk of the model chosen by the learning algorithm on the training data ( training error ) is optimistic estimator of true risk: E[ Rˆ ( f )] R( f ). The problem is caused by the dependency of the chosen model on the data used for evaluation. Approch to fix the problem: use test data that are independent of the training data. 14
15 Holdout-Testing Idea: estimate risk on independent test data Given: data Split data into D {( x, y ),...,( x, y )} 1 1 Training data {( x, y ),...,( x, y )} and 1 1 m m Test data T {( x, y ),...,( x, y )} m1 m1 d d d d T 15
16 Holdout-Testing Run learning algorithm on data, this yields model. Compute empirical risk Rˆ ( f ) on test data T. T Run learning algorithm on data D, this yields model. Output: Model f, use ˆ D RT ( f ) as estimator for true risk of the model f D. f f D T 16
17 Holdout-Testing: Analysis Is the estimator Rˆ ( f ) for the risk of the model unbiased, optimistic, pessimistic? T f D 17
18 Holdout-Testing: Analysis Estimator Rˆ T ( f ) is pessimistic for R( f D ): Rˆ ( f ) is unbiased for f T f was learned on fewer training examples than f D, and therefore has a higher risk (in expectation). But the estimator Rˆ T ( f ) is useful in practice, while the estimator Rˆ ( f ) is usually wildly optimistic (often close to 0). Why do we train and return model f D? Final model f D rather than f, because f D has a lower risk, and is therefore better. 18
19 Holdout-Testing: Analysis What are the advantages/disadvantages when choosing the test set T large small? T should be large to ensure that risk estimate Rˆ T ( f ) has low variance. T should be small to ensure that Rˆ T ( f ) has low bias, that is, is not too pessimistic. We need a lot of data in order to obtain good estimates In practice, holdout-testing is only used when data is plentyful. Cross-validation (next slide) usually gives better results. 19
20 Cross-Validation Given: data D {( x, y ),...,( x, y )} 1 1 d d Split D into n equally sized blocks with D D and D i D 0 n i1 Repeat for i=1 n i earn model f i with i =D \ D i. Compute empirical risk ˆ ( ) on D i. j R i D f i D1 D2 D3 D4 D,..., 1 Dn Training examples 20
21 Cross-Validation Average empirical risk estimates from the different test sets D i : 1 n R Rˆ D ( f ) n i 1 earn model f D on complete data set D. Return model f D and estimator R. i i Training examples 21
22 Cross-Validation: Analysis Is the estimator optimistic / pessimistic / unbiased? 22
23 Cross-Validation: Analysis Is the estimator optimistic / pessimistic / unbiased? Estimator is pessimistic: Models f i are trained on fraction (n-1)/n of overall data. Model f D will be trained on all data. Cross-Validation Holdout Training examples 23
24 Cross-Validation: Analysis Bias/Variance compared to holdout-testing? Variance is lower than for holdout-testing Averaging over several holdout experiments, this reduces variance Estimator is based on all data, because all instances appear as test instances in some block. Bias similar as for holdout-testing, depends on number of blocks. Cross-Validation Holdout Training examples 24
25 Example: regularized polynomial regression Polynomial model fw ( x) wi x i0 earn model by minimizing regularized loss M w* arg min w ( ( ) ) ln 18 i1 M 2 fw xi yi w i 2 Training data {( x, y ),...,( x, y )} 1 1 m m earned model True model y x 25
26 Tune regularization parameter We have to determine a good regularization parameter. Regularization parameter controls complexity of model. ln ln 18 ln 0 26
27 Tune regularization parameter Perform cross-validation for different parameters, save the corresponding risk estimates. When training the final model on all of the data, use the * parameter that resulted in smallest risk estimate. Training error minimal for unregularized model, but test error better for moderate regularization. * 0 27
28 Tune regularization parameter Algorithm: earn model with optimal regularization parameter. Function For : trainmodeloptimalambda( D) 1 1 { 2 k,2 k...,2 k, 2 k } Determine error( ) crossvalidatio n(, D) cross-validation risk estimate for model with parameter on data D. Set earn * arg mi n error( ) f trainmodel(, D) * * earning model with * parameter on data D. Output: model f *. 28
29 Estimating error of model with tuned regularization parameter How do we estimate the error of the model with tuned * regularization parameter? Warning: we can not simply use the error estimate error( * )! * The parameter was chosen such that error( * ) is as small as possible. The error estimate error( * ) is therefore optimistic. Compare with earlier argument: training error is optimistic, because model parameters have been chosen based on training data. Instead, what we need is a nested cross-validation (see next slide). 29
30 Nested Cross-Validation Algorithm: risk estimate with tuned regularization parameter Function trainandevaluateoptimalambda( D) Split D into n equally sized blocks D,..., 1 Dn with D n D i1 i and D D 0. For earn i i {1,..., n}: f j * i trainmodeloptimalambda( D \ D) i Determine empirical risk * Rˆ ( f ) on data Di D i i Average the different empirical risk estimates: earn f * trainmodeloptimalambda( D) R 1 n R ˆ ( f ). i1 Di i n * f R. Output: model and risk estimate 30
31 Evaluation: Summary Studied the problem of risk estimation: expected loss on novel test data. Training error optimistic, cannot be used as risk estimate. Appropriate approaches are holdout-testing and crossvalidation. Cross-validation is also used to tune hyperparameters such as regularization parameter. Error estimate for model with tuned hyperparameters requires nested cross-validation. 31
Computational Finance Least Squares Monte Carlo
Computational Finance Least Squares Monte Carlo School of Mathematics 2019 Monte Carlo and Binomial Methods In the last two lectures we discussed the binomial tree method and convergence problems. One
More informationMS&E 448 Final Presentation High Frequency Algorithmic Trading
MS&E 448 Final Presentation High Frequency Algorithmic Trading Francis Choi George Preudhomme Nopphon Siranart Roger Song Daniel Wright Stanford University June 6, 2017 High-Frequency Trading MS&E448 June
More informationLearning From Data: MLE. Maximum Likelihood Estimators
Learning From Data: MLE Maximum Likelihood Estimators 1 Parameter Estimation Assuming sample x1, x2,..., xn is from a parametric distribution f(x θ), estimate θ. E.g.: Given sample HHTTTTTHTHTTTHH of (possibly
More informationRandom Variables and Applications OPRE 6301
Random Variables and Applications OPRE 6301 Random Variables... As noted earlier, variability is omnipresent in the business world. To model variability probabilistically, we need the concept of a random
More informationECS171: Machine Learning
ECS171: Machine Learning Lecture 15: Tree-based Algorithms Cho-Jui Hsieh UC Davis March 7, 2018 Outline Decision Tree Random Forest Gradient Boosted Decision Tree (GBDT) Decision Tree Each node checks
More informationModule 4: Point Estimation Statistics (OA3102)
Module 4: Point Estimation Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 8.1-8.4 Revision: 1-12 1 Goals for this Module Define
More informationTime Observations Time Period, t
Operations Research Models and Methods Paul A. Jensen and Jonathan F. Bard Time Series and Forecasting.S1 Time Series Models An example of a time series for 25 periods is plotted in Fig. 1 from the numerical
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Monte Carlo Methods Mark Schmidt University of British Columbia Winter 2018 Last Time: Markov Chains We can use Markov chains for density estimation, p(x) = p(x 1 ) }{{} d p(x
More informationMonte-Carlo Methods in Financial Engineering
Monte-Carlo Methods in Financial Engineering Universität zu Köln May 12, 2017 Outline Table of Contents 1 Introduction 2 Repetition Definitions Least-Squares Method 3 Derivation Mathematical Derivation
More informationReview of Expected Operations
Economic Risk and Decision Analysis for Oil and Gas Industry CE81.98 School of Engineering and Technology Asian Institute of Technology January Semester Presented by Dr. Thitisak Boonpramote Department
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Monte Carlo Methods Mark Schmidt University of British Columbia Winter 2019 Last Time: Markov Chains We can use Markov chains for density estimation, d p(x) = p(x 1 ) p(x }{{}
More informationLecture 3: Factor models in modern portfolio choice
Lecture 3: Factor models in modern portfolio choice Prof. Massimo Guidolin Portfolio Management Spring 2016 Overview The inputs of portfolio problems Using the single index model Multi-index models Portfolio
More informationReview of key points about estimators
Review of key points about estimators Populations can be at least partially described by population parameters Population parameters include: mean, proportion, variance, etc. Because populations are often
More informationMachine Learning for Quantitative Finance
Machine Learning for Quantitative Finance Fast derivative pricing Sofie Reyners Joint work with Jan De Spiegeleer, Dilip Madan and Wim Schoutens Derivative pricing is time-consuming... Vanilla option pricing
More informationStatistical analysis and bootstrapping
Statistical analysis and bootstrapping p. 1/15 Statistical analysis and bootstrapping Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Statistical analysis and bootstrapping
More informationChapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi
Chapter 4: Commonly Used Distributions Statistics for Engineers and Scientists Fourth Edition William Navidi 2014 by Education. This is proprietary material solely for authorized instructor use. Not authorized
More informationPredicting Foreign Exchange Arbitrage
Predicting Foreign Exchange Arbitrage Stefan Huber & Amy Wang 1 Introduction and Related Work The Covered Interest Parity condition ( CIP ) should dictate prices on the trillion-dollar foreign exchange
More informationLecture 12: The Bootstrap
Lecture 12: The Bootstrap Reading: Chapter 5 STATS 202: Data mining and analysis October 20, 2017 1 / 16 Announcements Midterm is on Monday, Oct 30 Topics: chapters 1-5 and 10 of the book everything until
More informationValue (x) probability Example A-2: Construct a histogram for population Ψ.
Calculus 111, section 08.x The Central Limit Theorem notes by Tim Pilachowski If you haven t done it yet, go to the Math 111 page and download the handout: Central Limit Theorem supplement. Today s lecture
More informationGamma. The finite-difference formula for gamma is
Gamma The finite-difference formula for gamma is [ P (S + ɛ) 2 P (S) + P (S ɛ) e rτ E ɛ 2 ]. For a correlation option with multiple underlying assets, the finite-difference formula for the cross gammas
More informationLecture 22. Survey Sampling: an Overview
Math 408 - Mathematical Statistics Lecture 22. Survey Sampling: an Overview March 25, 2013 Konstantin Zuev (USC) Math 408, Lecture 22 March 25, 2013 1 / 16 Survey Sampling: What and Why In surveys sampling
More informationPoint Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage
6 Point Estimation Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Point Estimation Statistical inference: directed toward conclusions about one or more parameters. We will use the generic
More informationPoint Estimation. Some General Concepts of Point Estimation. Example. Estimator quality
Point Estimation Some General Concepts of Point Estimation Statistical inference = conclusions about parameters Parameters == population characteristics A point estimate of a parameter is a value (based
More informationThe Loans_processed.csv file is the dataset we obtained after the pre-processing part where the clean-up python code was used.
Machine Learning Group Homework 3 MSc Business Analytics Team 9 Alexander Romanenko, Artemis Tomadaki, Justin Leiendecker, Zijun Wei, Reza Brianca Widodo The Loans_processed.csv file is the dataset we
More informationSTAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved.
STAT 509: Statistics for Engineers Dr. Dewei Wang Applied Statistics and Probability for Engineers Sixth Edition Douglas C. Montgomery George C. Runger 7 Point CHAPTER OUTLINE 7-1 Point Estimation 7-2
More informationCOS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #24 Scribe: Jordan Ash May 1, 2014
COS 5: heoretical Machine Learning Lecturer: Rob Schapire Lecture #24 Scribe: Jordan Ash May, 204 Review of Game heory: Let M be a matrix with all elements in [0, ]. Mindy (called the row player) chooses
More informationSubject : Computer Science. Paper: Machine Learning. Module: Decision Theory and Bayesian Decision Theory. Module No: CS/ML/10.
e-pg Pathshala Subject : Computer Science Paper: Machine Learning Module: Decision Theory and Bayesian Decision Theory Module No: CS/ML/0 Quadrant I e-text Welcome to the e-pg Pathshala Lecture Series
More informationThe Simple Regression Model
Chapter 2 Wooldridge: Introductory Econometrics: A Modern Approach, 5e Definition of the simple linear regression model "Explains variable in terms of variable " Intercept Slope parameter Dependent var,
More information1. The number of dental claims for each insured in a calendar year is distributed as a Geometric distribution with variance of
1. The number of dental claims for each insured in a calendar year is distributed as a Geometric distribution with variance of 7.315. The amount of each claim is distributed as a Pareto distribution with
More informationLearning Objectives = = where X i is the i t h outcome of a decision, p i is the probability of the i t h
Learning Objectives After reading Chapter 15 and working the problems for Chapter 15 in the textbook and in this Workbook, you should be able to: Distinguish between decision making under uncertainty and
More informationInterval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems
Interval estimation September 29, 2017 STAT 151 Class 7 Slide 1 Outline of Topics 1 Basic ideas 2 Sampling variation and CLT 3 Interval estimation using X 4 More general problems STAT 151 Class 7 Slide
More informationSTRESS-STRENGTH RELIABILITY ESTIMATION
CHAPTER 5 STRESS-STRENGTH RELIABILITY ESTIMATION 5. Introduction There are appliances (every physical component possess an inherent strength) which survive due to their strength. These appliances receive
More information1. The number of dental claims for each insured in a calendar year is distributed as a Geometric distribution with variance of
1. The number of dental claims for each insured in a calendar year is distributed as a Geometric distribution with variance of 7.3125. The amount of each claim is distributed as a Pareto distribution with
More informationCSC 411: Lecture 08: Generative Models for Classification
CSC 411: Lecture 08: Generative Models for Classification Richard Zemel, Raquel Urtasun and Sanja Fidler University of Toronto Zemel, Urtasun, Fidler (UofT) CSC 411: 08-Generative Models 1 / 23 Today Classification
More informationLecture outline W.B.Powell 1
Lecture outline What is a policy? Policy function approximations (PFAs) Cost function approximations (CFAs) alue function approximations (FAs) Lookahead policies Finding good policies Optimizing continuous
More informationChapter 7 presents the beginning of inferential statistics. The two major activities of inferential statistics are
Chapter 7 presents the beginning of inferential statistics. Concept: Inferential Statistics The two major activities of inferential statistics are 1 to use sample data to estimate values of population
More informationGradient Boosting Trees: theory and applications
Gradient Boosting Trees: theory and applications Dmitry Efimov November 05, 2016 Outline Decision trees Boosting Boosting trees Metaparameters and tuning strategies How-to-use remarks Regression tree True
More informationUnit 5: Sampling Distributions of Statistics
Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate
More informationUnit 5: Sampling Distributions of Statistics
Unit 5: Sampling Distributions of Statistics Statistics 571: Statistical Methods Ramón V. León 6/12/2004 Unit 5 - Stat 571 - Ramon V. Leon 1 Definitions and Key Concepts A sample statistic used to estimate
More informationStratified Sampling in Monte Carlo Simulation: Motivation, Design, and Sampling Error
South Texas Project Risk- Informed GSI- 191 Evaluation Stratified Sampling in Monte Carlo Simulation: Motivation, Design, and Sampling Error Document: STP- RIGSI191- ARAI.03 Revision: 1 Date: September
More informationThe method of Maximum Likelihood.
Maximum Likelihood The method of Maximum Likelihood. In developing the least squares estimator - no mention of probabilities. Minimize the distance between the predicted linear regression and the observed
More information**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:
**BEGINNING OF EXAMINATION** 1. You are given: (i) A random sample of five observations from a population is: 0.2 0.7 0.9 1.1 1.3 (ii) You use the Kolmogorov-Smirnov test for testing the null hypothesis,
More informationModelling Returns: the CER and the CAPM
Modelling Returns: the CER and the CAPM Carlo Favero Favero () Modelling Returns: the CER and the CAPM 1 / 20 Econometric Modelling of Financial Returns Financial data are mostly observational data: they
More informationModeling of Claim Counts with k fold Cross-validation
Modeling of Claim Counts with k fold Cross-validation Alicja Wolny Dominiak 1 Abstract In the ratemaking process the ranking, which takes into account the number of claims generated by a policy in a given
More information10/1/2012. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1
PSY 511: Advanced Statistics for Psychological and Behavioral Research 1 Pivotal subject: distributions of statistics. Foundation linchpin important crucial You need sampling distributions to make inferences:
More informationThe Simple Regression Model
Chapter 2 Wooldridge: Introductory Econometrics: A Modern Approach, 5e Definition of the simple linear regression model Explains variable in terms of variable Intercept Slope parameter Dependent variable,
More informationStock Trading Following Stock Price Index Movement Classification Using Machine Learning Techniques
Stock Trading Following Stock Price Index Movement Classification Using Machine Learning Techniques 6.1 Introduction Trading in stock market is one of the most popular channels of financial investments.
More informationComparison of OLS and LAD regression techniques for estimating beta
Comparison of OLS and LAD regression techniques for estimating beta 26 June 2013 Contents 1. Preparation of this report... 1 2. Executive summary... 2 3. Issue and evaluation approach... 4 4. Data... 6
More informationB. Maddah INDE 504 Discrete-Event Simulation. Output Analysis (3)
B. Maddah INDE 504 Discrete-Event Simulation Output Analysis (3) Variance Reduction Variance reduction techniques (VRT) are methods to reduce the variance (i.e. increase precision) of simulation output
More informationYear 0 $ (12.00) Year 1 $ (3.40) Year 5 $ Year 3 $ Year 4 $ Year 6 $ Year 7 $ 8.43 Year 8 $ 3.44 Year 9 $ (4.
Four Ways to do Project Analysis Project Analysis / Decision Making Engineering 9 Dr. Gregory Crawford Statistical / Regression Analysis (forecasting) Sensitivity Analysis Monte Carlo Simulations Decision
More informationChapter 8 Statistical Intervals for a Single Sample
Chapter 8 Statistical Intervals for a Single Sample Part 1: Confidence intervals (CI) for population mean µ Section 8-1: CI for µ when σ 2 known & drawing from normal distribution Section 8-1.2: Sample
More informationAgricultural and Applied Economics 637 Applied Econometrics II
Agricultural and Applied Economics 637 Applied Econometrics II Assignment I Using Search Algorithms to Determine Optimal Parameter Values in Nonlinear Regression Models (Due: February 3, 2015) (Note: Make
More informationOptimal Search for Parameters in Monte Carlo Simulation for Derivative Pricing
Optimal Search for Parameters in Monte Carlo Simulation for Derivative Pricing Prof. Chuan-Ju Wang Department of Computer Science University of Taipei Joint work with Prof. Ming-Yang Kao March 28, 2014
More informationIs Greedy Coordinate Descent a Terrible Algorithm?
Is Greedy Coordinate Descent a Terrible Algorithm? Julie Nutini, Mark Schmidt, Issam Laradji, Michael Friedlander, Hoyt Koepke University of British Columbia Optimization and Big Data, 2015 Context: Random
More informationMultivariate Statistics Lecture Notes. Stephen Ansolabehere
Multivariate Statistics Lecture Notes Stephen Ansolabehere Spring 2004 TOPICS. The Basic Regression Model 2. Regression Model in Matrix Algebra 3. Estimation 4. Inference and Prediction 5. Logit and Probit
More informationLecture outline W.B. Powell 1
Lecture outline Applications of the newsvendor problem The newsvendor problem Estimating the distribution and censored demands The newsvendor problem and risk The newsvendor problem with an unknown distribution
More informationAccelerated Option Pricing Multiple Scenarios
Accelerated Option Pricing in Multiple Scenarios 04.07.2008 Stefan Dirnstorfer (stefan@thetaris.com) Andreas J. Grau (grau@thetaris.com) 1 Abstract This paper covers a massive acceleration of Monte-Carlo
More informationBinomial Distributions
Binomial Distributions A binomial experiment is a probability experiment that satisfies these conditions. 1. The experiment has a fixed number of trials, where each trial is independent of the other trials.
More informationWeek 1 Quantitative Analysis of Financial Markets Basic Statistics A
Week 1 Quantitative Analysis of Financial Markets Basic Statistics A Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 October
More informationIntroduction To Risk & Return
Calculating the Rate of Return on Assets Introduction o Risk & Return Econ 422: Investment, Capital & Finance University of Washington Summer 26 August 5, 26 Denote today as time the price of the asset
More informationOverview. Transformation method Rejection method. Monte Carlo vs ordinary methods. 1 Random numbers. 2 Monte Carlo integration.
Overview 1 Random numbers Transformation method Rejection method 2 Monte Carlo integration Monte Carlo vs ordinary methods 3 Summary Transformation method Suppose X has probability distribution p X (x),
More informationAlgorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model
Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model Simerjot Kaur (sk3391) Stanford University Abstract This work presents a novel algorithmic trading system based on reinforcement
More informationEconometric Methods for Valuation Analysis
Econometric Methods for Valuation Analysis Margarita Genius Dept of Economics M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 1 / 25 Outline We will consider econometric
More informationChapter 7: Point Estimation and Sampling Distributions
Chapter 7: Point Estimation and Sampling Distributions Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 20 Motivation In chapter 3, we learned
More informationReinforcement Learning. Slides based on those used in Berkeley's AI class taught by Dan Klein
Reinforcement Learning Slides based on those used in Berkeley's AI class taught by Dan Klein Reinforcement Learning Basic idea: Receive feedback in the form of rewards Agent s utility is defined by the
More informationSupport Vector Machines: Training with Stochastic Gradient Descent
Support Vector Machines: Training with Stochastic Gradient Descent Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Support vector machines Training by maximizing margin The SVM
More informationMonte-Carlo Planning: Introduction and Bandit Basics. Alan Fern
Monte-Carlo Planning: Introduction and Bandit Basics Alan Fern 1 Large Worlds We have considered basic model-based planning algorithms Model-based planning: assumes MDP model is available Methods we learned
More informationPoint Estimators. STATISTICS Lecture no. 10. Department of Econometrics FEM UO Brno office 69a, tel
STATISTICS Lecture no. 10 Department of Econometrics FEM UO Brno office 69a, tel. 973 442029 email:jiri.neubauer@unob.cz 8. 12. 2009 Introduction Suppose that we manufacture lightbulbs and we want to state
More informationChapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS
Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS Part 1: Introduction Sampling Distributions & the Central Limit Theorem Point Estimation & Estimators Sections 7-1 to 7-2 Sample data
More informationSTUDY SET 1. Discrete Probability Distributions. x P(x) and x = 6.
STUDY SET 1 Discrete Probability Distributions 1. Consider the following probability distribution function. Compute the mean and standard deviation of. x 0 1 2 3 4 5 6 7 P(x) 0.05 0.16 0.19 0.24 0.18 0.11
More informationEstimation after Model Selection
Estimation after Model Selection Vanja M. Dukić Department of Health Studies University of Chicago E-Mail: vanja@uchicago.edu Edsel A. Peña* Department of Statistics University of South Carolina E-Mail:
More informationLecture Notes 6. Assume F belongs to a family of distributions, (e.g. F is Normal), indexed by some parameter θ.
Sufficient Statistics Lecture Notes 6 Sufficiency Data reduction in terms of a particular statistic can be thought of as a partition of the sample space X. Definition T is sufficient for θ if the conditional
More informationOnline Appendix for The Importance of Being. Marginal: Gender Differences in Generosity
Online Appendix for The Importance of Being Marginal: Gender Differences in Generosity Stefano DellaVigna, John List, Ulrike Malmendier, Gautam Rao January 14, 2013 This appendix describes the structural
More informationCOMPARISON OF RATIO ESTIMATORS WITH TWO AUXILIARY VARIABLES K. RANGA RAO. College of Dairy Technology, SPVNR TSU VAFS, Kamareddy, Telangana, India
COMPARISON OF RATIO ESTIMATORS WITH TWO AUXILIARY VARIABLES K. RANGA RAO College of Dairy Technology, SPVNR TSU VAFS, Kamareddy, Telangana, India Email: rrkollu@yahoo.com Abstract: Many estimators of the
More informationEcon 300: Quantitative Methods in Economics. 11th Class 10/19/09
Econ 300: Quantitative Methods in Economics 11th Class 10/19/09 Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write. --H.G. Wells discuss test [do
More informationYao s Minimax Principle
Complexity of algorithms The complexity of an algorithm is usually measured with respect to the size of the input, where size may for example refer to the length of a binary word describing the input,
More informationReview of key points about estimators
Review of key points about estimators Populations can be at least partially described by population parameters Population parameters include: mean, proportion, variance, etc. Because populations are often
More informationNovember 2001 Course 1 Mathematical Foundations of Actuarial Science. Society of Actuaries/Casualty Actuarial Society
November 00 Course Mathematical Foundations of Actuarial Science Society of Actuaries/Casualty Actuarial Society . An urn contains 0 balls: 4 red and 6 blue. A second urn contains 6 red balls and an unknown
More informationBias Reduction Using the Bootstrap
Bias Reduction Using the Bootstrap Find f t (i.e., t) so that or E(f t (P, P n ) P) = 0 E(T(P n ) θ(p) + t P) = 0. Change the problem to the sample: whose solution is so the bias-reduced estimate is E(T(P
More informationIEOR E4703: Monte-Carlo Simulation
IEOR E4703: Monte-Carlo Simulation Simulation Efficiency and an Introduction to Variance Reduction Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University
More informationAlternative VaR Models
Alternative VaR Models Neil Roeth, Senior Risk Developer, TFG Financial Systems. 15 th July 2015 Abstract We describe a variety of VaR models in terms of their key attributes and differences, e.g., parametric
More informationStructural credit risk models and systemic capital
Structural credit risk models and systemic capital Somnath Chatterjee CCBS, Bank of England November 7, 2013 Structural credit risk model Structural credit risk models are based on the notion that both
More informationReliable region predictions for Automated Valuation Models
Reliable region predictions for Automated Valuation Models Tony Bellotti, Department of Mathematics, Imperial College London Royal Holloway, University of London 29 April 2016 Outline Automated valuation
More informationPortfolio Credit Risk Models
Portfolio Credit Risk Models Paul Embrechts London School of Economics Department of Accounting and Finance AC 402 FINANCIAL RISK ANALYSIS Lent Term, 2003 c Paul Embrechts and Philipp Schönbucher, 2003
More informationMonte-Carlo Planning: Introduction and Bandit Basics. Alan Fern
Monte-Carlo Planning: Introduction and Bandit Basics Alan Fern 1 Large Worlds We have considered basic model-based planning algorithms Model-based planning: assumes MDP model is available Methods we learned
More informationHigh Frequency Trading Strategy Based on Prex Trees
High Frequency Trading Strategy Based on Prex Trees Yijia Zhou, 05592862, Financial Mathematics, Stanford University December 11, 2010 1 Introduction 1.1 Goal I am an M.S. Finanical Mathematics student
More informationStat 139 Homework 2 Solutions, Fall 2016
Stat 139 Homework 2 Solutions, Fall 2016 Problem 1. The sum of squares of a sample of data is minimized when the sample mean, X = Xi /n, is used as the basis of the calculation. Define g(c) as a function
More informationMath 140 Introductory Statistics
Math 140 Introductory Statistics Let s make our own sampling! If we use a random sample (a survey) or if we randomly assign treatments to subjects (an experiment) we can come up with proper, unbiased conclusions
More informationSampling and sampling distribution
Sampling and sampling distribution September 12, 2017 STAT 101 Class 5 Slide 1 Outline of Topics 1 Sampling 2 Sampling distribution of a mean 3 Sampling distribution of a proportion STAT 101 Class 5 Slide
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation The likelihood and log-likelihood functions are the basis for deriving estimators for parameters, given data. While the shapes of these two functions are different, they have
More informationHeavy-tailedness and dependence: implications for economic decisions, risk management and financial markets
Heavy-tailedness and dependence: implications for economic decisions, risk management and financial markets Rustam Ibragimov Department of Economics Harvard University Based on joint works with Johan Walden
More informationStatistics for Business and Economics
Statistics for Business and Economics Chapter 7 Estimation: Single Population Copyright 010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 7-1 Confidence Intervals Contents of this chapter: Confidence
More informationAntitrust Notice. Copyright 2010 National Council on Compensation Insurance, Inc. All Rights Reserved.
Antitrust Notice The Casualty Actuarial Society is committed to adhering strictly to the letter and spirit of the antitrust laws. Seminars conducted under the auspices of the CAS are designed solely to
More informationIntroduction to Sequential Monte Carlo Methods
Introduction to Sequential Monte Carlo Methods Arnaud Doucet NCSU, October 2008 Arnaud Doucet () Introduction to SMC NCSU, October 2008 1 / 36 Preliminary Remarks Sequential Monte Carlo (SMC) are a set
More informationMachine Learning in Risk Forecasting and its Application in Low Volatility Strategies
NEW THINKING Machine Learning in Risk Forecasting and its Application in Strategies By Yuriy Bodjov Artificial intelligence and machine learning are two terms that have gained increased popularity within
More information2011 Pearson Education, Inc
Statistics for Business and Economics Chapter 4 Random Variables & Probability Distributions Content 1. Two Types of Random Variables 2. Probability Distributions for Discrete Random Variables 3. The Binomial
More informationA useful modeling tricks.
.7 Joint models for more than two outcomes We saw that we could write joint models for a pair of variables by specifying the joint probabilities over all pairs of outcomes. In principal, we could do this
More information1. You are given the following information about a stationary AR(2) model:
Fall 2003 Society of Actuaries **BEGINNING OF EXAMINATION** 1. You are given the following information about a stationary AR(2) model: (i) ρ 1 = 05. (ii) ρ 2 = 01. Determine φ 2. (A) 0.2 (B) 0.1 (C) 0.4
More information12 The Bootstrap and why it works
12 he Bootstrap and why it works For a review of many applications of bootstrap see Efron and ibshirani (1994). For the theory behind the bootstrap see the books by Hall (1992), van der Waart (2000), Lahiri
More informationAP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE
AP STATISTICS Name: FALL SEMESTSER FINAL EXAM STUDY GUIDE Period: *Go over Vocabulary Notecards! *This is not a comprehensive review you still should look over your past notes, homework/practice, Quizzes,
More information