The Loans_processed.csv file is the dataset we obtained after the pre-processing part where the clean-up python code was used.
|
|
- Shanna Robertson
- 6 years ago
- Views:
Transcription
1 Machine Learning Group Homework 3 MSc Business Analytics Team 9 Alexander Romanenko, Artemis Tomadaki, Justin Leiendecker, Zijun Wei, Reza Brianca Widodo The Loans_processed.csv file is the dataset we obtained after the pre-processing part where the clean-up python code was used. 1. Import the pre-processed data set in R. Shuffle the records and split them into a training set (20,000 records), a validation set (8,000 records) and a test set (all remaining records). Here we first shuffle our data in order to then split them randomly into a training, validation and test data set proportionate to a data set of 20,000, 8,000 and 10,697 samples respectively (51.7%, 20.7% and 27.6% split). 2. Using a classification tree (look at the C50 library), try to predict with an accuracy greater than (# of re loans /# of re loans + # of charged loans) if a loan will be re. Do you manage to achieve this performance on the validation set? What about the training set? First, we calculate our threshold which equals the value of %. This implies that the number of re loans constitutes the % of the sample or, in other words, that the proportion of the charged loans is the % of the sample. Next, we built a model classification tree using the C50 library based on the training set in order to be able to predict, later on, whether a loan is likely to be re or not, according to our validation set. Results of the classification tree based on the training set: Evaluation on training data (20000 cases): Decision Tree Size Errors (13.9%) << (a) (b) <-classified as (a): class Off (b): class Paid So, our results here basically inform us that the tree split the data at one spot, whether or not the loan_status variable had the value or. It is also stated that there was 13.9% error rate (13.9% of cases were incorrectly classified), which accounts for 2,779 out of the 20,000 records used for training. In other words, our accuracy here is estimated to be 86.1%, which is higher than our specified threshold.
2 Next, we will add to our model weak learners in such a way that newer learners pick up the slack of older learners. In this way, we will incrementally increase the accuracy of the model. Using the C5.0() function, we can increase the number of boosting iterations by changing the trials parameter. Results of boosting: Evaluation on training data (20000 cases): Decision Tree Size Errors (13.9%) << (a) (b) <-classified as (a): class Off (b): class Paid The results we obtain from boosting are exactly the same as before. Finally, we can make our prediction for the samples of our validation set based on our training set and then find the accuracy of that prediction by estimating how many of our predicted based on our training set equal the real of our validation set (loan_status column in validation table). Predict() function is used for this purpose. The type="class" argument specifies that we want the actual class labels as output, rather than the probability that the class label was one label or another. Results of our prediction: Off Paid Obtained confusion matrix: Paid Predicted Total Total According to our prediction, the loan status of all the samples of our validation set are predicted to be Paid. However, in reality, only the 6,886 of the samples in our validation set are characteri zed as Paid. Our accuracy here is estimated to be % with an error rate of %, whic h is again higher than our threshold. From this procedure, we can infer that this method might not constitute to the optimal way to predict and find the default cases.
3 3. The majority of loans in the data set are being re. By default, a classification tree algorithm uses majority votes in the leaf nodes and thus classify loans in leaf nodes with more than 50% non-defaults as safe. This strategy optimizes the default metric: the number of correctly classified loans. From a business perspective, however, we are interested in identifying loans with a high probability of default, even if the associated data record falls in a leaf node with more than 50% of safe loan samples. R s C50 library contains a cost matrix parameter that allows you to change the optimized metric and thus put more weight on one type of error over the other (e.g., false positives or false negatives). Experiment with different cost matrices to achieve a sensitivity (also known as recall) of approximately 25%, 40% and 50% in your validation set. Also, report the percentage of the loans (n11 / n11 + n21) you would recommend to the bank for re-evaluation that were indeed charged (also known as precision). Until now we have tried to achieve the highest accuracy. However, this is not our goal here. In the business world, we are highly interested in identifying the cases where the risk of default is high, as this will have a high cost for us. Here comes the cost matrix feature of C5.0 package that enables us to reflect this fact. Please note that the column in cost matrix represents actual output and the row will represent predicted value. a. Sensitivity matrix of 25% (24.23%) will be provided by the following cost matrix: Predicted 0 1 Paid 14 0 This matrix will provide 30.75% level of precision. Confusion matrix for validation data: Paid Predicted Total Total b. Sensitivity matrix of 40% (38.77%) will be provided by the following cost matrix:
4 Predicted 0 1 Paid 30 0 This matrix will provide 27.41% level of precision. Confusion matrix for validation data: Paid Predicted Total Total c. Sensitivity matrix of 50% (51.52%) will be provided by the following cost matrix: Predicted 0 1 Paid 48 0 This matrix will provide 23.8% level of precision. Confusion matrix for validation data: Paid Predicted Total Total
5 4. Pick a cost parameter matrix that you assess as the most appropriate for identifying loan applications that deserve further examination Based on the matrices identified in question 3, we believe that the most appropriate parameter matrix for identifying loan applications that deserve further examination is matrix c. Let us explain why. The matrix c has a lower percentage value of charged loans predicted to be fully : 6.8% vs 8.5% (for both matrices a and b). The proportion of fully loans which have been classified as charged is higher for matrix c: 23% vs 7.6% (matrix a) and 14.3% (matrix b). Despite the big impact on misclassification resulted from the information provided above, we need to keep in mind that for banks the value of a lost customer (fully loan predicted to be a default and as a result refused a loan) is much lower comparing to the value of a defaulted customer. Therefore here we should give much higher weight to instances where percentage of charged loans was incorrectly classified as good loans. As a result, this confirm our initial statement that matrix c is the best option here. In addition to the above calculation, we can see a negative correlation described oabove for sensitivity against precision:
6 5. Evaluate the performance of your cost parameter matrix on the test set The confusion matrix based on our testing data set can be seen below: Paid Predicted Total Total With: sensitivity level = 49.22% and precision level = 22.69%. To summarise, this is somewhat of disappointing results as the test set shows misclassification rate of 31.5%. However the incorrectly classified charged clients as fully is only 7.3% and based on the discussion in the question 4 is a relatively low result.
ECS171: Machine Learning
ECS171: Machine Learning Lecture 15: Tree-based Algorithms Cho-Jui Hsieh UC Davis March 7, 2018 Outline Decision Tree Random Forest Gradient Boosted Decision Tree (GBDT) Decision Tree Each node checks
More informationCredit Card Default Predictive Modeling
Credit Card Default Predictive Modeling Background: Predicting credit card payment default is critical for the successful business model of a credit card company. An accurate predictive model can help
More informationComparison of Logit Models to Machine Learning Algorithms for Modeling Individual Daily Activity Patterns
Comparison of Logit Models to Machine Learning Algorithms for Modeling Individual Daily Activity Patterns Daniel Fay, Peter Vovsha, Gaurav Vyas (WSP USA) 1 Logit vs. Machine Learning Models Logit Models:
More informationCOS 511: Theoretical Machine Learning. Lecturer: Rob Schapire Lecture #24 Scribe: Jordan Ash May 1, 2014
COS 5: heoretical Machine Learning Lecturer: Rob Schapire Lecture #24 Scribe: Jordan Ash May, 204 Review of Game heory: Let M be a matrix with all elements in [0, ]. Mindy (called the row player) chooses
More informationYao s Minimax Principle
Complexity of algorithms The complexity of an algorithm is usually measured with respect to the size of the input, where size may for example refer to the length of a binary word describing the input,
More informationDATA MINING ON LOAN APPROVED DATSET FOR PREDICTING DEFAULTERS
DATA MINING ON LOAN APPROVED DATSET FOR PREDICTING DEFAULTERS By Ashish Pandit A Project Report Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science
More informationInternet Appendix. Additional Results. Figure A1: Stock of retail credit cards over time
Internet Appendix A Additional Results Figure A1: Stock of retail credit cards over time Stock of retail credit cards by month. Time of deletion policy noted with vertical line. Figure A2: Retail credit
More informationInvesting through Economic Cycles with Ensemble Machine Learning Algorithms
Investing through Economic Cycles with Ensemble Machine Learning Algorithms Thomas Raffinot Silex Investment Partners Big Data in Finance Conference Thomas Raffinot (Silex-IP) Economic Cycles-Machine Learning
More informationNumerical investigation on multiclass probabilistic classification of damage location in a plate structure
Numerical investigation on multiclass probabilistic classification of damage location in a plate structure Rims Janeliukstis *, Sandris Rucevskis, Andrejs Kovalovs and Andris Chate Institute of Materials
More informationStock Trading Following Stock Price Index Movement Classification Using Machine Learning Techniques
Stock Trading Following Stock Price Index Movement Classification Using Machine Learning Techniques 6.1 Introduction Trading in stock market is one of the most popular channels of financial investments.
More informationLoan Approval and Quality Prediction in the Lending Club Marketplace
Loan Approval and Quality Prediction in the Lending Club Marketplace Milestone Write-up Yondon Fu, Shuo Zheng and Matt Marcus Recap Lending Club is a peer-to-peer lending marketplace where individual investors
More informationSession 57PD, Predicting High Claimants. Presenters: Zoe Gibbs Brian M. Hartman, ASA. SOA Antitrust Disclaimer SOA Presentation Disclaimer
Session 57PD, Predicting High Claimants Presenters: Zoe Gibbs Brian M. Hartman, ASA SOA Antitrust Disclaimer SOA Presentation Disclaimer Using Asymmetric Cost Matrices to Optimize Wellness Intervention
More informationPredicting Companies Delisting to Improve Mutual Fund Performance
Predicting Companies Delisting to Improve Mutual Fund Performance TA-WEI HUANG EUGENE YANG PO-WEI HUANG BADM BADM Group 6 Executive Summary Stock is removed from an exchange because the company for which
More informationCS 475 Machine Learning: Final Project Dual-Form SVM for Predicting Loan Defaults
CS 475 Machine Learning: Final Project Dual-Form SVM for Predicting Loan Defaults Kevin Rowland Johns Hopkins University 3400 N. Charles St. Baltimore, MD 21218, USA krowlan3@jhu.edu Edward Schembor Johns
More informationBig Data Analytics: Evaluating Classification Performance April, 2016 R. Bohn. Some overheads from Galit Shmueli and Peter Bruce 2010
Big Data Analytics: Evaluating Classification Performance April, 2016 R. Bohn 1 Some overheads from Galit Shmueli and Peter Bruce 2010 Most accurate Best! Actual value Which is more accurate?? 2 Why Evaluate
More informationLoan Approval and Quality Prediction in the Lending Club Marketplace
Loan Approval and Quality Prediction in the Lending Club Marketplace Final Write-up Yondon Fu, Matt Marcus and Shuo Zheng Introduction Lending Club is a peer-to-peer lending marketplace where individual
More informationMS&E 448 Final Presentation High Frequency Algorithmic Trading
MS&E 448 Final Presentation High Frequency Algorithmic Trading Francis Choi George Preudhomme Nopphon Siranart Roger Song Daniel Wright Stanford University June 6, 2017 High-Frequency Trading MS&E448 June
More informationLecture 9: Classification and Regression Trees
Lecture 9: Classification and Regression Trees Advanced Applied Multivariate Analysis STAT 2221, Spring 2015 Sungkyu Jung Department of Statistics, University of Pittsburgh Xingye Qiao Department of Mathematical
More informationNCC5010: Data Analytics and Modeling Spring 2015 Exemption Exam
NCC5010: Data Analytics and Modeling Spring 2015 Exemption Exam Do not look at other pages until instructed to do so. The time limit is two hours. This exam consists of 6 problems. Do all of your work
More informationHomework solutions, Chapter 8
Homework solutions, Chapter 8 NOTE: We might think of 8.1 as being a section devoted to setting up the networks and 8.2 as solving them, but only 8.2 has a homework section. Section 8.2 2. Use Dijkstra
More informationRisk and Risk Management in the Credit Card Industry
Risk and Risk Management in the Credit Card Industry F. Butaru, Q. Chen, B. Clark, S. Das, A. W. Lo and A. Siddique Discussion by Richard Stanton Haas School of Business MFM meeting January 28 29, 2016
More informationInternational Journal of Advance Engineering and Research Development REVIEW ON PREDICTION SYSTEM FOR BANK LOAN CREDIBILITY
Scientific Journal of Impact Factor (SJIF): 4.72 International Journal of Advance Engineering and Research Development Volume 4, Issue 12, December -2017 e-issn (O): 2348-4470 p-issn (P): 2348-6406 REVIEW
More informationDecision Trees An Early Classifier
An Early Classifier Jason Corso SUNY at Buffalo January 19, 2012 J. Corso (SUNY at Buffalo) Trees January 19, 2012 1 / 33 Introduction to Non-Metric Methods Introduction to Non-Metric Methods We cover
More informationLendingClub Loan Default and Profitability Prediction
LendingClub Loan Default and Profitability Prediction Peiqian Li peiqian@stanford.edu Gao Han gh352@stanford.edu Abstract Credit risk is something all peer-to-peer (P2P) lending investors (and bond investors
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Evaluation of Models. Niels Landwehr
Universität Potsdam Institut für Informatik ehrstuhl Maschinelles ernen Evaluation of Models Niels andwehr earning and Prediction Classification, Regression: earning problem Input: training data Output:
More informationIntroduction to Operations Research
Introduction to Operations Research Unit 1: Linear Programming Terminology and formulations LP through an example Terminology Additional Example 1 Additional example 2 A shop can make two types of sweets
More informationAcademic Research Review. Classifying Market Conditions Using Hidden Markov Model
Academic Research Review Classifying Market Conditions Using Hidden Markov Model INTRODUCTION Best known for their applications in speech recognition, Hidden Markov Models (HMMs) are able to discern and
More informationBusiness Strategies in Credit Rating and the Control of Misclassification Costs in Neural Network Predictions
Association for Information Systems AIS Electronic Library (AISeL) AMCIS 2001 Proceedings Americas Conference on Information Systems (AMCIS) December 2001 Business Strategies in Credit Rating and the Control
More informationSCHOOL OF BUSINESS, ECONOMICS AND MANAGEMENT. BF360 Operations Research
SCHOOL OF BUSINESS, ECONOMICS AND MANAGEMENT BF360 Operations Research Unit 3 Moses Mwale e-mail: moses.mwale@ictar.ac.zm BF360 Operations Research Contents Unit 3: Sensitivity and Duality 3 3.1 Sensitivity
More informationMutual Funds Action Predictor. Our product platform
Mutual Funds Action Predictor Our product platform September 19, 2017 Fund Movement Prediction WHAT IS IT? BUSINESS VALUE SCREENSHOTS MODELLING RESULTS Page 2 What does it offer? The AlgoAnalyticsMutual
More informationModeling Private Firm Default: PFirm
Modeling Private Firm Default: PFirm Grigoris Karakoulas Business Analytic Solutions May 30 th, 2002 Outline Problem Statement Modelling Approaches Private Firm Data Mining Model Development Model Evaluation
More informationIEOR E4004: Introduction to OR: Deterministic Models
IEOR E4004: Introduction to OR: Deterministic Models 1 Dynamic Programming Following is a summary of the problems we discussed in class. (We do not include the discussion on the container problem or the
More informationSSC - Appendix A35. South Staffordshire Water PR19. Monte Carlo modelling of ODI RoRE. Issue 3 Final 29/08/18. South Staffordshire Water
Document Ti tle SSC - Appendix A35 South Staffordshire Water PR19 Monte Carlo modelling of ODI RoRE Issue 3 Final 29/08/18 South Staffordshire Water South Staffordshire Water PR19 Project No: B2342800
More informationRelative and absolute equity performance prediction via supervised learning
Relative and absolute equity performance prediction via supervised learning Alex Alifimoff aalifimoff@stanford.edu Axel Sly axelsly@stanford.edu Introduction Investment managers and traders utilize two
More informationAbstract Making good predictions for stock prices is an important task for the financial industry. The way these predictions are carried out is often
Abstract Making good predictions for stock prices is an important task for the financial industry. The way these predictions are carried out is often by using artificial intelligence that can learn from
More information6.896 Topics in Algorithmic Game Theory February 10, Lecture 3
6.896 Topics in Algorithmic Game Theory February 0, 200 Lecture 3 Lecturer: Constantinos Daskalakis Scribe: Pablo Azar, Anthony Kim In the previous lecture we saw that there always exists a Nash equilibrium
More informationTop-down particle filtering for Bayesian decision trees
Top-down particle filtering for Bayesian decision trees Balaji Lakshminarayanan 1, Daniel M. Roy 2 and Yee Whye Teh 3 1. Gatsby Unit, UCL, 2. University of Cambridge and 3. University of Oxford Outline
More information3.2 Aids to decision making
3.2 Aids to decision making Decision trees One particular decision-making technique is to use a decision tree. A decision tree is a way of representing graphically the decision processes and their various
More information2. This algorithm does not solve the problem of finding a maximum cardinality set of non-overlapping intervals. Consider the following intervals:
1. No solution. 2. This algorithm does not solve the problem of finding a maximum cardinality set of non-overlapping intervals. Consider the following intervals: E A B C D Obviously, the optimal solution
More informationUNDERSTANDING ML/DL MODELS USING INTERACTIVE VISUALIZATION TECHNIQUES
UNDERSTANDING ML/DL MODELS USING INTERACTIVE VISUALIZATION TECHNIQUES Chakri Cherukuri Senior Researcher Quantitative Financial Research Group 1 OUTLINE Introduction Applied machine learning in finance
More informationExercise: Support Vector Machines
SMO using Weka Follow these instructions to explore the concept of Sequential Minimal Optimization, or SMO, using the Weka software tool. Write answers to the questions below on a separate sheet or type
More informationPredicting and Preventing Credit Card Default
Predicting and Preventing Credit Card Default Project Plan MS-E2177: Seminar on Case Studies in Operations Research Client: McKinsey Finland Ari Viitala Max Merikoski (Project Manager) Nourhan Shafik 21.2.2018
More informationCan Twitter predict the stock market?
1 Introduction Can Twitter predict the stock market? Volodymyr Kuleshov December 16, 2011 Last year, in a famous paper, Bollen et al. (2010) made the claim that Twitter mood is correlated with the Dow
More informationPredictive Model for Prosper.com BIDM Final Project Report
Predictive Model for Prosper.com BIDM Final Project Report Build a predictive model for investors to be able to classify Success loans vs Probable Default Loans Sourabh Kukreja, Natasha Sood, Nikhil Goenka,
More informationCS 7646 Exam 1 October 12, 2017 Exam Version D. Do not open this booklet until instructed to begin
CS 7646 Exam 1 October 12, 2017 Exam Version D Do not open this booklet until instructed to begin 1. How does the IEX exchange defeat the high frequency traders investigated in the 60 minutes video? A)
More informationv CORRELATION MATRIX
v CORRELATION MATRIX 1. About correlation... 2 2. Using the Correlation Matrix... 3 2.1 The matrix... 3 2.2 Changing the parameters for the calculation... 3 2.3 Highlighting correlation strength... 4 2.4
More informationOracle Financial Services Market Risk User Guide
Oracle Financial Services User Guide Release 8.0.4.0.0 March 2017 Contents 1. INTRODUCTION... 1 PURPOSE... 1 SCOPE... 1 2. INSTALLING THE SOLUTION... 3 2.1 MODEL UPLOAD... 3 2.2 LOADING THE DATA... 3 3.
More informationAvailable online at ScienceDirect. Procedia Computer Science 89 (2016 )
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 89 (2016 ) 441 449 Twelfth International Multi-Conference on Information Processing-2016 (IMCIP-2016) Prediction Models
More informationPERFORMANCE COMPARISON OF THREE DATA MINING MODELS FOR BUSINESS TAX AUDIT
PERFORMANCE COMPARISON OF THREE DATA MINING MODELS FOR BUSINESS TAX AUDIT 1 TSUNG-NAN CHOU 1 Asstt Prof., Department of Finance, Chaoyang University of Technology. Taiwan E-mail: 1 tnchou@cyut.edu.tw ABSTRACT
More informationEvery data set has an average and a standard deviation, given by the following formulas,
Discrete Data Sets A data set is any collection of data. For example, the set of test scores on the class s first test would comprise a data set. If we collect a sample from the population we are interested
More informationAdvanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras
Advanced Operations Research Prof. G. Srinivasan Department of Management Studies Indian Institute of Technology, Madras Lecture 21 Successive Shortest Path Problem In this lecture, we continue our discussion
More informationCS 798: Homework Assignment 4 (Game Theory)
0 5 CS 798: Homework Assignment 4 (Game Theory) 1.0 Preferences Assigned: October 28, 2009 Suppose that you equally like a banana and a lottery that gives you an apple 30% of the time and a carrot 70%
More informationACCT323, Cost Analysis & Control H Guy Williams, 2005
Cost allocation methods are an interesting group of exercise. We will see different cuts. Basically the problem we have is very similar to the problem we have with overhead. We can figure out the direct
More informationNBER WORKING PAPER SERIES RISK AND RISK MANAGEMENT IN THE CREDIT CARD INDUSTRY
NBER WORKING PAPER SERIES RISK AND RISK MANAGEMENT IN THE CREDIT CARD INDUSTRY Florentin Butaru QingQing Chen Brian Clark Sanmay Das Andrew W. Lo Akhtar Siddique Working Paper 21305 http://www.nber.org/papers/w21305
More informationUnfold Income Myth: Revolution in Income Models with Advanced Machine Learning. Techniques for Better Accuracy
Unfold Income Myth: Revolution in Income Models with Advanced Machine Learning Techniques for Better Accuracy ABSTRACT Consumer IncomeView is the Equifax next-gen income estimation model that estimates
More informationLecture 3: Factor models in modern portfolio choice
Lecture 3: Factor models in modern portfolio choice Prof. Massimo Guidolin Portfolio Management Spring 2016 Overview The inputs of portfolio problems Using the single index model Multi-index models Portfolio
More informationHealth Insurance Market
Health Insurance Market Jeremiah Reyes, Jerry Duran, Chanel Manzanillo Abstract Based on a person s Health Insurance Plan attributes, namely if it was a dental only plan, is notice required for pregnancy,
More informationAccepted Manuscript. Enterprise Credit Risk Evaluation Based on Neural Network Algorithm. Xiaobing Huang, Xiaolian Liu, Yuanqian Ren
Accepted Manuscript Enterprise Credit Risk Evaluation Based on Neural Network Algorithm Xiaobing Huang, Xiaolian Liu, Yuanqian Ren PII: S1389-0417(18)30213-4 DOI: https://doi.org/10.1016/j.cogsys.2018.07.023
More information56:171 Operations Research Midterm Examination October 28, 1997 PART ONE
56:171 Operations Research Midterm Examination October 28, 1997 Write your name on the first page, and initial the other pages. Answer both questions of Part One, and 4 (out of 5) problems from Part Two.
More informationDATA MINING FOR OPTIMAL GAMBLING.
DATA MINING FOR OPTIMAL GAMBLING. Gabriele Torre 1 and Fabrizio Malfanti 2 1 Dipartimento di Matematica, Università degli Studi di Genova, via Dodecaneso 35, 16146, Genova, Italy. (e-mail: torre@dima.unige.it)
More informationSupport Vector Machines: Training with Stochastic Gradient Descent
Support Vector Machines: Training with Stochastic Gradient Descent Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Support vector machines Training by maximizing margin The SVM
More information56:171 Operations Research Midterm Exam Solutions Fall 1994
56:171 Operations Research Midterm Exam Solutions Fall 1994 Possible Score A. True/False & Multiple Choice 30 B. Sensitivity analysis (LINDO) 20 C.1. Transportation 15 C.2. Decision Tree 15 C.3. Simplex
More informationData Mining: A Closer Look. 2.1 Data Mining Strategies 8/30/2011. Chapter 2. Data Mining Strategies. Market Basket Analysis. Unsupervised Clustering
Data Mining: A Closer Look Chapter 2 2.1 Data Mining Strategies Data Mining Strategies Unsupervised Clustering Supervised Learning Market Basket Analysis Classification Estimation Prediction Figure 2.1
More informationIntroducing GEMS a Novel Technique for Ensemble Creation
Introducing GEMS a Novel Technique for Ensemble Creation Ulf Johansson 1, Tuve Löfström 1, Rikard König 1, Lars Niklasson 2 1 School of Business and Informatics, University of Borås, Sweden 2 School of
More informationInternational Journal of Business and Administration Research Review, Vol. 1, Issue.1, Jan-March, Page 149
DEVELOPING RISK SCORECARD FOR APPLICATION SCORING AND OPERATIONAL EFFICIENCY Avisek Kundu* Ms. Seeboli Ghosh Kundu** *Senior consultant Ernst and Young. **Senior Lecturer ITM Business Schooland Research
More informationIterated Dominance and Nash Equilibrium
Chapter 11 Iterated Dominance and Nash Equilibrium In the previous chapter we examined simultaneous move games in which each player had a dominant strategy; the Prisoner s Dilemma game was one example.
More informationThe exam is closed book, closed calculator, and closed notes except your three crib sheets.
CS 188 Spring 2016 Introduction to Artificial Intelligence Final V2 You have approximately 2 hours and 50 minutes. The exam is closed book, closed calculator, and closed notes except your three crib sheets.
More informationBinary Diagnostic Tests Single Sample
Chapter 535 Binary Diagnostic Tests Single Sample Introduction This procedure generates a number of measures of the accuracy of a diagnostic test. Some of these measures include sensitivity, specificity,
More information2018 Predictive Analytics Symposium Session 10: Cracking the Black Box with Awareness & Validation
2018 Predictive Analytics Symposium Session 10: Cracking the Black Box with Awareness & Validation SOA Antitrust Compliance Guidelines SOA Presentation Disclaimer Cracking the Black Box with Awareness
More informationPredicting the Success of a Retirement Plan Based on Early Performance of Investments
Predicting the Success of a Retirement Plan Based on Early Performance of Investments CS229 Autumn 2010 Final Project Darrell Cain, AJ Minich Abstract Using historical data on the stock market, it is possible
More informationGradient Boosting Trees: theory and applications
Gradient Boosting Trees: theory and applications Dmitry Efimov November 05, 2016 Outline Decision trees Boosting Boosting trees Metaparameters and tuning strategies How-to-use remarks Regression tree True
More informationMachine Learning Applications in Insurance
General Public Release Machine Learning Applications in Insurance Nitin Nayak, Ph.D. Digital & Smart Analytics Swiss Re General Public Release Machine learning is.. Giving computers the ability to learn
More informationMachine Learning in Risk Forecasting and its Application in Low Volatility Strategies
NEW THINKING Machine Learning in Risk Forecasting and its Application in Strategies By Yuriy Bodjov Artificial intelligence and machine learning are two terms that have gained increased popularity within
More informationOptimization Methods in Management Science
Problem Set Rules: Optimization Methods in Management Science MIT 15.053, Spring 2013 Problem Set 6, Due: Thursday April 11th, 2013 1. Each student should hand in an individual problem set. 2. Discussing
More informationDeep Learning - Financial Time Series application
Chen Huang Deep Learning - Financial Time Series application Use Deep learning to learn an existing strategy Warning Don t Try this at home! Investment involves risk. Make sure you understand the risk
More information56:171 Operations Research Midterm Examination Solutions PART ONE
56:171 Operations Research Midterm Examination Solutions Fall 1997 Write your name on the first page, and initial the other pages. Answer both questions of Part One, and 4 (out of 5) problems from Part
More informationMilestone2. Zillow House Price Prediciton. Group: Lingzi Hong and Pranali Shetty
Milestone2 Zillow House Price Prediciton Group Lingzi Hong and Pranali Shetty MILESTONE 2 REPORT Data Collection The following additional features were added 1. Population, Number of College Graduates
More informationBinomial Probability
Binomial Probability Features of a Binomial Experiment 1. There are a fixed number of trials. We denote this number by the letter n. Features of a Binomial Experiment 2. The n trials are independent and
More informationReal Options. Katharina Lewellen Finance Theory II April 28, 2003
Real Options Katharina Lewellen Finance Theory II April 28, 2003 Real options Managers have many options to adapt and revise decisions in response to unexpected developments. Such flexibility is clearly
More informationMining Investment Venture Rules from Insurance Data Based on Decision Tree
Mining Investment Venture Rules from Insurance Data Based on Decision Tree Jinlan Tian, Suqin Zhang, Lin Zhu, and Ben Li Department of Computer Science and Technology Tsinghua University., Beijing, 100084,
More information56:171 Operations Research Midterm Exam Solutions October 19, 1994
56:171 Operations Research Midterm Exam Solutions October 19, 1994 Possible Score A. True/False & Multiple Choice 30 B. Sensitivity analysis (LINDO) 20 C.1. Transportation 15 C.2. Decision Tree 15 C.3.
More informationPredicting Economic Recession using Data Mining Techniques
Predicting Economic Recession using Data Mining Techniques Authors Naveed Ahmed Kartheek Atluri Tapan Patwardhan Meghana Viswanath Predicting Economic Recession using Data Mining Techniques Page 1 Abstract
More informationDECISION TREE INDUCTION
CSc-215 (Gordon) Week 12A notes DECISION TREE INDUCTION A decision tree is a graphic way of representing certain types of Boolean decision processes. Here is a simple example of a decision tree for determining
More informationPhylogenetic comparative biology
Phylogenetic comparative biology In phylogenetic comparative biology we use the comparative data of species & a phylogeny to make inferences about evolutionary process and history. Reconstructing the ancestral
More informationSession 5. Predictive Modeling in Life Insurance
SOA Predictive Analytics Seminar Hong Kong 29 Aug. 2018 Hong Kong Session 5 Predictive Modeling in Life Insurance Jingyi Zhang, Ph.D Predictive Modeling in Life Insurance JINGYI ZHANG PhD Scientist Global
More informationRegressing Loan Spread for Properties in the New York Metropolitan Area
Regressing Loan Spread for Properties in the New York Metropolitan Area Tyler Casey tyler.casey09@gmail.com Abstract: In this paper, I describe a method for estimating the spread of a loan given common
More informationThe Influence of News Articles on The Stock Market.
The Influence of News Articles on The Stock Market. COMP4560 Presentation Supervisor: Dr Timothy Graham U6015364 Zhiheng Zhou Australian National University At Ian Ross Design Studio On 2018-5-18 Motivation
More informationDFAST Modeling and Solution
Regulatory Environment Summary Fallout from the 2008-2009 financial crisis included the emergence of a new regulatory landscape intended to safeguard the U.S. banking system from a systemic collapse. In
More informationP2.T5. Market Risk Measurement & Management. Bruce Tuckman, Fixed Income Securities, 3rd Edition
P2.T5. Market Risk Measurement & Management Bruce Tuckman, Fixed Income Securities, 3rd Edition Bionic Turtle FRM Study Notes Reading 40 By David Harper, CFA FRM CIPM www.bionicturtle.com TUCKMAN, CHAPTER
More informationChapter ML:III. III. Decision Trees. Decision Trees Basics Impurity Functions Decision Tree Algorithms Decision Tree Pruning
Chapter ML:III III. Decision Trees Decision Trees Basics Impurity Functions Decision Tree Algorithms Decision Tree Pruning ML:III-93 Decision Trees STEIN/LETTMANN 2005-2017 Overfitting Definition 10 (Overfitting)
More informationHKUST CSE FYP , TEAM RO4 OPTIMAL INVESTMENT STRATEGY USING SCALABLE MACHINE LEARNING AND DATA ANALYTICS FOR SMALL-CAP STOCKS
HKUST CSE FYP 2017-18, TEAM RO4 OPTIMAL INVESTMENT STRATEGY USING SCALABLE MACHINE LEARNING AND DATA ANALYTICS FOR SMALL-CAP STOCKS MOTIVATION MACHINE LEARNING AND FINANCE MOTIVATION SMALL-CAP MID-CAP
More informationThe Geometry of Interest Rate Risk
The Geometry of Interest Rate Risk [Maio-de Jong (2014)] World Finance Conference, Buenos Aires, Argentina, July 23 rd 2015 Michele Maio ugly Duckling m.maio@uglyduckling.nl Slides available at: http://uglyduckling.nl/wfc2015
More informationECO 5341 (Section 2) Spring 2016 Midterm March 24th 2016 Total Points: 100
Name:... ECO 5341 (Section 2) Spring 2016 Midterm March 24th 2016 Total Points: 100 For full credit, please be formal, precise, concise and tidy. If your answer is illegible and not well organized, if
More informationExamining Long-Term Trends in Company Fundamentals Data
Examining Long-Term Trends in Company Fundamentals Data Michael Dickens 2015-11-12 Introduction The equities market is generally considered to be efficient, but there are a few indicators that are known
More informationSection 8.1 Distributions of Random Variables
Section 8.1 Distributions of Random Variables Random Variable A random variable is a rule that assigns a number to each outcome of a chance experiment. There are three types of random variables: 1. Finite
More informationHow To Prevent Another Financial Crisis On Wall Street
How To Prevent Another Financial Crisis On Wall Street Helin Gao helingao@stanford.edu Qianying Lin qlin1@stanford.edu Kaidi Yan kaidi@stanford.edu Abstract Riskiness of a particular loan can be estimated
More informationB-tagging based on Boosted Decision Trees
B-tagging based on Boosted Decision Trees Haijun Yang University of Michigan (with Xuefei Li and Bing Zhou) ATLAS B-tagging Meeting CERN, July 7, 2009 1 Introduction Outline Boosted Decision Trees B-tagging
More informationEconomics 109 Practice Problems 1, Vincent Crawford, Spring 2002
Economics 109 Practice Problems 1, Vincent Crawford, Spring 2002 P1. Consider the following game. There are two piles of matches and two players. The game starts with Player 1 and thereafter the players
More informationAmazon Elastic Compute Cloud
Amazon Elastic Compute Cloud An Introduction to Spot Instances API version 2011-05-01 May 26, 2011 Table of Contents Overview... 1 Tutorial #1: Choosing Your Maximum Price... 2 Core Concepts... 2 Step
More information2.1 Mathematical Basis: Risk-Neutral Pricing
Chapter Monte-Carlo Simulation.1 Mathematical Basis: Risk-Neutral Pricing Suppose that F T is the payoff at T for a European-type derivative f. Then the price at times t before T is given by f t = e r(t
More information