Report concerning the estimation of variables at various spatial scales for Luxembourg

Size: px
Start display at page:

Download "Report concerning the estimation of variables at various spatial scales for Luxembourg"

Transcription

1 Report concerning the estimation of variables at various spatial scales for Luxembourg Report produced in the framework of the Urban Audit project and presented to EUROSTAT on December 1st 2010 Luxembourgish Partners in the project: GEODE team at CEPS/INSTEAD and STATEC Corresponding authors: Dr. Hichem Omrani and Dr. Philippe Gerber Contacts: GEODE department, CEPS/INSTEAD, Luxembourg Contents: 1. Introduction Overview Description of the work Problems in the sub level estimation Examining the Data Fitting the linear probability model Coverage of variables Selected variables Spatial units Description of the developed estimation methodology Logit Model Neural Network Model Results from learning and test step Optimal number of units in each hidden layer Comparison of results from neural network and Logit models Capacity of prediction and efficiency Assessment by Receiving Operating Characteristics (ROC curve analysis) Monte Carlo simulation: evaluation, sub level data quality measures: confidence Intervals...14 Conclusion...16 Acknowledments...16 References...16

2 1. Introduction Urban Audit 4 (UA) was started on 1 September 2009 and it will be finished on 1st September 2011 by writing a final report presenting our results about estimating variables on multiple spatial units. GEODE (Geography and Development) department of CEPS/INSTEAD has to estimate various variables on several scales (i.e. CLSN: city, large city, sub-city and national level). For the assessment, multiple data sources are used: COL (administrative file of the whole residents in the City Of Luxembourg) and Census data in In the framework of Urban Audit, GEODE department of CEPS/INSTEAD, STATEC and EUROSTAT institutes acknowledge the demand for small area data to support planning, decision making and service delivery at local area level. In this report, the aim is to estimate a certain number of spatially scaled variables. This estimation (from census dataset and PSELL: panel survey at national level: Panel Socio- Economique Liewen zu Lëtzebuerg) is available just at national scale and not at sub level scales (e.g. city level and municipalities). Here our concern is using national level data set and sub-levels characteristics in order to estimate sub-levels variables. These estimations are useful for decision making. We present hereafter the methods of estimation developed and the results that they produced. The methods developed here could be applied in several fields and applications when estimation is needed at sub-levels or fitting non linear model from some explanatory variables. 2. Overview For effective planning of social economic and for distributing government funds, there is a growing demand to produce reliable estimates for smaller geographic areas and subpopulations, called small areas, for which adequate samples are not available. There are several methods already developed for sub-level estimation (e.g. small area estimation (SAE)). For sub level estimation, several methods could be applied such as a regression approach, small area estimation (SAE). For a review of SAE, see Ghosh and Rao (1994), among others. We can classify the different methods of SAE into the following classes: synthetic estimators, regression models, base unit estimators, composite small area estimators and bayesian estimators (see Omrani and Gerber, 2009). These methods are not sufficient for all kind of data and not appropriate for sublevel estimation (dissagregation process). In fact, from PSELL-EU-SILC dataset, it is only possible to provide estimation at national level. Indeed, it isn t appropriate to provide estimation at sublevels (e.g. at municipality or at Luxembourg City) and this is due to several issues like the size of samples, the missing data or responses etc. 3. Description of the work We present in this paper adequate techniques for sub-level estimation by applying Logit and neural network models. The methods developed here could be applied in several fields and applications when estimation is needed at sub-levels Problems in the sub level estimation We present here some problems in the sublevel estimation: - the census and the survey are almost contemporaneous. - the variables are often collected and coded differently across surveys and censuses.

3 In the framework of PSELL survey derived from EU-SILC, we interrogated in 2008, 746 households at Luxembourg City among 3779 households in all questioned in the level of the country. The problem is that they didn t be sampled according to a spatial stratification having to ensure a representativeness of the city. The sample is valid at the national level and not enough reliable at the sublevel (city level). However, the size of sample at the city level seems to be high enough for estimating variables at sub-levels. Number of people in the sample Lux. City Esch sur Alzette Canton code Figure 1: Number of individuals in the sample (from PSELL, 2008) by canton level 1 For instance, from figure 1, we note that the number of sampled individuals at the canton of Esch-sur-alzette (1296 persons) is larger than at the canton of Luxembourg City (746 persons) whereas individuals in the canton of Esch-sur-alzette (146000) are a little bit more than in Luxembourg City (139000) which makes a biased estimation. This is why we apply advanced techniques (i.e. Logit and neural network models) for variables estimation Examining the Data Initial examination of the data is a preliminary task in order to study variables distribution, their relationship, dependency, causality and linearity. It consists also to treat missing values. We present here bellow some data examination in figure 2. From figure 2, we underline that the number of population at Luxembourg City increase from 2003 to 2008 which is not match with the number of households in the same period according to PSELL Panel. This is due to the quality of PSELL data sources. In fact, COL data set is an administrative data which is exhaustive (containing all residents at Luxembourg City) but not precise (problem of over estimation by including residents who already left the city); whereas the PSELL data set is a national survey which may contain some imprecision at local level so it provides biased estimation of variables at such level. Variables of interest from Survey and COL data sets : 1 1: LUXEMBOURG VILLE (746), 2: CAPELLEN (199), 3: ESCH-SUR-ALZETTE (1296), 4: LUXEMBOURG-CAMPAGNE (355), 5: MERSCH (183), 6: CLERVAUX (155), 7: DIEKIRCH (202), 8: REDANGE (88), 9: VIANDEN (24), 10: WILTZ (109), 11: ECHTERNACH (110), 12: REMICH (142)

4 Let X = {x1, x2, x3, x4, x5} and Y = {y1} be the inputs and the output variables, where: - x1: age of individual; - x2: nationality of individual; - x3: gender of individual; - x4: marital status of individual; - x5: activity; - y1: being household chief (a dummy variable 2 : 1 (yes), 0 (no)): to be estimated by using Logit and Neural Network models and the COL data sets. Number of pop from VDL Number of households (PSELL) Year Figure 2: Number of population (in the left) vs. Number of households (in the right) from 2003 to Year 3.3. Fitting the linear probability model We proceed to fit a linear probability (known by discrete choice model or linear Probit) model to the data (represented according to figure 3 and table 1), including fixed effects for the 5 explicative variables as mentioned before. Let data03 be the PSELL database in 2003 with dummies variables (16 Inputs (I) and 1 Output (O)). The Linear model is as follow (R code): lm (O ~ I), where I=data03 [, 1:16] (inputs) and O=data03 [, 17] (output). Table 1: Linear probability model (values of intercept and slopes) Linear regression model by least squares: Call: lm (formula = O ~ I) (R code) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** I < 2e-16 *** I < 2e-16 *** 2 A dummy variable is a binary variable that has either 1 or zero. It is commonly used to examine group and time effects in regression.

5 I < 2e-16 *** I < 2e-16 *** I I < 2e-16 *** I < 2e-16 *** I < 2e-16 *** I < 2e-16 *** I < 2e-16 *** I < 2e-16 *** I < 2e-16 *** I < 2e-16 *** I < 2e-16 *** I < 2e-16 *** I e-13 *** Signif. codes: 0 *** ** 0.01 * Residual standard error: on degrees of freedom Multiple R-squared: 0.326, Adjusted R-squared: F-statistic: 1.331e+04 on 16 and DF, p-value: < 2.2e-16 AIC: Values of probabilities of being household chief (Linear model) Density _ Learning dataset _ Test dataset N = Bandwidth = Figure 3: Results from linear model: the values of probabilities of being household chief The estimated number of households (hh) is as follow: hh from the learning dataset and hh from the test dataset (the latter estimation is too low so inefficient for the estimation of the number of hh). A linear equation for estimating the number of households (function of age and marital status, activity etc.) is not adequate. This is clear by the value of Multiple R-squared and adjusted R-squared which are low (as shown in table 1). Therefore, we propose here to fit the household reference (i.e. being household chief) according to several variables (e.g. age, gender, marital status, nationality and activity) by using advanced techniques for sub-level estimation for instance logit and neural network models.

6 4. Coverage of variables 4.1. Selected variables There are three sources of information (COL, Census and PSELL) that are used for the estimation of variables at three spatial units (national, sub city district and city levels). For estimation at national level requests are done according to the PSELL panel. On the other hand, for estimation at sub levels (i.e. core city and sub-city district) specific techniques are used like small area estimation method which is the most applied method for this task. All variables that have been estimated pertain to the number of households that reside in determined spatial units of various sizes. The broad categories of variables of Urban Audit to be estimated are the following: - Total Number of Households (excluding institutional households) - One person households - Households with children aged 0 to under 18 - Total Number of Households with less than half of the national average disposable annual household income 4.2. Spatial units It exists 3 spatial levels of estimation related to : Core city, sub-city district level and national level (see tables 2-3). Table 2: Variables transmitted to Eurostat (to collect) Variables A B C CC Total of variables S (city) S (Sub-city) L (Large Urban Total (%) zone) N (National) Total Total (%) Note : A : collected, B : estimated, C : cannot be yet estimated, CC : collected centrally Table 3: List of variables (collected) Variables A B C CC Total of variables Total (%) S (city) S (Sub-city) L (Large Urban zone) Total N (National) Total (%) Note : A : collected, B : estimated, C : cannot be yet estimated, CC : collected centrally

7 5. Description of the developed estimation methodology For sub level estimation, several methods could be applied such as a regression approach. These methods are not sufficient for all kind of data like mentioned and demonstrated before in the sub-section 2.1. From PSELL-EU-SILC dataset, which is an national survey, it is only possible to provide estimation at national level. Indeed, it isn t appropriate to provide estimation at sublevels (e.g. at municipality or at Luxembourg City) and this is due to several issues like the size of samples, the missing data or responses etc. Indeed, we present hereafter adequate techniques for sub-level estimation. The methodology discussed hereafter to derive for example the number of households by sublevel (or small area) is provided by applying Logit and neural network models. Both of these methods use information derived from a set of independent variables obtained from administrative information sources (here in our study, Census and COL datasets) that are symptomatic of regional statistics estimation. Here, in order to estimate the Urban Audit variables, we use multiple information sources like aforementioned. The census data set has been used to calibtrate the models by estimating the different coefficients. Then, the models already calibtared have been used to estimate the sulevel estimation by using the COL data sets. Furthermore, for each estimation indicator, we will present a confidence interval which allows us to indicate the accuracy of each estimation. The learning dataset here used (according to PSELL-EU-SILC) in the logit and in the neural network is composed of 5 independent variables (age, sex, marital status, nationality and activity) and 1 dependent variable (i.e. being household chief). After data treatment, we have 16 dummies variables as inputs and 1 dummy variable as target variable or output. In 2003, the dataset has individuals living in Luxembourg. Among, there exist households. Indeed the rate of being households is equal to 39%. Let data03 be the PSELL database in 2003 with dummies variables (16 Inputs (I) and 1 Output (O)) Logit Model The logit model with R software is as follow: (see table 4). logit=glm (formula = O ~ I, family = binomial (link = "logit")), where I=data03 [, 1:16] (inputs) and O=data03 [, 17] (output). Coefficient=logit$coefficients r = Intercept + Coefficient[1] * I[,1] + Coefficient[2] * I[,2] + Coefficient[3] * I[,3] + Coefficient[4] * I[,4] + Coefficient[5] * I[,5] + Coefficient[6] * I[,6] + Coefficient[7] * I[,7] + Coefficient[8] * I[,8] + Coefficient[9] * I[,9] + Coefficient[10] * I[,10] + Coefficient[11] * I[,11] + Coefficient[12] * I[,12] + Coefficient[13] * I[,13] + Coefficient[14] * I[,14] + Coefficient[15] * I[,15] + Coefficient[16] * I[,16] r <- exp(r)/(1+exp(r)) Table 4: Variables in the Logit model and coefficients estimation B Coefficients Estimate S.E. Std. Error Wald df Sig. Exp(B) I

8 I I I I I I I I I I I I I I I Intercept From Logit model and with learning database (PSELL-EU-SILC dataset), the estimated number of households is equal to (as shown in figure 4), while the real value is Therefore, this estimated is reliable enough according to the learning step. The reliability of Logit model with CoL data is shown in figure 5 (test dataset including whole residents in Luxembourg City). The estimated number of households at the level of Luxembourg City is equal to households in The Logit model (where the information criteria for model selection AIC is equal to ) is quite better than the linear model (AIC=446730). Hereafter we will show the reliability of the Logit model comparing with the Neural Network model. Value of probabilities of being household chief (Logit model) Density _ From learning dataset Estimated number of households, from learning dataset N = Bandwidth = Figure 4: Value of probabilities of being household with Logit model during learning step ( individuals in 2003, from PSELL-EU-SILC dataset); estimated number of households and its variation (sensitivity analysis) with threshold: with a threshold equal to 0.5, the estimated number of households is equal to whereas the real value is Threshold

9 Density Value of probabilities of being household chief _ From test dataset Estimated number of households from test dataset with Logit model N = Bandwidth = Threshold, from 0.2 to 0.8 by Figure 5: Logit results: Households in 2007 at the level of Luxembourg City 5.2. Neural Network Model Neural network is a theoretical framework more general and widespread than Logit model which can be applied with several dependent variables and each one can follow a given distribution. The neural network (noted after by NN) is applied here in order to estimate the number of households at Luxembourg City. In the neural network, we have applied the softmax activation function (Bridle, 1990) which is a useful way of describing the relationship between one or more dependant variables (i.e. v1=age, v2=gender, v3= marital status and v4 =activity) and an outcome (being a household), expressed as a probability, that has only two possible values, such as "household chief" or "not household chief" Results from learning and test step In the figure 6 below, we present the estimated number of households and its variation with thresholds values. Here the number (corresponding to the threshold 0.5) seems to be the most adequate with 4 hidden layers (see fig. 6). Value of probabilities of being household chief Density _ From learning dataset Estimated number of households, from learning dataset N = Bandwidth = Threshold Figure 6: Value of probabilities of being household chief or not being; estimated number of households ( with optimal threshold), from the learning dataset PSELL-EU-SILC by using neural network model (the real number is equal to households)

10 Density Value of probabilities of being household chief _ From test dataset Estimated number of households from test dataset with NN model N = Bandwidth = Threshold, from 0.2 to 0.8 by Figure 7: Density of being households or not to be with test dataset from NN model; estimated number of households (48506, with 4 hidden layers), from test dataset with neural network model. Density distribution from NN model Density _ Learning dataset _ Test dataset N = Bandwidth = Figure 8: Comparison of value of probabilities by use NN model from learning and test data sets. From the neural network model with 4 units in the hidden layer (where results are given according to the figure 8), the estimated number of households is equal to as shown in figure Optimal number of units in each hidden layer We vary the number of units in the hidden layer and we observe the result of estimation of the number of households (see figure 9). By comparison with the real value, it is easy to determine the optimal number of units in the hidden layer as shown in figure 9. The optimal number of units in the hidden layer is equal to 4, which is in line with literature recommendation by proposing that it is equal to the square root of the number of variables.

11 Estimation Number of hidden layers Figure 9: Estimated number of households from test dataset with NN model, with various hidden layers (from 0 to 50) 5.3. Comparison of results from neural network and Logit models Capacity of prediction and efficiency In figure 10, we present the estimated results of the number of households with neural network and Logit model. This comparison is done by varying the threshold from 0.2 to 0.8 by a step of It seems that the neural network is more precise and less sensitive with the threshold values than the Logit model (see tables 4-7 after mentioned). Value of probabilities of being household chief Density Logit NN N = Bandwidth = Figure 10: Prediction results: comparision of probabilities values from Logit and NN models with the test dataset

12 Estimated number of households from test dataset Logit NN Threshold, from 0.2 to 0.8 by Figure 11: Number of household with various thresholds: comparison of Logit and neural network models for test step: COL dataset After leaning step, in order to estimate the parameter of the model, a test step is important to run in order to study the efficiency and the generalization capacity of the model. The dataset of test phase is composed of 5 variables (sex, age, marital status, nationality and activity) with individuals. Here, we applied the Logit and the neural network model in order to estimate the number of household starting from the administrative dataset with all the residents in Luxembourg City (see results as shown in figure 12). The efficiency of the applied models is shown in tables 3-6. We underline that the NN model is the most reliable in terms of capacity of prediction and the percentage of correct classification (i.e. 0 or 1, to be household or not to be) is equal to 82.34% (see tables 5-8). Table 5: Estimation efficiency from learning dataset with LPM: classification Table (a) Predicted Percentage 0 1 Correct Observed Overall Percentage (a): The cut value is.5 Table 6: Estimation efficiency from learning dataset with Logit: classification Table (a) Predicted Percentage 0 1 Correct Observed Overall Percentage 79.7 (a): The cut value is.5

13 Table 7: Estimation efficiency from learning dataset with NN: classification Table (a) Predicted Percentage 0 1 Correct Observed Overall Percentage (a): The cut value is.5 Table 8: Comparison of results from LPM, Logit and NN models LPM Logit NN Learning dataset Test dataset Percentage Correct Table 9 summarizes the estimation results from NN and Logit models from 2003 to Table 9: Comparison of results: etimated number of households from 3 models: LPM, Logit and Neural Network Year Population 3 in Luxembourg city HH number from PSELL Estimated value (LPM) Estimated value (Logit) Estimated value (NN) The number of population at the city is taken from the STATEC institute:

14 Number of population Number of household Number of population PSELL data set COL data set Number of household PSELL data set COL data set Year Figure 12: Resuls from the estimation methodology at the city level, estimation of number of housholds from 2003 to 2008 Year Assessment by Receiving Operating Characteristics (ROC curve analysis) The ROC is a tool for assessment and comparison of models. We use the ROC curve analysis (Receiving Operating Characteristics) in order to evaluate the efficiency of the three applied prediction models (LPM, Logit and Neural Network (NN)). According the figure 13, we underline the efficiency of the NN model against Logit and LPM models. Sensitivity NN LOGIT LPM Specificity Figure 13: ROC curve analysis related to three prediction models (NN, LOGIT and LPM) Monte Carlo simulation: evaluation, sub level data quality measures: confidence Intervals We recall that the confidence interval illustrates the degree of variability associated with a number. Wide confidence intervals indicate high variability. Thus, these numbers should be interpreted and compared with due caution. Here we apply Monte Carlo simulation in order to determine confidence intervals for the number of households (hh). The confidence

15 intervals of the number of households are produced here by assuming that the distribution of the hh number fellow normal distribution. The applied Monte Carlo simulation is presented in 5 steps as follows: - Predict the number of hh from the test data set with NN and Logit models: y =f(x1, x2,, xq) - Generate a set of random inputs (x1, x2,..., xq) by introducing an error term. - Evaluate the model and store the results as yi. - Repeat steps 2 and 3 (N iterations) for i=1 to N. - Analyze the results using confidence intervals and check that the first predicted value is within the confidence interval. First, we estimate the number of households with the test data sets (COL, ) by using the estimated parameters with the learning data sets. Then we introduce an error term in the model with the inputs variables. We repeat the process N times and at each iteration and then we determine the predicted value of the number of households as shown in figure 14. We compute the mean and the standard deviation of the predicted hh number. Finally we determine the confidence interval. For instance, with N iterations, let (m, σ) be respectively the mean and the standard variation of the number of household. Therefore its Confidence Interval (CI) at 95% can be computed as follows for lower and upper limit: CI = [m σ / (sqrt(n)), m σ/(sqrt(n))] Where (m, σ) are respectively the mean and the standard variation of number of households variable which supposed to follow normal distribution. The Confidence Interval (noted by CI) for the estimated number of households (from the test data set) is shown in figure 14. The CI for the estimated number of households in 2003 is equal to: [33351, 42004]. The predicted number of hh in 2003 is equal to which belongs the CI before mentioned. We conclude that this prediction is true. To conclude this section, we have seen that the neural network is better than the Logit model. The NN model is able to classify correctly about 86% of the individuals. We may still slightly improve these results by running a parametric study on the number of hidden units. This is could be done by minimizing the Akaike information criterion (AIC, Greene, 2000). But this is rather time-consuming and our results are already quite good. Estimated number of households: Monté Carlo Simulation N iterations Figure 14: Results from Monte Carlo simulation by NN model with various thresholds, from the CoL data.

16 Conclusion In this report, we support the use of neural network techniques, that integrate census and survey data, to produce sub-levels estimations. The use of neural network model for sublevel estimation is motivated by their universal approximation capabilities. As illustrated through the application previously described, neural learning is particularly useful and efficient when complex nonlinear phenomena. Neural Network technique seems particularly adapted to estimate sublevel variables better than Logit model. The results of the proposed methods are assessed by ROC curve analysis and they have been represented by confidence interval with Monte Carlo simulation. In the future, we plan to apply the developed technique based on NN model to estimate the set of variables at sublevel scales (i.e. not only city level but also at municipality or the sub-city district level) of the Urban Audit project. In a next step of our research, we may try to include the spatial units (e.g. municipality level) into the NN model and predict the revenues and the structure of household (e.g. number of children) and comparing the results with Poisson regression model which is widely used to predict such variables. Acknowledments We would like to thank the anonymous reviewers for their comments. This work was supported in part by the EUROSTAT institute and in part by GEODE department of CEPS/INSTEAD research institute. Annex: Time schedule Luxembourg Urban Audit Start 1st September 2009 Identification variables 1st December 2009 (Statec/ CEPS) Compilation variables 2008 ANNUAL UA 1st March 2010 (Statec/ CEPS) Compilation variables 1st September 2010 (Statec CEPS) EXHAUSTIVE UA ref year 2008 Compilation interim operational 1st December 2010 (CEPS) report Compilation variables 1stMarch 2011 (Statec/ CEPS) 2009 ANNUAL UA Participation quality control of 1st June 2011 (Statec/ CEPS) variables / variables Compilation of maps 1st June 2011 (CEPS) Compile final operational report 1st September 2011 (CEPS) References Omrani H., Gerber P., Small Area Estimation, International Conference on Small Area Estimation (SAE'09), Elche-Spain, June 29-July 01, 2009 (a) Omrani H., Gerber P., Bousch P., Model-Based Small Area Estimation with application to unemployment estimates, International Conference on Mathematics, Statistics and Scientific Computing (ICMSSC), Dubai, UAE, January 28-30, 2009 (b) Omrani H., Gerber P., Small area estimation: Methods and application, final report of Urban Audit project, EUROSTAT, CEPS/INSTEAD, Luxembourg, December, 2008

17 Roy, G. et Vanheuverzwyn, A., «Redressement par la macro CALMAR: applications et pistes d amélioration», in Traitements des fichiers d enquêtes, éditions PUG, p , Ghosh, M. and Rao, J. N. K., Small Area Estimation: An Appraisal, Statistical Science, Vol. 9, No. 1, pp , Bridle, J.S, Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition. In: F.Fogleman Soulie and J.Herault (eds.), Neurocomputing: Algorithms, Architectures and Applications, Berlin: Springer-Verlag, pp , Greene W. H., Econometrics analysis, Pentice Hall International. 4th edition, ISBN: , Publication in 2010, about the Research work conducted in the framework of the Urban Audit project : Omrani H., Gerber P., Small Area Estimation, Sublevel estimation of variables at various spatial scales, submitted to a scientific journal (October 2010)

Multiple Regression and Logistic Regression II. Dajiang 525 Apr

Multiple Regression and Logistic Regression II. Dajiang 525 Apr Multiple Regression and Logistic Regression II Dajiang Liu @PHS 525 Apr-19-2016 Materials from Last Time Multiple regression model: Include multiple predictors in the model = + + + + How to interpret the

More information

Option Pricing Using Bayesian Neural Networks

Option Pricing Using Bayesian Neural Networks Option Pricing Using Bayesian Neural Networks Michael Maio Pires, Tshilidzi Marwala School of Electrical and Information Engineering, University of the Witwatersrand, 2050, South Africa m.pires@ee.wits.ac.za,

More information

Intro to GLM Day 2: GLM and Maximum Likelihood

Intro to GLM Day 2: GLM and Maximum Likelihood Intro to GLM Day 2: GLM and Maximum Likelihood Federico Vegetti Central European University ECPR Summer School in Methods and Techniques 1 / 32 Generalized Linear Modeling 3 steps of GLM 1. Specify the

More information

Dummy Variables. 1. Example: Factors Affecting Monthly Earnings

Dummy Variables. 1. Example: Factors Affecting Monthly Earnings Dummy Variables A dummy variable or binary variable is a variable that takes on a value of 0 or 1 as an indicator that the observation has some kind of characteristic. Common examples: Sex (female): FEMALE=1

More information

Calculating the Probabilities of Member Engagement

Calculating the Probabilities of Member Engagement Calculating the Probabilities of Member Engagement by Larry J. Seibert, Ph.D. Binary logistic regression is a regression technique that is used to calculate the probability of an outcome when there are

More information

An enhanced artificial neural network for stock price predications

An enhanced artificial neural network for stock price predications An enhanced artificial neural network for stock price predications Jiaxin MA Silin HUANG School of Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR S. H. KWOK HKUST Business

More information

Estimation strategies with different sources of information

Estimation strategies with different sources of information Estimation strategies with different sources of information Stefano Falorsi, Andrea Fasulo, Fabrizio Solari Via Cesare Balbo, 16 Istat, DIRM and DIPS Rome, Italy stfalors@istat.it fasulo@istat.it solari@istat.it

More information

ELEMENTS OF MONTE CARLO SIMULATION

ELEMENTS OF MONTE CARLO SIMULATION APPENDIX B ELEMENTS OF MONTE CARLO SIMULATION B. GENERAL CONCEPT The basic idea of Monte Carlo simulation is to create a series of experimental samples using a random number sequence. According to the

More information

PARAMETRIC AND NON-PARAMETRIC BOOTSTRAP: A SIMULATION STUDY FOR A LINEAR REGRESSION WITH RESIDUALS FROM A MIXTURE OF LAPLACE DISTRIBUTIONS

PARAMETRIC AND NON-PARAMETRIC BOOTSTRAP: A SIMULATION STUDY FOR A LINEAR REGRESSION WITH RESIDUALS FROM A MIXTURE OF LAPLACE DISTRIBUTIONS PARAMETRIC AND NON-PARAMETRIC BOOTSTRAP: A SIMULATION STUDY FOR A LINEAR REGRESSION WITH RESIDUALS FROM A MIXTURE OF LAPLACE DISTRIBUTIONS Melfi Alrasheedi School of Business, King Faisal University, Saudi

More information

Predicting Economic Recession using Data Mining Techniques

Predicting Economic Recession using Data Mining Techniques Predicting Economic Recession using Data Mining Techniques Authors Naveed Ahmed Kartheek Atluri Tapan Patwardhan Meghana Viswanath Predicting Economic Recession using Data Mining Techniques Page 1 Abstract

More information

Regression and Simulation

Regression and Simulation Regression and Simulation This is an introductory R session, so it may go slowly if you have never used R before. Do not be discouraged. A great way to learn a new language like this is to plunge right

More information

Non-linearities in Simple Regression

Non-linearities in Simple Regression Non-linearities in Simple Regression 1. Eample: Monthly Earnings and Years of Education In this tutorial, we will focus on an eample that eplores the relationship between total monthly earnings and years

More information

Assessment on Credit Risk of Real Estate Based on Logistic Regression Model

Assessment on Credit Risk of Real Estate Based on Logistic Regression Model Assessment on Credit Risk of Real Estate Based on Logistic Regression Model Li Hongli 1, a, Song Liwei 2,b 1 Chongqing Engineering Polytechnic College, Chongqing400037, China 2 Division of Planning and

More information

Determination of the Optimal Stratum Boundaries in the Monthly Retail Trade Survey in the Croatian Bureau of Statistics

Determination of the Optimal Stratum Boundaries in the Monthly Retail Trade Survey in the Croatian Bureau of Statistics Determination of the Optimal Stratum Boundaries in the Monthly Retail Trade Survey in the Croatian Bureau of Statistics Ivana JURINA (jurinai@dzs.hr) Croatian Bureau of Statistics Lidija GLIGOROVA (gligoroval@dzs.hr)

More information

Artificially Intelligent Forecasting of Stock Market Indexes

Artificially Intelligent Forecasting of Stock Market Indexes Artificially Intelligent Forecasting of Stock Market Indexes Loyola Marymount University Math 560 Final Paper 05-01 - 2018 Daniel McGrath Advisor: Dr. Benjamin Fitzpatrick Contents I. Introduction II.

More information

MODELLING OF INCOME AND WAGE DISTRIBUTION USING THE METHOD OF L-MOMENTS OF PARAMETER ESTIMATION

MODELLING OF INCOME AND WAGE DISTRIBUTION USING THE METHOD OF L-MOMENTS OF PARAMETER ESTIMATION International Days of Statistics and Economics, Prague, September -3, MODELLING OF INCOME AND WAGE DISTRIBUTION USING THE METHOD OF L-MOMENTS OF PARAMETER ESTIMATION Diana Bílková Abstract Using L-moments

More information

Credit Card Default Predictive Modeling

Credit Card Default Predictive Modeling Credit Card Default Predictive Modeling Background: Predicting credit card payment default is critical for the successful business model of a credit card company. An accurate predictive model can help

More information

Stock Trading Following Stock Price Index Movement Classification Using Machine Learning Techniques

Stock Trading Following Stock Price Index Movement Classification Using Machine Learning Techniques Stock Trading Following Stock Price Index Movement Classification Using Machine Learning Techniques 6.1 Introduction Trading in stock market is one of the most popular channels of financial investments.

More information

WesVar uses repeated replication variance estimation methods exclusively and as a result does not offer the Taylor Series Linearization approach.

WesVar uses repeated replication variance estimation methods exclusively and as a result does not offer the Taylor Series Linearization approach. CHAPTER 9 ANALYSIS EXAMPLES REPLICATION WesVar 4.3 GENERAL NOTES ABOUT ANALYSIS EXAMPLES REPLICATION These examples are intended to provide guidance on how to use the commands/procedures for analysis of

More information

The data definition file provided by the authors is reproduced below: Obs: 1500 home sales in Stockton, CA from Oct 1, 1996 to Nov 30, 1998

The data definition file provided by the authors is reproduced below: Obs: 1500 home sales in Stockton, CA from Oct 1, 1996 to Nov 30, 1998 Economics 312 Sample Project Report Jeffrey Parker Introduction This project is based on Exercise 2.12 on page 81 of the Hill, Griffiths, and Lim text. It examines how the sale price of houses in Stockton,

More information

Power of t-test for Simple Linear Regression Model with Non-normal Error Distribution: A Quantile Function Distribution Approach

Power of t-test for Simple Linear Regression Model with Non-normal Error Distribution: A Quantile Function Distribution Approach Available Online Publications J. Sci. Res. 4 (3), 609-622 (2012) JOURNAL OF SCIENTIFIC RESEARCH www.banglajol.info/index.php/jsr of t-test for Simple Linear Regression Model with Non-normal Error Distribution:

More information

A Comparative Study of Various Forecasting Techniques in Predicting. BSE S&P Sensex

A Comparative Study of Various Forecasting Techniques in Predicting. BSE S&P Sensex NavaJyoti, International Journal of Multi-Disciplinary Research Volume 1, Issue 1, August 2016 A Comparative Study of Various Forecasting Techniques in Predicting BSE S&P Sensex Dr. Jahnavi M 1 Assistant

More information

Econometrics is. The estimation of relationships suggested by economic theory

Econometrics is. The estimation of relationships suggested by economic theory Econometrics is Econometrics is The estimation of relationships suggested by economic theory Econometrics is The estimation of relationships suggested by economic theory The application of mathematical

More information

Estimating term structure of interest rates: neural network vs one factor parametric models

Estimating term structure of interest rates: neural network vs one factor parametric models Estimating term structure of interest rates: neural network vs one factor parametric models F. Abid & M. B. Salah Faculty of Economics and Busines, Sfax, Tunisia Abstract The aim of this paper is twofold;

More information

Mortality Rates Estimation Using Whittaker-Henderson Graduation Technique

Mortality Rates Estimation Using Whittaker-Henderson Graduation Technique MATIMYÁS MATEMATIKA Journal of the Mathematical Society of the Philippines ISSN 0115-6926 Vol. 39 Special Issue (2016) pp. 7-16 Mortality Rates Estimation Using Whittaker-Henderson Graduation Technique

More information

Market Variables and Financial Distress. Giovanni Fernandez Stetson University

Market Variables and Financial Distress. Giovanni Fernandez Stetson University Market Variables and Financial Distress Giovanni Fernandez Stetson University In this paper, I investigate the predictive ability of market variables in correctly predicting and distinguishing going concern

More information

GENERATION OF STANDARD NORMAL RANDOM NUMBERS. Naveen Kumar Boiroju and M. Krishna Reddy

GENERATION OF STANDARD NORMAL RANDOM NUMBERS. Naveen Kumar Boiroju and M. Krishna Reddy GENERATION OF STANDARD NORMAL RANDOM NUMBERS Naveen Kumar Boiroju and M. Krishna Reddy Department of Statistics, Osmania University, Hyderabad- 500 007, INDIA Email: nanibyrozu@gmail.com, reddymk54@gmail.com

More information

The Impact of a $15 Minimum Wage on Hunger in America

The Impact of a $15 Minimum Wage on Hunger in America The Impact of a $15 Minimum Wage on Hunger in America Appendix A: Theoretical Model SEPTEMBER 1, 2016 WILLIAM M. RODGERS III Since I only observe the outcome of whether the household nutritional level

More information

The state of the social economy in Luxembourg

The state of the social economy in Luxembourg The state of the social economy in Luxembourg Chiara Peroni Research Division, STATEC Establishing Satellite Accounts for the Social Economy Wednesday 14 th October, 2015 1/17 Introduction The social economy

More information

Optimal Search for Parameters in Monte Carlo Simulation for Derivative Pricing

Optimal Search for Parameters in Monte Carlo Simulation for Derivative Pricing Optimal Search for Parameters in Monte Carlo Simulation for Derivative Pricing Prof. Chuan-Ju Wang Department of Computer Science University of Taipei Joint work with Prof. Ming-Yang Kao March 28, 2014

More information

Let us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times.

Let us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times. Mixed-effects models An introduction by Christoph Scherber Up to now, we have been dealing with linear models of the form where ß0 and ß1 are parameters of fixed value. Example: Let us assume that we are

More information

The Duration Derby: A Comparison of Duration Based Strategies in Asset Liability Management

The Duration Derby: A Comparison of Duration Based Strategies in Asset Liability Management The Duration Derby: A Comparison of Duration Based Strategies in Asset Liability Management H. Zheng Department of Mathematics, Imperial College London SW7 2BZ, UK h.zheng@ic.ac.uk L. C. Thomas School

More information

SUPPLEMENTARY ONLINE APPENDIX FOR: TECHNOLOGY AND COLLECTIVE ACTION: THE EFFECT OF CELL PHONE COVERAGE ON POLITICAL VIOLENCE IN AFRICA

SUPPLEMENTARY ONLINE APPENDIX FOR: TECHNOLOGY AND COLLECTIVE ACTION: THE EFFECT OF CELL PHONE COVERAGE ON POLITICAL VIOLENCE IN AFRICA SUPPLEMENTARY ONLINE APPENDIX FOR: TECHNOLOGY AND COLLECTIVE ACTION: THE EFFECT OF CELL PHONE COVERAGE ON POLITICAL VIOLENCE IN AFRICA 1. CELL PHONES AND PROTEST The Afrobarometer survey asks whether respondents

More information

Lecture 10: Alternatives to OLS with limited dependent variables, part 1. PEA vs APE Logit/Probit

Lecture 10: Alternatives to OLS with limited dependent variables, part 1. PEA vs APE Logit/Probit Lecture 10: Alternatives to OLS with limited dependent variables, part 1 PEA vs APE Logit/Probit PEA vs APE PEA: partial effect at the average The effect of some x on y for a hypothetical case with sample

More information

Improving Returns-Based Style Analysis

Improving Returns-Based Style Analysis Improving Returns-Based Style Analysis Autumn, 2007 Daniel Mostovoy Northfield Information Services Daniel@northinfo.com Main Points For Today Over the past 15 years, Returns-Based Style Analysis become

More information

STOCK PRICE PREDICTION: KOHONEN VERSUS BACKPROPAGATION

STOCK PRICE PREDICTION: KOHONEN VERSUS BACKPROPAGATION STOCK PRICE PREDICTION: KOHONEN VERSUS BACKPROPAGATION Alexey Zorin Technical University of Riga Decision Support Systems Group 1 Kalkyu Street, Riga LV-1658, phone: 371-7089530, LATVIA E-mail: alex@rulv

More information

List of tables List of boxes List of screenshots Preface to the third edition Acknowledgements

List of tables List of boxes List of screenshots Preface to the third edition Acknowledgements Table of List of figures List of tables List of boxes List of screenshots Preface to the third edition Acknowledgements page xii xv xvii xix xxi xxv 1 Introduction 1 1.1 What is econometrics? 2 1.2 Is

More information

CHAPTER 4 DATA ANALYSIS Data Hypothesis

CHAPTER 4 DATA ANALYSIS Data Hypothesis CHAPTER 4 DATA ANALYSIS 4.1. Data Hypothesis The hypothesis for each independent variable to express our expectations about the characteristic of each independent variable and the pay back performance

More information

To be two or not be two, that is a LOGISTIC question

To be two or not be two, that is a LOGISTIC question MWSUG 2016 - Paper AA18 To be two or not be two, that is a LOGISTIC question Robert G. Downer, Grand Valley State University, Allendale, MI ABSTRACT A binary response is very common in logistic regression

More information

WC-5 Just How Credible Is That Employer? Exploring GLMs and Multilevel Modeling for NCCI s Excess Loss Factor Methodology

WC-5 Just How Credible Is That Employer? Exploring GLMs and Multilevel Modeling for NCCI s Excess Loss Factor Methodology Antitrust Notice The Casualty Actuarial Society is committed to adhering strictly to the letter and spirit of the antitrust laws. Seminars conducted under the auspices of the CAS are designed solely to

More information

Multiple regression - a brief introduction

Multiple regression - a brief introduction Multiple regression - a brief introduction Multiple regression is an extension to regular (simple) regression. Instead of one X, we now have several. Suppose, for example, that you are trying to predict

More information

A Comparison of Univariate Probit and Logit. Models Using Simulation

A Comparison of Univariate Probit and Logit. Models Using Simulation Applied Mathematical Sciences, Vol. 12, 2018, no. 4, 185-204 HIKARI Ltd, www.m-hikari.com https://doi.org/10.12988/ams.2018.818 A Comparison of Univariate Probit and Logit Models Using Simulation Abeer

More information

Predicting Abnormal Stock Returns with a. Nonparametric Nonlinear Method

Predicting Abnormal Stock Returns with a. Nonparametric Nonlinear Method Predicting Abnormal Stock Returns with a Nonparametric Nonlinear Method Alan M. Safer California State University, Long Beach Department of Mathematics 1250 Bellflower Boulevard Long Beach, CA 90840-1001

More information

Approximating the Confidence Intervals for Sharpe Style Weights

Approximating the Confidence Intervals for Sharpe Style Weights Approximating the Confidence Intervals for Sharpe Style Weights Angelo Lobosco and Dan DiBartolomeo Style analysis is a form of constrained regression that uses a weighted combination of market indexes

More information

Introduction to Population Modeling

Introduction to Population Modeling Introduction to Population Modeling In addition to estimating the size of a population, it is often beneficial to estimate how the population size changes over time. Ecologists often uses models to create

More information

Business Strategies in Credit Rating and the Control of Misclassification Costs in Neural Network Predictions

Business Strategies in Credit Rating and the Control of Misclassification Costs in Neural Network Predictions Association for Information Systems AIS Electronic Library (AISeL) AMCIS 2001 Proceedings Americas Conference on Information Systems (AMCIS) December 2001 Business Strategies in Credit Rating and the Control

More information

The Use of Artificial Neural Network for Forecasting of FTSE Bursa Malaysia KLCI Stock Price Index

The Use of Artificial Neural Network for Forecasting of FTSE Bursa Malaysia KLCI Stock Price Index The Use of Artificial Neural Network for Forecasting of FTSE Bursa Malaysia KLCI Stock Price Index Soleh Ardiansyah 1, Mazlina Abdul Majid 2, JasniMohamad Zain 2 Faculty of Computer System and Software

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Scott Creel Wednesday, September 10, 2014 This exercise extends the prior material on using the lm() function to fit an OLS regression and test hypotheses about effects on a parameter.

More information

International Journal of Research in Engineering Technology - Volume 2 Issue 5, July - August 2017

International Journal of Research in Engineering Technology - Volume 2 Issue 5, July - August 2017 RESEARCH ARTICLE OPEN ACCESS The technical indicator Z-core as a forecasting input for neural networks in the Dutch stock market Gerardo Alfonso Department of automation and systems engineering, University

More information

Gamma Distribution Fitting

Gamma Distribution Fitting Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics

More information

Introduction to the Maximum Likelihood Estimation Technique. September 24, 2015

Introduction to the Maximum Likelihood Estimation Technique. September 24, 2015 Introduction to the Maximum Likelihood Estimation Technique September 24, 2015 So far our Dependent Variable is Continuous That is, our outcome variable Y is assumed to follow a normal distribution having

More information

Iran s Stock Market Prediction By Neural Networks and GA

Iran s Stock Market Prediction By Neural Networks and GA Iran s Stock Market Prediction By Neural Networks and GA Mahmood Khatibi MS. in Control Engineering mahmood.khatibi@gmail.com Habib Rajabi Mashhadi Associate Professor h_mashhadi@ferdowsi.um.ac.ir Electrical

More information

Backpropagation and Recurrent Neural Networks in Financial Analysis of Multiple Stock Market Returns

Backpropagation and Recurrent Neural Networks in Financial Analysis of Multiple Stock Market Returns Backpropagation and Recurrent Neural Networks in Financial Analysis of Multiple Stock Market Returns Jovina Roman and Akhtar Jameel Department of Computer Science Xavier University of Louisiana 7325 Palmetto

More information

A Novel Prediction Method for Stock Index Applying Grey Theory and Neural Networks

A Novel Prediction Method for Stock Index Applying Grey Theory and Neural Networks The 7th International Symposium on Operations Research and Its Applications (ISORA 08) Lijiang, China, October 31 Novemver 3, 2008 Copyright 2008 ORSC & APORC, pp. 104 111 A Novel Prediction Method for

More information

STOCHASTIC COST ESTIMATION AND RISK ANALYSIS IN MANAGING SOFTWARE PROJECTS

STOCHASTIC COST ESTIMATION AND RISK ANALYSIS IN MANAGING SOFTWARE PROJECTS Full citation: Connor, A.M., & MacDonell, S.G. (25) Stochastic cost estimation and risk analysis in managing software projects, in Proceedings of the ISCA 14th International Conference on Intelligent and

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #6 EPSY 905: Maximum Likelihood In This Lecture The basics of maximum likelihood estimation Ø The engine that

More information

A NEW POINT ESTIMATOR FOR THE MEDIAN OF GAMMA DISTRIBUTION

A NEW POINT ESTIMATOR FOR THE MEDIAN OF GAMMA DISTRIBUTION Banneheka, B.M.S.G., Ekanayake, G.E.M.U.P.D. Viyodaya Journal of Science, 009. Vol 4. pp. 95-03 A NEW POINT ESTIMATOR FOR THE MEDIAN OF GAMMA DISTRIBUTION B.M.S.G. Banneheka Department of Statistics and

More information

Web Appendix. Are the effects of monetary policy shocks big or small? Olivier Coibion

Web Appendix. Are the effects of monetary policy shocks big or small? Olivier Coibion Web Appendix Are the effects of monetary policy shocks big or small? Olivier Coibion Appendix 1: Description of the Model-Averaging Procedure This section describes the model-averaging procedure used in

More information

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2010, Mr. Ruey S. Tsay Solutions to Final Exam

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2010, Mr. Ruey S. Tsay Solutions to Final Exam The University of Chicago, Booth School of Business Business 410, Spring Quarter 010, Mr. Ruey S. Tsay Solutions to Final Exam Problem A: (4 pts) Answer briefly the following questions. 1. Questions 1

More information

Final Exam - section 1. Thursday, December hours, 30 minutes

Final Exam - section 1. Thursday, December hours, 30 minutes Econometrics, ECON312 San Francisco State University Michael Bar Fall 2013 Final Exam - section 1 Thursday, December 19 1 hours, 30 minutes Name: Instructions 1. This is closed book, closed notes exam.

More information

Chapter 2 Uncertainty Analysis and Sampling Techniques

Chapter 2 Uncertainty Analysis and Sampling Techniques Chapter 2 Uncertainty Analysis and Sampling Techniques The probabilistic or stochastic modeling (Fig. 2.) iterative loop in the stochastic optimization procedure (Fig..4 in Chap. ) involves:. Specifying

More information

Econometric Methods for Valuation Analysis

Econometric Methods for Valuation Analysis Econometric Methods for Valuation Analysis Margarita Genius Dept of Economics M. Genius (Univ. of Crete) Econometric Methods for Valuation Analysis Cagliari, 2017 1 / 25 Outline We will consider econometric

More information

Journal of Economics and Financial Analysis, Vol:1, No:1 (2017) 1-13

Journal of Economics and Financial Analysis, Vol:1, No:1 (2017) 1-13 Journal of Economics and Financial Analysis, Vol:1, No:1 (2017) 1-13 Journal of Economics and Financial Analysis Type: Double Blind Peer Reviewed Scientific Journal Printed ISSN: 2521-6627 Online ISSN:

More information

Final Exam Suggested Solutions

Final Exam Suggested Solutions University of Washington Fall 003 Department of Economics Eric Zivot Economics 483 Final Exam Suggested Solutions This is a closed book and closed note exam. However, you are allowed one page of handwritten

More information

Stock Price and Index Forecasting by Arbitrage Pricing Theory-Based Gaussian TFA Learning

Stock Price and Index Forecasting by Arbitrage Pricing Theory-Based Gaussian TFA Learning Stock Price and Index Forecasting by Arbitrage Pricing Theory-Based Gaussian TFA Learning Kai Chun Chiu and Lei Xu Department of Computer Science and Engineering The Chinese University of Hong Kong, Shatin,

More information

International Journal of Computer Engineering and Applications, Volume XII, Issue II, Feb. 18, ISSN

International Journal of Computer Engineering and Applications, Volume XII, Issue II, Feb. 18,   ISSN Volume XII, Issue II, Feb. 18, www.ijcea.com ISSN 31-3469 AN INVESTIGATION OF FINANCIAL TIME SERIES PREDICTION USING BACK PROPAGATION NEURAL NETWORKS K. Jayanthi, Dr. K. Suresh 1 Department of Computer

More information

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018 ` Subject CS1 Actuarial Statistics 1 Core Principles Syllabus for the 2019 exams 1 June 2018 Copyright in this Core Reading is the property of the Institute and Faculty of Actuaries who are the sole distributors.

More information

Dividend Strategies for Insurance risk models

Dividend Strategies for Insurance risk models 1 Introduction Based on different objectives, various insurance risk models with adaptive polices have been proposed, such as dividend model, tax model, model with credibility premium, and so on. In this

More information

Consistent estimators for multilevel generalised linear models using an iterated bootstrap

Consistent estimators for multilevel generalised linear models using an iterated bootstrap Multilevel Models Project Working Paper December, 98 Consistent estimators for multilevel generalised linear models using an iterated bootstrap by Harvey Goldstein hgoldstn@ioe.ac.uk Introduction Several

More information

AN AGENT BASED ESTIMATION METHOD OF HOUSEHOLD MICRO-DATA INCLUDING HOUSING INFORMATION FOR THE BASE YEAR IN LAND-USE MICROSIMULATION

AN AGENT BASED ESTIMATION METHOD OF HOUSEHOLD MICRO-DATA INCLUDING HOUSING INFORMATION FOR THE BASE YEAR IN LAND-USE MICROSIMULATION AN AGENT BASED ESTIMATION METHOD OF HOUSEHOLD MICRO-DATA INCLUDING HOUSING INFORMATION FOR THE BASE YEAR IN LAND-USE MICROSIMULATION Kazuaki Miyamoto, Tokyo City University, Japan Nao Sugiki, Docon Co.,

More information

Monte-Carlo Methods in Financial Engineering

Monte-Carlo Methods in Financial Engineering Monte-Carlo Methods in Financial Engineering Universität zu Köln May 12, 2017 Outline Table of Contents 1 Introduction 2 Repetition Definitions Least-Squares Method 3 Derivation Mathematical Derivation

More information

Homework Assignment Section 3

Homework Assignment Section 3 Homework Assignment Section 3 Tengyuan Liang Business Statistics Booth School of Business Problem 1 A company sets different prices for a particular stereo system in eight different regions of the country.

More information

XLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING

XLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING XLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING INTRODUCTION XLSTAT makes accessible to anyone a powerful, complete and user-friendly data analysis and statistical solution. Accessibility to

More information

Ordinal Multinomial Logistic Regression. Thom M. Suhy Southern Methodist University May14th, 2013

Ordinal Multinomial Logistic Regression. Thom M. Suhy Southern Methodist University May14th, 2013 Ordinal Multinomial Logistic Thom M. Suhy Southern Methodist University May14th, 2013 GLM Generalized Linear Model (GLM) Framework for statistical analysis (Gelman and Hill, 2007, p. 135) Linear Continuous

More information

Chapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables

Chapter 5. Continuous Random Variables and Probability Distributions. 5.1 Continuous Random Variables Chapter 5 Continuous Random Variables and Probability Distributions 5.1 Continuous Random Variables 1 2CHAPTER 5. CONTINUOUS RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS Probability Distributions Probability

More information

Modelling the potential human capital on the labor market using logistic regression in R

Modelling the potential human capital on the labor market using logistic regression in R Modelling the potential human capital on the labor market using logistic regression in R Ana-Maria Ciuhu (dobre.anamaria@hotmail.com) Institute of National Economy, Romanian Academy; National Institute

More information

An ex-post analysis of Italian fiscal policy on renovation

An ex-post analysis of Italian fiscal policy on renovation An ex-post analysis of Italian fiscal policy on renovation Marco Manzo, Daniela Tellone VERY FIRST DRAFT, PLEASE DO NOT CITE June 9 th 2017 Abstract In June 2012, the share of dwellings renovation costs

More information

Modeling Private Firm Default: PFirm

Modeling Private Firm Default: PFirm Modeling Private Firm Default: PFirm Grigoris Karakoulas Business Analytic Solutions May 30 th, 2002 Outline Problem Statement Modelling Approaches Private Firm Data Mining Model Development Model Evaluation

More information

Models of Patterns. Lecture 3, SMMD 2005 Bob Stine

Models of Patterns. Lecture 3, SMMD 2005 Bob Stine Models of Patterns Lecture 3, SMMD 2005 Bob Stine Review Speculative investing and portfolios Risk and variance Volatility adjusted return Volatility drag Dependence Covariance Review Example Stock and

More information

Copyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley.

Copyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley. Appendix: Statistics in Action Part I Financial Time Series 1. These data show the effects of stock splits. If you investigate further, you ll find that most of these splits (such as in May 1970) are 3-for-1

More information

DATABASE AND RESEARCH METHODOLOGY

DATABASE AND RESEARCH METHODOLOGY CHAPTER III DATABASE AND RESEARCH METHODOLOGY The nature of the present study Direct Tax Reforms in India: A Comparative Study of Pre and Post-liberalization periods is such that it requires secondary

More information

The Impact of Financial Parameters on Agricultural Cooperative and Investor-Owned Firm Performance in Greece

The Impact of Financial Parameters on Agricultural Cooperative and Investor-Owned Firm Performance in Greece The Impact of Financial Parameters on Agricultural Cooperative and Investor-Owned Firm Performance in Greece Panagiota Sergaki and Anastasios Semos Aristotle University of Thessaloniki Abstract. This paper

More information

Module 4 Bivariate Regressions

Module 4 Bivariate Regressions AGRODEP Stata Training April 2013 Module 4 Bivariate Regressions Manuel Barron 1 and Pia Basurto 2 1 University of California, Berkeley, Department of Agricultural and Resource Economics 2 University of

More information

COMPARING NEURAL NETWORK AND REGRESSION MODELS IN ASSET PRICING MODEL WITH HETEROGENEOUS BELIEFS

COMPARING NEURAL NETWORK AND REGRESSION MODELS IN ASSET PRICING MODEL WITH HETEROGENEOUS BELIEFS Akademie ved Leske republiky Ustav teorie informace a automatizace Academy of Sciences of the Czech Republic Institute of Information Theory and Automation RESEARCH REPORT JIRI KRTEK COMPARING NEURAL NETWORK

More information

Modeling customer revolving credit scoring using logistic regression, survival analysis and neural networks

Modeling customer revolving credit scoring using logistic regression, survival analysis and neural networks Modeling customer revolving credit scoring using logistic regression, survival analysis and neural networks NATASA SARLIJA a, MIRTA BENSIC b, MARIJANA ZEKIC-SUSAC c a Faculty of Economics, J.J.Strossmayer

More information

Public Opinion about the Pension Reform in Albania

Public Opinion about the Pension Reform in Albania EUROPEAN ACADEMIC RESEARCH Vol. II, Issue 4/ July 2014 ISSN 2286-4822 www.euacademic.org Impact Factor: 3.1 (UIF) DRJI Value: 5.9 (B+) Public Opinion about the Pension Reform in Albania AIDA GUXHO Faculty

More information

Jaime Frade Dr. Niu Interest rate modeling

Jaime Frade Dr. Niu Interest rate modeling Interest rate modeling Abstract In this paper, three models were used to forecast short term interest rates for the 3 month LIBOR. Each of the models, regression time series, GARCH, and Cox, Ingersoll,

More information

Smooth estimation of yield curves by Laguerre functions

Smooth estimation of yield curves by Laguerre functions Smooth estimation of yield curves by Laguerre functions A.S. Hurn 1, K.A. Lindsay 2 and V. Pavlov 1 1 School of Economics and Finance, Queensland University of Technology 2 Department of Mathematics, University

More information

MCMC Package Example

MCMC Package Example MCMC Package Example Charles J. Geyer April 4, 2005 This is an example of using the mcmc package in R. The problem comes from a take-home question on a (take-home) PhD qualifying exam (School of Statistics,

More information

APPLICATION OF ARTIFICIAL NEURAL NETWORK SUPPORTING THE PROCESS OF PORTFOLIO MANAGEMENT IN TERMS OF TIME INVESTMENT ON THE WARSAW STOCK EXCHANGE

APPLICATION OF ARTIFICIAL NEURAL NETWORK SUPPORTING THE PROCESS OF PORTFOLIO MANAGEMENT IN TERMS OF TIME INVESTMENT ON THE WARSAW STOCK EXCHANGE QUANTITATIVE METHODS IN ECONOMICS Vol. XV, No. 2, 2014, pp. 307 316 APPLICATION OF ARTIFICIAL NEURAL NETWORK SUPPORTING THE PROCESS OF PORTFOLIO MANAGEMENT IN TERMS OF TIME INVESTMENT ON THE WARSAW STOCK

More information

Comparison of OLS and LAD regression techniques for estimating beta

Comparison of OLS and LAD regression techniques for estimating beta Comparison of OLS and LAD regression techniques for estimating beta 26 June 2013 Contents 1. Preparation of this report... 1 2. Executive summary... 2 3. Issue and evaluation approach... 4 4. Data... 6

More information

MODEL SELECTION CRITERIA IN R:

MODEL SELECTION CRITERIA IN R: 1. R 2 statistics We may use MODEL SELECTION CRITERIA IN R R 2 = SS R SS T = 1 SS Res SS T or R 2 Adj = 1 SS Res/(n p) SS T /(n 1) = 1 ( ) n 1 (1 R 2 ). n p where p is the total number of parameters. R

More information

SMALL AREA ESTIMATES OF INCOME: MEANS, MEDIANS

SMALL AREA ESTIMATES OF INCOME: MEANS, MEDIANS SMALL AREA ESTIMATES OF INCOME: MEANS, MEDIANS AND PERCENTILES Alison Whitworth (alison.whitworth@ons.gsi.gov.uk) (1), Kieran Martin (2), Cruddas, Christine Sexton, Alan Taylor Nikos Tzavidis (3), Marie

More information

Presented at the 2012 SCEA/ISPA Joint Annual Conference and Training Workshop -

Presented at the 2012 SCEA/ISPA Joint Annual Conference and Training Workshop - Applying the Pareto Principle to Distribution Assignment in Cost Risk and Uncertainty Analysis James Glenn, Computer Sciences Corporation Christian Smart, Missile Defense Agency Hetal Patel, Missile Defense

More information

The duration derby : a comparison of duration based strategies in asset liability management

The duration derby : a comparison of duration based strategies in asset liability management Edith Cowan University Research Online ECU Publications Pre. 2011 2001 The duration derby : a comparison of duration based strategies in asset liability management Harry Zheng David E. Allen Lyn C. Thomas

More information

Credit Risk Modeling Using Excel and VBA with DVD O. Gunter Loffler Peter N. Posch. WILEY A John Wiley and Sons, Ltd., Publication

Credit Risk Modeling Using Excel and VBA with DVD O. Gunter Loffler Peter N. Posch. WILEY A John Wiley and Sons, Ltd., Publication Credit Risk Modeling Using Excel and VBA with DVD O Gunter Loffler Peter N. Posch WILEY A John Wiley and Sons, Ltd., Publication Preface to the 2nd edition Preface to the 1st edition Some Hints for Troubleshooting

More information

Risk management methodology in Latvian economics

Risk management methodology in Latvian economics Risk management methodology in Latvian economics Dr.sc.ing. Irina Arhipova irina@cs.llu.lv Latvia University of Agriculture Faculty of Information Technologies, Liela street 2, Jelgava, LV-3001 Fax: +

More information

Exchange Rate Regime Classification with Structural Change Methods

Exchange Rate Regime Classification with Structural Change Methods Exchange Rate Regime Classification with Structural Change Methods Achim Zeileis Ajay Shah Ila Patnaik http://statmath.wu-wien.ac.at/ zeileis/ Overview Exchange rate regimes What is the new Chinese exchange

More information

The Vasicek adjustment to beta estimates in the Capital Asset Pricing Model

The Vasicek adjustment to beta estimates in the Capital Asset Pricing Model The Vasicek adjustment to beta estimates in the Capital Asset Pricing Model 17 June 2013 Contents 1. Preparation of this report... 1 2. Executive summary... 2 3. Issue and evaluation approach... 4 3.1.

More information

Some Characteristics of Data

Some Characteristics of Data Some Characteristics of Data Not all data is the same, and depending on some characteristics of a particular dataset, there are some limitations as to what can and cannot be done with that data. Some key

More information