Summary of Statistical Analysis Tools EDAD 5630

Similar documents
SPSS I: Menu Basics Practice Exercises Target Software & Version: SPSS V Last Updated on January 17, 2007 Created by Jennifer Ortman

Descriptive Statistics

Descriptive Statistics

Monte Carlo Simulation (Random Number Generation)

Table of Contents. New to the Second Edition... Chapter 1: Introduction : Social Research...

Data screening, transformations: MRC05

Establishing a framework for statistical analysis via the Generalized Linear Model

Dot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.

Influence of Personal Factors on Health Insurance Purchase Decision

MBEJ 1023 Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

SPSS Reliability Example

starting on 5/1/1953 up until 2/1/2017.

Categorical. A general name for non-numerical data; the data is separated into categories of some kind.

Terms & Characteristics

Descriptive Analysis

DATA SUMMARIZATION AND VISUALIZATION

Monte Carlo Simulation (General Simulation Models)

LAB 2 INSTRUCTIONS PROBABILITY DISTRIBUTIONS IN EXCEL

Description of Data I

XLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING

PSYCHOLOGICAL STATISTICS

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

Module Tag PSY_P2_M 7. PAPER No.2: QUANTITATIVE METHODS MODULE No.7: NORMAL DISTRIBUTION

How to Use Fundamental Data in TradingExpert Pro

CHAPTER 2 Describing Data: Numerical

Prepared By. Handaru Jati, Ph.D. Universitas Negeri Yogyakarta.

TAA Scheduling. User s Guide

DATA HANDLING Five-Number Summary

The Normal Distribution & Descriptive Statistics. Kin 304W Week 2: Jan 15, 2012

MATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1)

Basic Procedure for Histograms

Fundamentals of Statistics

9/17/2015. Basic Statistics for the Healthcare Professional. Relax.it won t be that bad! Purpose of Statistic. Objectives

Lectures delivered by Prof.K.K.Achary, YRC

Creating a Standard AssetMatch Proposal in Advisor Workstation 2.0

DECISION SUPPORT Risk handout. Simulating Spreadsheet models

Software Tutorial ormal Statistics

Lecture 2 Describing Data

CHAPTER 6 DATA ANALYSIS AND INTERPRETATION

SOLUTIONS TO THE LAB 1 ASSIGNMENT

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

IOP 201-Q (Industrial Psychological Research) Tutorial 5

3. Entering transactions

Manual for the TI-83, TI-84, and TI-89 Calculators

Bidding Decision Example

Discrete Probability Distributions

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

How To: Perform a Process Capability Analysis Using STATGRAPHICS Centurion

Lecture 1: Review and Exploratory Data Analysis (EDA)

22.2 Shape, Center, and Spread

NCSS Statistical Software. Reference Intervals

Exploratory Data Analysis (EDA)

DESCRIPTIVE STATISTICS

STAB22 section 1.3 and Chapter 1 exercises

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis

Review: Types of Summary Statistics

Formulating Models of Simple Systems using VENSIM PLE

Math 227 Elementary Statistics. Bluman 5 th edition

Washington University Fall Economics 487. Project Proposal due Monday 10/22 Final Project due Monday 12/3

Importing Fundamental Data

Summarising Data. Summarising Data. Examples of Types of Data. Types of Data

REGIONAL WORKSHOP ON TRAFFIC FORECASTING AND ECONOMIC PLANNING

Learning The Expert Allocator by Investment Technologies

GL Budgets. Account Budget and Forecast. Account Budgets and Forecasts Menu

Graphing Calculator Appendix

Point-Biserial and Biserial Correlations

Using the Clients & Portfolios Module in Advisor Workstation

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Tax Fairness Dimensions In An Asian Context: The Malaysian Perspective

7. Portfolio Simulation and Pick of the Day

Session 5: Associations

5.- RISK ANALYSIS. Business Plan

Lecture Week 4 Inspecting Data: Distributions

Getting to know a data-set (how to approach data) Overview: Descriptives & Graphing

IPUMS Int.l Extraction and Analysis

Chapter 6 Part 3 October 21, Bootstrapping

chapter 2-3 Normal Positive Skewness Negative Skewness

Logistic Regression Analysis

Getting to know data. Play with data get to know it. Image source: Descriptives & Graphing

appstats5.notebook September 07, 2016 Chapter 5

MEASURES OF DISPERSION, RELATIVE STANDING AND SHAPE. Dr. Bijaya Bhusan Nanda,

Math 2200 Fall 2014, Exam 1 You may use any calculator. You may not use any cheat sheet.

Frequency Distribution and Summary Statistics

GGraph. Males Only. Premium. Experience. GGraph. Gender. 1 0: R 2 Linear = : R 2 Linear = Page 1

You should already have a worksheet with the Basic Plus Plan details in it as well as another plan you have chosen from ehealthinsurance.com.

1 Exercise One. 1.1 Calculate the mean ROI. Note that the data is not grouped! Below you find the raw data in tabular form:

2CORE. Summarising numerical data: the median, range, IQR and box plots

Contents. Chapter 1: Using this manual 1. Chapter 2: Entering plan assumptions 7. Chapter 3: Entering net worth information 29

Measures of Central Tendency Lecture 5 22 February 2006 R. Ryznar

NOTES: Chapter 4 Describing Data

Research Wizard: UPGRADE (March 2006) Descriptions and Screenshots

STA 248 H1S Winter 2008 Assignment 1 Solutions

Measures of Central Tendency: Ungrouped Data. Mode. Median. Mode -- Example. Median: Example with an Odd Number of Terms

SPSS t tests (and NP Equivalent)

SFSU FIN822 Project 1

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

Oracle Financial Services Market Risk User Guide

CHAPTER 6. ' From the table the z value corresponding to this value Z = 1.96 or Z = 1.96 (d) P(Z >?) =

Transcription:

Summary of Statistical Analysis Tools EDAD 5630 Test Name Program Used Purpose Steps Main Uses/Applications in Schools Principal Component Analysis SPSS Measure Underlying Constructs Reliability SPSS Measure Reliability of Latent Construct Item Response Theory Descriptive Measures MPlus http://www.statmodel.com SPSS Examine psychometric properties of assessments. Examine descriptive measures of individual variables Correlation SPSS Determine the linear relation between two variables Independent Sample T-test SPSS Examine mean score difference between two independent groups See Below See Below See Below See Below See Below See Below Examine latent constructs of an assessment Determine How well individual items measure a latent construct Examine Test Information Curves, item difficulty, and item discrimination Examne measures of central tendency among continuous variables or frequencies and percent among categorical variables Examine relation between two variables (i.e., TAKS Objective score to total test score) Example- Examine mean difference between males and females or two teachers.

Paired Sample T-Test SPSS To examine change from 1 test to next for same individuals One-Way ANOVA SPSS Examine mean differences between three or more groups Growth Modeling SPSS To determine the amount of growth students make between assessments. Financial Efficiency Frontier Analysis To determine financial efficieny among campuses See Below See Below See Below See Below Example Determine if statistically significant change has occurred in exams from beginning of school to end of school Example Compare mean scroes among race/ethnicity or three or more teachers To determine the amount of growth students make between assessments. Example compare expenditures in relation to student achievement to determin if getting adequate return on investment. PRINCIPAL COMPONENT ANALYSIS The main idea of this method is to form, from a set of existing variables, a new variable (or new variables, but as few as possible) that contain as much variability of the original data as possible. This is a method of data reduction; we reduce the number of variables in order to handle data more easily. In most cases we wish to get only one dimension (variable) that contains most of the variability of the original data. This variable than represents some sort of index of a certain property that is measured by the original variables. For example: - we are measuring the development of a region. We measure the differences with several variables (e.g. GDP/pc, infant mortality,...). With the help of principal component analysis we can construct an index of development. - a controller in a factory has several indicators of quality - with principal

components analysis we can construct a quality index PRINCIPAL COMPONENT ANALYSIS WITH SPSS PROCEDURE FACTOR ANALYSIS SPSS can perform principal component analysis, but the procedure for doing so is hidden within the procedure for factor analysis. Procedure can perform the analysis with standardized and original (non-standardized) data. With this procedure we can - compute descriptive statistics for all variables - make the correlation matrix - compute communalities - compute the share of variance of original data, explained by each and all components - plot the scree-plot COMPUTATION OF THE PARAMETERS OF PRINCIPAL COMPONENTS ANALYSIS 1. Enter or load the data 2. Select Analyze Dimension Reduction Factor; we get the menu Factor Analysis (Figure 1)

3. In the left box we select the variables that we want to enter into the principal components analysis and transfer them into the right box. 4. Click Extraction...; we get the menu Factor Analysis: Extraction (Figure 2). The option for performing principal components analysis is Principal Components in the field Method. Other options in this field are for factor analysis.. 5. We click OK, the window Factor Analysis closes and the results of the analysis appear in the Viewer window.

Figure 2: Dialog window Factor Analysis: Extraction In the box Analyze we can set, whether the analysis will be performed on original (nonstandardized) (Covariance matrix) or standardized data (Correlation matrix). When choosing the analysis on original data, the importance of a variable is determined by the relative size of its variance higher variance means higher importance of that variable. If we don t want the variability of a variable to determine its importance, we decide to standardize data and so to use the correlation matrix. The decision, which one to use, depends on the nature of the problem. If we think the

variables are more or less equally important, we decide for the standardization; if the variability of the variable is of any importance, we use covariance matrix in the analysis. When variables are of very different measurement sizes (e.g. infant mortality in % against GDP/pc in $) the standardization is usually the only sensible choice. Field Display offers the possibility of printing the unrotated solution (the only one in principal component analysis). The solution can contain only some components; the number of components is set by the rules in the field Extract. Field Display also sets the display of the scree-plot. Scree-plot is useful in determining the number of components needed. In field Extract we set how many components we want to be displayed. We can set the number of components we want or set the cut-off eigenvalue. Default value is 1 in the case of standardized data or the average eigenvalue in case of original data. DESCRIPTIVE STATISTICS AND CORRELATION MATRICES Click Descriptives, which opens the dialog window Factor Analysis: Descriptives (Figure 3). In this dialog we set: - in field Statistics the display of descriptive statistics and the initial solution (all components)

- Figure 3: Dialog window Factor Analysis: Descriptives - in field Correlation Matrix we set the display of correlation matrix, significances,... KMO or Keiser-Meyer-Olin-ova measure of sampling adequacy shows the strength of connection between variables; it can be between 0 and 1, values closer to 1 are more desirable. Bartlet test of sphericity tests for the assumption, that the correlation matrix is an identity matrix (variables are not correlated). In this case, principal component analysis can not be performed. -----------------------------------------------------------------------------------------------------------------------------------------------

Reliability Example A researcher has devised a nine-question questionnaire with which they hope to measure how safe people feel at work at an industrial complex. Each question was a 5-point Likert item from "strongly disagree" to "strongly agree". In order to understand whether the questions in this questionnaire all reliably measure the same latent variable (feeling of safety) (so a Likert scale could be constructed), a Cronbach's alpha was run on a sample size of 15 workers. Setup in SPSS The nine questions have been labelled "Qu1" through to "Qu9". To know how to correctly enter your data into SPSS in order to run a Cronbach's alpha test please read our Entering Data into SPSS tutorial. Test Procedure in SPSS 1. Click Analyze > Scale > Reliability Analysis... on the top menu as shown below:

2. You will be presented with the Reliability Analysis dialogue box:

3. Transfer the variables "Qu1" to "Qu9" into the "Items:" box. You can do this by drag-and-dropping the variables into their respective boxes or by using the button. You will be presented with the following screen:

4. Leave the "Model:" set as "Alpha", which represents Cronbach's alpha in SPSS. If you want to provide a name for the scale enter it in the "Scale label:" box. Since this only prints the name you enter at the top of the SPSS output, it is certainly not essential that you do; and in this case we will leave it blank. 5. Click on the button, which will present the Reliability Analysis: Statistics dialogue box, as shown below:

6. Select the "Item", "Scale" and "Scale if item deleted" in the "Descriptives for" box and "Correlations" in the "Inter-Item" box, as shown below:

Published with written permission from SPSS Inc, an IBM Company. 7. Click the button. This will return you to the Reliability Analysis dialogue box. 8. Click the button to generate the output. SPSS Output for Cronbach's Alpha

SPSS produces many different tables. The first important table is the Reliability Statistics table that provides the actual value for Cronbach's alpha, as shown below: We can see that in our example, Cronbach's alpha is 0.805, which indicates a high level of internal consistency for our scale with this specific sample. Item-Total Statistics The Item-Total Statistics table presents the Cronbach's Alpha if Item Deleted in the final column, as shown below:

This column presents the value that Cronbach's alpha would be if that particular item was deleted from the scale. We can see that removal of any question except question 8, would result in a lower Cronbach's alpha. Therefore, we would not want to remove these questions. Removal of question 8 would lead to a small improvement in Cronbach's alpha and we can also see that the Corrected Item-Total Correlation value was low (0.128) for this item. This might lead us to consider whether we should remove this item. Cronbach's alpha simply provides you with an overall reliability coefficient for a set of variables, e.g. questions. If your questions reflect different underlying personal qualities (or other dimensions), for example, employee motivation and employee commitment, then Cronbach's alpha will not be able to distinguish between these. In order to do this and then check their reliability (using Cronbach's alpha), you will first need to run a test such as a principal components analysis (PCA). --------------------------------------------------------------------------------------------------------------------------------------------------------- IRT Syntax: TITLE: this is an example of a two-parameter logistic item response theory (IRT) model DATA: FILE IS ex5.5.dat; VARIABLE: NAMES ARE u1-u20; CATEGORICAL ARE u1-u20; ANALYSIS: ESTIMATOR = MLR; MODEL: f BY u1-u20*; f@1; OUTPUT: TECH1 TECH8; PLOT: TYPE = PLOT3; ----------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------

Descriptive Measures SPSS descriptive statistics are designed to give you information about the distributions of your variables. SPSS allows you to complete a number of statistical procedures including: measures of central tendency, measures of variability around the mean, measures of deviation from normality, and information concerning the spread of the distribution. For the following instructions: * = A single click of the left mouse button **= A double-click of the left mouse button After opening the file you desire to use, * Analyze, Descriptive Statistics, *Descriptives. Select the variables for which you wish to compute descriptives by clicking the desired variable name in the box to the left and then pasting it into the Variables box to the right by clicking the right arrow in the middle of the screen. See here. If you want to calculate more than four statistics, after selecting the desired variables (and before *OK), *Options. To select the desired descriptive statistics, * on the box next to the procedure you wish to have completed. Under Descriptives:

Options, you can choose a number of statistics. By clicking on the box next to the option, SPSS can perform many different functions. Some additional SPSS features include: o Under the Descriptives: Options, you can also choose the Display Order options (again by * on the the circle next to the option): Variable list: This is the default for this option; this arranges the items in the same order as found in the data editor). Alphabetic: Names of variables are arranged alphabetically. Ascending means: This orders the means from smallest mean value to largest mean value in the output. Descending means: This orders the means from largest mean value to smallest mean value in the output.

Kinds of descriptive statistics that SPSS provides Measures of Central Tendency Central tendency measures give an estimate of how a group did as a whole Mean: the average value of the distribution Median: the middle value of the distribution Mode: the most frequently occurring value ***Note that to calculate both median and mode of your distribution, you need to *Analyze, *Descriptive Statistics, and then *Frequencies. Then * on the boxes of Median and/or Mode under "Central Tendency." ***Note also that percentiles and quartiles are done under frequencies too.

Measures of Variability Variability provides an estimate of how much scores within a group of scores varied. In SPSS they can be found under the "Analyze", "Descriptive Statistics" menus in either the Descriptive or Frequency options. Measure for the size of the distribution: Maximum: largest value in the distribution Minimum: smallest value in the distribution Range of values in the distribution Sum of the scores in the distribution Measures of stability: Standard error Standard error is designed to be a measure of stability or of sampling error. SPSS computes SE for the mean, the kurtosis, and the skewness A small value indicates a greater stability or smaller sampling err Measures of the shape of the distribution (measures of the deviation from normality) Kurtosis: a measure of the "peakedness" or "flatness" of a distribution. A kurtosis value near zero indicates a shape close to normal. A negative value indicates a distribution which is more peaked than normal, and a positive kurtosis indicates a shape flatter than normal. An extreme positive kurtosis indicates a distribution where more of the values are located in the tails of the distribution rather than around the mean. A kurtosis value of +/-1 is considered very good for most psychometric uses, but +/-2 is also usually acceptable. Skewness: the extent to which a distribution of values deviates from symmetry around the mean. A value of zero means the distribution is symmetric, while a positive skewness indicates a greater number of smaller values, and a negative value indicates a greater number of larger values. Values for acceptability for psychometric purposes (+/-1 to +/-2) are the same as with kurtosis.

Normal Distributions No skew (lopsidedness of the distribution) mean > median = positive skew mean < median = negative skew No kurtosis (peakedness or flatness) negative value (very flat) is undesirable positive value (very pointed) is also undesirable Skew and Kurtosis in SPSS Choose Statistics, Descriptives Choose "Options" Select skew and kurtosis Interpretation of Skew and Kurtosis Output Divide Skew by SE Skew and divide Kurtosis by SE Kurtosis Values of 2 or more suggest skew or kurtosis Viewing Normality of Distribution Choose Charts, Histogram Enter variable Check "Display normal curve" Creating Standard Scores A z-score is a standard score obtained by subtracting the mean from a score and dividing by the standard deviation In SPSS, Compute a new variable Or, choose Descriptives and "save standardized values as variables".

Comparing Means Comparing means allows us to look at differences between groups of participants Choose Statistics, Compare Means, Means Continuous variables go in Dependent List Grouping variable goes in Independent List Under "Options," choose statistics Enter second categorical variable as layer Plotting Group Means Choose Graphs, Bar, Simple, Define Choose "Other summary function" and enter dependent variable Independent variable is "Category Axis" Under "Options," uncheck "Display groups defined by missing values" A Clustered Chart can be used for a second grouping variable ----------------------------------------------------------------------------------------------------------------------------------------------------

Correlation Bivariate correlation can be used to determine if two variables are linearly related to each other. Remember that you will want to perform a scatter plot before performing the correlation (to see if the assumptions have been met.) The command for correlation is found at Analyze Correlate Bivariate (this is shorthand for clicking on the Analyze menu item at the top of the window, and then clicking on Correlate from the drop down menu, and Bivariate from the pop up menu.):

The Bivariate Correlations dialog box will appear: Select one of the variables that you want to correlate by clicking on it in the left hand pane of the Bivariate Correlations dialog box. Then click on the arrow button to move the variable into the Variables pane. Click on the other variable that you want to correlate in

the left hand pane and move it into the Variables pane by clicking on the arrow button: Specify whether the test of significance should be one-tailed or two-tailedyou can click on the Options button to have some descriptive statistics calculated. The Options dialog box will appear

: From the Options dialog box, click on "Means and standard deviations" to get some common descriptive statistics. Click on the Continue button in the Options dialog box. Click on OK in the Bivariate Correlations dialog box. The SPSS Output Viewer will appear. In the SPSS Output Viewer, you will see a table with the requested descriptive statistics and correlations.

Independent Samples T-Test 1. Click Analyze > Compare Means > Independent-Samples T Test... on the top menu as shown below. Published with written permission from SPSS Inc, an IBM company. You will be presented with the following:

Published with written permission from SPSS Inc, an IBM company. 2. Put the "Cholesterol Concentration" variable into the "Test Variable(s):" box and the "Treatment" variable into the "Grouping Variable:" box by highlighting the relevant variables and pressing the buttons.

Published with written permission from SPSS Inc, an IBM company. 3. You then need to define the groups (treatments). Press the button. You will be presented with the following screen:

Published with written permission from SPSS Inc, an IBM company. 4. Enter "1" into the "Group 1:" box and enter "2" into the "Group 2:" box. Remember that we labelled the Diet Treatment group as "1" and the Exercise Treatment group as "2". If you have more than 2 treatment groups, e.g. a diet, exercise and drug treatment group, then you could type in "1" to "Group 1:" box and "3" to "Group 2:" box if you wished to compare the diet with drug treatment. Published with written permission from SPSS Inc, an IBM company.

5. Press the button 6. If you need to change the confidence level limits, or change how to exclude cases then press the button. You will be presented with the following: Published with written permission from SPSS Inc, an IBM company. 7. Click the button. 8. Click the button. ----------------------------------------------------------------------------------------------------------------------------------------------------------

Paired Samples T-Test 1. The command for the paired samples t tests is found at Analyze Compare Means Paired-Samples T Test (this is shorthand for clicking on the Analyze menu item at the top of the window, and then clicking on Compare Means from the drop down menu, and Paired-Samples T Test from the pop up menu.):

The Paired-Samples t Test dialog box will appear: You must select a pair of variables that represent the two conditions. Click on one of the variables in the left hand pane of the Paired-Samples t Test dialog box. Then click on the other variable in the left hand pane. Click on the arrow button to move the variables into the Paired Variables pane. In this example, select Older and Younger variables (number of older and younger

siblings) and then click on the arrow button to move the pair into the Paired Variables box: Click on the OK button in the Paired-Samples t Test dialog box to perform the t-test. The output viewer will appear with the results of the t test. ----------------------------------------------------------------------------------------------------------------------------------------------------------------- One-Way Anova 1. SPSS assumes that the independent variable (technically a quasi-independent variable in this case) is represented numerically. In the sample data set, MAJOR is a string. So we must first convert MAJOR from a string variable to a numerical variable. See the tutorial on transforming a variable to learn how to do this. We need to automatically recode the MAJOR variable into a variable called MAJORNUM.

Once you have recoded the independent variable, you are ready to perform the ANOVA. Click on Analyze Compare Means One-Way ANOVA:

The One-Way ANOVA dialog box appears: In the list at the left, click on the variable that corresponds to your dependent variable (the one that was measured.) Move it into the Dependent List by clicking on the upper arrow button. In this example, the GPA is the variable that we recorded, so we

click on it and the upper arrow button: Now select the (quasi) independent variable from the list at the left and click on it. Move it into the Factor box by clicking on the lower arrow button. In this example, the quasi-independent variable is the recoded variable from above, MAJORNUM:

Click on the Post Hoc button to specify the type of multiple comparison that you would like to perform. The Post Hoc dialog box appears:

Consult your statistics text book to decide which post-hoc test is appropriate for you. In this example, I will use a conservative post-hoc test, the Tukey test. Click in the check box next to Tukey (not Tukey's-b):

Click on the Continue Button to return to the One-Way ANOVA dialog box. Click on the Options button in the One-Way ANOVA dialog box. The One-Way ANOVA Options dialog box appears:

Click in the check box to the left of Descriptives (to get descriptive statistics), Homogeneity of Variance (to get a test of the assumption of homogeneity of variance) and Means plot (to get a graph of the means of the conditions.): Click on the Continue button to return to the One-Way ANOVA dialog box. In the One Way ANOVA dialog box, click on the OK button to perform the analysis of variance. The SPSS output window will appear. ------------------------------------------------------------------------------------------------------------------------------------------------------------------