A test for balanced coverage across cases and controls as a qualifying criterion in collapsing analysis.
|
|
- Frank Cannon
- 6 years ago
- Views:
Transcription
1 A test for balanced coverage across cases and controls as a qualifying criterion in collapsing analysis. Background and Motivation: Collapsing analyses test the association of qualifying rare variants in defined genomic regions (e.g all consensus coding sequence or CCDS boundaries) with disease phenotype. Qualifying criteria for variants in these analysis include variant quality (e.g read depth, genotype quality), variant functional prediction (e.g protein coding change) and population frequency (minor allele frequency in ExAC or control datasets). However, before these analyses can be carried out, it is essential to control and minimize signal artifacts arising out of differences in sequencing coverage between cases and controls. In the current IGM collapsing analysis framework, the penultimate step before the collapsing analysis on cases and controls is a site coverage harmonization (SCH)[1]. For each genomic site being interrogated in the collapsing run (e.g CCDS sites), we calculate the fraction of cases and fraction of controls covered at a predetermined threshold coverage (e.g 10X) and then calculate the absolute difference in fractional coverage between cases and controls. We then calculate the mean absolute difference from all sites, and then subtract it from the absolute difference values for each site to reflect the deviation from the mean difference, which is then squared to define the variation value for each site. The resulting variation estimates across the million CCDS sites are sorted from largest to smallest and plotted as a cumulative sum of variation plot. The plot is then shifted on a 45 angle to find the peak maximum point. In other words, (y-x) is plotted against x. Here, the x value at which (y-x) is maximized points us to the suggested cutoff index. Any site where the absolute fractional difference is above this threshold is then excluded in subsequent collapsing analysis. This method is effective at pruning sites where the fractional difference between cases and controls is sufficiently high to induce biases in collapsing studies. However, by not normalizing the absolute fractional coverage difference to the cohort mean, we prune well-covered sites that might have a marginally larger coverage difference than poorly covered sites with a smaller difference. For instance, at an absolute difference threshold of 0.05, a site with fractional coverage 0.89 in cases and 0.95 in controls will be pruned while a site with fractional coverage 0.12 in cases and 0.17 in controls will be retained. Additionally, by computing fractional coverage difference across all sites, we add a high computational effort to the collapsing runs. In a typical rare variant collapsing run, we identify only about ~300K sites of the CCDS regions to have a qualifying variant. The CCDS comprises ~33M bases, so for every analysis, the pre-computation of coverage balance constitutes a 100 fold excess of computational load.
2 Methods: To reduce the retention of poorly covered sites at the expense of highly covered sites, we impose a statistical test of independence between case/control status and coverage. At a given site: For x cases covered at 10X, y controls covered at 10x, and s total number of cases, t total number of controls, we can model the number of covered cases X as a Binomial random variable: X ~ Bin(n = number covered samples, p = P(case covered)) If case/control status and coverage status are independent, then: P(case covered) = P(case) = s s + t This allows us to perform a Binomial test (two-sided) on the actual number of covered samples, x: BinomTest(k = x, n = x + y, p = s s + t ) A binomial test as described above can be executed independently at each site, enabling parallelization at the computing level. This method will also resolve the need to pre-compute fractional coverage difference at all CCDS sites to identify a threshold difference as required by the SCH method. We can perform the binomial test of coverage bias as an additional qualifying criteria only on those sites where there is an otherwise qualifying variant identified in a sample, resulting in a 100 fold decrease in computational burden. Results: We implemented a binomial test of coverage and case/control status independence as additional qualifying criterion in ATAV. We used two IGM cohorts to compare the binomial test method with the SCH method (1) A chronic kidney disease cohort with ~10,000 controls and ~1,700 cases, and (2) An idiopathic pulmonary fibrosis cohort with ~4,000 controls and ~200 cases. For each cohort, we analyzed CCDS sites (S) using the SCH method and compiled a list of sites (SSCH) that would be pruned before subsequent collapsing analysis. Independently, we performed the binomial coverage test described above for every CCDS site for the same cohort and identified sites (Sbinom) with a nominal p-value of 0.05 to be pruned prior to collapsing analysis. Finally, we executed a collapsing analysis on the cohort on all CCDS sites without any coverage analysis method (SQV).
3 SQV represents the set of sites where a qualifying variant satisfying typical qualifying criteria for variant quality, function and minor allele frequency is present in at least one sample. We then calculated Qualifying sites pruned with both methods = SSCH Sbinom Qualifying sites uniquely pruned by SCH method = SSCH - Sbinom Qualifying sites uniquely pruned by binomial test method = Sbinom - SSCH CKD cohort (MAF ) CKD cohort (MAF ) IPF cohort Sites pruned by both methods Sites pruned by SCH only Sites pruned by binom test only Table 1. Sites pruned by coverage analysis methods. For all analyses, we found that the SCH method pruned sites vastly in excess of those pruned by the binomial test method (SSCH - Sbinom >> Sbinom - SSCH). We then investigated the mean coverage of the pruned sites to evaluate the overall coverage of sites which are pruned by these methods. We are typically interested sites with high coverage across the cohort, where we have an increased probability for a sample to have a variant that satisfies qualifying criteria. We evaluated fractional coverage difference as determined by the SCH method against the binomial test p- value (Figure 1A) at each site. Sites pruned by the SCH method, but retained by binomial test had a high overall coverage across the cohort (mean fractional coverage across all sites = 0.86, Figure 1B), while sites pruned by binomial test but retained by SCH had low coverage (mean fractional coverage across all sites = 0.13, Figure 1C), implying that the binomial test is capable of rescuing sites with high coverage that are otherwise pruned by the SCH method.
4 Figure 1. (A) Scatter plot of absolute difference of coverage fraction against a binomial test p-value for 100,000 CCDS sites. Lower left quadrant represents sites that are pruned due a nominally significant p-value of 0.05 in binomial test, but retained in SCH method. Upper right quadrant represents sites that are retained by a binomial test but pruned by SCH method. (B) Frequency histogram of cohort fraction coverage for sites retained by SCH method and pruned by binomial test. (C) Frequency histogram of cohort fraction coverage for sites retained by binomial test method and pruned by SCH. Inflation: Additionally, we measured the inflation in collapsing results using lambda (the ratio of Observed/Expected p-value at the 50 th percentile of gene p-values after collapsing) to evaluate any unforeseen biases in the analyses through the use of the binomial test. In the two cohorts we evaluated, there was no significant difference in the inflation factor between the two methods, with the binomial test method performing nominally better. Lambda SCH Lambda Binom-test IPF cohort CKD cohort ( MAF) CKD cohort ( MAF) Table 2: Lambda from collapsing analysis using SCH or binomial test to control for coverage imbalance. Qualifying variants in top collapsing genes:
5 We counted the number of variants pruned uniquely by either SCH or the binomial test method within the top ten most significant collapsing analysis genes for each analysis. The binomial test method rescued several qualifying variants in top collapsing genes in each analysis, while the SCH method did not rescue any top gene QVs in any of the analyses. # Binom. test rescued QVs # SCH rescued QVs IPF cohort 42 0 CKD cohort ( MAF) 5 0 CKD cohort ( MAF) 2 0 Table 3: Number of rescued qualifying variants in top 10 most significant collapsing analysis genes ATAV runtime: Eliminating the SCH ATAV step significantly reduces the overall time needed to complete a full collapsing analysis. For the ~11,700 sample CKD cohort, elimination of the SCH step in favor of the binomial test method brought ATAV time down by ~26 hours, while runtime for the IPF cohort decreased by ~13 hours. These reductions are equivalent to around half of the total runtime. Though runtime measurements are affected by overall ATAV load at the time of analysis and are therefore subject to variation, it is clear that the binomial test method has the potential to greatly improve the speed of collapsing analysis. Conclusions: We implemented a test of independence of coverage and case/control status as a qualifying criterion in collapsing analysis. Our test of coverage independence rescued sites with reasonably balanced coverage that were pruned out by SCH method. In general, we found large overlap between sites that were pruned by either method for reasons of coverage imbalance. However, the binomial test uniquely retained fold more sites than it uniquely pruned when compared to SCH. The binomial test method could evaluate several thousand additional variant sites in the CCDS region that are pruned by SCH. The inflation factor, measured by lambda, was not significantly altered between the two methods. Typical collapsing runs require coverage data for the entire cohort to establish minor allele frequency for a variant. Therefore, adding a coverage comparison test on otherwise qualifying variants only marginally added to the compute time for an analysis. Implementing the coverage test as part of the collapsing run resulted in a 50% reduction in ATAV compute load and collapsing analysis time through the elimination of a previously necessary coverage harmonization step. The binomial test for independence of coverage and case-control status is thus a computationally efficient and robust method to control for coverage imbalance in collapsing analysis.
6 REFERENCES: 1. Petrovski, S., et al., An Exome Sequencing Study to Assess the Role of Rare Genetic Variation in Pulmonary Fibrosis. Am J Respir Crit Care Med, (1): p
MAS187/AEF258. University of Newcastle upon Tyne
MAS187/AEF258 University of Newcastle upon Tyne 2005-6 Contents 1 Collecting and Presenting Data 5 1.1 Introduction...................................... 5 1.1.1 Examples...................................
More informationKARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI
88 P a g e B S ( B B A ) S y l l a b u s KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI Course Title : STATISTICS Course Number : BA(BS) 532 Credit Hours : 03 Course 1. Statistical
More informationChapter 5 Normal Probability Distributions
Chapter 5 Normal Probability Distributions Section 5-1 Introduction to Normal Distributions and the Standard Normal Distribution A The normal distribution is the most important of the continuous probability
More informationDiploma in Financial Management with Public Finance
Diploma in Financial Management with Public Finance Cohort: DFM/09/FT Jan Intake Examinations for 2009 Semester II MODULE: STATISTICS FOR FINANCE MODULE CODE: QUAN 1103 Duration: 2 Hours Reading time:
More informationRandom variables The binomial distribution The normal distribution Sampling distributions. Distributions. Patrick Breheny.
Distributions September 17 Random variables Anything that can be measured or categorized is called a variable If the value that a variable takes on is subject to variability, then it the variable is a
More informationRandom variables The binomial distribution The normal distribution Other distributions. Distributions. Patrick Breheny.
Distributions February 11 Random variables Anything that can be measured or categorized is called a variable If the value that a variable takes on is subject to variability, then it the variable is a random
More informationROBUST CHAUVENET OUTLIER REJECTION
Submitted to the Astrophysical Journal Supplement Series Preprint typeset using L A TEX style emulateapj v. 12/16/11 ROBUST CHAUVENET OUTLIER REJECTION M. P. Maples, D. E. Reichart 1, T. A. Berger, A.
More informationDescriptive Statistics
Petra Petrovics Descriptive Statistics 2 nd seminar DESCRIPTIVE STATISTICS Definition: Descriptive statistics is concerned only with collecting and describing data Methods: - statistical tables and graphs
More informationOn-line Appendix: The Mutual Fund Holdings Database
Unexploited Gains from International Diversification: Patterns of Portfolio Holdings around the World Tatiana Didier, Roberto Rigobon, and Sergio L. Schmukler Review of Economics and Statistics, forthcoming
More informationWeek 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.
Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics. Convergent validity: the degree to which results/evidence from different tests/sources, converge on the same conclusion.
More informationTable 1. Summary of Faculty Salary Data for Fall Mean Salary Males. Mean Salary Females. Median Salary Males
Report to the UTK Faculty Senate from the Senate Budget and Planning Committee Analysis of Faculty Salary Data based upon Gender using Data from Fall 2015 Draft August 31, 2016 Louis J. Gross, Chair, Faculty
More informationCSC Advanced Scientific Programming, Spring Descriptive Statistics
CSC 223 - Advanced Scientific Programming, Spring 2018 Descriptive Statistics Overview Statistics is the science of collecting, organizing, analyzing, and interpreting data in order to make decisions.
More informationThe Use of Accounting Information to Estimate Indicators of Customer and Supplier Payment Periods
The Use of Accounting Information to Estimate Indicators of Customer and Supplier Payment Periods Conference Uses of Central Balance Sheet Data Offices Information IFC / ECCBSO / CBRT Özdere-Izmir, September
More informationThe Use of Accounting Information to Estimate Indicators of Customer and Supplier Payment Periods
The Use of Accounting Information to Estimate Indicators of Customer and Supplier Payment Periods Pierrette Heuse David Vivet Dominik Elgg Timm Körting Luis Ángel Maza Antonio Lorente Adrien Boileau François
More informationContinuous Probability Distributions
8.1 Continuous Probability Distributions Distributions like the binomial probability distribution and the hypergeometric distribution deal with discrete data. The possible values of the random variable
More informationSOLUTIONS TO THE LAB 1 ASSIGNMENT
SOLUTIONS TO THE LAB 1 ASSIGNMENT Question 1 Excel produces the following histogram of pull strengths for the 100 resistors: 2 20 Histogram of Pull Strengths (lb) Frequency 1 10 0 9 61 63 6 67 69 71 73
More informationCaught on Tape: Institutional Trading, Stock Returns, and Earnings Announcements
Caught on Tape: Institutional Trading, Stock Returns, and Earnings Announcements The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters.
More informationCHAPTER 5 Sampling Distributions
CHAPTER 5 Sampling Distributions 5.1 The possible values of p^ are 0, 1/3, 2/3, and 1. These correspond to getting 0 persons with lung cancer, 1 with lung cancer, 2 with lung cancer, and all 3 with lung
More informationThe Persistent Effect of Temporary Affirmative Action: Online Appendix
The Persistent Effect of Temporary Affirmative Action: Online Appendix Conrad Miller Contents A Extensions and Robustness Checks 2 A. Heterogeneity by Employer Size.............................. 2 A.2
More information1 Exercise One. 1.1 Calculate the mean ROI. Note that the data is not grouped! Below you find the raw data in tabular form:
1 Exercise One Note that the data is not grouped! 1.1 Calculate the mean ROI Below you find the raw data in tabular form: Obs Data 1 18.5 2 18.6 3 17.4 4 12.2 5 19.7 6 5.6 7 7.7 8 9.8 9 19.9 10 9.9 11
More informationNCSS Statistical Software. Reference Intervals
Chapter 586 Introduction A reference interval contains the middle 95% of measurements of a substance from a healthy population. It is a type of prediction interval. This procedure calculates one-, and
More informationStat 101 Exam 1 - Embers Important Formulas and Concepts 1
1 Chapter 1 1.1 Definitions Stat 101 Exam 1 - Embers Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2.
More informationBinomial Distribution and Discrete Random Variables
3.1 3.3 Binomial Distribution and Discrete Random Variables Prof. Tesler Math 186 Winter 2017 Prof. Tesler 3.1 3.3 Binomial Distribution Math 186 / Winter 2017 1 / 16 Random variables A random variable
More informationLecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 7.4-1
Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Section 7.4-1 Chapter 7 Estimates and Sample Sizes 7-1 Review and Preview 7- Estimating a Population
More informationCA660 Statistical Data Analysis (2013_2014) M.Sc. (DA Major) - backgrounds various. Exercises 2 : Probability Distributions and Applications
CA660 tatistical Data Analysis (03_0) M.c. (DA Major) - backgrounds various Exercises : Probability Distributions and Applications includes conditionals, Decision-making + Classical Inference: ampling
More informationAssessing Normality. Contents. 1 Assessing Normality. 1.1 Introduction. Anthony Tanbakuchi Department of Mathematics Pima Community College
Introductory Statistics Lectures Assessing Normality Department of Mathematics Pima Community College Redistribution of this material is prohibited without written permission of the author 2009 (Compile
More informationTABLE OF CONTENTS - VOLUME 2
TABLE OF CONTENTS - VOLUME 2 CREDIBILITY SECTION 1 - LIMITED FLUCTUATION CREDIBILITY PROBLEM SET 1 SECTION 2 - BAYESIAN ESTIMATION, DISCRETE PRIOR PROBLEM SET 2 SECTION 3 - BAYESIAN CREDIBILITY, DISCRETE
More informationSection The Sampling Distribution of a Sample Mean
Section 5.2 - The Sampling Distribution of a Sample Mean Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin The Sampling Distribution of a Sample Mean Example: Quality control check of light
More informationFundamentals of Statistics
CHAPTER 4 Fundamentals of Statistics Expected Outcomes Know the difference between a variable and an attribute. Perform mathematical calculations to the correct number of significant figures. Construct
More informationCopyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley.
Appendix: Statistics in Action Part I Financial Time Series 1. These data show the effects of stock splits. If you investigate further, you ll find that most of these splits (such as in May 1970) are 3-for-1
More informationMSCI US EQUITY INDEXES METHODOLOGY
INDEX METHODOLOGY MSCI US EQUITY INDEXES METHODOLOGY Index Construction Objectives and Methodology for the MSCI US Equity Indexes July 2018 JULY 2018 CONTENTS 1 US Equity Indexes Methodology Overview...
More informationAspects of Sample Allocation in Business Surveys
Aspects of Sample Allocation in Business Surveys Gareth James, Mark Pont and Markus Sova Office for National Statistics, Government Buildings, Cardiff Road, NEWPORT, NP10 8XG, UK. Gareth.James@ons.gov.uk,
More informationLecture 6: Non Normal Distributions
Lecture 6: Non Normal Distributions and their Uses in GARCH Modelling Prof. Massimo Guidolin 20192 Financial Econometrics Spring 2015 Overview Non-normalities in (standardized) residuals from asset return
More informationForecasting Chapter 14
Forecasting Chapter 14 14-01 Forecasting Forecast: A prediction of future events used for planning purposes. It is a critical inputs to business plans, annual plans, and budgets Finance, human resources,
More informationGamma Distribution Fitting
Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics
More informationChapter 3 Statistical Quality Control, 7th Edition by Douglas C. Montgomery. Copyright (c) 2013 John Wiley & Sons, Inc.
1 3.1 Describing Variation Stem-and-Leaf Display Easy to find percentiles of the data; see page 69 2 Plot of Data in Time Order Marginal plot produced by MINITAB Also called a run chart 3 Histograms Useful
More informationStochastic Analysis Of Long Term Multiple-Decrement Contracts
Stochastic Analysis Of Long Term Multiple-Decrement Contracts Matthew Clark, FSA, MAAA and Chad Runchey, FSA, MAAA Ernst & Young LLP January 2008 Table of Contents Executive Summary...3 Introduction...6
More informationSTAT 157 HW1 Solutions
STAT 157 HW1 Solutions http://www.stat.ucla.edu/~dinov/courses_students.dir/10/spring/stats157.dir/ Problem 1. 1.a: (6 points) Determine the Relative Frequency and the Cumulative Relative Frequency (fill
More informationMUTUAL FUND PERFORMANCE ANALYSIS PRE AND POST FINANCIAL CRISIS OF 2008
MUTUAL FUND PERFORMANCE ANALYSIS PRE AND POST FINANCIAL CRISIS OF 2008 by Asadov, Elvin Bachelor of Science in International Economics, Management and Finance, 2015 and Dinger, Tim Bachelor of Business
More informationRisk Management CHAPTER 12
Risk Management CHAPTER 12 Concept of Risk Management Types of Risk in Investments Risks specific to Alternative Investments Risk avoidance Benchmarking Performance attribution Asset allocation strategies
More informationMSCI US Equity Indices Methodology
Index Construction Objectives and Methodology for the MSCI US Equity Indices Contents Section 1: US Equity Indices Methodology Overview... 5 1.1 Introduction... 5 1.2 Defining the US Equity Market Capitalization
More information1 Bayesian Bias Correction Model
1 Bayesian Bias Correction Model Assuming that n iid samples {X 1,...,X n }, were collected from a normal population with mean µ and variance σ 2. The model likelihood has the form, P( X µ, σ 2, T n >
More informationInternet Appendix to Do the Rich Get Richer in the Stock Market? Evidence from India
Internet Appendix to Do the Rich Get Richer in the Stock Market? Evidence from India John Y. Campbell, Tarun Ramadorai, and Benjamin Ranish 1 First draft: March 2018 1 Campbell: Department of Economics,
More informationFV N = PV (1+ r) N. FV N = PVe rs * N 2011 ELAN GUIDES 3. The Future Value of a Single Cash Flow. The Present Value of a Single Cash Flow
QUANTITATIVE METHODS The Future Value of a Single Cash Flow FV N = PV (1+ r) N The Present Value of a Single Cash Flow PV = FV (1+ r) N PV Annuity Due = PVOrdinary Annuity (1 + r) FV Annuity Due = FVOrdinary
More informationBasic Procedure for Histograms
Basic Procedure for Histograms 1. Compute the range of observations (min. & max. value) 2. Choose an initial # of classes (most likely based on the range of values, try and find a number of classes that
More informationGRAMMATICAL EVOLUTION. Peter Černo
GRAMMATICAL EVOLUTION Peter Černo Grammatical Evolution (GE) Is an evolutionary algorithm that can evolve programs. Representation: linear genome + predefined grammar. Each individual: variable-length
More informationAppendices. Strained Schools Face Bleak Future: Districts Foresee Budget Cuts, Teacher Layoffs, and a Slowing of Education Reform Efforts
Appendices Strained Schools Face Bleak Future: Districts Foresee Budget Cuts, Teacher Layoffs, and a Slowing of Education Reform Efforts Appendix 1: Confidence Intervals and Statistical Significance Many
More informationDecision Trees An Early Classifier
An Early Classifier Jason Corso SUNY at Buffalo January 19, 2012 J. Corso (SUNY at Buffalo) Trees January 19, 2012 1 / 33 Introduction to Non-Metric Methods Introduction to Non-Metric Methods We cover
More informationMATHEMATICS APPLIED TO BIOLOGICAL SCIENCES MVE PA 07. LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1)
LP07 DESCRIPTIVE STATISTICS - Calculating of statistical indicators (1) Descriptive statistics are ways of summarizing large sets of quantitative (numerical) information. The best way to reduce a set of
More informationLongitudinal Analysis Report. Lebanon Valley College
Longitudinal Analysis Report Lebanon Valley College Time Span 1: 7/1/2014-6/30/2015 Total Tests = 19 Outbound = 19 Academic Level: Masters Aggregates: ACBSP (US) - Accreditation Council for Business Schools
More informationChapter 6 Analyzing Accumulated Change: Integrals in Action
Chapter 6 Analyzing Accumulated Change: Integrals in Action 6. Streams in Business and Biology You will find Excel very helpful when dealing with streams that are accumulated over finite intervals. Finding
More informationMBEJ 1023 Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment
MBEJ 1023 Planning Analytical Methods Dr. Mehdi Moeinaddini Dept. of Urban & Regional Planning Faculty of Built Environment Contents What is statistics? Population and Sample Descriptive Statistics Inferential
More informationAIRCURRENTS: PORTFOLIO OPTIMIZATION FOR REINSURERS
MARCH 12 AIRCURRENTS: PORTFOLIO OPTIMIZATION FOR REINSURERS EDITOR S NOTE: A previous AIRCurrent explored portfolio optimization techniques for primary insurance companies. In this article, Dr. SiewMun
More informationPRMIA Exam 8002 PRM Certification - Exam II: Mathematical Foundations of Risk Measurement Version: 6.0 [ Total Questions: 132 ]
s@lm@n PRMIA Exam 8002 PRM Certification - Exam II: Mathematical Foundations of Risk Measurement Version: 6.0 [ Total Questions: 132 ] Question No : 1 A 2-step binomial tree is used to value an American
More informationSession Window. Variable Name Row. Worksheet Window. Double click on MINITAB icon. You will see a split screen: Getting Started with MINITAB
STARTING MINITAB: Double click on MINITAB icon. You will see a split screen: Session Window Worksheet Window Variable Name Row ACTIVE WINDOW = BLUE INACTIVE WINDOW = GRAY f(x) F(x) Getting Started with
More informationCost Distribution Analysis of Remote Monitoring System Use in the Treatment of Chronic Diseases
University of Arkansas, Fayetteville ScholarWorks@UARK Industrial Engineering Undergraduate Honors Theses Industrial Engineering 5-2013 Cost Distribution Analysis of Remote Monitoring System Use in the
More informationRisk-Based Capital (RBC) Reserve Risk Charges Improvements to Current Calibration Method
Risk-Based Capital (RBC) Reserve Risk Charges Improvements to Current Calibration Method Report 7 of the CAS Risk-based Capital (RBC) Research Working Parties Issued by the RBC Dependencies and Calibration
More informationConsistent estimators for multilevel generalised linear models using an iterated bootstrap
Multilevel Models Project Working Paper December, 98 Consistent estimators for multilevel generalised linear models using an iterated bootstrap by Harvey Goldstein hgoldstn@ioe.ac.uk Introduction Several
More informationStatistics and Probability
Statistics and Probability Continuous RVs (Normal); Confidence Intervals Outline Continuous random variables Normal distribution CLT Point estimation Confidence intervals http://www.isrec.isb-sib.ch/~darlene/geneve/
More informationMSCI Global Investable Market Indices Methodology
MSCI Global Investable Market Indices Methodology Index Construction Objectives, Guiding Principles and Methodology for the MSCI Global Investable Market Indices Contents Outline of the Methodology Book...
More informationSAMPLE. HSC formula sheet. Sphere V = 4 πr. Volume. A area of base
Area of an annulus A = π(r 2 r 2 ) R radius of the outer circle r radius of the inner circle HSC formula sheet Area of an ellipse A = πab a length of the semi-major axis b length of the semi-minor axis
More informationData that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc.
Chapter 8 Measures of Center Data that can be any numerical value are called continuous. These are usually things that are measured, such as height, length, time, speed, etc. Data that can only be integer
More informationLongitudinal Analysis Report. Embry-Riddle Aeronautical University - Worldwide Campus
Longitudinal Analysis Report Embry-Riddle Aeronautical University - Worldwide Campus Time Span 1: 7/1/2013-6/30/2014 Total Tests = 0 Outbound = 0 Time Span 2: 7/1/2014-6/30/2015 Total Tests = 0 Outbound
More informationOnline Appendix (Not For Publication)
A Online Appendix (Not For Publication) Contents of the Appendix 1. The Village Democracy Survey (VDS) sample Figure A1: A map of counties where sample villages are located 2. Robustness checks for the
More informationConsiderations for Sampling from a Skewed Population: Establishment Surveys
Considerations for Sampling from a Skewed Population: Establishment Surveys Marcus E. Berzofsky and Stephanie Zimmer 1 Abstract Establishment surveys often have the challenge of highly-skewed target populations
More informationRubric TESTING FRAMEWORK FOR EARLY WARNING INDICATORS CONTENTS
TESTING FRAMEWORK FOR EARLY WARNING INDICATORS Joint project by: Ģirts Maslinarskis (Latvijas Banka), Jussi Leinonen (ECB) & Matti Hellqvist (ECB) 12th Payment and Settlement System Simulation Seminar
More informationDATA SUMMARIZATION AND VISUALIZATION
APPENDIX DATA SUMMARIZATION AND VISUALIZATION PART 1 SUMMARIZATION 1: BUILDING BLOCKS OF DATA ANALYSIS 294 PART 2 PART 3 PART 4 VISUALIZATION: GRAPHS AND TABLES FOR SUMMARIZING AND ORGANIZING DATA 296
More informationTHE EUROSYSTEM S EXPERIENCE WITH FORECASTING AUTONOMOUS FACTORS AND EXCESS RESERVES
THE EUROSYSTEM S EXPERIENCE WITH FORECASTING AUTONOMOUS FACTORS AND EXCESS RESERVES reserve requirements, together with its forecasts of autonomous excess reserves, form the basis for the calibration of
More informationProbability and distributions
2 Probability and distributions The concepts of randomness and probability are central to statistics. It is an empirical fact that most experiments and investigations are not perfectly reproducible. The
More informationSpecific Objectives. Be able to: Apply graphical frequency analysis for data that fit the Log- Pearson Type 3 Distribution
CVEEN 4410: Engineering Hydrology (continued) : Topic and Goal: Use frequency analysis of historical data to forecast hydrologic events Specific Be able to: Apply graphical frequency analysis for data
More informationRandom Variables and Probability Distributions
Chapter 3 Random Variables and Probability Distributions Chapter Three Random Variables and Probability Distributions 3. Introduction An event is defined as the possible outcome of an experiment. In engineering
More informationLecture Week 4 Inspecting Data: Distributions
Lecture Week 4 Inspecting Data: Distributions Introduction to Research Methods & Statistics 2013 2014 Hemmo Smit So next week No lecture & workgroups But Practice Test on-line (BB) Enter data for your
More informationMath 140 Introductory Statistics
Math 140 Introductory Statistics Let s make our own sampling! If we use a random sample (a survey) or if we randomly assign treatments to subjects (an experiment) we can come up with proper, unbiased conclusions
More informationMonte Carlo Simulation (General Simulation Models)
Monte Carlo Simulation (General Simulation Models) Revised: 10/11/2017 Summary... 1 Example #1... 1 Example #2... 10 Summary Monte Carlo simulation is used to estimate the distribution of variables when
More informationStatistical Intervals (One sample) (Chs )
7 Statistical Intervals (One sample) (Chs 8.1-8.3) Confidence Intervals The CLT tells us that as the sample size n increases, the sample mean X is close to normally distributed with expected value µ and
More informationGraphical and Tabular Methods in Descriptive Statistics. Descriptive Statistics
Graphical and Tabular Methods in Descriptive Statistics MATH 3342 Section 1.2 Descriptive Statistics n Graphs and Tables n Numerical Summaries Sections 1.3 and 1.4 1 Why graph data? n The amount of data
More informationData Distributions and Normality
Data Distributions and Normality Definition (Non)Parametric Parametric statistics assume that data come from a normal distribution, and make inferences about parameters of that distribution. These statistical
More informationChapter 4. The Normal Distribution
Chapter 4 The Normal Distribution 1 Chapter 4 Overview Introduction 4-1 Normal Distributions 4-2 Applications of the Normal Distribution 4-3 The Central Limit Theorem 4-4 The Normal Approximation to the
More informationChapter 5. Forecasting. Learning Objectives
Chapter 5 Forecasting To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing
More informationBinomial distribution
Binomial distribution Jon Michael Gran Department of Biostatistics, UiO MF9130 Introductory course in statistics Tuesday 24.05.2010 1 / 28 Overview Binomial distribution (Aalen chapter 4, Kirkwood and
More informationMortality of Beneficiaries of Charitable Gift Annuities 1 Donald F. Behan and Bryan K. Clontz
Mortality of Beneficiaries of Charitable Gift Annuities 1 Donald F. Behan and Bryan K. Clontz Abstract: This paper is an analysis of the mortality rates of beneficiaries of charitable gift annuities. Observed
More informationLAB 2 INSTRUCTIONS PROBABILITY DISTRIBUTIONS IN EXCEL
LAB 2 INSTRUCTIONS PROBABILITY DISTRIBUTIONS IN EXCEL There is a wide range of probability distributions (both discrete and continuous) available in Excel. They can be accessed through the Insert Function
More informationAlternate Specifications
A Alternate Specifications As described in the text, roughly twenty percent of the sample was dropped because of a discrepancy between eligibility as determined by the AHRQ, and eligibility according to
More informationTutorial Handout Statistics, CM-0128M Descriptive Statistics
Tutorial Handout Statistics, CM-0128M January 18, 2013 Exercise 1. The following figures show the annual salaries in of 20 workers in a small firm. Calculate the arithmetic mean, median and mode salaries.
More informationAccolade: The Effect of Personalized Advocacy on Claims Cost
Aon U.S. Health & Benefits Accolade: The Effect of Personalized Advocacy on Claims Cost A Case Study of Two Employer Groups October, 2018 Risk. Reinsurance. Human Resources. Preparation of This Report
More information23.1 Probability Distributions
3.1 Probability Distributions Essential Question: What is a probability distribution for a discrete random variable, and how can it be displayed? Explore Using Simulation to Obtain an Empirical Probability
More informationThe Binomial Distribution
The Binomial Distribution Patrick Breheny February 16 Patrick Breheny STA 580: Biostatistics I 1/38 Random variables The Binomial Distribution Random variables The binomial coefficients The binomial distribution
More informationEENG473 Mobile Communications Module 3 : Week # (11) Mobile Radio Propagation: Large-Scale Path Loss
EENG473 Mobile Communications Module 3 : Week # (11) Mobile Radio Propagation: Large-Scale Path Loss Practical Link Budget Design using Path Loss Models Most radio propagation models are derived using
More informationContents. An Overview of Statistical Applications CHAPTER 1. Contents (ix) Preface... (vii)
Contents (ix) Contents Preface... (vii) CHAPTER 1 An Overview of Statistical Applications 1.1 Introduction... 1 1. Probability Functions and Statistics... 1..1 Discrete versus Continuous Functions... 1..
More informationIntroduction to QTL (Quantitative Trait Loci) & LOD analysis Steven M. Carr / Biol 4241 / Winter Study Design of Hamer et al.
Introduction to QTL (Quantitative Trait Loci) & LOD analysis Steven M. Carr / Biol 4241 / Winter 2016 Quantitative Trait Loci: contribution of multiple genes to a single trait Linkage between phenotypic
More informationStatistical Intervals. Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage
7 Statistical Intervals Chapter 7 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Confidence Intervals The CLT tells us that as the sample size n increases, the sample mean X is close to
More informationAnd The Winner Is? How to Pick a Better Model
And The Winner Is? How to Pick a Better Model Part 2 Goodness-of-Fit and Internal Stability Dan Tevet, FCAS, MAAA Goodness-of-Fit Trying to answer question: How well does our model fit the data? Can be
More informationSTATISTICAL DISTRIBUTIONS AND THE CALCULATOR
STATISTICAL DISTRIBUTIONS AND THE CALCULATOR 1. Basic data sets a. Measures of Center - Mean ( ): average of all values. Characteristic: non-resistant is affected by skew and outliers. - Median: Either
More informationMEMORANDUM. From: Division of Risk, Strategy, and Financial Innovation 1
MEMORANDUM To: File From: Division of Risk, Strategy, and Financial Innovation 1 Re: Information regarding activities and positions of participants in the singlename credit default swap market Date: 3/15/2012
More informationChapter 5 Discrete Probability Distributions. Random Variables Discrete Probability Distributions Expected Value and Variance
Chapter 5 Discrete Probability Distributions Random Variables Discrete Probability Distributions Expected Value and Variance.40.30.20.10 0 1 2 3 4 Random Variables A random variable is a numerical description
More informationSummarising Data. Summarising Data. Examples of Types of Data. Types of Data
Summarising Data Summarising Data Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester Today we will consider Different types of data Appropriate ways to summarise these data 17/10/2017
More informationGenetic testing anti-selection risk and
Genetic testing anti-selection risk and implications for insurers Florian Rechfeld Senior Research Analyst, Life & Health R&D, Swiss Re CRO Assembly, 31 th May 2018 Trends and prospects in genetic testing
More informationCFA Level I - LOS Changes
CFA Level I - LOS Changes 2018-2019 Topic LOS Level I - 2018 (529 LOS) LOS Level I - 2019 (525 LOS) Compared Ethics 1.1.a explain ethics 1.1.a explain ethics Ethics Ethics 1.1.b 1.1.c describe the role
More informationThe Central Limit Theorem
The Central Limit Theorem Patrick Breheny March 1 Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 1 / 29 Kerrich s experiment Introduction The law of averages Mean and SD of
More informationCFA Level I - LOS Changes
CFA Level I - LOS Changes 2017-2018 Topic LOS Level I - 2017 (534 LOS) LOS Level I - 2018 (529 LOS) Compared Ethics 1.1.a explain ethics 1.1.a explain ethics Ethics 1.1.b describe the role of a code of
More information