Improved Inference for Signal Discovery Under Exceptionally Low False Positive Error Rates
|
|
- Shanon Rogers
- 5 years ago
- Views:
Transcription
1 Improved Inference for Signal Discovery Under Exceptionally Low False Positive Error Rates (to appear in Journal of Instrumentation) Igor Volobouev & Alex Trindade Dept. of Physics & Astronomy, Texas Tech University Dept. of Mathematics & Statistics, Texas Tech University December 2018 Edgeworth Expansions for Mixture Models 1 / 30
2 Signal Strength Determination by Maximum Likelihood Model is signal/background density mixture without nuisance parameters: p(x α) = αs(x) + (1 α)b(x) (1) Signal fraction α is estimated by maximizing the log-likelihood: l(α) = n log p(x i α). (2) i=1 ˆα := arg max l(α) (3) α R Goal: produce accurate tests of H 0 : α = 0 vs. H 1 : α > 0. Only unknown parameter is α; a toy problem... But we ll make it more realistic at the end. alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 2 / 30
3 Example: Flat Background With Gaussian Signal Let b(x) follow a uniform distribution on [0, 1], and s(x) a truncated Gaussian on [0, 1]: { 1, if x [0, 1] b(x) =, (4) 0, if x [0, 1] s(x) = e (x µ)2 2σ 2 / 1 0 (y µ) 2 e 2σ 2 dy, if x [0, 1] 0, if x [0, 1]. (5) This will be the model used in simulations, and whenever specific settings of the signal are needed, we use µ = 0.5, and σ = 0.1. (6) alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 3 / 30
4 Notation l i (α) = i l/ α i, the i-th derivative of l(α) J(α) = l 2 (α) Expected information number: I (α) = E[J(α)] Observed information number: J(ˆα) = l 2 (ˆα) Assume usual regularity conditions for consistency and asymptotic normality of ˆα are satisfied: n(ˆα α) d N (0, I(α) 1 ), 1 I(α) = lim n n I (α) = ˆα N (α, σ 2ˆα (α)), σ2ˆα (α) = I (α) 1 (By not restricting α [0, 1] we avoid exotic asymptotics at the boundaries...) alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 4 / 30
5 A Gamut of Tests for α In lack of a UMP test, we have the following: Table: Promising statistics for tests on α. Method Statistic Value Likelihood Ratio T LR 2[l(ˆα) l(0)] Wald (Expected) T W ˆα 2 I (0) Wald (Observed) T W2 ˆα 2 J(ˆα) Score T S l 1 (0) 2 /I (0) Wald-type 3 T W3 Wald-type 4 T W4 ˆα 2 /σ3 2 ˆα 2 /σ4 2 The Wald-type 3 & 4 statistics are variants of T W2 (used by physicists) that use shortcuts for computing J(α) so as to avoid differentiating l(α). alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 5 / 30
6 Higher-Order Asymptotics For one-sided testing use signed version of any of the statistics (say T ) in the Table: R = sgn(ˆα) T. Under H 0, to first order R Z, where Z N (0, 1), whence p -value = P(Z > r), r = sgn(ˆα) t In general, R n Z to k-th order, means that approx error = R n Z = O p (n k/2 ) P(R n r) = Φ(r) + a 1,n n 1/2 + a 2,n n 1 + a 3,n n 3/2 + + a k 1,n n (k 1)/2 + O(n k/2 ) alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 6 / 30
7 Tools for Higher-Order Asymptotic Theory Taylor expansions of l(α) near true value of α Joint cumulants for the derivatives of l(α) under H 0 : nν ijkl = (i, j, k, l)-th joint cumulant of {l 1 (0),..., l 4 (0)} Edgeworth-type series: construct an approximate pdf for R (which is approx N (0, 1)) via the Gram-Charlier expansion: f R (z) = φ(z) 1 + β j H j (z), H j (z) are the Hermite polynomials. Coefficients β j are chosen to match the cumulants κ j of R n. j=1 alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 7 / 30
8 Final Edgeworth Expansion for CDF of R Integrate Gram-Charlier expansion and collect terms in powers of n 1/2 : [ F R (z) = Φ(z) φ(z) κ κ 3H 2 (z) + 1 ( 1 2 (κ2 1 + κ 2 1)z + 6 κ 1κ ) 24 κ 4 H 3 (z) + 1 ] 72 κ2 3H 5 (z) + O(n 3/2 ). (7) (There are some technical assumptions on the cumulant behavior of R...) alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 8 / 30
9 Relating Cumulants of Log-Likelihood to those of Statistic The above expression for F R (z) holds for any statistic R which is approx N (0, 1). The challenge is to be able to express (approximate) the κ j (which are unknown) in terms of the ν ijkl (which can be computed)!!! Has to be done case-by-case for each statistic R: start from suitable Taylor expansions in probability, and use some tricks... Required A LOT OF BOOKKEEPING (20th century). In 21st century this can be replaced with careful programming of a symbolic algebra system (Maple/Mathematica). The relationships in the above challenge have been worked out (to 3rd order) for the classical statistics (LR, Wald, Score), so we: corrected some typos in the existing expressions (Severini, 2000), worked out 4th order expansions for the classical statistics, and worked out 3rd order expansions for the non-classical statistics. alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 9 / 30
10 More Notation E denotes expectation under the null : E[q] := q(x)b(x)dx. E s denotes expectation under the signal : E s [q] := q(x)s(x)dx. Defining V i := El i (0), the Edgeworth expansions for all the statistics in Table 1 depend only on the following (dimensionless & location-scale invariant) expressions: γ := ρ := V 4 6V 2 2 V 3 2 ( V 2 ) 3/2 = E s = E s [ s 3 b 3 ] [ s 2 b 2 ] 3E s [ s b ] + 2 ( Es [ s b ] 1 ) 3/2, (8) 4E s [ s 2 b 2 ] + 6E s [ s b ] 3 ( Es [ s b ] 1 ) 2. (9) alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 10 / 30
11 Resulting Higher-Order Distributions of Test Statistics Plugging the (thus) approximated ˆκ j into F R (z) in (7), gives, e.g. Signed Wald statistic: [ ( ) 1 P(R W z) = Φ(z) φ(z) n 1/2 6 γh 2(z) ( 1 + n 1 2 (ρ γ2 1)z + 1 ) 24 (ρ 3)H 3(z) + γ2 72 H 5(z) ] + O(n 3/2 ). Signed likelihood ratio statistic: [ P(R LR z) = Φ(z) φ(z) n 1/2 ( γ 6 + n 1 ( 1 12 (3ρ 2γ2 )z ) ) ] + O(n 3/2 ). alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 11 / 30
12 But Wait: why do we need higher-order asymptotics? When n is small (typically not the case in these experiments). When Type I error rate (q 1 ) is very small..., how small? In signal-hunting particle physics experiments the gold standard is 5σ: q 0 = P(Z > 5) = This puts us way out in the tail of the N (0, 1)... alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 12 / 30
13 Quantifying Deviations From Normality Consider normal approx error R(r) = r r, r = Φ 1 (F R (r)) With Edgeworth-approx F R ( ): r = r to an accuracy of O(n 3/2 ) under H 0 If R is exactly N (0, 1): Large values of R(r): R(r) = 0 Edgeworth-approx had a large effect in normalizing R alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 13 / 30
14 Ex: R(r) for Flat Background With Gaussian Signal Wald (Expected): R W Wald (Observed): R W2 R(r) R(r) r r Score: R S Likelihood Ratio: R LR R(r) R(r) n=200 n=1000 n=5000 n= r r alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 14 / 30
15 Some Deviations From Normality for this Example µ = 0.5 and σ = 0.1, implies γ and ρ Wald statistic at r = 5 for n = 200: R W (r) = 0.25 means p-value is wrong by factor of P(Z > 5)/P(Z > 5.25) 3.8 higher signal significance will be claimed than supported by data. LR statistic at r = 4 for n = 200: R LR (r) = means p-value is wrong by factor of P(Z > 4)/P(Z > 3.995) 0.98 reported signal significance will be about right. alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 15 / 30
16 Simulations: Quantify Accuracy of Edgeworth-Approx m = 10 9 Monte Carlo replicates for n = 200, 1000, 5000, Compared distributions of R with Edgeworth-predictions. For 3 classical stats (R W, R LR, R S ): distributional shape parameters (mean, standard deviation, skewness, kurtosis) are in good agreement with O(n 3/2 ) predictions for n The agreement worsens for n = Statistically significant disagreements with N (0, 1) predictions are in red values in next Table. alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 16 / 30
17 Simulations: Predicted Mean (Equals 0 if Exactly N (0, 1)) O(n 3/2 ) Simulated Simulation n Prediction Value Uncertainty R W R W R W R W R LR R S alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 17 / 30
18 Simulations: Predicted Survival Probabilities (R W & R W2 ) Log base 10 of survival probability R W and n = 200 N(0,1) prediction O(n 1 ) prediction O(n 3 2 ) prediction Log base 10 of survival probability R W and n = r r Log base 10 of survival probability R W2 and n = Log base 10 of survival probability R W2 and n = r r alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 18 / 30
19 Simulations: Predicted Survival Probabilities (R LR & R S ) Log base 10 of survival probability R LR and n = Log base 10 of survival probability R LR and n = r r Log base 10 of survival probability R S and n = Log base 10 of survival probability R S and n = 1000 N(0,1) prediction O(n 1 ) prediction O(n 3 2 ) prediction r r alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 19 / 30
20 Simulations: Quantiles (Predicted vs. Simulated) Table: O(n 3/2 ) Edgeworth predictions for the (1 q 0 )-th quantiles of the statistics in Table 1 (± std. err.), compared to their corresponding values computed based on 10 9 simulations. (Predictions deviating by more than twice the std. err. from their simulated values are bolded.) Statistic Method n = 200 n = 1000 n = 5000 n = R W Predicted ± ± ± ± Simulated R W2 Predicted ± ± ± ± Simulated R W3 Predicted ± ± ± ± Simulated R W4 Predicted ± ± ± ± Simulated R LR Predicted ± ± ± ± Simulated R S Predicted ± ± ± ± Simulated alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 20 / 30
21 Simulations: Type I Errors (Nominal Level is q 0 = ) Table: Type I error probabilities: rejection is based on the O(n 3/2 ) Edgeworth-predicted quantiles from previous Table. (Values that deviate by more than twice the simulation uncertainty of from the nominal value are bolded.) n R W R W2 R W3 R W4 R LR R S alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 21 / 30
22 Type II Errors: Strategy Since all statistics have same asymptotics as MLE (to 1st order), when truth is α = α 1, we have R n N(α 1, σ 2ˆα (α 1)) Let c n be the Edgeworth-predicted quantiles from Table such that Then the power is: P α=0 (R n > c n ) = q 0, c n 5σˆα (0) 1 β = P α=α1 (R n > c n ) = P ( Z > c ) n α 1 σˆα (α 1 ) Thus choosing α 1 = 5σˆα (0) we should have as n : 1 β P (Z > 0) = 0.5 (Keeps difficulty in finding signal approx constant...) alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 22 / 30
23 Type II Errors: Settings Table: Values of the Cramer-Rao uncertainty for H 0, σ ˆα, and corresponding values of α = α 1 used as the actual model signal fraction under H 1. n σˆα α 1 = 5σˆα alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 23 / 30
24 Simulations: Type II Errors Table: Type II error probabilities (determined empirically), using the predicted & simulated quantiles from Table. The smallest predicted value at each n is in bold. (Simulation uncertainty ) Sample Size (n) R Method R W Predicted Simulated R W2 Predicted Simulated R W3 Predicted Simulated R W4 Predicted Simulated R LR Predicted Simulated R S Predicted Simulated alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 24 / 30
25 Insight: Why does LR perform so well? Often remarked in math-stat books... Mykland (1999): proves that k-th cumulant of R LR vanishes to O(n k/2 ) for all k 3 κ 3 = 0 to O(n 3/2 ) (but κ j 0 for j 2) κ 4 = 0 to O(n 2 ) (but κ j 0 for j 3) etc. Mykland speculates: this fact... would seem to be the main asymptotic property governing the accuracy behavior... of R LR. Why? Because the high-order cumulants are precisely the coefficients of the highest degree H k ( ) in the Edgeworth exp... alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 25 / 30
26 Extension 1: Nuisance parameters in signal & background Doable: s(x) & b(x) b(x φ). extend everything we have done in nuisance parameter setting (multivariate Edgeworth exp.). Problem: s(x) s(x θ) means θ is not identifiable under H 0 : classical inference for treating nuisance parameters then breaks down... Davies (Biometrika, 1987): appropriate p-value is an excursion probability p-value = P(max R(θ) > c) θ Θ Theory of Random Fields (TRF): emerged as only analytical solution so far (large-scale searches in neuroimaging, astrophysics, etc.) R(θ) is viewed as Gaussian random field over manifold Θ R d φ has been profiled out of R(θ, φ) : φ ˆφ provides closed-form approximaton when c is large... alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 26 / 30
27 TRF: Adler & Taylor (2007), Random Fields and Geometry, Springer Excursion set of field above level c: A c = {θ Θ : R(θ) > c} Euler characteristic of excursion set: φ(a c ) = geometric property of field Fundamental result in TRF: E[φ(A c )] = d a i f i (c) i=0 a i : positive constants (to be determined by Monte Carlo) f i ( ): known universal functions For large c: (Taylor et al., Annals of Probability, 2005) p-value = P(max θ Θ R(θ) > c) E[φ(A c)] p global alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 27 / 30
28 Nuisance in signal & background: improving TRF (THIS IS THE MAIN CONTRIBUTION OF THIS PAPER!) Suppose θ 0 & φ 0 Solution 1 (straightforward): treat all parameters via TRF in conjuction with Edgeworth O(n 3/2 ) normalized versions of LR statistic r r = Φ 1 (F R (r)) Solution 2 (exotic): adjust global significance of test statistic, leading to (conservative) estimate of p global in context of TRF... p global = P(R LR (ˆθ) > r(ˆθ)) r(θ) is observed (local) value of R LR (θ) computed from sample, ˆθ = arg max θ Θ r(θ). alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 28 / 30
29 Algorithm for Solution 2: details... Normal approx error for each observed (local) r r(θ) as before: R(r(θ)) = r(θ) r(θ) Locate: θ = arg max θ Θ R(r(θ)) Search can use same grid as TRF search for ˆθ = arg max r(θ). Calculate global significance of signal p global via TRF, and express it in terms of the global r: r global = Φ 1 (1 p global ) Adjust global r: r adj global = r global R(r(θ )) Global (adjusted) p -value is then: p adj global = 1 Φ(r adj global ) alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 29 / 30
30 Extension 2: Random sample size Suppose x 1,..., x N iid p(x α) and N P(ν(α)), then have two cases: if ν(α) independent of α so that ν(α) ν, inference on α is as before l(α, ν) = n log p(x i α) ν + n log ν = l(α) + l(ν) i=1 otherwise, log-likelihood is not separable, but under a simplified regime where α = ν s /(ν s + ν b ) with ν b known, we have l(ν s ) = n log p(x i α) (ν s + ν b ) + n log(ν s + ν b ) i=1 and can now re-do all calcs for the new parameter ν s... THE END! alex.trindade@ttu.edu Edgeworth Expansions for Mixture Models 30 / 30
Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER
Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS Answer any FOUR of the SIX questions.
More informationDetermining source cumulants in femtoscopy with Gram-Charlier and Edgeworth series
Determining source cumulants in femtoscopy with Gram-Charlier and Edgeworth series M.B. de Kock a H.C. Eggers a J. Schmiegel b a University of Stellenbosch, South Africa b Aarhus University, Denmark VI
More informationApplications of Good s Generalized Diversity Index. A. J. Baczkowski Department of Statistics, University of Leeds Leeds LS2 9JT, UK
Applications of Good s Generalized Diversity Index A. J. Baczkowski Department of Statistics, University of Leeds Leeds LS2 9JT, UK Internal Report STAT 98/11 September 1998 Applications of Good s Generalized
More informationUQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.
UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions. Random Variables 2 A random variable X is a numerical (integer, real, complex, vector etc.) summary of the outcome of the random experiment.
More informationSYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data
SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu September 5, 2015
More informationLikelihood Methods of Inference. Toss coin 6 times and get Heads twice.
Methods of Inference Toss coin 6 times and get Heads twice. p is probability of getting H. Probability of getting exactly 2 heads is 15p 2 (1 p) 4 This function of p, is likelihood function. Definition:
More informationThe Normal Distribution
The Normal Distribution The normal distribution plays a central role in probability theory and in statistics. It is often used as a model for the distribution of continuous random variables. Like all models,
More informationDependence Structure and Extreme Comovements in International Equity and Bond Markets
Dependence Structure and Extreme Comovements in International Equity and Bond Markets René Garcia Edhec Business School, Université de Montréal, CIRANO and CIREQ Georges Tsafack Suffolk University Measuring
More informationExperience with the Weighted Bootstrap in Testing for Unobserved Heterogeneity in Exponential and Weibull Duration Models
Experience with the Weighted Bootstrap in Testing for Unobserved Heterogeneity in Exponential and Weibull Duration Models Jin Seo Cho, Ta Ul Cheong, Halbert White Abstract We study the properties of the
More informationChapter 7: Point Estimation and Sampling Distributions
Chapter 7: Point Estimation and Sampling Distributions Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 20 Motivation In chapter 3, we learned
More informationPoint Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage
6 Point Estimation Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Point Estimation Statistical inference: directed toward conclusions about one or more parameters. We will use the generic
More informationLecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial
Lecture 23 STAT 225 Introduction to Probability Models April 4, 2014 approximation Whitney Huang Purdue University 23.1 Agenda 1 approximation 2 approximation 23.2 Characteristics of the random variable:
More informationAnalyzing Oil Futures with a Dynamic Nelson-Siegel Model
Analyzing Oil Futures with a Dynamic Nelson-Siegel Model NIELS STRANGE HANSEN & ASGER LUNDE DEPARTMENT OF ECONOMICS AND BUSINESS, BUSINESS AND SOCIAL SCIENCES, AARHUS UNIVERSITY AND CENTER FOR RESEARCH
More informationNormal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.
Lecture 21,22, 23 Text: A Course in Probability by Weiss 8.5 STAT 225 Introduction to Probability Models March 31, 2014 Standard Sums of Whitney Huang Purdue University 21,22, 23.1 Agenda 1 2 Standard
More information8.1 Estimation of the Mean and Proportion
8.1 Estimation of the Mean and Proportion Statistical inference enables us to make judgments about a population on the basis of sample information. The mean, standard deviation, and proportions of a population
More informationExercise. Show the corrected sample variance is an unbiased estimator of population variance. S 2 = n i=1 (X i X ) 2 n 1. Exercise Estimation
Exercise Show the corrected sample variance is an unbiased estimator of population variance. S 2 = n i=1 (X i X ) 2 n 1 Exercise S 2 = = = = n i=1 (X i x) 2 n i=1 = (X i µ + µ X ) 2 = n 1 n 1 n i=1 ((X
More information12 The Bootstrap and why it works
12 he Bootstrap and why it works For a review of many applications of bootstrap see Efron and ibshirani (1994). For the theory behind the bootstrap see the books by Hall (1992), van der Waart (2000), Lahiri
More informationarxiv: v1 [math.st] 18 Sep 2018
Gram Charlier and Edgeworth expansion for sample variance arxiv:809.06668v [math.st] 8 Sep 08 Eric Benhamou,* A.I. SQUARE CONNECT, 35 Boulevard d Inkermann 900 Neuilly sur Seine, France and LAMSADE, Universit
More informationTwo Hours. Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER. 22 January :00 16:00
Two Hours MATH38191 Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER STATISTICAL MODELLING IN FINANCE 22 January 2015 14:00 16:00 Answer ALL TWO questions
More informationmay be of interest. That is, the average difference between the estimator and the truth. Estimators with Bias(ˆθ) = 0 are called unbiased.
1 Evaluating estimators Suppose you observe data X 1,..., X n that are iid observations with distribution F θ indexed by some parameter θ. When trying to estimate θ, one may be interested in determining
More informationA potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples
1.3 Regime switching models A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples (or regimes). If the dates, the
More informationcontinuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence
continuous rv Let X be a continuous rv. Then a probability distribution or probability density function (pdf) of X is a function f(x) such that for any two numbers a and b with a b, P(a X b) = b a f (x)dx.
More informationA Saddlepoint Approximation to Left-Tailed Hypothesis Tests of Variance for Non-normal Populations
UNF Digital Commons UNF Theses and Dissertations Student Scholarship 2016 A Saddlepoint Approximation to Left-Tailed Hypothesis Tests of Variance for Non-normal Populations Tyler L. Grimes University of
More informationChapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi
Chapter 4: Commonly Used Distributions Statistics for Engineers and Scientists Fourth Edition William Navidi 2014 by Education. This is proprietary material solely for authorized instructor use. Not authorized
More informationReliability and Risk Analysis. Survival and Reliability Function
Reliability and Risk Analysis Survival function We consider a non-negative random variable X which indicates the waiting time for the risk event (eg failure of the monitored equipment, etc.). The probability
More informationPoint Estimation. Some General Concepts of Point Estimation. Example. Estimator quality
Point Estimation Some General Concepts of Point Estimation Statistical inference = conclusions about parameters Parameters == population characteristics A point estimate of a parameter is a value (based
More informationNormal distribution Approximating binomial distribution by normal 2.10 Central Limit Theorem
1.1.2 Normal distribution 1.1.3 Approimating binomial distribution by normal 2.1 Central Limit Theorem Prof. Tesler Math 283 Fall 216 Prof. Tesler 1.1.2-3, 2.1 Normal distribution Math 283 / Fall 216 1
More informationChapter 7: Estimation Sections
1 / 40 Chapter 7: Estimation Sections 7.1 Statistical Inference Bayesian Methods: Chapter 7 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions 7.4 Bayes Estimators Frequentist Methods:
More information4.3 Normal distribution
43 Normal distribution Prof Tesler Math 186 Winter 216 Prof Tesler 43 Normal distribution Math 186 / Winter 216 1 / 4 Normal distribution aka Bell curve and Gaussian distribution The normal distribution
More informationCentral Limit Theorem, Joint Distributions Spring 2018
Central Limit Theorem, Joint Distributions 18.5 Spring 218.5.4.3.2.1-4 -3-2 -1 1 2 3 4 Exam next Wednesday Exam 1 on Wednesday March 7, regular room and time. Designed for 1 hour. You will have the full
More informationExperience with the Weighted Bootstrap in Testing for Unobserved Heterogeneity in Exponential and Weibull Duration Models
Experience with the Weighted Bootstrap in Testing for Unobserved Heterogeneity in Exponential and Weibull Duration Models Jin Seo Cho, Ta Ul Cheong, Halbert White Abstract We study the properties of the
More informationChapter 8: Sampling distributions of estimators Sections
Chapter 8 continued Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample
More informationMEASURING PORTFOLIO RISKS USING CONDITIONAL COPULA-AR-GARCH MODEL
MEASURING PORTFOLIO RISKS USING CONDITIONAL COPULA-AR-GARCH MODEL Isariya Suttakulpiboon MSc in Risk Management and Insurance Georgia State University, 30303 Atlanta, Georgia Email: suttakul.i@gmail.com,
More informationFinancial Risk Management
Financial Risk Management Professor: Thierry Roncalli Evry University Assistant: Enareta Kurtbegu Evry University Tutorial exercices #4 1 Correlation and copulas 1. The bivariate Gaussian copula is given
More informationWhat was in the last lecture?
What was in the last lecture? Normal distribution A continuous rv with bell-shaped density curve The pdf is given by f(x) = 1 2πσ e (x µ)2 2σ 2, < x < If X N(µ, σ 2 ), E(X) = µ and V (X) = σ 2 Standard
More informationChapter 7: Estimation Sections
Chapter 7: Estimation Sections 7.1 Statistical Inference Bayesian Methods: 7.2 Prior and Posterior Distributions 7.3 Conjugate Prior Distributions Frequentist Methods: 7.5 Maximum Likelihood Estimators
More informationProbability Theory and Simulation Methods. April 9th, Lecture 20: Special distributions
April 9th, 2018 Lecture 20: Special distributions Week 1 Chapter 1: Axioms of probability Week 2 Chapter 3: Conditional probability and independence Week 4 Chapters 4, 6: Random variables Week 9 Chapter
More informationModelling financial data with stochastic processes
Modelling financial data with stochastic processes Vlad Ardelean, Fabian Tinkl 01.08.2012 Chair of statistics and econometrics FAU Erlangen-Nuremberg Outline Introduction Stochastic processes Volatility
More informationNormal Distribution. Definition A continuous rv X is said to have a normal distribution with. the pdf of X is
Normal Distribution Normal Distribution Definition A continuous rv X is said to have a normal distribution with parameter µ and σ (µ and σ 2 ), where < µ < and σ > 0, if the pdf of X is f (x; µ, σ) = 1
More informationAsymptotic refinements of bootstrap tests in a linear regression model ; A CHM bootstrap using the first four moments of the residuals
Asymptotic refinements of bootstrap tests in a linear regression model ; A CHM bootstrap using the first four moments of the residuals Pierre-Eric Treyens To cite this version: Pierre-Eric Treyens. Asymptotic
More informationBusiness Statistics 41000: Probability 3
Business Statistics 41000: Probability 3 Drew D. Creal University of Chicago, Booth School of Business February 7 and 8, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office: 404
More informationModule 2: Monte Carlo Methods
Module 2: Monte Carlo Methods Prof. Mike Giles mike.giles@maths.ox.ac.uk Oxford University Mathematical Institute MC Lecture 2 p. 1 Greeks In Monte Carlo applications we don t just want to know the expected
More informationProbability Weighted Moments. Andrew Smith
Probability Weighted Moments Andrew Smith andrewdsmith8@deloitte.co.uk 28 November 2014 Introduction If I asked you to summarise a data set, or fit a distribution You d probably calculate the mean and
More informationAsymmetric Type II Compound Laplace Distributions and its Properties
CHAPTER 4 Asymmetric Type II Compound Laplace Distributions and its Properties 4. Introduction Recently there is a growing trend in the literature on parametric families of asymmetric distributions which
More informationCharacterization of the Optimum
ECO 317 Economics of Uncertainty Fall Term 2009 Notes for lectures 5. Portfolio Allocation with One Riskless, One Risky Asset Characterization of the Optimum Consider a risk-averse, expected-utility-maximizing
More informationAnalysis of truncated data with application to the operational risk estimation
Analysis of truncated data with application to the operational risk estimation Petr Volf 1 Abstract. Researchers interested in the estimation of operational risk often face problems arising from the structure
More informationAsymmetric Price Transmission: A Copula Approach
Asymmetric Price Transmission: A Copula Approach Feng Qiu University of Alberta Barry Goodwin North Carolina State University August, 212 Prepared for the AAEA meeting in Seattle Outline Asymmetric price
More informationRisk management. VaR and Expected Shortfall. Christian Groll. VaR and Expected Shortfall Risk management Christian Groll 1 / 56
Risk management VaR and Expected Shortfall Christian Groll VaR and Expected Shortfall Risk management Christian Groll 1 / 56 Introduction Introduction VaR and Expected Shortfall Risk management Christian
More informationME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.
ME3620 Theory of Engineering Experimentation Chapter III. Random Variables and Probability Distributions Chapter III 1 3.2 Random Variables In an experiment, a measurement is usually denoted by a variable
More informationSimulation Lecture Notes and the Gentle Lentil Case
Simulation Lecture Notes and the Gentle Lentil Case General Overview of the Case What is the decision problem presented in the case? What are the issues Sanjay must consider in deciding among the alternative
More informationLecture 9: Markov and Regime
Lecture 9: Markov and Regime Switching Models Prof. Massimo Guidolin 20192 Financial Econometrics Spring 2017 Overview Motivation Deterministic vs. Endogeneous, Stochastic Switching Dummy Regressiom Switching
More informationStatistics 431 Spring 2007 P. Shaman. Preliminaries
Statistics 4 Spring 007 P. Shaman The Binomial Distribution Preliminaries A binomial experiment is defined by the following conditions: A sequence of n trials is conducted, with each trial having two possible
More informationPosterior Inference. , where should we start? Consider the following computational procedure: 1. draw samples. 2. convert. 3. compute properties
Posterior Inference Example. Consider a binomial model where we have a posterior distribution for the probability term, θ. Suppose we want to make inferences about the log-odds γ = log ( θ 1 θ), where
More informationADVANCED OPERATIONAL RISK MODELLING IN BANKS AND INSURANCE COMPANIES
Small business banking and financing: a global perspective Cagliari, 25-26 May 2007 ADVANCED OPERATIONAL RISK MODELLING IN BANKS AND INSURANCE COMPANIES C. Angela, R. Bisignani, G. Masala, M. Micocci 1
More informationAn Information Based Methodology for the Change Point Problem Under the Non-central Skew t Distribution with Applications.
An Information Based Methodology for the Change Point Problem Under the Non-central Skew t Distribution with Applications. Joint with Prof. W. Ning & Prof. A. K. Gupta. Department of Mathematics and Statistics
More informationNEWCASTLE UNIVERSITY SCHOOL OF MATHEMATICS, STATISTICS & PHYSICS SEMESTER 1 SPECIMEN 2 MAS3904. Stochastic Financial Modelling. Time allowed: 2 hours
NEWCASTLE UNIVERSITY SCHOOL OF MATHEMATICS, STATISTICS & PHYSICS SEMESTER 1 SPECIMEN 2 Stochastic Financial Modelling Time allowed: 2 hours Candidates should attempt all questions. Marks for each question
More informationWeek 1 Quantitative Analysis of Financial Markets Distributions B
Week 1 Quantitative Analysis of Financial Markets Distributions B Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 October
More information6. Genetics examples: Hardy-Weinberg Equilibrium
PBCB 206 (Fall 2006) Instructor: Fei Zou email: fzou@bios.unc.edu office: 3107D McGavran-Greenberg Hall Lecture 4 Topics for Lecture 4 1. Parametric models and estimating parameters from data 2. Method
More informationA comment on Christoffersen, Jacobs and Ornthanalai (2012), Dynamic jump intensities and risk premiums: Evidence from S&P500 returns and options
A comment on Christoffersen, Jacobs and Ornthanalai (2012), Dynamic jump intensities and risk premiums: Evidence from S&P500 returns and options Garland Durham 1 John Geweke 2 Pulak Ghosh 3 February 25,
More informationIntroduction to Statistics I
Introduction to Statistics I Keio University, Faculty of Economics Continuous random variables Simon Clinet (Keio University) Intro to Stats November 1, 2018 1 / 18 Definition (Continuous random variable)
More informationA New Test for Correlation on Bivariate Nonnormal Distributions
Journal of Modern Applied Statistical Methods Volume 5 Issue Article 8 --06 A New Test for Correlation on Bivariate Nonnormal Distributions Ping Wang Great Basin College, ping.wang@gbcnv.edu Ping Sa University
More informationChapter 5. Statistical inference for Parametric Models
Chapter 5. Statistical inference for Parametric Models Outline Overview Parameter estimation Method of moments How good are method of moments estimates? Interval estimation Statistical Inference for Parametric
More information12. Conditional heteroscedastic models (ARCH) MA6622, Ernesto Mordecki, CityU, HK, 2006.
12. Conditional heteroscedastic models (ARCH) MA6622, Ernesto Mordecki, CityU, HK, 2006. References for this Lecture: Robert F. Engle. Autoregressive Conditional Heteroscedasticity with Estimates of Variance
More informationJackknife Empirical Likelihood Inferences for the Skewness and Kurtosis
Georgia State University ScholarWorks @ Georgia State University Mathematics Theses Department of Mathematics and Statistics 5-10-2014 Jackknife Empirical Likelihood Inferences for the Skewness and Kurtosis
More informationOn Some Statistics for Testing the Skewness in a Population: An. Empirical Study
Available at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 12, Issue 2 (December 2017), pp. 726-752 Applications and Applied Mathematics: An International Journal (AAM) On Some Statistics
More informationIntroduction to the Maximum Likelihood Estimation Technique. September 24, 2015
Introduction to the Maximum Likelihood Estimation Technique September 24, 2015 So far our Dependent Variable is Continuous That is, our outcome variable Y is assumed to follow a normal distribution having
More informationدرس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی
یادگیري ماشین توزیع هاي نمونه و تخمین نقطه اي پارامترها Sampling Distributions and Point Estimation of Parameter (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی درس هفتم 1 Outline Introduction
More informationLecture 8: Markov and Regime
Lecture 8: Markov and Regime Switching Models Prof. Massimo Guidolin 20192 Financial Econometrics Spring 2016 Overview Motivation Deterministic vs. Endogeneous, Stochastic Switching Dummy Regressiom Switching
More informationMuch of what appears here comes from ideas presented in the book:
Chapter 11 Robust statistical methods Much of what appears here comes from ideas presented in the book: Huber, Peter J. (1981), Robust statistics, John Wiley & Sons (New York; Chichester). There are many
More informationPopulations and Samples Bios 662
Populations and Samples Bios 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-08-22 16:29 BIOS 662 1 Populations and Samples Random Variables Random sample: result
More informationInternet Appendix for Asymmetry in Stock Comovements: An Entropy Approach
Internet Appendix for Asymmetry in Stock Comovements: An Entropy Approach Lei Jiang Tsinghua University Ke Wu Renmin University of China Guofu Zhou Washington University in St. Louis August 2017 Jiang,
More informationThe Bernoulli distribution
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More information4-1. Chapter 4. Commonly Used Distributions by The McGraw-Hill Companies, Inc. All rights reserved.
4-1 Chapter 4 Commonly Used Distributions 2014 by The Companies, Inc. All rights reserved. Section 4.1: The Bernoulli Distribution 4-2 We use the Bernoulli distribution when we have an experiment which
More informationChapter 5: Statistical Inference (in General)
Chapter 5: Statistical Inference (in General) Shiwen Shen University of South Carolina 2016 Fall Section 003 1 / 17 Motivation In chapter 3, we learn the discrete probability distributions, including Bernoulli,
More informationWindow Width Selection for L 2 Adjusted Quantile Regression
Window Width Selection for L 2 Adjusted Quantile Regression Yoonsuh Jung, The Ohio State University Steven N. MacEachern, The Ohio State University Yoonkyung Lee, The Ohio State University Technical Report
More informationModeling Portfolios that Contain Risky Assets Stochastic Models I: One Risky Asset
Modeling Portfolios that Contain Risky Assets Stochastic Models I: One Risky Asset C. David Levermore University of Maryland, College Park Math 420: Mathematical Modeling March 25, 2014 version c 2014
More informationCan we use kernel smoothing to estimate Value at Risk and Tail Value at Risk?
Can we use kernel smoothing to estimate Value at Risk and Tail Value at Risk? Ramon Alemany, Catalina Bolancé and Montserrat Guillén Riskcenter - IREA Universitat de Barcelona http://www.ub.edu/riskcenter
More informationComparing the Means of. Two Log-Normal Distributions: A Likelihood Approach
Journal of Statistical and Econometric Methods, vol.3, no.1, 014, 137-15 ISSN: 179-660 (print), 179-6939 (online) Scienpress Ltd, 014 Comparing the Means of Two Log-Normal Distributions: A Likelihood Approach
More informationFINITE SAMPLE DISTRIBUTIONS OF RISK-RETURN RATIOS
Available Online at ESci Journals Journal of Business and Finance ISSN: 305-185 (Online), 308-7714 (Print) http://www.escijournals.net/jbf FINITE SAMPLE DISTRIBUTIONS OF RISK-RETURN RATIOS Reza Habibi*
More informationTwo-term Edgeworth expansions of the distributions of fit indexes under fixed alternatives in covariance structure models
Economic Review (Otaru University of Commerce), Vo.59, No.4, 4-48, March, 009 Two-term Edgeworth expansions of the distributions of fit indexes under fixed alternatives in covariance structure models Haruhiko
More informationOn Complexity of Multistage Stochastic Programs
On Complexity of Multistage Stochastic Programs Alexander Shapiro School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0205, USA e-mail: ashapiro@isye.gatech.edu
More informationSampling Distribution
MAT 2379 (Spring 2012) Sampling Distribution Definition : Let X 1,..., X n be a collection of random variables. We say that they are identically distributed if they have a common distribution. Definition
More informationModeling Co-movements and Tail Dependency in the International Stock Market via Copulae
Modeling Co-movements and Tail Dependency in the International Stock Market via Copulae Katja Ignatieva, Eckhard Platen Bachelier Finance Society World Congress 22-26 June 2010, Toronto K. Ignatieva, E.
More informationActuarial Mathematics and Statistics Statistics 5 Part 2: Statistical Inference Tutorial Problems
Actuarial Mathematics and Statistics Statistics 5 Part 2: Statistical Inference Tutorial Problems Spring 2005 1. Which of the following statements relate to probabilities that can be interpreted as frequencies?
More informationRohini Kumar. Statistics and Applied Probability, UCSB (Joint work with J. Feng and J.-P. Fouque)
Small time asymptotics for fast mean-reverting stochastic volatility models Statistics and Applied Probability, UCSB (Joint work with J. Feng and J.-P. Fouque) March 11, 2011 Frontier Probability Days,
More information1. You are given the following information about a stationary AR(2) model:
Fall 2003 Society of Actuaries **BEGINNING OF EXAMINATION** 1. You are given the following information about a stationary AR(2) model: (i) ρ 1 = 05. (ii) ρ 2 = 01. Determine φ 2. (A) 0.2 (B) 0.1 (C) 0.4
More informationUnobserved Heterogeneity Revisited
Unobserved Heterogeneity Revisited Robert A. Miller Dynamic Discrete Choice March 2018 Miller (Dynamic Discrete Choice) cemmap 7 March 2018 1 / 24 Distributional Assumptions about the Unobserved Variables
More informationLecture 17: More on Markov Decision Processes. Reinforcement learning
Lecture 17: More on Markov Decision Processes. Reinforcement learning Learning a model: maximum likelihood Learning a value function directly Monte Carlo Temporal-difference (TD) learning COMP-424, Lecture
More informationMultivariate Cox PH model with log-skew-normal frailties
Multivariate Cox PH model with log-skew-normal frailties Department of Statistical Sciences, University of Padua, 35121 Padua (IT) Multivariate Cox PH model A standard statistical approach to model clustered
More informationWeb-based Supplementary Materials for. A space-time conditional intensity model. for invasive meningococcal disease occurence
Web-based Supplementary Materials for A space-time conditional intensity model for invasive meningococcal disease occurence by Sebastian Meyer 1,2, Johannes Elias 3, and Michael Höhle 4,2 1 Department
More informationA New Hybrid Estimation Method for the Generalized Pareto Distribution
A New Hybrid Estimation Method for the Generalized Pareto Distribution Chunlin Wang Department of Mathematics and Statistics University of Calgary May 18, 2011 A New Hybrid Estimation Method for the GPD
More informationDepartment of Economics School of Social Sciences
Department of Economics School of Social Sciences Robust misspecification tests for the Heckman s two-step estimator Gabriel V. Montes-Rojas 1 City University Department of Economics Discussion Paper Series
More informationA New Multivariate Kurtosis and Its Asymptotic Distribution
A ew Multivariate Kurtosis and Its Asymptotic Distribution Chiaki Miyagawa 1 and Takashi Seo 1 Department of Mathematical Information Science, Graduate School of Science, Tokyo University of Science, Tokyo,
More information**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:
**BEGINNING OF EXAMINATION** 1. You are given: (i) A random sample of five observations from a population is: 0.2 0.7 0.9 1.1 1.3 (ii) You use the Kolmogorov-Smirnov test for testing the null hypothesis,
More informationEstimating Mixed Logit Models with Large Choice Sets. Roger H. von Haefen, NC State & NBER Adam Domanski, NOAA July 2013
Estimating Mixed Logit Models with Large Choice Sets Roger H. von Haefen, NC State & NBER Adam Domanski, NOAA July 2013 Motivation Bayer et al. (JPE, 2007) Sorting modeling / housing choice 250,000 individuals
More informationSample Size for Assessing Agreement between Two Methods of Measurement by Bland Altman Method
Meng-Jie Lu 1 / Wei-Hua Zhong 1 / Yu-Xiu Liu 1 / Hua-Zhang Miao 1 / Yong-Chang Li 1 / Mu-Huo Ji 2 Sample Size for Assessing Agreement between Two Methods of Measurement by Bland Altman Method Abstract:
More informationOn the Distribution of Kurtosis Test for Multivariate Normality
On the Distribution of Kurtosis Test for Multivariate Normality Takashi Seo and Mayumi Ariga Department of Mathematical Information Science Tokyo University of Science 1-3, Kagurazaka, Shinjuku-ku, Tokyo,
More informationChapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29
Chapter 5 Univariate time-series analysis () Chapter 5 Univariate time-series analysis 1 / 29 Time-Series Time-series is a sequence fx 1, x 2,..., x T g or fx t g, t = 1,..., T, where t is an index denoting
More informationHigh Dimensional Edgeworth Expansion. Applications to Bootstrap and Its Variants
With Applications to Bootstrap and Its Variants Department of Statistics, UC Berkeley Stanford-Berkeley Colloquium, 2016 Francis Ysidro Edgeworth (1845-1926) Peter Gavin Hall (1951-2016) Table of Contents
More informationStatistical estimation
Statistical estimation Statistical modelling: theory and practice Gilles Guillot gigu@dtu.dk September 3, 2013 Gilles Guillot (gigu@dtu.dk) Estimation September 3, 2013 1 / 27 1 Introductory example 2
More information