Strong consistency of nonparametric Bayes density estimation on compact metric spaces

Similar documents
5. Best Unbiased Estimators

Sequences and Series

Lecture 9: The law of large numbers and central limit theorem

14.30 Introduction to Statistical Methods in Economics Spring 2009

Bayes Estimator for Coefficient of Variation and Inverse Coefficient of Variation for the Normal Distribution

Rafa l Kulik and Marc Raimondo. University of Ottawa and University of Sydney. Supplementary material

Asymptotics: Consistency and Delta Method

Combining imperfect data, and an introduction to data assimilation Ross Bannister, NCEO, September 2010

Introduction to Probability and Statistics Chapter 7

Maximum Empirical Likelihood Estimation (MELE)

A Bayesian perspective on estimating mean, variance, and standard-deviation from data

18.S096 Problem Set 5 Fall 2013 Volatility Modeling Due Date: 10/29/2013

5 Statistical Inference

1 Estimating sensitivities

Lecture 4: Parameter Estimation and Confidence Intervals. GENOME 560 Doug Fowler, GS

point estimator a random variable (like P or X) whose values are used to estimate a population parameter

FINM6900 Finance Theory How Is Asymmetric Information Reflected in Asset Prices?

Chapter 8. Confidence Interval Estimation. Copyright 2015, 2012, 2009 Pearson Education, Inc. Chapter 8, Slide 1

Statistics for Economics & Business

Monopoly vs. Competition in Light of Extraction Norms. Abstract

SELECTING THE NUMBER OF CHANGE-POINTS IN SEGMENTED LINE REGRESSION

An Empirical Study of the Behaviour of the Sample Kurtosis in Samples from Symmetric Stable Distributions

The Limit of a Sequence (Brief Summary) 1

A New Constructive Proof of Graham's Theorem and More New Classes of Functionally Complete Functions

ECON 5350 Class Notes Maximum Likelihood Estimation

Solutions to Problem Sheet 1

x satisfying all regularity conditions. Then

Unbiased estimators Estimators

Parametric Density Estimation: Maximum Likelihood Estimation

Sampling Distributions and Estimation

Research Article The Probability That a Measurement Falls within a Range of n Standard Deviations from an Estimate of the Mean

NOTES ON ESTIMATION AND CONFIDENCE INTERVALS. 1. Estimation

r i = a i + b i f b i = Cov[r i, f] The only parameters to be estimated for this model are a i 's, b i 's, σe 2 i

Math 312, Intro. to Real Analysis: Homework #4 Solutions

Inferential Statistics and Probability a Holistic Approach. Inference Process. Inference Process. Chapter 8 Slides. Maurice Geraghty,

Fourier Transform in L p (R) Spaces, p 1

4.5 Generalized likelihood ratio test

Random Sequences Using the Divisor Pairs Function

Online appendices from Counterparty Risk and Credit Value Adjustment a continuing challenge for global financial markets by Jon Gregory

Game Theory. Lecture Notes By Y. Narahari. Department of Computer Science and Automation Indian Institute of Science Bangalore, India July 2012

NORMALIZATION OF BEURLING GENERALIZED PRIMES WITH RIEMANN HYPOTHESIS

EXERCISE - BINOMIAL THEOREM

0.1 Valuation Formula:

Kernel Density Estimation. Let X be a random variable with continuous distribution F (x) and density f(x) = d

Online appendices from The xva Challenge by Jon Gregory. APPENDIX 10A: Exposure and swaption analogy.

Exam 1 Spring 2015 Statistics for Applications 3/5/2015

1 Random Variables and Key Statistics

Standard Deviations for Normal Sampling Distributions are: For proportions For means _

EVEN NUMBERED EXERCISES IN CHAPTER 4

Hopscotch and Explicit difference method for solving Black-Scholes PDE

We analyze the computational problem of estimating financial risk in a nested simulation. In this approach,

Models of Asset Pricing

AY Term 2 Mock Examination

Limits of sequences. Contents 1. Introduction 2 2. Some notation for sequences The behaviour of infinite sequences 3

Estimating Proportions with Confidence

Mixed and Implicit Schemes Implicit Schemes. Exercise: Verify that ρ is unimodular: ρ = 1.

Lecture 4: Probability (continued)

Binomial Model. Stock Price Dynamics. The Key Idea Riskless Hedge

Topic 14: Maximum Likelihood Estimation

A random variable is a variable whose value is a numerical outcome of a random phenomenon.

The Valuation of the Catastrophe Equity Puts with Jump Risks

SUPPLEMENTAL MATERIAL

. (The calculated sample mean is symbolized by x.)

Calculation of the Annual Equivalent Rate (AER)

Chapter 8: Estimation of Mean & Proportion. Introduction

Models of Asset Pricing

Estimating Forward Looking Distribution with the Ross Recovery Theorem

CHAPTER 2 PRICING OF BONDS

The material in this chapter is motivated by Experiment 9.

Stochastic Processes and their Applications in Financial Pricing

Estimation of Parameters of Three Parameter Esscher Transformed Laplace Distribution

Consistent non-parametric Bayesian estimation for a time-inhomogeneous Brownian motion Gugushvili, S.; Spreij, P.J.C.

Minhyun Yoo, Darae Jeong, Seungsuk Seo, and Junseok Kim

Models of Asset Pricing

Monetary Economics: Problem Set #5 Solutions

ii. Interval estimation:

Anomaly Correction by Optimal Trading Frequency

Today: Finish Chapter 9 (Sections 9.6 to 9.8 and 9.9 Lesson 3)

Research Paper Number From Discrete to Continuous Time Finance: Weak Convergence of the Financial Gain Process

1. Suppose X is a variable that follows the normal distribution with known standard deviation σ = 0.3 but unknown mean µ.

Moving frame and integrable system of the discrete centroaffine curves in R 3

AMS Portfolio Theory and Capital Markets

Summary. Recap. Last Lecture. .1 If you know MLE of θ, can you also know MLE of τ(θ) for any function τ?

DESCRIPTION OF MATHEMATICAL MODELS USED IN RATING ACTIVITIES

CAPITAL ASSET PRICING MODEL

Department of Mathematics, S.R.K.R. Engineering College, Bhimavaram, A.P., India 2

Appendix 1 to Chapter 5

Discriminating Between The Log-normal and Gamma Distributions

Subject CT5 Contingencies Core Technical. Syllabus. for the 2011 Examinations. The Faculty of Actuaries and Institute of Actuaries.

These characteristics are expressed in terms of statistical properties which are estimated from the sample data.


Average Distance and Vertex-Connectivity

of Asset Pricing R e = expected return

CAPITAL PROJECT SCREENING AND SELECTION

CAUCHY'S FORMULA AND EIGENVAULES (PRINCIPAL STRESSES) IN 3-D

We learned: $100 cash today is preferred over $100 a year from now

STAT 135 Solutions to Homework 3: 30 points

Institute of Actuaries of India Subject CT5 General Insurance, Life and Health Contingencies

Lecture 5: Sampling Distribution

Bootstrapping high-frequency jump tests

Transcription:

Strog cosistecy of oparametric Bayes desity estimatio o compact metric spaces Abhishek Bhattacharya ad David Duso Departmet of Statistical Sciece, Duke Uiversity duso@stat.duke.edu Abstract. This article cosiders a broad class of kerel mixture desity models o compact metric spaces ad maifolds. Followig a Bayesia approach with a oparametric prior o the locatio mixig distributio ad badwidth, sufficiet coditios are obtaied o the kerel, prior ad the uderlyig space for strog posterior cosistecy at ay positive cotiuous desity. The prior is also allowed to deped o the sample size ad sufficiet coditios are obtaied for weak ad strog cosistecy. These coditios are verified o the hypersphere usig a vo Mises-Fisher kerel ad o the plaar shape space usig complex Watso kerels.. Itroductio Desity estimatio o compact metric spaces, such as maifolds, is a fudametal problem i oparametric iferece o o-euclidea spaces. Some applicatios iclude directioal data aalysis, spatial modelig, shape aalysis ad dimesioality reductio problems i which the data lie o a ukow lower dimesioal space. However, the literature o statistical theory ad methods of desity estimatio i o-euclidea spaces is still uder-developed. Our focus is o Bayesia oparametric approaches. For oparametric Bayes desity estimatio o the real lie R, there is a rich literature, with Dirichlet process mixtures of Gaussia kerels providig a commolyused approach ([6]) that leads to dese support ([5]) ad weak ad strog posterior cosistecy ([9]). From the celebrated theorem of [6], weak posterior cosistecy results whe the true desity f 0 is i the Kullback-Leibler (KL) support of the prior, meaig that all KL eighborhoods aroud f 0 are assiged positive probability. I geeral, it is quite difficult to show KL support for ew priors for a desity, though [9] provide useful coditios for a class of kerel mixture priors, with [3] extedig these coditios to geeral compact metric spaces. It is widely accepted that weak cosistecy is a isufficiet property whe the focus is o desity estimatio. For example, if f 0 is a desity with respect to Lebesgue measure, weak cosistecy d Key words ad phrases. Noparametric Bayes; Desity Estimatio; Posterior cosistecy; Sample depedet prior; Riemaia maifold; Hypersphere; Plaar shape space.

2 ABHISHEK BHATTACHARYA AND DAVID DUNSON oes ot eve esure that the posterior assigs positive probability to the set of desities with respect to Lebesgue measure. Hece, it is importat to provide stroger results. Util very recetly, essetially all the literature o theory of oparametric Bayes desity estimatio focused o oe-dimesioal Euclidea spaces. A importat developmet i multivariate Euclidea spaces is the article of [20] who provide sufficiet coditios for strog cosistecy i oparametric Bayes desity estimatio from Dirichlet process mixtures of multivariate Gaussia kerels. The theory developed i their paper is specialized ad caot be easily geeralized to arbitrary kerel mixtures o more geeral spaces. We are particularly iterested i desity estimatio i the special case i which the compact metric space M correspods to a Riemaia maifold, such as a uit hypersphere or ladmark-based plaar shape space. I order to exted kerel mixture models used i Euclidea spaces to maifolds M, the kerel eeds to be carefully chose. Oe approach is to itroduce a ivertible coordiate map betwee a subset of M ad a Euclidea space ([]). Uder such a approach, the desity prior o M ca be iduced through a kerel mixture model i a Euclidea space. However, several major problems arise i usig such a approach. Firstly, it is ot possible to cover the etire maifold with a sigle smooth coordiate chart except for very simple maifolds, so uless the data are very cocetrated oe may obtai poor performace. Differet local charts ca be patched together to form a atlas, but this may itroduce artifactual discotiuities i the resultig de sity. Because the coordiate map is ot isometric, the geometry of the maifold ca be heavily distorted. As good choices of coordiate frames ecessarily deped o the observatios, additioal ucertaity is automatically iduced. Due to these ad other shortcomigs of coordiate based methods, we focus o modelig approaches that are coordiate free i the sese that we build desity models with respect to the ivariat volume form o the maifold. I [3], a desity model is preseted o a geeral compact metric space with respect to ay fixed base measure usig a radom mixture of probability kerels. Uder mild coditios o the kerel ad the mixig prior, it is show that the prior probability of ay uiform eighborhood of ay cotiuous desity f 0 is positive ad if f 0 is positive everywhere, it lies i the KL support of the prior. Desity estimatio o the plaar shape space is preseted as a special case. I [2], such a desity model is used to carry out classificatio with features o some o- Euclidea maifold ad oparametric Bayes hypothesis testig with observatios o the maifold. Cosistecy results are proved ad for illustratio, the methods are applied to hyperspheres. Focusig o kerel mixture priors for desities o a compact metric space M, i this article, we provide sufficiet coditios o the kerel, prior ad the uderlyig space to esure strog cosistecy. Theorem 2.4 ad Corollary 2.5 provide sufficiet coditios to esure that all total variatio eighborhoods aroud f 0 will be assiged probability covergig to oe as the sample size icreases. The theoretical developmet relies o the method of sieves ad expoetially cosistet tests reviewed i [8]. However, applyig this framework outside Euclidea spaces is ot stadard ad requires careful use of differetial geometry. To illustrate the theory, we focus o desity estimatio o the uit hypersphere usig vo Mises- Fisher kerels ad o the plaar shape space usig complex Watso kerels. I

NONPARAMETRIC BAYES DENSITY ESTIMATION ON MANIFOLDS 3 both these cases, it is show that the kerels satisfy the sufficiet coditios. The results also apply to Gaussia mixture desities o R d wheever the true desity has compact support. Whe the maifold is high-dimesioal, priors satisfyig coditios for strog cosistecy ted to put too little probability ear badwidths close to 0, which is udesirable for applicatios. A gamma prior o the iverse-badwidth, for example, caot be show to satisfy the coditios. Hece, we exted the cosistecy results to cover cases with priors depedig o the sample size. Theorem 2.6 exteds the Schwartz theorem to prove weak cosistecy, while Theorem 2.9 proves strog cosistecy usig such priors. A gamma prior with scale decreasig with at a appropriate rate satisfies the coditios for both weak ad strog posterior cosistecy at a expoetial rate. 2. Cosistecy theorems o compact metric spaces 2.. Weak posterior cosistecy. Let (M, ρ) be a compact metric space, ρ beig the distace metric, ad let X be a radom variable o M (from some measurable space (Ω, B, Q)). We assume that the distributio of X has a desity with respect to some fixed fiite base measure λ o M. The atural choice for such a λ whe M is a Riemaia maifold is the ivariat volume form. We are iterested i modellig this ukow desity via a flexible model. Let K(m; µ, κ) be a probability kerel o M with locatio µ M ad iverse-scale κ [0, ), with K(m; µ, κ)λ(dm) =. The a locatio mixture desity model for X is M defied as (2.) f(m; P, κ) = K(m; µ, κ)p (dµ) M with parameters P i the space M(M) of all probability distributios o M ad κ 0. Kerel mixture models are used routiely i Bayesia desity estimatio i Euclidea spaces, with [4] applyig such a approach to bivariate agular data ad [2, 3] cosiderig kerel mixtures o geeral metric spaces. A prior Π o (P, κ) iduces a prior Π o the space of desities D(M) o M through the model (2.). Give a radom realizatio X,..., X of X, we ca compute the posterior of f. The Schwartz theorem([6]) provides a useful tool i provig that the posterior assigs probability covergig to oe i arbitrarily small eighborhoods of the true desity f 0 as the sample size. Let F 0 deote the probability distributio correspodig to f 0, let KL(f 0 ; f) = M f 0(m) log{f 0 (m)/f(m)}λ(dm) deote the KL divergece of aother desity f from f 0, ad let K ɛ (f 0 ) deote the KL eighborhood {f D(M) : KL(f 0 ; f) < ɛ}. f 0 is said to be i the KL support of Π if Π{K ɛ (f 0 )} > 0 for all ɛ > 0. Propositio 2. (Schwartz). If () f 0 is i the KL support of Π, ad (2) U D(M) is such that there exists a uiformly expoetially cosistet sequece of test fuctios for testig H 0 : f = f 0 versus H : f U c, the Π(U X,..., X ) as a.s. F 0. The posterior probability of U c ca be expressed as (2.2) Π(U c X,..., X ) = f(x i) U c i= f Π(df) 0(X i) f(x i) f Π(df) 0(X i)

4 ABHISHEK BHATTACHARYA AND DAVID DUNSON Coditio (), kow as the KL coditio, esures that for ay β > 0, (2.3) lim if exp(β) f(x i ) Π(df) = a.s. f 0 (X i ) while coditio (2) implies that lim exp(β 0) for some β 0 > 0 ad therefore i= U c i= f(x i ) Π(df) = 0 a.s. f 0 (X i ) lim exp(β 0/2)Π(U c X,..., X ) = 0 a.s. Hece Propositio 2. provides coditios for posterior cosistecy at a expoetial rate. Propositio 2.2, proved i [3], derives sufficiet coditios o the kerel ad the prior so that f 0 is i the KL support of Π. They are A The kerel K is cotiuous o M M (κ 0, ) for some κ 0 0. A2 lim sup κ m M f 0(m) K(m; µ, κ)f 0 (µ)λ(dµ) = 0. M A3 For ay P M(M) ad κ > 0, there exists κ κ such that (P, κ) supp(π ) with supp(π ) deotig the weak support of Π. A4 f 0 is strictly positive ad cotiuous everywhere. Propositio 2.2. Uder assumptios A-A4, for ay ɛ > 0, { } Π f : sup f(m) f 0 (m) < ɛ m M > 0, which implies that f 0 is i the KL support of Π. Whe U is a weakly ope eighborhood of f 0, coditio (2) i Propositio 2. is always satisfied. Hece uder assumptios A-A4, from Propositio 2.2, weak posterior cosistecy at a expoetial rate follows. We will provide examples of kerels o some compact maifolds which satisfy A ad A2. A3 imposes a mild support coditio o the prior o the mixig distributio ad badwidth which is easily satisfied by several priors. A commo choice is Π π with Π beig a Dirichlet process DP(w 0 P 0 ) with supp(p 0 ) = M ad π beig a desity o R + givig o-zero probability ear ifity. 2.2. Strog cosistecy. Whe U is a total variatio eighborhood of f 0, [3] ad [] show that coditio (2) of Propositio 2. will ot be satisfied i most cases. I [] (also see [9]), a sieve method is cosidered to obtai sufficiet coditios for the umerator i (2.2) to decay at a expoetial rate ad hece get strog posterior cosistecy at a expoetial rate. This is stated i Propositio 2.3. I its statemet, for F D(M) ad ɛ > 0, the L -metric etropy N(ɛ, F) is defied as the logarithm of the miimum umber of ɛ-sized (or smaller) L subsets eeded to cover F. Propositio 2.3. If there exists a D D(M) such that () for sufficietly large, Π(D) c < exp( β) for some β > 0, ad (2) N(ɛ, D )/ 0 as for ay ɛ > 0, the for ay total variatio eighborhood U of f 0, there exists a β 0 > 0 such that lim sup exp(β 0 ) f(x i) U c f Π(df) = 0 a.s. F 0(X i) 0. Hece if f 0 is i

NONPARAMETRIC BAYES DENSITY ESTIMATION ON MANIFOLDS 5 the KL support of Π, the posterior probability of ay total variatio eighborhood of f 0 coverges to almost surely. Theorem 2.4, which is the mai theorem of this paper, describes a D which satisfies coditio (2). We impose the followig additioal restrictios o the kerel K ad the space M. A5 There exist positive costats K, a, A such that for all K K, µ, ν M, sup m M,κ [0,K] K(m; µ, κ) K(m; ν, κ) A K a ρ(µ, ν). A6 There exists positive costats a 2, A 2 such that for all κ, κ 2 [0, K], K K, K(m; µ, κ ) K(m; µ, κ 2 ) A 2 K a2 κ κ 2. sup m,µ M A7 There exist positive costats a 3, A 3, A 4 such that give ay ɛ > 0, M ca be covered by A 3 ɛ a3 + A 4 or fewer subsets of diameter at most ɛ. Theorem 2.4. For a positive sequece {κ } divergig to, defie D = { f(p, κ) : P M(M), κ [0, κ ] }. Uder assumptios A5-A7, give ay ɛ > 0, for sufficietly large, N(ɛ, D ) C(ɛ)κ aa3 for some C(ɛ) ( > 0. Hece N(ɛ, D ) is o(), that is, lim N(ɛ, D )/ = 0, wheever κ = o (aa3) ). As a corollary, we derive coditios o the prior Π o (P, κ) uder which strog posterior cosistecy at a expoetial rate follows. Corollary 2.5. Uder assumptios A-A7 ad A8 Π (M(M) ( a, )) < exp( β) for some a < (a a 3 ) ad β > 0, the posterior probability of ay total variatio eighborhood of f 0 coverges to a.s. F 0. Whe we choose Π = Π π with a Dirichlet process Π as i Sectio 2., a choice for π for which assumptios A3 ad A8 are satisfied is a Weibull desity with shape parameter exceedig a a 3. Remark 2.. A gamma prior o κ satisfies A3 but ot A8 (uless a a 3 < ). However that does ot prove that it is ot eligible for strog cosistecy because Corollary 2.5 provides oly sufficiet coditios. Whe the uderlyig space is o-compact (but separable) such as R d, Corollary 2.5 applies to ay true desity f 0 with compact support, say M. The the kerel ca be chose to have o-compact support, such as Gaussia, but the prior o the locatio mixig distributio eeds to have support i M(M). We may eve weake assumptio A5 to A5 sup κ [0,K] K(µ, κ) K(ν, κ) A K a ρ(µ, ν) where f g deotes the L distace. The proof of Theorem 2.4 ca be easily modified to show cosistecy uder this assumptio ad is left to the reader. I such a case, we are modellig a compactly supported desity with a mixture desity possibly havig full support but with locatios draw from a compact domai. Usig a locatio mixture of Gaussia kerels o R d, a ad a 3 from Assumptios A5 ad A7 ca be show to be d/2 ad d respectively. Hece we ca take π to

6 ABHISHEK BHATTACHARYA AND DAVID DUNSON be Weibull with shape parameter exceedig d 2 /2 which ca be the gamma prior i oe dimesio. Remark 2.2. Ulike i [9] ad [20], Corollary 2.5 imposes o support restrictio o the scale parameter. It will be exteded to cover desities with o-compact support, i particular R d i later works. Sice most of the o-euclidea maifolds arisig i applicatios are compact, that is ot a high priority. 2.3. Cosistecy with sample size-depedet priors. Whe the dimesio of the maifold is large, as is the case i shape aalysis with a large umber of ladmarks, the costraits o the shape parameter i the proposed Weibull prior o the iverse badwidth become overly-restrictive. I particular, for strog posterior cosistecy, the shape parameter eeds to be very large i high-dimesioal cases, implyig a prior o the badwidth that places very small probability i eighborhoods close to zero, which is udesirable i may applicatios. By istead allowig the prior to deped o sample size, we ca potetially obtai priors that may have better small sample operatig characteristics, while still leadig to strog cosistecy. However, for -depedet priors, the KL coditio is o loger sufficiet to esure that (2.3) holds ad hece the Schwartz theorem breaks dow. I this sectio, we will modify the coditios ad derive weak ad strog cosistecy results for -depedet priors. As recommeded i earlier sectios, we let P ad κ be idepedet uder Π. The, assumig P Π is a costat prior, we focus o the case i which the iverse-badwidth has a sample size-depedet prior distributio o R +, κ π. Deote the resultig sequece of iduced priors o D(M) as Π. Theorem 2.6 proves weak posterior cosistecy uder the followig assumptios o the prior. A9 The prior Π has full support. A0 For ay β > 0, there exists a κ 0 0, such that for all κ κ 0, lim if exp(β)π (κ) =. Theorem 2.6. Uder assumptios A ad A2 o the kerel, A9 ad A0 o the prior ad A4 o the true desity f 0, the posterior probability of ay weak eighborhood of f 0 coverges to oe a.s. F 0. The proof is immediate from the followig two lemmas. Lemma 2.7. Uder assumptios A-A2, A4 ad A9-A0, for ay β > 0, a.s. F 0. lim if exp(β) f(x i ) f 0 (X i ) Π (df) = Lemma 2.8. If there exists a uiformly expoetially cosistet sequece of test fuctios for testig H 0 : f = f 0 versus H : f U c, ad Π (U c ) > 0 for all, the for some β 0 > 0, a.s. F 0. lim exp(β 0) U c f(x i ) f 0 (X i ) Π (df) = 0

NONPARAMETRIC BAYES DENSITY ESTIMATION ON MANIFOLDS 7 The proof of Lemma 2.8 is related to that of Lemma 4.4.2.[0] which is stated for a costat prior Π but with the set U c depedig o, they call this V. There it is assumed that lim if Π(V ) > 0 but that is ot ecessary as log as Π(V ) > 0 for all > C with C a sufficietly large costat. A gamma prior π (κ) exp( β κ)κ α, α, β > 0, deoted by Gam(α, β ) satisfies assumptio A0 as log as β is o(). For strog cosistecy, we impose the followig additioal coditio o π. Let a ad a 3 be as i assumptios A5 ad A7. A For some β 0 > 0 ad a < (a a 3 ), lim exp(β 0)π {( a, )} = 0. Theorem 2.9. Uder assumptios A-A2, A4-A7 ad A9-A, the posterior probability of ay total variatio eighborhood of f 0 coverges to a.s F 0. The proof is very similar to that of Corollary 2.5 ad hece is omitted. A Gam(α, β ) prior satisfies A whe (aa3) is o(β ). Hece, for example, we have weak ad strog posterior cosistecy with β = b /{log()} b2 for ay b, b 2 > 0. I the subsequet sectios, we cosider desity estimatio o two specific compact maifolds, amely the hypersphere ad the plaar shape space. We costruct mixture models usig suitable kerels which satisfy the requiremets for weak ad strog cosistecy. 3. Applicatio to uit hypersphere Let M be the uit sphere S d embedded i R d+. It is a compact Riemaia maifold of dimesio d ad a compact metric space uder the chord distace ρ(u, v) = u v 2,. 2 deotig the L 2 -orm. To defie a probability desity model as i (2.) with respect to the volume form V, we eed a suitable kerel which satisfies the assumptios i Sectio 2. Oe of the most commoly used probability desities o this space is the vo Mises-Fisher (vmf) desity which is give by (3.) vmf(m; µ, κ) = c (κ) exp(κm T µ), with c beig the ormalizig costat which ca be derived to be (3.2) 2π d/2 Γ( d 2 ) exp(κt)( t 2 ) d/2 dt. The vmf desity o S was first derived i [7] ad the desity i case of S 2 was give by [7]. [8] geeralized this distributio to S d ad examied may of its properties. It ca be show that the parameter µ is the extrisic mea (as defied i [4]), ad hece ca be iterpreted as the distributio locatio. The parameter κ is a measure of cocetratio, with κ = 0 correspodig to the uiform distributio havig costat desity equal to / V (dm). As κ diverges to, S d the vmf distributio coverges to a poit mass at µ i a L sese uiformly. This is proved i Theorem 3.. Theorem 3.. The vmf kerel satisfies assumptio A with κ 0 = 0 ad assumptio A2 for ay cotiuous f 0.

8 ABHISHEK BHATTACHARYA AND DAVID DUNSON Hece from Propositio 2.2, weak posterior cosistecy follows usig the locatio mixture desity model (2.) with a Dirichlet Process prior o P ad a idepedet gamma prior o κ. I the d = 2 special case, [4] proposed a closely related model but did ot cosider theoretical properties. Theorem 3.2 verifies the assumptios for strog cosistecy. Theorem 3.2. The vmf kerel o S d satisfies assumptio A5 with a = d/2 + ad A6 with a 2 = d/2. The compact metric-space (S d, ρ) satisfies assumptio A7 with a 3 = d. As a result a Weibull prior o κ with shape parameter exceedig (d + d 2 /2) satisfies the coditio of Corollary 2.5 ad strog posterior cosistecy follows. The proofs of Theorems 3. ad 3.2 use the followig lemma which establishes certai properties of the ormalizig costat. Lemma 3.3. Defie c(κ) = exp( κ)c(κ), κ 0. The c is decreasig ad for κ, c(κ) Cκ d/2 for some appropriate positive costat C. Whe d is large, as is ofte the case for spherical data, a more appropriate prior o κ for which weak ad strog cosistecies hold ca be Gam(α, β ) as metioed at the ed of 2.3. 4. Plaar Shape Space 4.. Backgroud. Let M be the plaar shape space Σ k 2 which is defied as follows. Cosider a set of k ladmark locatios, k > 2, o a 2D image, ot all poits beig the same. We refer to such a set as a k-ad. The similarity shape of this k-ad is what remais after removig the Euclidea rigid body motios of traslatio, rotatio ad scalig. We use the followig shape represetatio first proposed by [2]. Deote the k-ad by a complex k-vector z i C k. To remove the effect of traslatio from z let z c = z z, with z = ( k j= z j)/k beig the cetroid. The cetered k-ad z c lies i a k dimesioal complex subspace, ad hece we ca use k complex coordiates. The effect of scalig is the removed by ormalizig the coordiates of z c to obtai a poit w o the complex uit sphere CS k 2 i C k. Sice w cotais the shape iformatio of z alog with rotatio, it is called the preshape of z. The similarity shape of z is the orbit of w uder all rotatios i 2D which is [w] = {e iθ w : θ ( π, π]}. This represets a shape as the set of all itersectio poits of a uique complex lie passig through the origi with CS k 2 ad the plaar shape space Σ k 2 is the the set of all such shapes. Hece Σ k 2 ca be idetified with the space of all complex lies passig through the origi i C k which is the complex projective space ad is a compact Riemaia maifold of dimesio 2k 4. The Σ k 2 ca be embedded ito the space of all order k complex Hermitia matrices via the embeddig J([w]) = ww, deotig the complex cojugate traspose. This embeddig iduces a distace o Σ k 2 called the extrisic distace which geerates the maifold topology ad is give by d E ([u], [v]) = J([u]) J([v]) = 2( u v 2 ) ([u], [v] Σ k 2). For more details, see [3] ad the refereces cited therei.

NONPARAMETRIC BAYES DENSITY ESTIMATION ON MANIFOLDS 9 4.2. Desity model. We defie a locatio-mixture desity o Σ k 2 as i (2.) with respect to the Riemaia volume form V ad the kerel beig a complex Watso desity. This complex Watso desity was used i [5] for parametric desity modellig ad is give by (4.) (4.2) CW(m; µ, κ) = c (κ) exp{κ( z ν 2 )} (m = [z], µ = [ν]) ) k 3 with c(κ) = π k 2 κ ( 2 k κ r exp( κ). r! It is show i [3] that the complex Watso kerel satisfies assumptios A ad A2 i 2. Usig a Dirichlet Process prior o the locatio mixig distributio ad a idepedet gamma prior o the iverse-scale parameter, Propositio 2.2 implies that the desity model (2.) has full support i the space of all positive cotiuous desities o Σ k 2 i uiform ad KL sese ad hece the posterior is weakly cosistet. Theorem 4. verifies that the complex Watso kerel also satisfies the regularity coditios i A5 ad A6. Theorem 4.. The complex Watso kerel CW(m; µ, κ) o the compact metric space Σ k 2 edowed with the extrisic distace d E satisfies assumptio A5 with a = k ad A6 with a 2 = 3k 8. The proof uses Lemma 4.2 which verifies certai properties of the ormalizig costat. Lemma 4.2. Let c(κ) be the ormalizig costat for CW(µ, κ) as defied i (4.2). The c is decreasig o [0, ) with If we defie lim c(κ) = πk 2 κ 0 (k 2)! it follows that c is icreasig with k 3 c(κ) = exp( κ) lim c(κ) = 0, lim κ 0 ad lim c(κ) = 0. κ r=0 κ r r!, c(κ) = ad κ c(κ) (k 2)! exp( κ)κ k 2. r=0 Proof. Follows from direct computatios. Theorem 4.3 verifies that assumptio A7 holds o Σ k 2. Theorem 4.3. The compact metric space (Σ k 2, d E ) satisfies assumptio A7 with a 3 = 2k 3. As a result, Corollary 2.5 implies that strog posterior cosistecy holds with Π = (DP )(ω 0 P 0 ) π, for Weibull π with shape parameter exceedig (2k 3)(k ). Alteratively oe may use a gamma prior o κ with iverse-scale icreasig with at a suitable rate ad we have cosistecy usig Theorems 2.6 ad 2.9.

0 ABHISHEK BHATTACHARYA AND DAVID DUNSON 5. Summary We cosider kerel mixture desity models o geeral compact metric spaces ad obtai sufficiet coditios o the kerel, priors ad the space for the desity estimate to be strogly cosistet. Thereby we exted the existig literature o strog posterior cosistecy o R usig Gaussia kerels to more geeral o- Euclidea maifolds. The coditios are verified for specific kerels o two importat maifolds, amely the hypersphere ad the plaar shape space. We also allow the prior to deped o the sample size ad obtai sufficiet coditios for weak ad strog cosistecy. The assumptio that the true desity is positive everywhere ca be relaxed if the locatios for the mixture desity model are draw from the support of the truth. 6. Appedix 6.. Proof of Theorem 2.4. I this proof ad the subsequet oes, we shall use a geeral symbol C for ay costat ot depedig o (but possibly o ɛ). Proof. Give δ > 0 ( δ (ɛ, )), cover M by N ( N (δ )) may disjoit subsets of diameter at most δ : M = N i= E i. Assumptio A7 implies that for δ sufficietly small, N Cδ a3. Pick µ i E i, i =,..., N, ad defie for a probability P, (6.) N P = P (E i )δ µi, P (E) = (P (E ),..., P (E N )) T. i= Deotig the L -orm as., for ay κ κ, (6.2) (6.3) N f(p, κ) f(p, κ) K(µ, κ) K(µ i, κ) P (dµ) i= E i C sup m M K(m; µ, κ) K(m; µ i, κ) P (dµ) i E i Cκ a δ. The iequality i (6.3) follows from (6.2) usig Assumptio A5. For κ, κ κ, P M(M), (6.4) f(p, κ) f(p, κ) C sup m,µ M K(m; µ, κ) K(m; µ, κ) Cκ a2 κ κ, the iequality i (6.4) followig from Assumptio A6. Give δ 2 > 0 ( δ 2 (ɛ, )), cover [0, κ ] by fiitely may subsets of legth at most δ 2, the umber of such subsets required beig at most κ δ2. Call the collectio of these subsets W (δ 2, ). Lettig S d = {x [0, ] d : x i }, S d is compact uder the L -metric ( x L = x i, x R d ), ad hece give ay δ 3 > 0 ( δ 3 (ɛ)), ca be covered by fiitely may subsets of the cube [0, ] d each of diameter at most δ 3. I particular cover S d with cubes of side legth δ 3 /d lyig partially or totally i S d. The a upper boud o the umber N 2 N 2 (δ 3, d) of such cubes ca be show to be λ(s d (+δ 3)) (δ 3/d), λ deotig the Lebesgue measure o R d ad S d d (r) = {x [0, ) d :

NONPARAMETRIC BAYES DENSITY ESTIMATION ON MANIFOLDS xi r}. Sice λ(s d (r)) = r d /d!, hece N 2 (δ 3, d) dd d! ( ) d + δ3. Let W(δ 3, d) deote the partitio of S d as costructed above. Let d = N (δ ). For i N 2 (δ 3, d ), j κ δ 2, defie D ij = {f(p, κ) : P (E) W i, κ W j }, with W i ad W j beig elemets of W(δ 3, d ) ad W (δ 2, ) respectively. We claim that this subset of D has L diameter of at most ɛ. For f(p, κ), f( P, κ) i this set, f(p, κ) f( P, κ) (6.5) f(p, κ) f(p, κ) + f(p, κ) f( P, κ) + + f( P, κ) f( P, κ) + f( P, κ) f( P, κ). From iequality (6.3), it follows that the first ad third terms i (6.5) are at most Cκ a δ. The secod term ca be bouded by d i= δ 3 P (E i ) P (E i ) < δ 3 ad from the iequality i (6.4), the fourth term is bouded by Cκ a2 δ 2. Hece the claim holds if we choose δ = Cκ a, δ 2 = Cκ a2, ad δ 3 = C. The umber of such subsets coverig D is at most N 2 (δ 3, d )κ δ2. From Assumptio A7, it follows that for sufficietly large, d = N (δ ) Cκ aa3. Usig the Stirlig s formula, we ca boud log(n 2 (δ 3, d )) by Cd. Also κ δ2 is bouded by Cκ a2+, so that N(ɛ, D ) C + C log(κ ) + Cd Cκ aa3 for sufficietly large. This completes the proof. 6.2. Proof of Lemma 2.7. Proof. Uder assumptios A ad A2, from the proof of Propositio 2.2, it follows that give ɛ > 0, for ay κ 0 0, there exist κ 2 > κ > κ 0 ad a weakly ope eighborhood W of F 0 (all depedig o ɛ), such that K ɛ (f 0 ) cotais {f(p, κ) : P W, κ (κ, κ 2 )}. Hece f(x i ) f 0 (X i ) Π (df) W (κ,κ 2) K ɛ(f 0) f(x i ) f 0 (X i ) Π (df) f(x i ; P, κ) π (κ)π (dp )dκ. f 0 (X i ) By the law of large umbers, for ay f K ɛ (f 0 ), log{(f 0 /f)(x i )} KL(f 0 ; f) < ɛ i

2 ABHISHEK BHATTACHARYA AND DAVID DUNSON a.s. F0 as. Therefore for ay P W ad κ (κ, κ 2 ), f(x i ; P, κ) lim if exp(2ɛ) f 0 (X i ) lim if exp[[2ɛ (/) log{f 0 (X i )/f(x i ; P, κ)}]] = a.s. F0. i Also from Assumptio A0, for κ 0 sufficietly large, lim if exp(ɛ)π (κ) = ad hece lim if exp(3ɛ) f(x i ; P, κ) π (κ) = a.s. F0. f 0 (X i ) By Fubii-Toelli theorem, there exists a Ω 0 Ω with probability such that for ay ω Ω 0, f(x i (ω); P, κ) lim if exp(3ɛ) π (κ) = f 0 (X i (ω)) for all (P, κ) W (κ, κ 2 ) outside of a Π (dp ) dκ measure 0 subset. By Assumptio A9, Π (W) > 0. Therefore usig the Fatou s lemma, we coclude that f(x i ) lim if exp(3ɛ) f 0 (X i ) Π (df) W (κ,κ 2) lim if{exp(3ɛ) Sice ɛ was arbitrary, the proof is completed. 6.3. Proof of Lemma 3.3. Proof. Express c(κ) as C f(x i ; P, κ) π (κ)}π (dp )dκ = a.s. F0. f 0 (X i ) exp{ κ( t)}( t 2 ) d/2 dt ad it is clear that it is decreasig. This expressio suggests that c(κ) C C = C 0 0 if κ. This completes the proof. 0 exp{ κ( t)}( t 2 ) d/2 dt exp{ κ( t 2 )}( t 2 ) d/2 dt exp( κu)u d/2 ( u) /2 du C 0 exp( κu)u d/2 du κ = Cκ d/2 exp( v)v d/2 dv 0 C { exp( v)v d/2 dv } κ d/2 0 =

NONPARAMETRIC BAYES DENSITY ESTIMATION ON MANIFOLDS 3 6.4. Proof of Theorem 3.. Proof. Deote by M the uit sphere S d ad by ρ the chord distace o it. Express the vmf kerel as K(m; µ, κ) = c (κ) exp [ κ { ρ 2 (m, µ)/2 }] (m, µ M; κ [0, )). Sice ρ is cotiuous o the product space M M ad c is cotiuous ad ovaishig o [0, ), K is cotiuous o M M [0, ) ad assumptio A follows. For a give cotiuous fuctio φ o M, m M, κ 0, defie I(m, κ) = φ(m) K(m; µ, κ)φ(µ)v (dµ) = K(m; µ, κ){φ(m) φ(µ)}v (dµ). M The showig assumptio A2 for f 0 = φ is equivalet to showig M lim ( sup I(m, κ) ) = 0. κ m M To simplify I(m, κ), make a chage of coordiates µ µ = U(m) T µ, µ θ Θ d (0, π) d (0, 2π) where U(m) is a orthogoal matrix with first colum equal to m ad θ = (θ,..., θ d ) T are the spherical coordiates of µ µ(θ) which are give by µ j = cos θ j si θ h, j =,..., d, µ d+ = h<j d si θ j. j= Usig these coordiates, the volume form ca be writte as V (dµ) = V (d µ) = si d (θ ) si d 2 (θ 2 )... si(θ d )dθ... dθ d ad hece I(m, κ) equals c { (κ) exp κ cos(θ ) }{ φ(m) φ (U(m) µ) } si d (θ )... si(θ d )dθ... dθ d Θ d = c (κ) exp(κt) { φ(m) φ (U(m) µ) } ( t 2 ) d/2 (6.6) Θ d (,) si d 2 (θ 2 )... si(θ d )dθ 2... dθ d dt where t = cos(θ ), µ = µ ( θ(t) ) ad θ(t) = (arccos(t), θ 2,..., θ d ) T. I the itegrad i (6.6), the distace betwee m ad U(m) µ is 2( t). Substitute t = κ s i the itegral with s (0, 2κ). Defie Φ(s, κ) = sup { φ(m) φ( m) : m, m M, ρ(m, m) 2κ s }. The φ(m) φ (U(m) µ) Φ(s, κ).

4 ABHISHEK BHATTACHARYA AND DAVID DUNSON Sice φ is uiformly cotiuous o (M, ρ), therefore Φ is bouded o (R + ) 2 ad lim κ Φ(s, κ) = 0. Hece from (6.6), we deduce that sup m M I(m, κ) c (κ)κ exp(κ s)φ(s, κ)(κ s(2 κ s)) d/2 (6.7) Θ d (0,2κ) From Lemma 3.3, it follows that si d 2 (θ 2 )... si(θ d )dθ 2... dθ d ds Cκ d/2 c (κ) Φ(s, κ)e s s d/2 ds. lim sup κ d/2 c (κ) <. κ This i tur, usig the Lebesgue Domiated Covergece Theorem implies that the expressio i (6.7) coverges to 0 as κ. This verifies assumptio A2 ad completes the proof. 6.5. Proof of Theorem 3.2. I the proof, B d (r) deotes the ball of radius r aroud 0 i R d : B d (r) = {x R d : x 2 r} ad B d refers to B d (). Proof. It is clear from (3.) ad (3.2) that the vmf kerel K is cotiuously differetiable o R d+ R d+ [0, ). Hece sup K(m; µ, κ) K(m; ν, κ) sup m S d,κ [0,K] m S d,x B d+,κ [0,K] x K(m; x, κ) 2 µ ν 2. Sice x K(m; x, κ) = κ c (κ) exp{ κ( m T x)}m, its orm is bouded by κ c (κ). Lemma 3.3 implies that this i tur is bouded by K c (K) CK d/2+ for κ K ad K. This proves assumptio A5 with a = d/2 +. To verify A6, give κ, κ 2 K, use the iequality, sup K(m; µ, κ ) K(m; µ, κ 2 ) sup m,µ S κ K(m; µ, κ) κ κ 2. d By direct computatios, oe ca show that 0 m,µ S d,κ K K(m; µ, κ) = κ κ c(κ) c 2 (κ) exp{ κ( m T µ)} c (κ) exp{ κ( m T µ)}( m T µ), c(κ) = C exp{ κ( t)}( t)( t 2 ) d/2 dt, κ κ c(κ) C c(κ).

NONPARAMETRIC BAYES DENSITY ESTIMATION ON MANIFOLDS 5 Therefore, usig Lemma 3.3, κ K(m; µ, κ) C c (κ) C c (K) CK d/2 for ay κ K ad K. Hece A6 is verified with a 2 = d/2. Fially to verify A7, ote that S d B d+ [, ] d+ which ca be covered by fiitely may cubes of side legth ɛ/(d + ). Each such cube has L 2 diameter ɛ. Hece their itersectios with S d provides a fiite ɛ-cover for this maifold. If ɛ <, such a cube itersects with S d oly if it lies etirely i B d+ ( + ɛ) B d+ ( ɛ) c. The umber of such cubes, ad hece the ɛ-cover size ca be bouded by Cɛ (d+) {( + ɛ) d+ ( ɛ) d+ } Cɛ d for some C > 0 ot depedig o ɛ. This verifies A7 for appropriate positive costats A 3, A 4 ad a 3 = d ad completes the proof. 6.6. Proof of Theorem 4.. Proof. Express the complex Watso kerel as ( ) κ K(m; µ, κ) = c (κ) exp 2 d2 E(m, µ). Give κ 0, defie The φ (t) 2κ, so that which implies that (6.8) ( ) κ φ(t) = exp 2 t2, t [0, 2]. φ(t) φ(s) 2κ s t, s, t [0, 2] K(m; µ, κ) K(m; ν, κ) c (κ) 2κ d E (m, µ) d E (m, ν) 2κc (κ)d E (µ, ν). For κ K, from Lemma 4.2, it follows that κc (κ) Kc (K) = π 2 k K k c (K) provided K. Hece for ay K, π 2 k K k c () sup κc (κ) CK k κ [0,K] ad from iequality (6.8), a = k follows. (6.9) By direct computatio, oe ca show that κ K(m; µ, κ) = πk 2 exp { 2 κd2 E(m, µ) κ } c 2 (κ)κ 2 k[ r=k κ r r! { k 2 r 2 d2 E(m, µ) }].

6 ABHISHEK BHATTACHARYA AND DAVID DUNSON Deote by S the sum i the secod lie of (6.9) ad by T r its rth term, r k. Sice d 2 E (m, µ) 2, it ca be show that { k 2 if k r 2k 4, T r r k + 2 if 2k 3 r, so that S (k 2) 2k 4 r=k κ r r! + r=2k 3 κ r k 3 = (k 2)κ k 2 κ r (r + k )! + κ2k 4 r=0 Cκ k 2 e κ + κ 2k 4 e κ. r! (r k + 2) r=0 κ r (r + k ) (r + 2k 3)! Plug the above iequality i (6.9) to get K(m; µ, κ) κ Cc 2 (κ)κ 2 k exp { 2 κd2 E(m, µ) } (Cκ k 2 + κ 2k 4 ) (6.0) Cc 2 (κ)(c + κ k 2 ). For κ K ad K, usig Lemma 4.2, we boud the expressio i (6.0) by (6.) Cc 2 (K)(C + K k 2 ) = CK 2k 6 c 2 (K)(C + K k 2 ) CK 2k 6 c 2 ()(C + K k 2 ) CK 3k 8 for K sufficietly large. Sice K is a cotiuously differetiable i κ, from (6.) it follows that there exists K > 0 such that for all K K, κ, κ 2 K, sup K(m; µ, κ ) K(m; µ, κ 2 ) sup K(m; µ, κ) m,µ Σ k 2 m,µ Σ k 2,κ [0,K] κ κ κ 2 CK 3k 8 κ κ 2. This proves Assumptio A6 with a 2 = 3k 8 ad completes the proof. 6.7. Proof of Theorem 4.3. I the proof, C i, i =, 2,... deote positive costats possibly depedig o k. Proof. The preshape sphere CS k 2, as a real maifold, ca be idetified with the real uit sphere S 2k 3. Edow it with the chord distace iduced by the L 2 - orm u 2 = k u i 2 (u = (u,..., u k ) T ). i= The from Theorem 3.2, it follows that give ay δ > 0, CS k 2 ca be covered by fiitely may subsets of diameter less tha or equal to δ, the umber of such subsets beig bouded by C δ (2k 3) + C 2. The extrisic distace d E o Σ k 2 ca be bouded by the chord distace o CS k 2 as follows. For u, v CS k 2, u v 2 2 = 2 2Re(u v) 2 2 u v = 2( u v ) ( + u v )( u v ) = 2 d2 E([u], [v]).

NONPARAMETRIC BAYES DENSITY ESTIMATION ON MANIFOLDS 7 Hece d E ([u], [v]) 2 u v 2, so that give ay ɛ > 0, the shape image of a δ-cover for CS k 2 with δ = ɛ/ 2 provides a ɛ-cover for Σ k 2. Hece the ɛ-coverig size for Σ k 2 ca be bouded by C ɛ (2k 3) + C 2. This completes the proof. Refereces [] A. R. Barro. Uiformly powerful goodess of fit tests. A. Statist., 7:07 24, 989. [2] A. Bhattacharya ad D. Duso. Noparametric Bayes classificatio ad testig o maifolds with applicatios o hypersphere. 200. Discussio Paper, Departmet of Statistical Sciece, Duke Uiversity. [3] A. Bhattacharya ad D. Duso. Noparametric Bayesia desity estimatio o maifolds with applicatios to plaar shapes. Biometrika, 200. I Press. [4] R. N. Bhattacharya ad V. Patragearu. Large sample theory of itrisic ad extrisic sample meas o maifolds. A. Statist., 3: 29, 2003. [5] I. L. Dryde ad K. V. Mardia. Statistical Shape Aalysis. Wiley N.Y., 998. [6] M. D. Escobar ad M. West. Bayesia desity-estimatio ad iferece usig mixtures. J. Am. Statist. Assoc., 90:577 588, 995. [7] R. A. Fisher. Dispersio o a sphere. Proc. of the Royal Soc. of Lodo Ser. A - Math. ad Phy. Sci., 30:295 305, 953. [8] S. Ghosal. Bayesia Noparametrics. Cambridge Uiversity Press, 200. I Press. [9] S. Ghosal, J. K. Ghosh, ad R. V. Ramamoorthi. Posterior cosistecy of dirichlet mixtures i desity estimatio. A. Statist., 27:43 58, 999. [0] J.K. Ghosh ad R.V. Ramamoorthi. Bayesia Noparametrics. Spriger, N.Y., 2003. [] M. Hirsch. Differetial Topology. Spriger Verlag, New York, 976. [2] D. G. Kedall. Shape maifolds, procrustea metrics, ad complex projective spaces. Bull. of the Lodo Math. Soc., 6:8 2, 984. [3] L. LeCam. Covergece of estimates uder dimesioality restrictios. A. Statist., :38 53, 973. [4] K. P. Leox, D. B. Dahl, M. Vaucci, ad J. W. Tsai. Desity estimatio for protei coformatio agles usig a bivariate vo Mises distributio ad Bayesia oparametrics. J. Am. Statist. Assoc., 04:586 596, 2009. [5] A. Y. Lo. O a class of Bayesia oparametric estimates.. desity estimates. A. Statist., 2:35 357, 984. [6] L. Schwartz. O Bayes procedures. Z. Wahrsch. Verw. Gebiete, 4:0 26, 965. [7] R.V. vo Mises. Uber die Gazzahligkeit der Atomgewicht ud verwadte Frage. Physik Z, 9:490 500, 98. [8] G.S. Watso ad E.J. Williams. Costructio of sigificace tests o the circle ad sphere. Biometrika, 43:344 52, 953. [9] Y. Wu ad S. Ghosal. Kullback-Leibler property of kerel mixture priors i Bayesia desity estimatio. Elec J. Statist., 2:298 33, 2008. [20] Y. Wu ad S. Ghosal. L - cosistecy of dirichlet mixtures i multivariate bayesia desity estimatio o bayes procedures. 200. To Appear. Departmet of Statistical Sciece, Duke Uiversity, Durham, NC, USA