TL;DR: In this paper, the joint maximum likelihood estimator of the structural parameters is not consistent as the number of groups increases, with a fixed number of observations per group, and a conditional likelihood function is maximized, conditional on sufficient statistics for the incidental parameters.
Abstract: In data with a group structure, incidental parameters are included to control for missing variables. Applications include longitudinal data and sibling data. In general, the joint maximum likelihood estimator of the structural parameters is not consistent as the number of groups increases, with a fixed number of observations per group. Instead a conditional likelihood function is maximized, conditional on sufficient statistics for the incidental parameters. In the logit case, a standard conditional logit program can be used. Another solution is a random effects model, in which the distribution of the incidental parameters may depend upon the exogenous variables.
TL;DR: The Em procedure is shown to apply to general item-response models lacking simple sufficient statistics for ability, including models with more than one latent dimension, when computing procedures based on an EM algorithm are used.
Abstract: Maximum likelihood estimation of item parameters in the marginal distribution, integrating over the distribution of ability, becomes practical when computing procedures based on an EM algorithm are used By characterizing the ability distribution empirically, arbitrary assumptions about its form are avoided The Em procedure is shown to apply to general item-response models lacking simple sufficient statistics for ability This includes models with more than one latent dimension
TL;DR: In this paper, it was shown that if a > and n are independent, then the combination (a, -y) > (#, y) is a sufficient statistic for a procedure equivalent to,S, a >, it is shown that a v j3.1.
Abstract: 1. Summary Bohnenblust, Shapley, and Sherman [2] have introduced a method of comparing two sampling procedures or experiments; essentially their concept is that one experiment a is more informative than a second experiment ,, a v ,S, if, for every possible risk function, any risk attainable with , is also attainable with a. If a is a sufficient statistic for a procedure equivalent to ,S, a >,, it is shown that a v j3. In the case of dichotomies, the converse is proved. Whether > and v are equivalent in general is not known. Various properties of > and n are obtained, such as the following: if a > , and y is independent of both, then the combination (a, -y) > (#, y). An application to a problem in 2 X 2 tables is discussed.
TL;DR: In this article, the authors present a general overview of decision theory and its application to probability models. But they do not consider the problem of estimating the probability distribution of a decision.
Abstract: Content.- 1: Probability Models.- 1.1 Background.- 1.1.1 General Concepts.- 1.1.2 Classical Statistics.- 1.1.3 Bayesian Statistics.- 1.2 Exchangeability.- 1.2.1 Distributional Symmetry.- 1.2.2 Frequency arid Exchangeability.- 1.3 Parametric Models.- 1.3.1 Prior, Posterior, and Predictive Distributions.- 1.3.2 Improper Prior Distributions.- 1.3.3 Choosing Probability Distributions.- 1.4 DeFinetti's Representation Theorem.- 1.4.1 Understanding the Theorems.- 1.4.2 The Mathematical Statements.- 1.4.3 Some Examples.- 1.5 Proofs of DeFinetti's Theorem and Related Results*.- 1.5.1 Strong Law of Large Numbers.- 1.5.2 The Bernoulli Case.- 1.5.3 The General Finite Case*.- 1.5.4 The General Infinite Case.- 1.5.5 Formal Introduction to Parametric Models*.- 1.6 Infinite-Dimensional Parameters*.- 1.6.1 Dirichlet Processes.- 1.6.2 Tailfree Processes+.- 1.7 Problems.- 2: Sufficient Statistics.- 2.1 Definitions.- 2.1.1 Notational Overview.- 2.1.2 Sufficiency.- 2.1.3 Minimal and Complete Sufficiency.- 2.1.4 Ancillarity.- 2.2 Exponential Families of Distributions.- 2.2.1 Basic Properties.- 2.2.2 Smoothness Properties.- 2.2.3 A Characterization Theorem*.- 2.3 Information.- 2.3.1 Fisher Information.- 2.3.2 Kullback-Leibler Information.- 2.3.3 Conditional Information*.- 2.3.4 Jeffreys' Prior*.- 2.4 Extremal Families*.- 2.4.1 The Main Results.- 2.4.2 Examples.- 2.4.3 Proofs+.- 2.5 Problems.- Chapte 3: Decision Theory.- 3.1 Decision Problems.- 3.1.1 Framework.- 3.1.2 Elements of Bayesian Decision Theory.- 3.1.3 Elements of Classical Decision Theory.- 3.1.4 Summary.- 3.2 Classical Decision Theory.- 3.2.1 The Role of Sufficient Statistics.- 3.2.2 Admissibility.- 3.2.3 James-Stein Estimators.- 3.2.4 Minimax Rules.- 3.2.5 Complete Classes.- 3.3 Axiomatic Derivation of Decision Theory*.- 3.3.1 Definitions and Axioms.- 3.2.2 Examples.- 3.3.3 The Main Theorems.- 3.3.4 Relation to Decision Theory.- 3.3.5 Proofs of the Main Theorems*.- 3.3.6 State-Dependent Utility*.- 3.4 Problems.- 4: Hypothesis Testing.- 4.1 Introduction.- 4.1.1 A Special Kind of Decision Problem.- 4.1.2 Pure Significance Tests.- 4.2 Bayesian Solutions.- 4.2.1 Testing in General.- 4.2.2 Bayes Factors.- 4.3 Most Powerful Tests.- 4.3.1 Simple Hypotheses and Alternatives.- 4.3.2 Simple Hypotheses, Composite Alternatives.- 4.3.3 One-Sided Tests.- 4.3.4 Two-Sided Hypotheses.- 4.4 Unbiased Tests.- 4.4.1 General Results.- 4.4.2 Interval Hypotheses.- 4.4.3 Point Hypotheses.- 4.5 Nuisance Parameters.- 4.5.1 Neyinan Structure.- 4.5.2 Tests about Natural Parameters.- 4.5.3 Linear Combinations of Natural Parameters.- 4.5.4 Other Two-Sided Cases*.- 4.5.5 Likelihood Ratio Tests.- 4.5.6 The Standard F-Test as a Bayes Rule.- 4.6 P-Values.- 4.6.1 Definitions and Examples.- 4.6.2 P-Values and Bayes Factors.- 4.7 Problems.- 5: Estimation.- 5.1 Point Estimation.- 5.1.1 Minimum Variance Unbiased Estimation.- 5.1.2 Lower Bounds on the Variance of Unbiased Estimators.- 5.1.3 Maximum Likelihood Estimation.- 5.1.4 Bayesian Estimation.- 5.1.5 Robust Estimation*.- 5.2 Set Estimation.- 5.2.1 Confidence Sets.- 5.2.2 Prediction Sets*.- 5.2.3 Tolerance Sets*.- 5.2.4 Bayesian Set Estimation.- 5.2.5 Decision Theoretic Set Estimation.- 5.3 The Bootstrap*.- 5.3.1 The General Concept.- 5.3.2 Standard Deviations and Bias.- 5.3.3 Bootstrap Confidence Intervals.- 5.4 Problems.- 6: Equivariance*.- 6.1 Common Examples.- 6.1.1 Location Problems.- 6.1.2 Scale Problems.- 6.2 Equivariant Decision Theory.- 6.2.1 Groups of Transformations.- 6.2.2 Equivariance and Changes of Units.- 6.2.3 Minimum Risk Equivariant Decisions.- 6.3 Testing and Confidence Intervals*.- 6.3.1 P-Values in Invariant Problems.- 6.3.2 Equivariant Confidence Sets.- 6.3.3 Invariant Tests*.- 6.4 Problems.- 7: Large Sample Theory.- 7.1 Convergence Concepts.- 7.1.1 Deterministic Convergence.- 7.1.2 Stochastic Convergence.- 7.1.3 The Delta Method.- 7.2 Sample Quantiles.- 7.2.1 A Single Quantile.- 7.2.2 Several Quantiles.- 7.2.3 Linear Combinations of Quantiles*.- 7.3 Large Sample Estimation.- 7.3.1 Some Principles of Large Sample Estimation.- 7.3.2 Maximum Likelihood Estimators.- 7.3.3 MLEs in Exponential Families.- 7.3.4 Examples of Inconsistent MLEs.- 7.3.5 Asymptotic Normality of MLEs.- 7.3.6 Asymptotic Properties of M-Estimators.- 7.4 Large Sample Properties of Posterior Distributions.- 7.4.1 Consistency of Posterior Distributions+.- 7.4.2 Asymptotic Normality of Posterior Distributions.- 7.4.3 Laplace Approximations to Posterior Distributions*.- 7.4.4 Asymptotic Agreement of Predictive Distributions+.- 7.5 Large Sample Tests.- 7.5.1 Likelihood Ratio Tests.- 7.5.2 Chi-Squarcd Goodness of Fit Tests.- 7.6 Problems.- 8: Hierarchical Models.- 8.1 Introduction.- 8.1.1 General Hierarchical Models.- 8.1.2 Partial Exchangeability*.- 8.1.3 Examples of the Representation Theorem*.- 8.2 Normal Linear Models.- 8.2.1 One-Way ANOVA.- 8.2.2 Two-Way Mixed Model ANOVA*.- 8.2.3 Hypothesis Testing.- 8.3 Nonnormal Models*.- 8.3.1 Poisson Process Data.- 8.3.2 Bernoulli Process Data.- 8.4 Empirical Bayes Analysis*.- 8.4.1 Naive Empirical Bayes.- 8.4.2 Adjusted Empirical Bayes.- 8.4.3 Unequal Variance Case.- 8.5 Successive Substitution Sampling.- 8.5.1 The General Algorithm.- 8.5.2 Normal Hierarchical Models.- 8.5.3 Nonnormal Models.- 8.6 Mixtures of Models.- 8.6.1 General Mixture Models.- 8.6.2 Outliers.- 8.6.3 Bayesian Robustness.- 8.7 Problems.- 9: Sequential Analysis.- 9.1 Sequential Decision Problems.- 9.2 The Sequential Probability Ratio Test.- 9.3 Interval Estimation*.- 9.4 The Relevancc of Stopping Rules.- 9.5 Problems.- Appendix A: Measure and Integration Theory.- A.1 Overview.- A.1.1 Definitions.- A.1.2 Measurable Functions.- A.1.3 Integration.- A.1.4 Absolute Continuity.- A.2 Measures.- A.3 Measurable Functions.- A.4 Integration.- A.5 Product Spaces.- A.6 Absolute Continuity.- A.7 Problems.- Appendix B: Probability Theory.- B.1 Overview.- B.1.1 Mathematical Probability.- B.1.2 Conditioning.- B.1.3 Limit Theorems.- B.2 Mathematical Probability.- B.2.1 Random Quantities and Distributions.- B.2.2 Some Useful Inequalities.- B.3 Conditioning.- B.3.1 Conditional Expectations.- B.3.2 Borel Spaces*.- B.3.3 Conditional Densities.- B.3.4 Conditional Independence.- B.3.5 The Law of Total Probability.- B.4 Limit Theorems.- B.4.1 Convergence in Distribution and in Probability.- B.4.2 Characteristic Functions.- B.5 Stochastic Processes.- B.5.1 Introduction.- B.5.3 Markov Chains*.- B.5.4 General Stochastic Processes.- B.6 Subjective Probability.- B.7 Simulation*.- B.8 Problems.- Appendix C: Mathematical Theorems Not Proven Here.- C.1 Real Analysis.- C.2 Complex Analysis.- C.3 Functional Analysis.- Appendix D: Summary of Distributions.- D.1 Univariate Continuous Distributions.- D.2 Univariate Discrete Distributions.- D.3 Multivariate Distributions.- References.- Notation and Abbreviation Index.- Name Index.
TL;DR: In this article, a maximum-likelihood estimator based on the conditional distribution given minimal sufficient statistics for the incidental parameters is proposed, and it is proved that conditional maximum likelihood estimates in the regular case are consistent and asymptotically normally distributed.
Abstract: The problem of obtaining consistent estimates for structural parameters in the presence of infinitely many incidental parameters was discussed first by Neyman and Scott (1948). In this paper a maximum-likelihood method based on the conditional distribution given minimal sufficient statistics for the incidental parameters is suggested. It is proved that conditional maximumlikelihood estimates in the regular case are consistent and asymptotically normally distributed with a simple asymptotic variance. The efficiency problem of this new estimator is discussed in particular with respect to some situations with ancillary information.