Fact-checked by Grok 2 weeks ago

Log-linear model

A log-linear model is a statistical method for analyzing associations among two or more categorical variables by modeling the logarithms of expected cell counts in a multi-way contingency table as a linear function of parameters that capture main effects and interactions between the variables.^[1] These models treat all variables symmetrically, without distinguishing between response and predictor variables, and are particularly suited for count data observed under multinomial or Poisson sampling schemes.^[1] Log-linear models form a special case of generalized linear models (GLMs), employing a Poisson distribution for the response and a logarithmic link function to ensure predicted counts remain positive.^[1] Parameter estimation typically uses maximum likelihood methods, often implemented via iterative algorithms such as iteratively reweighted least squares or iterative proportional fitting, with goodness-of-fit assessed through deviance statistics like the likelihood ratio chi-square (G²) or Pearson's chi-square (X²).^[1] Hierarchical principles guide model specification, where higher-order interactions imply lower-order ones, enabling tests for conditional independence and the estimation of measures like odds ratios to quantify associations.^[1] The foundations of log-linear models trace back to early 20th-century work on contingency tables by Karl Pearson, who introduced the chi-square test in 1900, and George Udny Yule, though the modern log-linear framework emerged in the 1960s.^[2] Key advancements include Bartlett's 1935 maximum likelihood estimation for three-way tables and the iterative proportional fitting algorithm by Deming and Stephan in 1940, but the pivotal unification came with Michael Birch's 1963 paper on maximum likelihood for multi-way tables, followed by Leo Goodman's extensions in 1963–1971 that popularized their use for testing interactions.^[2] Shelby Haberman's 1974 work further clarified estimation conditions, solidifying log-linear models as a cornerstone of categorical data analysis.^[2] In practice, log-linear models are widely applied in fields such as social sciences, epidemiology,^[3] and market research^[4] to explore complex dependencies in categorical data, including latent class analysis^[3] and graphical models for higher-dimensional tables.^[5] They offer flexibility for sparse data and model selection via criteria like the Bayesian information criterion (BIC), as highlighted in Raftery's 1980s contributions.^[1] Modern software like R's loglin package or SAS's PROC GENMOD facilitates their implementation, ensuring accessibility for rigorous inference on multivariate associations.^[1]

Definition and Basic Concepts

General Definition

A log-linear model is a statistical method for categorical data analysis in which the logarithms of the expected cell counts in a multi-way contingency table are expressed as a linear function of parameters that capture main effects and interactions among the categorical variables. This structure renders the model multiplicative on the original count scale and additive on the logarithmic scale, accommodating scenarios where effects among factors interact multiplicatively. Unlike models with designated response variables, log-linear models treat all categorical variables symmetrically, without distinguishing between response and predictors.^[6] Log-linear models originated in the 1960s and gained prominence in the 1970s for the analysis of categorical data, with foundational contributions from researchers like Leo A. Goodman and Stephen E. Fienberg. Goodman's early 1970s papers introduced hierarchical formulations for multi-way data structures, while Haberman advanced likelihood-based inference in 1973–1974; Fienberg advanced iterative estimation methods and addressed sampling challenges in model fitting. These innovations, facilitated by emerging computational capabilities, established log-linear models as a cornerstone for multivariate analysis.^[2] Unlike standard linear models, which posit additive effects suitable for unbounded responses, log-linear models assume multiplicativity to handle strictly positive outcomes, ensuring predictions remain non-negative and reflecting proportional changes. In log-linear models for contingency tables, the responses are cell counts, typically assumed to follow a Poisson distribution (or multinomial for fixed margins), which belongs to the exponential family of distributions in generalized linear models.^[7]^[8]

Relation to Other Models

The log-linear model is a special case of the generalized linear model (GLM) framework, where the response variable follows a Poisson or multinomial distribution and a logarithmic link function is employed to model the expected cell counts in contingency tables.^[9] This positioning allows log-linear models to handle non-normal responses, such as counts, through the exponential family of distributions, unifying them with other GLMs like logistic regression under a common estimation paradigm via maximum likelihood. In comparison to ordinary linear regression, which assumes additive effects and constant variance, the log-linear model is particularly suited for positive, skewed data like counts or rates, as the log link transforms multiplicative relationships into additive ones on the log scale, yielding interpretable percentage changes in expectations.^[10] It also addresses heteroscedasticity inherent in count data, where variance increases with the mean, by incorporating Poisson variance-mean equality, which linear models often violate without transformations. Log-linear models extend analysis of variance (ANOVA) and regression techniques for categorical predictors to the realm of count data, treating all variables symmetrically in multi-way contingency tables rather than distinguishing response from predictors.^[11] By modeling the logarithm of expected frequencies as a linear combination of main effects and interactions—analogous to ANOVA decompositions on the log scale—they enable hierarchical testing of associations among categorical variables, surpassing traditional chi-square tests in flexibility for complex structures. A precursor to modern log-linear estimation, the iterative proportional fitting (IPF) procedure, developed in the mid-20th century for adjusting contingency tables to marginal constraints, provides an efficient algorithm for obtaining maximum likelihood estimates under log-linear specifications, especially for hierarchical models. This method, formalized in the context of log-linear models during the 1970s, remains computationally valuable for large tables where direct GLM fitting may be intensive.

Mathematical Formulation

Model Equation

The log-linear model is a type of generalized linear model (GLM) that employs a logarithmic link function to model the expected cell counts in multi-way contingency tables arising from categorical variables. These models treat all variables symmetrically and are particularly suited for count data. In the hierarchical parameterization for categorical data, the logarithm of the expected cell count \mu_{ijk\dots} is expressed as a linear function of parameters capturing main effects and interactions: \log(\mu_{ijk\dots}) = \mu + \lambda_i^A + \lambda_j^B + \lambda_k^C + \dots + \lambda_{ij}^{AB} + \lambda_{ik}^{AC} + \dots + \lambda_{ijk}^{ABC} + \dots, where \mu is the overall mean (on the log scale), the \lambda terms represent main effects for each factor (e.g., \lambda_i^A for levels of factor A) and higher-order interactions (e.g., \lambda_{ij}^{AB} for the two-way interaction between A and B), up to the full r-way interaction if included.^[11] This can be a saturated model, which includes all possible interaction terms and fits the data perfectly, or an unsaturated model that omits higher-order terms to impose parsimony and test specific hypotheses about associations.^[1] Exponentiating both sides yields \mu_{ijk\dots} = \exp(\mu + \lambda_i^A + \lambda_j^B + \dots), ensuring that predicted counts are positive and allowing for multiplicative effects among the factors.^[12] Key assumptions underlying the log-linear model include the independence of observations across cells, positive expected values (\mu > 0) to avoid undefined logarithms, and, for count data in contingency tables, the observed counts typically following a Poisson or multinomial distribution with mean \mu.^[13] The log-linear form arises naturally within the exponential family of distributions, specifically as the canonical link function for the Poisson distribution. For a Poisson random variable Y \sim \mathrm{Poisson}(\mu), the probability mass function is P(Y = y) = \frac{\mu^y e^{-\mu}}{y!}, which can be rewritten in exponential family form as \log P(Y = y) = [y \log \mu - \mu - \log(y!)] + \constant, where the natural parameter \theta = \log \mu directly links to the linear predictor \eta = X\beta, yielding the log-linear specification \log \mu = \eta.^[14] This canonical parameterization simplifies maximum likelihood estimation and ensures desirable statistical properties, with similar structure under multinomial sampling.^[12]

Parameter Interpretation

In log-linear models, the parameters associated with main effects, denoted as \lambda_i for the i-th category of a categorical variable, represent the logarithmic contribution of that category to the expected cell counts relative to a baseline or reference category.^[15] The exponential of these parameters, \exp(\lambda_i), serves as a multiplicative factor that scales the expected count for the given category compared to the baseline, indicating how much the count increases or decreases due to the presence of that level.^[16] For instance, under effect coding constraints where the sum of parameters for a variable is zero, \lambda_i measures the deviation of the log-expected count for category i from the overall mean log-expected count across all categories of that variable.^[16] Interaction parameters, such as \lambda_{ij}^{AB} for the joint effect of categories i and j from variables A and B, capture the combined influence of multiple variables on the expected counts beyond what main effects alone would predict.^[1] These terms quantify associations between variables; specifically, \exp(\lambda_{ij}^{AB}) can be interpreted as a ratio of expected counts or, in the context of contingency tables, as a measure of how the relationship between two variables modifies the odds or risks within subgroups.^[15] For example, in a two-way interaction, \lambda_{ij}^{AB} reflects the partial association between A and B, adjusting for other factors, and its exponential form indicates the proportional change in expected counts due to the specific category combination.^[17] The hierarchy principle in log-linear models ensures that the inclusion of a higher-order interaction term implies the presence of all corresponding lower-order terms, facilitating interpretable and nested model structures.^[16] For instance, specifying a three-way interaction \lambda_{ijk}^{ABC} requires including the two-way interactions like \lambda_{ij}^{AB} and main effects \lambda_i^A, as the higher-order term builds upon and modifies the lower-order associations.^[1] This principle maintains consistency in parameter meanings across models and prevents overparameterization by enforcing that higher interactions represent variations in lower ones.^[17] Parameters in log-linear models often translate directly into interpretable measures such as odds ratios and risk ratios, enhancing their practical utility in analysis.^[15] An odds ratio for adjacent categories in a contingency table is given by \exp(\lambda_{ij}^{XY} + \lambda_{i+1,j+1}^{XY} - \lambda_{i+1,j}^{XY} - \lambda_{i,j+1}^{XY}), representing the change in odds of one outcome relative to another due to a unit shift in categories.^[16] Similarly, \exp(\beta) in related GLM contexts denotes the multiplicative change in the expected response for a one-unit increase in a predictor, akin to a risk ratio in count data contexts.^[1] These transformations allow parameters to convey substantive effects, such as relative risks between groups, in a scale-free manner.^[17]

Applications

Categorical Data Analysis

Log-linear models are widely applied in categorical data analysis to examine relationships among multiple categorical variables through multi-way contingency tables, where cell entries represent observed frequencies. These models treat the cell counts as realizations of independent Poisson random variables, enabling the specification of expected frequencies via a logarithmic link function that captures main effects and interactions additively.^[18] This Poisson assumption aligns with the multinomial sampling common in contingency table studies, as the conditional distribution given the margins follows a multinomial form under the same log-linear parameterization.^[19] A key feature of log-linear models in this context is their hierarchical structure, which allows researchers to specify and test models ranging from complete independence among variables to partial associations (e.g., conditional independence given a third variable) and full interaction terms encompassing all higher-order effects. For instance, in a three-way table, a model of mutual independence might include only main effects, while a partial association model could incorporate a two-way interaction alongside main effects to represent conditional dependencies. These hierarchical specifications facilitate stepwise model building, where higher-order terms are included only if supported by the data, promoting parsimonious representations of complex associations in social science datasets.^[20] Such structures were extensively developed in the 1970s to address multi-dimensional tables in fields like sociology and ecology, with Stephen E. Fienberg's 1970 work providing foundational methods for analyzing interactions in higher dimensions.^[2] Collapsibility in log-linear models refers to the conditions under which associations observed in a full table preserve their strength when marginalizing over one or more variables, a property crucial for interpreting aggregated data without distortion. Violations of collapsibility can manifest as Simpson's paradox, where marginal associations reverse direction compared to conditional ones, often arising in non-collapsible interaction structures like those in certain hierarchical models. For example, in a three-way table, the partial association between two variables may hold conditionally but not marginally if a third variable induces confounding interactions. Parameter interpretations in these models link interaction terms to log-odds ratios or log-expected frequency deviations, aiding in the assessment of such effects.^[21]^[22]

Econometrics and Trend Modeling

In econometrics, log-linear models are frequently employed to analyze relationships involving multiplicative growth, such as demand functions or economic expansion, where the specification takes the form \log(y) = \beta \log(x) + \epsilon. This double-log transformation allows the coefficient \beta to represent the elasticity of y with respect to x, quantifying the percentage change in y for a one percent change in x.^[23]^[24] A prominent application occurs in labor economics through wage equations, exemplified by the Mincer equation, which models log wages as a function of education and experience to estimate returns to human capital.^[25] These models are also utilized in epidemiology to examine incidence rates, where log-linear forms capture proportional changes in disease occurrence over time or across populations.^[26] In trend analysis, log-linear models facilitate the modeling of exponential growth patterns, particularly in time series data, by linearizing the logarithmic scale to reveal constant percentage changes. A key example is joinpoint regression, which applies piecewise log-linear segments to detect shifts in trends, such as varying rates of increase in health or economic indicators over time. The log transformation in these models addresses heteroscedasticity inherent in multiplicative error structures, where errors proportional to the level of the variable lead to increasing variance; by stabilizing this variance on the log scale, the approach enhances the reliability of linear regression assumptions.^[27] Such applications extend log-linear principles to generalized linear models for non-categorical outcomes in economics.^[28]

Estimation and Inference

Maximum Likelihood Estimation

Maximum likelihood estimation (MLE) is the primary method for obtaining parameter estimates in log-linear models, which are generalized linear models (GLMs) assuming a Poisson distribution for cell counts in contingency tables and a logarithmic link function relating the mean to the linear predictor. Under independent Poisson sampling, the likelihood function for observed counts y = (y_1, \dots, y_I) and expected means \mu = (\mu_1, \dots, \mu_I) is given by

L(\beta) = \prod_{i=1}^I \frac{\mu_i^{y_i} e^{-\mu_i}}{y_i!},

where \mu_i = \exp(x_i^T \beta) and x_i denotes the design vector for cell i. Maximization of L(\beta) is typically performed by optimizing the log-likelihood

\ell(\beta) = \sum_{i=1}^I \left( y_i \log \mu_i - \mu_i - \log y_i! \right),

which ignores the constant term -\sum \log y_i! for optimization purposes. The resulting maximum likelihood estimates \hat{\beta} satisfy the score equations derived from the exponential family structure of the Poisson distribution. Due to the nonlinearity of the log link, direct closed-form solutions are unavailable except in special cases like saturated models; instead, iterative numerical methods are employed. The Newton-Raphson algorithm updates parameter estimates via

\beta^{(k+1)} = \beta^{(k)} + \left( X^T W X \right)^{-1} X^T (y - \mu^{(k)}),

where W is a diagonal matrix of weights \operatorname{diag}(\mu_i^{(k)}), but it can suffer from instability for sparse data. For log-linear models, this is commonly implemented as iteratively reweighted least squares (IRLS), which reframes the problem as weighted least squares on the log scale by linearizing the link function and iteratively adjusting weights based on current fitted values. IRLS converges to the MLE under standard conditions and is the basis for software implementations like those in R's glm function.^[2] Under regularity conditions—such as the model being correctly specified, positive expected cell counts, and the information matrix being positive definite—the MLE \hat{\beta} is consistent, i.e., \hat{\beta} \to_p \beta as sample size increases, and asymptotically normal: \sqrt{n} (\hat{\beta} - \beta) \to_d N(0, I(\beta)^{-1}), where I(\beta) is the Fisher information matrix. These properties enable large-sample inference, including Wald confidence intervals for parameters.^[29] In certain cases, such as decomposable log-linear models where the generating class forms a simplicial complex, the MLE coincides with solutions from weighted least squares estimation on the logarithmic scale of the sufficient marginal statistics.^[2]

Goodness-of-Fit Tests

Goodness-of-fit tests for log-linear models assess the adequacy of the fitted model in reproducing the observed contingency table frequencies, typically using statistics derived from maximum likelihood estimation. These tests compare the observed counts y_i to the expected counts \mu_i under the Poisson assumption inherent to log-linear models.^[30] The deviance statistic, also known as the likelihood ratio chi-square statistic G^2, quantifies the discrepancy between observed and fitted values as

D = 2 \sum_i y_i \log \left( \frac{y_i}{\mu_i} \right),

where terms with y_i = 0 are taken as zero. Under the null hypothesis of adequate fit and large sample sizes, D approximately follows a chi-squared distribution with degrees of freedom equal to the number of cells minus the number of estimated parameters. A non-significant D (e.g., p-value > 0.05) indicates that the model fits the data well.^[30] The Pearson chi-squared statistic provides an alternative measure of fit, defined as

X^2 = \sum_i \frac{(y_i - \mu_i)^2}{\mu_i}.

Like the deviance, X^2 is asymptotically chi-squared distributed with the same degrees of freedom for large samples, and it is particularly sensitive to differences in cells with large expected counts. Both statistics are equivalent to tests against the saturated model, which perfectly fits the data by estimating a separate parameter for each cell.^[30] Likelihood ratio tests (LRT) extend these assessments to compare nested log-linear models, such as hierarchical structures where one model is a special case of another (e.g., testing for higher-order interactions). The test statistic is the difference in deviances between the fuller and reduced models, D_{\text{reduced}} - D_{\text{full}}, which follows a chi-squared distribution with degrees of freedom equal to the difference in the number of parameters. This approach is useful for model selection in multi-way tables, where significant differences indicate the need for additional terms.^[30] Overdispersion arises in log-linear models when the observed variability exceeds the Poisson assumption of variance equaling the mean (\text{Var}(y_i) = \mu_i), often detected if the deviance or Pearson statistic divided by its degrees of freedom exceeds 1 (e.g., values around 4 suggest substantial overdispersion). To address this, quasi-likelihood methods adjust the variance to \text{Var}(y_i) = \phi \mu_i, where the dispersion parameter \phi is estimated as the Pearson statistic divided by degrees of freedom; standard errors are then scaled by \sqrt{\phi}, and test statistics are divided by \phi to maintain valid inference without altering parameter estimates.^[31]

Examples and Case Studies

Two-Way Contingency Table

A two-way contingency table arises when cross-classifying observations by two categorical variables, yielding cell counts that log-linear models can analyze for patterns of independence or association.^[32] Consider data from a survey of 1091 adults on gender and belief in an afterlife, presented in the following 2×2 table:

Gender	Belief: Yes	Belief: No	Total
Female	435	147	582
Male	375	134	509
Total	810	281	1091

This table illustrates how log-linear models quantify the relationship between gender (A) and belief (B). The independence model posits no association between the variables and takes the form

\log \mu_{ij} = \mu + \lambda_i^A + \lambda_j^B,

where \mu_{ij} is the expected count in cell (i,j), \mu is the overall mean (on the log scale), \lambda_i^A are the effects for levels of gender, and \lambda_j^B are the effects for levels of belief. Fitting this model via maximum likelihood yields expected counts that closely match the observed values, such as 432.1 for female-yes and 149.9 for female-no. To detect association, the model extends to include an interaction term:

\log \mu_{ij} = \mu + \lambda_i^A + \lambda_j^B + \lambda_{ij}^{AB}.

The saturated version of this association model perfectly reproduces the observed counts, with all \lambda_{ij}^{AB} terms capturing deviations from additivity. Four times the interaction parameter \lambda_{11}^{AB} (for female-yes) equals the log of the odds ratio measuring association strength, here approximately 0.056 (so \lambda_{11}^{AB} \approx 0.014), implying an odds ratio of about 1.06 that suggests minimal deviation from independence. Model comparison uses the deviance statistic or Pearson's chi-square. For the independence model, Pearson's X^2 = \sum (o_{ij} - \hat{\mu}_{ij})^2 / \hat{\mu}_{ij} = 0.162 with 1 degree of freedom (p = 0.687), indicating good fit. The difference in deviance between the independence and saturated models, also 0.162 on 1 df, tests the interaction term and provides no evidence of association (p = 0.687), supporting the conclusion of independence between gender and belief.

Multi-Way Interaction Model

In log-linear models for multi-way contingency tables, interactions among three or more categorical variables are modeled by extending the Poisson log-linear framework to higher-order terms, allowing for the examination of complex associations beyond pairwise relationships. These models are particularly useful in fields like epidemiology and social sciences, where variables such as exposure, demographic factors, and outcomes may interact in non-additive ways. For instance, a 2×2×2 contingency table might explore the joint distribution of smoking status (smoker/non-smoker), gender (male/female), and disease presence (yes/no), enabling assessment of whether the association between smoking and disease varies by gender. A common approach is the partial association model, which includes main effects and selected two-way interactions while omitting others to test conditional independencies. For variables A (smoking, i=1,2), B (gender, j=1,2), and C (disease, k=1,2), the model specifies:

\log \mu_{ijk} = \mu + \lambda_i^A + \lambda_j^B + \lambda_k^C + \lambda_{ij}^{AB} + \lambda_{ik}^{AC}

This formulation assumes no direct interaction between gender and disease (\lambda_{jk}^{BC} = 0) and no three-way interaction, implying that the association between gender and disease is uniform across smoking levels. The parameters \lambda_{ij}^{AB} capture the smoking-gender association, while \lambda_{ik}^{AC} represents the smoking-disease association, averaged over gender. Maximum likelihood estimates for these parameters are obtained iteratively, often using algorithms like iterative proportional fitting, with the model's fit evaluated via deviance statistics. Model selection in multi-way log-linear analysis typically involves hierarchical inclusion of interaction terms, guided by likelihood ratio tests (LRT) in a stepwise manner to balance parsimony and fit. Starting from the saturated model (including all interactions), terms are sequentially added or removed based on LRT comparisons; for example, testing the addition of the three-way term \lambda_{ijk}^{ABC} against the partial association model yields a chi-squared statistic with 1 degree of freedom for a 2×2×2 table, where a significant result (e.g., p < 0.05) indicates the need for the higher-order interaction. This stepwise LRT process ensures that only terms contributing significantly to explaining the data variation are retained, preventing overfitting in higher dimensions. The three-way interaction term \lambda_{ijk}^{ABC}, when included, quantifies deviations from the pairwise associations captured in the partial model, representing how the smoking-disease relationship differs across gender levels (or vice versa). For the 2×2×2 smoking-gender-disease table, this term measures the log-odds ratio contrast between conditional tables sliced by gender; a non-zero value signals that the smoking-disease association is not uniform by gender, such as stronger effects among males, thus revealing synergistic or antagonistic effects among the variables. Interpretation focuses on these deviations to inform causal inferences, such as moderated risks in epidemiological contexts.

References

[1]
[PDF] Log-linear Models for Contingency Tables - Edps/Psych/Soc 589
Log-linear models, also known as Poisson regression, model counts in contingency tables, modeling associations between categorical variables. The response ...Missing: key | Show results with:key
[2]
[PDF] Log-linear Models and Maximum Likelihood Estimation
Jan 25, 2006 · Key pa- pers by Birch (1963), Darroch (1962), Good (1963), and Goodman (1963,. 1964), plus the availability of high-speed computers, served to ...
[3]
Log-Linear Model - an overview | ScienceDirect Topics
Log-linear models are statistical models that express the relationship between categorical response variables and latent variables through a logarithmic ...
[4]
Log-linear models of migration flows
In this section, the log-linear model is defined in the context of two-dimensional flow tables, and multiplicative forms as well as additive forms of the ...
[5]
Uses of the logarithm transformation in regression and forecasting
Therefore, logging converts multiplicative relationships to additive relationships, and by the same token it converts exponential (compound growth) trends to ...<|control11|><|separator|>
[6]
Log-linear models - Poisson regression - Stat@Duke
Poisson Regression · λ must be greater than or equal to 0 for any combination of predictor variables · Constant variance assumption will be violated!Missing: additivity positive
[7]
[PDF] An Introduction to Loglinear Modeling
A type of generalized linear models (GLM), the family of models that extend ordinary least squares regression to non-normal distributions.
[8]
Log-linear modeling - Wiley Online Library
This article describes log-linear models as special cases of generalized linear models. Specifically, log-linear models use a logarithmic link function.
[9]
[PDF] An Introduction to Categorical Data Analysis | ALAN AGRESTI
for multinomial responses, both nominal and ordinal. Chapter 7 discusses loglinear models for Poisson (count) data. Chapter 8 presents methods for matched ...
[10]
10: Log-Linear Models - STAT ONLINE
The PROC GENMOD procedure. Fits a log-linear model as a Generalized Linear Model (GLM) and by default uses indicator variable coding. We will mostly use PROC ...
[11]
[PDF] Generalized linear models - cs.wisc.edu
Nov 1, 2010 · Generalized Linear Models. • When using linear models (LMs) we assume ... • Thus, the canonical link function for the Poisson is the log link.
[12]
[PDF] Generalized Linear Models - Department of Statistical Sciences
Generalized linear models have a common algorithm for the est- imation of parameters by maximum likelihood; this uses weighted least squares with an ...
[13]
[PDF] Contingency Tables and Log-Linear Models: Basic Results and New ...
(Historical references can be found in various sources in- cluding Bishop, Fienberg, and Holland 1975, Carriquiry and. Fienberg 1998, Fienberg 1980, and ...
[14]
10: Log-Linear Models | STAT 504
Log-linear models go beyond single summary statistics and specify how the cell counts depend on the levels of categorical variables.
[15]
Chapter 4 Log-Linear Models | Advanced Statistical Modelling
(PART) Categorical Data Analysis; 2 Two-Way Contingency Tables · 2.1 2×2 2 × 2 Tables · 2.1.1 Example · 2.1.2 Sampling Schemes · 2.1.3 Odds and Odds Ratio · 2.2 ...
[16]
[PDF] Log-linear modelling 1 Introduction 2 Hierarchical log-linear models
defined in a much more general way by viewing them as a special case of the generalized linear modelling (GLM) family. In its most general form, a log ...
[17]
10.2 - Log-linear Models for Three-way Tables - STAT ONLINE
Main assumption. The $N = IJK$ counts in the cells are assumed to be independent observations of a Poisson random variable. Model structure.
[18]
[PDF] Chapter 5 - Log-Linear Models for Contingency Tables
This is part of a larger table found in Fienberg (1977, p. 101). Table 5.2 ... mized multinomial log-likelihood under the restrictions imposed by Equation.
[19]
[PDF] Chapters 9 and 10: Log-linear models - University of South Carolina
Partial independence models. 2. [XY ][Z], 3. [XZ][Y], or 4. [YZ][X]. There are three ways that one variable can be independent of the remaining two: (X,Y ) ...
[20]
[PDF] Simpson's Paradox and Collapsibility - arXiv
Mar 25, 2014 · Collapsibility is a concept closely related to that of Simpson's paradox. Generally, whenever Simp- son's paradox does not occur, the ...
[21]
[PDF] 36-720: Graphical Models
Nov 17, 2005 · Collapsibility and Simpson's Paradox. Consider a hierarchical, graphical log-linear model M and partition the factors in M into three dijoint ...
[22]
[PDF] Models with Natural Logarithms - Colin Cameron
Model. Specification. Interpretation of b2. Linear by = b1 + b2x. Slope: Ay/Ax. Log-Linear d ln y = b1 + b2x. Semi-elasticity: (Ay/y)/Ax. Log-log.
[23]
[PDF] Linear Regression Models with Logarithmic Transformations
Mar 17, 2011 · Logarithmic transformations in regression models handle non-linear relationships, preserve linear models, and can make skewed variables more ...Missing: heteroscedasticity | Show results with:heteroscedasticity
[24]
[PDF] The Mincer Earnings Function and Its Applications
In 1958 Jacob Mincer pioneered an important approach to understand how earnings are distributed across the population. In the years since Mincer's seminal work, ...
[25]
The Analysis of Rates and of Survivorship Using Log-Linear Models
Log-linear models analyze rates and survivorship where the rate of events has a log-linear relationship with covariates, using algorithms like IPF.
[26]
Some Statistical Implications of the Log Transformation of ...
Sep 1, 1972 · An attempt is made to set out the implications of the log transformation on the stochastic properties of the model, which are postulated in ...Missing: stabilizes | Show results with:stabilizes
[27]
Log-transformation and its implications for data analysis - PMC - NIH
The log transformation, a widely used method to address skewed data, is one of the most popular transformations used in biomedical and psychosocial research.
[28]
Maximum likelihood estimation in log-linear models - Project Euclid
We study maximum likelihood estimation in log-linear models under conditional Poisson sampling schemes. We derive necessary and sufficient conditions for ...<|control11|><|separator|>
[29]
7 Loglinear Models for Contingency Tables and Counts | html
5.2.1 Goodness of Fit: Model Comparison Using the Deviance · 5.2.2 ... 7.2 Statistical Inference for Loglinear Models. 7.2.1 Chi-Squared Goodness-of-Fit Tests.
[30]
7.3 - Overdispersion | STAT 504 - STAT ONLINE
Overdispersion occurs when the discrepancies between the observed responses and their predicted values are larger than what the binomial model would predict.
[31]
Discrete Multivariate Analysis: Theory and Practice - SpringerLink
$$109.00 In stockSampling Models For Discrete Data. Pages 435-456. Asymptotic Methods. Pages ... log-linear models · measures of association and agreement. Publish with us.