| Title: | Regression under Interference in Connected Populations |
|---|---|
| Description: | An implementation of generalized linear models (GLMs) for studying relationships among attributes in connected populations, where responses of connected units can be dependent, as introduced by Fritz et al. (2025) <doi:10.1080/01621459.2025.2565851>. 'igml' extends GLMs for independent responses to dependent responses and can be used for studying spillover in connected populations and other network-mediated phenomena. |
| Authors: | Cornelius Fritz [aut, cre], Michael Schweinberger [aut] |
| Maintainer: | Cornelius Fritz <[email protected]> |
| License: | GPL-3 |
| Version: | 1.2.4 |
| Built: | 2026-06-05 20:33:22 UTC |
| Source: | https://github.com/corneliusfritz/iglm |
Create a list of control parameters for the 'iglm' estimation algorithm.
control.iglm( estimate_model = TRUE, display_progress = FALSE, return_samples = TRUE, offset_nonoverlap = 0, var_method = "Mean-value", non_stop = FALSE, tol = 0.001, max_it = 100, return_x = FALSE, return_y = FALSE, return_z = FALSE, accelerated = TRUE, exact = TRUE )control.iglm( estimate_model = TRUE, display_progress = FALSE, return_samples = TRUE, offset_nonoverlap = 0, var_method = "Mean-value", non_stop = FALSE, tol = 0.001, max_it = 100, return_x = FALSE, return_y = FALSE, return_z = FALSE, accelerated = TRUE, exact = TRUE )
estimate_model |
(logical) If 'TRUE' (default), the main model parameters are estimated. If 'FALSE', estimation is skipped and only the preprocessing is done. |
display_progress |
(logical) If 'TRUE', display progress messages or a progress bar during estimation. Default is 'FALSE'. |
return_samples |
(logical). If |
offset_nonoverlap |
(numeric) A value added to the linear predictor for dyads not in the 'overlap' set. Default is '0'. |
var_method |
(string) Method for variance estimation. Options are "Mean-value" (default), "Godambe", and "Hessian". The mean-value version is described in Section 3.3 of Fritz et al. (2025), the Godambe method is described in Schmid and Hunter (2023), and the "Hessian" option just assumes that the pseudo likelihood is the correct likelihood. |
non_stop |
(logical) If 'TRUE', the estimation algorithm continues until 'max_it' iterations, ignoring the 'tol' convergence criterion. Default is 'FALSE'. |
tol |
(numeric) The tolerance level for convergence. The estimation stops when the change in coefficients between iterations is less than 'tol'. Default is '0.001'. |
max_it |
(integer) The maximum number of iterations for the estimation algorithm. Default is '100'. |
return_x |
(logical). If |
return_y |
(logical). If |
return_z |
(logical). If |
accelerated |
(logical) If 'TRUE' (default), an accelerated MM algorithm is used based on a Quasi Newton scheme described in the Supplemental Material of Fritz et al (2025). |
exact |
(logical) If 'TRUE', the pseudo Fisher information is calculated exact for assessing the uncertainty of the estimates. Default is 'FALSE'. |
A list object of class '"control.iglm"' containing the specified control parameters.
Fritz, C., Schweinberger, M. , Bhadra S., and D. R. Hunter (2025). A Regression Framework for Studying Relationships among Attributes under Network Interference. Journal of the American Statistical Association, to appear.
Schmid, C.S. and D. R. Hunter (2023). Computing Pseudolikelihood Estimators for Exponential-Family Random Graph Models. Journal of Data Science
A preprocessed dataset containing social ties, physical proximity, and nodal
attributes for a subset of participants in the Copenhagen Networks Study.
The object is provided as an iglm.data class.
data(copenhagen)data(copenhagen)
The iglm.data provides the following information:
A integer matrix representing the
undirected friendship network ($Z$).
A logical/binomial vector of length indicating
gender (1 for female, 0 for male).
A numeric vector of length representing the
log-transformed total call duration in minutes:
.
A matrix defining the proximity-based constraint space. Pairs are included if their cumulative physical proximity exceeded 24 hours during the observation period.
Boolean TRUE, indicating that the attribute
is treated as exogenous.
The following preprocessing steps were carried out:
Temporal Aggregation: Proximity data (Bluetooth scans) were aggregated into sessions. A session break was defined by any temporal gap exceeding 10 minutes.
Recursive Pruning: A recursive filter removed actors with missing
gender information or isolated actors in either the
friendship (z_network) or proximity (neighborhood) networks,
Sapiezynski, P., Stopczynski, A., Lassen, D. D. and Lehmann, S. (2019), Interaction data from the Copenhagen Networks Study. Scientific Data 6(1), 315.
This function generates the directory structure and source files for a new R package
named iglm.userterms (or whatever name is provided in the parameter pkg_name).
This auxiliary package serves as a template for extending the
iglm framework to user-defined sufficient statistics.
By compiling this package, users can link custom C++ implementations of change statistics
directly with the iglm package, enabling seamless integration of new model terms.
create_userterms_skeleton(path = ".", pkg_name = "iglm.userterms")create_userterms_skeleton(path = ".", pkg_name = "iglm.userterms")
path |
A character string specifying the path where the package directory
should be created. Defaults to the current working directory ( |
pkg_name |
A character string specifying the name of the package to be created. |
The function creates a directory with the name specified in pkg_name
at the specified location.
As an example for a possible statistic, the statistic counting mutual
connections in the network is implemented.
After defining all possible change-statistics in the c++ function (this has to include a change for
z_ij (network), x_i (attribute x), and y_i (attribute y) all toggling from 0 to 1),
the function has to be registered using the EFFECT_REGISTER macro.
After compiling the package,
users have to load the package using library(pkg_name) before using it in iglm.
R package iglm implements generalized linear models (GLMs)
for studying relationships among attributes in connected populations,
where responses of connected units can be dependent.
It extends GLMs for independent responses to dependent responses and can
be used for studying spillover in connected populations and other network-mediated phenomena.
It is based on a joint probability model for dependent
responses () and connections conditional on
predictors (X).
iglm( formula = NULL, coef = NULL, coef_degrees = NULL, sampler = NULL, control = NULL, name = NULL, file = NULL )iglm( formula = NULL, coef = NULL, coef_degrees = NULL, sampler = NULL, control = NULL, name = NULL, file = NULL )
formula |
A model 'formula' object. The left-hand side should be the
name of a 'iglm.data' object available in the calling environment.
See |
coef |
Optional numeric vector of initial coefficients for the structural (non-degrees) terms in 'formula'. If 'NULL', coefficients are initialized to zero. Length must match the number of terms. |
coef_degrees |
Optional numeric vector specifying the initial degrees coefficients. Required if 'formula' includes degrees terms, otherwise should be 'NULL'. Length must match 'n_actor' (for undirected) or '2 * n_actor' (for directed). |
sampler |
An object of class |
control |
An object of class |
name |
Optional character string specifying a name for the model. |
file |
Optional character string specifying a file path to load a
previously saved |
An object of class iglm.object.
The joint probability density is specified as
which is defined by two distinct sets of user-specified features:
: A vector of unit-level functions (or "g-terms")
that describe the relationship between an individual actor 's
predictors () and their own response ().
: A vector of pair-level functions (or "h-terms")
that specify how the connections () and responses ()
of a pair of units depend on each other and the wider
network structure.
This separation allows the model to simultaneously capture individual-level
behavior (via ) and dyadic, network-based dependencies (via ),
including local dependence limited to overlapping neighborhoods.
This help page documents the various statistics available in 'iglm',
corresponding to the (attribute-level) and (pair-level)
components of the joint model. See iglm-terms for details on specifying
all model terms via the formula interface.
Fritz, C., Schweinberger, M., Bhadra, S., and D.R. Hunter (2025). A Regression Framework for Studying Relationships among Attributes under Network Interference. Journal of the American Statistical Association, to appear.
Schweinberger, M. and M.S. Handcock (2015). Local Dependence in Random Graph Models: Characterization, Properties, and Statistical Inference. Journal of the Royal Statistical Society, Series B (Statistical Methodology), 7, 647-676.
Schweinberger, M. and J.R. Stewart (2020). Concentration and Consistency Results for Canonical and Curved Exponential-Family Models of Random Graphs. The Annals of Statistics, 48, 374-396.
Stewart, J.R. and M. Schweinberger (2025). Pseudo-Likelihood-Based M-Estimation of Random Graphs with Dependent Edges and Parameter Vectors of Increasing Dimension. The Annals of Statistics, to appear.
# Example usage: # Create a iglm.data data object (example) n_actor <- 50 neighborhood <- matrix(1, nrow = n_actor, ncol = n_actor) xyz_obj <- iglm.data( neighborhood = neighborhood, directed = FALSE, type_x = "binomial", type_y = "binomial" ) # Define ground truth coefficients gt_coef <- c("edges_local" = 3, "attribute_y" = -1, "attribute_x" = -1) gt_coef_pop <- rnorm(n = n_actor, -2, 1) # Define MCMC sampler sampler_new <- sampler.iglm( n_burn_in = 100, n_simulation = 10, sampler_x = sampler.net.attr(n_proposals = n_actor * 10), sampler_y = sampler.net.attr(n_proposals = n_actor * 10), sampler_z = sampler.net.attr(n_proposals = sum(neighborhood > 0) * 10), init_empty = FALSE ) # Create iglm model specification model_tmp_new <- iglm( formula = xyz_obj ~ edges(mode = "local") + attribute_y + attribute_x + degrees, coef = gt_coef, coef_degrees = gt_coef_pop, sampler = sampler_new, control = control.iglm( accelerated = FALSE, max_it = 200, display_progress = FALSE ) ) # Simulate from the model model_tmp_new$simulate() model_tmp_new$set_target(model_tmp_new$get_samples()[[1]]) # Estimate model parameters model_tmp_new$estimate() # Model Assessment model_tmp_new$assess(formula = ~degree_distribution) model_tmp_new$results$plot(model_assessment = TRUE)# Example usage: # Create a iglm.data data object (example) n_actor <- 50 neighborhood <- matrix(1, nrow = n_actor, ncol = n_actor) xyz_obj <- iglm.data( neighborhood = neighborhood, directed = FALSE, type_x = "binomial", type_y = "binomial" ) # Define ground truth coefficients gt_coef <- c("edges_local" = 3, "attribute_y" = -1, "attribute_x" = -1) gt_coef_pop <- rnorm(n = n_actor, -2, 1) # Define MCMC sampler sampler_new <- sampler.iglm( n_burn_in = 100, n_simulation = 10, sampler_x = sampler.net.attr(n_proposals = n_actor * 10), sampler_y = sampler.net.attr(n_proposals = n_actor * 10), sampler_z = sampler.net.attr(n_proposals = sum(neighborhood > 0) * 10), init_empty = FALSE ) # Create iglm model specification model_tmp_new <- iglm( formula = xyz_obj ~ edges(mode = "local") + attribute_y + attribute_x + degrees, coef = gt_coef, coef_degrees = gt_coef_pop, sampler = sampler_new, control = control.iglm( accelerated = FALSE, max_it = 200, display_progress = FALSE ) ) # Simulate from the model model_tmp_new$simulate() model_tmp_new$set_target(model_tmp_new$get_samples()[[1]]) # Estimate model parameters model_tmp_new$estimate() # Model Assessment model_tmp_new$assess(formula = ~degree_distribution) model_tmp_new$results$plot(model_assessment = TRUE)
The help pages of iglm describe the model with details on model fitting
and estimation.
Generally, a model is specified via it's sufficient statistics,
that can be further decomposed into two parts:
: A vector of unit-level functions (or "g-terms")
that describe the relationship between an individual actor 's
predictors () and their own response ().
: A vector of pair-level functions (or "h-terms")
that specify how the connections () and responses ()
of a pair of units depend on each other and the wider
network structure.
Each term defines a component for the model's features, which
are a sum of unit-level components, , and/or
pair-level components, .
The implemented terms are grouped into three categories:
Attribute Terms: Depend only on individual attributes or .
Network Terms: Depend only on the connections .
Joint Attribute/Network Terms: Depend on both individual attributes and connections.
degrees: Degrees: Specifies node-level fixed effects. Estimation requires an MM algorithm constraint.
edges(mode = "global"): Edges: Captures the baseline propensity of tie formation , partitioned by structural boundary .
global:
local:
alocal:
mutual(mode = "global"): Mutual Reciprocity: Evaluates reciprocal tie formation in directed networks.
global: (for )
local: (for )
alocal: (for )
cov_z(data, mode = "global"): Dyadic Covariate: Exogenous dyadic covariate influence on edge formation.
global:
local:
alocal:
cov_z_out(data, mode = "global"): Covariate Sender: Exogenous monadic covariate influence on generating an outgoing tie.
global:
local:
alocal:
cov_z_in(data, mode = "global"): Covariate Receiver: Exogenous monadic covariate influence on receiving an incoming tie.
global:
local:
alocal:
cov_x(data = v): Nodal Covariate (X): Effect of a unit-level exogenous covariate on endogenous attribute .
cov_y(data = v): Nodal Covariate (Y): Effect of a unit-level exogenous covariate on endogenous attribute .
attribute_xy(mode = "global"): Nodal Attribute Interaction (X-Y): Interaction of attributes and .
global:
local:
alocal:
attribute_yz(mode = "local"): Attribute Sum (Y-Z): Models the additive effect of and on edge formation within local neighborhoods.
attribute_xz(mode = "local"): Attribute Sum (X-Z): Models the additive effect of and on edge formation within local neighborhoods.
inedges_y(mode = "global"): Attribute In-Degree (Y-Z): Influence of endogenous on in-degree reception.
global:
local:
alocal:
outedges_y(mode = "global"): Attribute Out-Degree (Y-Z): Influence of endogenous on out-degree formation.
global:
local:
alocal:
inedges_x(mode = "global"): Attribute In-Degree (X-Z): Influence of endogenous on in-degree reception.
global:
local:
alocal:
outedges_x(mode = "global"): Attribute Out-Degree (X-Z): Influence of endogenous on out-degree formation.
global:
local:
alocal:
attribute_x: Attribute (X): Intercept for attribute .
attribute_y: Attribute (Y): Intercept for attribute .
edges_x_match(mode = "global"): Attribute Match (X-Z): Models homophily/matching on the binary attribute .
global:
local:
edges_y_match(mode = "global"): Attribute Match (Y-Z): Models homophily/matching on the binary attribute .
global:
local:
spillover_yy_scaled(mode = "global"): Scaled Y-Y-Z Outcome Spillover: Normalizes the -outcome spillover influence by the relevant out-degree topology.
global:
local:
spillover_xx_scaled(mode = "global"): Scaled X-X-Z Outcome Spillover: Normalizes the -outcome spillover influence by the relevant out-degree topology.
global:
local:
spillover_yx_scaled(mode = "global"): Scaled Y-X-Z Treatment Spillover: Normalizes cross-attribute spillover influence.
global:
local:
spillover_xy_scaled(mode = "global"): Scaled X-Y-Z Treatment Spillover: Normalizes cross-attribute spillover influence.
global:
local:
gwesp(data, mode = "global", variant = "OSP", decay = 0): Geometrically Weighted Edgewise-Shared Partners: Models triadic closure propensity conditioning on existing edges.
Types dictate path constraint: OTP, ITP, OSP, ISP for directed; symm for undirected.
gwdsp(data, mode = "global", variant = "OSP", decay = 0): Geometrically Weighted Dyadwise-Shared Partners: Models triadic potential irrespective of the closing edge.
Types dictate path constraint: OTP, ITP, OSP, ISP for directed; symm for undirected.
gwdegree(mode = "global", decay = 0): Geometrically Weighted Degree: Captures the degree distribution utilizing an exponential decay parameter.
gwidegree(mode = "global", decay = 0): Geometrically Weighted In-Degree: Captures the in-degree distribution utilizing an exponential decay parameter.
gwodegree(mode = "global", decay = 0): Geometrically Weighted Out-Degree: Captures the out-degree distribution utilizing an exponential decay parameter.
spillover_yc_symm(data = v, mode = "local"): Symmetric Y-C-Z Treatment Spillover: Bidirectional mapping of exogenous covariate and endogenous trait interaction.
spillover_xy(mode = "local"): Directed X-Y-Z Treatment Spillover: Maps cross-attribute treatment assignment.
spillover_yc(mode = "local"): Directed Y-C-Z Treatment Spillover: Exogenous covariate interacting with endogenous trait .
spillover_yx(mode = "local"): Directed Y-X-Z Treatment Spillover: Maps cross-attribute treatment assignment.
spillover_yy(mode = "local"): Symmetric Y-Y-Z Outcome Spillover: Propagates -outcome spillover effects.
spillover_xx(mode = "local"): Symmetric X-X-Z Outcome Spillover: Propagates -outcome spillover effects.
transitive: Transitivity (Local): Indicator evaluating the presence of a local transitive triad configuration.
nonisolates: Non-Isolates: Captures frequency of nodes with degree strictly greater than zero.
isolates: Isolates: Captures frequency of nodes with degree zero.
Attribute Terms:
Below is a detailed description of terms that depend only on nodal attributes:
Network Terms:
Below is a detailed description of terms that depend only on the network structure:
Joint Attribute/Network Terms:
Below is a detailed description of terms that depend on both attributes and the network:
inedges_x-term, inedges_y-term, outedges_x-term, outedges_y-term
spillover_yx-term, spillover_xy-term, spillover_yc-term, spillover_yc_symm-term
Fritz, C., Schweinberger, M., Bhadra, S., and D.R. Hunter (2025). A Regression Framework for Studying Relationships among Attributes under Network Interference. Journal of the American Statistical Association, to appear.
Schweinberger, M. and M.S. Handcock (2015). Local Dependence in Random Graph Models: Characterization, Properties, and Statistical Inference. Journal of the Royal Statistical Society, Series B (Statistical Methodology), 7, 647-676.
Schweinberger, M. and J.R. Stewart (2020). Concentration and Consistency Results for Canonical and Curved Exponential-Family Models of Random Graphs. The Annals of Statistics, 48, 374-396.
Stewart, J.R. and M. Schweinberger (2025). Pseudo-Likelihood-Based M-Estimation of Random Graphs with Dependent Edges and Parameter Vectors of Increasing Dimension. The Annals of Statistics, to appear.
Creates a 'iglm.data' object, which stores network and attribute data. This function acts as a user-friendly interface to the 'iglm.data' R6 class generator. It handles data input, infers parameters like the number of actors ('n_actor') and network directedness ('directed') if not explicitly provided, processes network data into a consistent edgelist format, calculates the overlap relation based on an optional neighborhood definition, and performs extensive validation of all inputs.
iglm.data( x_attribute = NULL, y_attribute = NULL, z_network = NULL, neighborhood = NULL, directed = TRUE, n_actor = NA, type_x = "binomial", type_y = "binomial", scale_x = 1, scale_y = 1, fix_x = FALSE, fix_z = FALSE, fix_z_alocal = FALSE, return_neighborhood = TRUE, file = NULL )iglm.data( x_attribute = NULL, y_attribute = NULL, z_network = NULL, neighborhood = NULL, directed = TRUE, n_actor = NA, type_x = "binomial", type_y = "binomial", scale_x = 1, scale_y = 1, fix_x = FALSE, fix_z = FALSE, fix_z_alocal = FALSE, return_neighborhood = TRUE, file = NULL )
x_attribute |
A numeric vector for the first unit-level attribute. |
y_attribute |
A numeric vector for the second unit-level attribute. |
z_network |
A matrix representing the network. Can be a 2-column edgelist or a square adjacency matrix. |
neighborhood |
An optional matrix for the neighborhood representing local dependence. Can be a 2-column edgelist or a square adjacency matrix. A tie in 'neighborhood' between actor i and j indicates that j is in the neighborhood of i, implying dependence between the respective actors. |
directed |
A logical value indicating if 'z_network' is directed. If 'NA' (default), directedness is inferred from the symmetry of 'z_network'. |
n_actor |
An integer for the number of actors in the system. If 'NA' (default), 'n_actor' is inferred from the attributes or network matrices. |
type_x |
Character string for the type of 'x_attribute'. Must be one of '"binomial"', '"poisson"', or '"normal"'. Default is '"binomial"'. |
type_y |
Character string for the type of 'y_attribute'. Must be one of '"binomial"', '"poisson"', or '"normal"'. Default is '"binomial"'. |
scale_x |
A positive numeric value for scaling (e.g., variance for "normal" type). Default is 1. |
scale_y |
A positive numeric value for scaling (e.g., variance for "normal" type). Default is 1. |
fix_x |
(logical) If ‘TRUE', the ’x' predictor is held fixed during estimation/simulation (fixed design in regression). Default is 'FALSE'. |
fix_z |
(logical) If ‘TRUE', the ’z' network is held fixed during estimation/simulation (fixed network design). Default is 'FALSE'. Setting this to TRUE, allows practicioners to estimate autologistic actor attribute models, which were introduced in binary settings in Daraganova, G., & Robins, G. (2013). |
fix_z_alocal |
(logical) If 'TRUE', edges outside the overlap region are fixed, else they are random (default). |
return_neighborhood |
Logical. If 'TRUE' (default) and 'neighborhood' is 'NULL', a full neighborhood (all dyads) is generated implying global dependence. If 'FALSE', no neighborhood is set. |
file |
(character) Optional file path to load a saved 'iglm.data' object state. |
An object of class 'iglm.data' (and 'R6').
Fritz, C., Schweinberger, M. , Bhadra S., and D. R. Hunter (2025). A Regression Framework for Studying Relationships among Attributes under Network Interference. Journal of the American Statistical Association, to appear.
Daraganova, G., and Robins, G. (2013). Exponential random graph models for social networks: Theory, methods and applications, 102-114. Cambridge University Press.
data("state_twitter") state_twitter state_twitter$iglm.data$degree_distribution(prob = FALSE, plot = TRUE) state_twitter$iglm.data$geodesic_distances_distribution(prob = FALSE, plot = TRUE) state_twitter$iglm.data$mean_x() state_twitter$iglm.data$mean_y() # Generate a small iglm data object either via adjacency matrix or edgelist tmp_adjacency <- iglm.data( z_network = matrix(c( 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0 ), nrow = 4, byrow = TRUE), directed = FALSE, n_actor = 4, type_x = "binomial", type_y = "binomial" ) tmp_edgelist <- iglm.data( z_network = tmp_adjacency$z_network, directed = FALSE, n_actor = 4, type_x = "binomial", type_y = "binomial" ) tmp_edgelist$mean_z() tmp_adjacency$mean_z()data("state_twitter") state_twitter state_twitter$iglm.data$degree_distribution(prob = FALSE, plot = TRUE) state_twitter$iglm.data$geodesic_distances_distribution(prob = FALSE, plot = TRUE) state_twitter$iglm.data$mean_x() state_twitter$iglm.data$mean_y() # Generate a small iglm data object either via adjacency matrix or edgelist tmp_adjacency <- iglm.data( z_network = matrix(c( 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0 ), nrow = 4, byrow = TRUE), directed = FALSE, n_actor = 4, type_x = "binomial", type_y = "binomial" ) tmp_edgelist <- iglm.data( z_network = tmp_adjacency$z_network, directed = FALSE, n_actor = 4, type_x = "binomial", type_y = "binomial" ) tmp_edgelist$mean_z() tmp_adjacency$mean_z()
The 'iglm.data' class is a container for storing, validating, and analyzing unit-level attributes (x_attribute, y_attribute) and connections (z_network).
x_attribute('numeric') The vector for the first unit-level attribute.
y_attribute('numeric') The vector for the second unit-level attribute.
z_network('matrix') The primary network structure as a 2-column integer edgelist.
neighborhood('matrix') Read-only. The secondary/neighborhood structure as a 2-column integer edgelist. An empty matrix if not provided.
overlap(‘matrix') Read-only. The calculated overlap relation (dyads with shared neighbors in 'neighborhood') as a 2-column integer edgelist. An empty matrix if overlap hasn’t been computed or is not available.
directed('logical') Indicates if the 'z_network' is treated as directed.
n_actor('integer') The total number of actors (nodes) in the network.
type_x('character') The specified distribution type for the 'x_attribute'.
type_y('character') The specified distribution type for the 'y_attribute'.
scale_x('numeric') The scale parameter associated with the 'x_attribute'.
scale_y('numeric') The scale parameter associated with the 'y_attribute'.
fix_x('logical') Indicates if the 'x_attribute' is fixed during estimation/simulation.
fix_z('logical') RIndicates if the 'z_network' is fixed during estimation/simulation.
descriptives('list')A list storing computed descriptive statistics for the network and attributes.
fix_z_alocal('logical') Flag indicating whether nonoverlap edges are treated as random.
new()
Create a new 'iglm.data' object, that includes data on two attributes and one network.
iglm.data_generator$new( x_attribute = NULL, y_attribute = NULL, z_network = NULL, neighborhood = NULL, directed = NA, n_actor = NA, type_x = "binomial", type_y = "binomial", scale_x = 1, scale_y = 1, fix_x = FALSE, fix_z = FALSE, fix_z_alocal = TRUE, return_neighborhood = TRUE, file = NULL )
x_attributeA numeric vector for the first unit-level attribute.
y_attributeA numeric vector for the second unit-level attribute.
z_networkA matrix representing the network. Can be a 2-column edgelist or a square adjacency matrix.
neighborhoodAn optional matrix for the neighborhood representing local dependence. Can be a 2-column edgelist or a square adjacency matrix. A tie in 'neighborhood' between actor i and j indicates that j is in the neighborhood of i, implying dependence between the respective actors.
directedA logical value indicating if 'z_network' is directed. If 'NA' (default), directedness is inferred from the symmetry of 'z_network'.
n_actorAn integer for the number of actors in the system. If 'NA' (default), 'n_actor' is inferred from the attributes or network matrices.
type_xCharacter string for the type of 'x_attribute'. Must be one of '"binomial"', '"poisson"', or '"normal"'. Default is '"binomial"'.
type_yCharacter string for the type of 'y_attribute'. Must be one of '"binomial"', '"poisson"', or '"normal"'. Default is '"binomial"'.
scale_xA positive numeric value for scaling (e.g., variance for "normal" type). Default is 1.
scale_yA positive numeric value for scaling (e.g., variance for "normal" type). Default is 1.
fix_xLogical. If 'TRUE', the 'x_attribute' is treated as fixed during model estimation and simulation. Default is 'FALSE'.
fix_zLogical. If 'TRUE', the 'z_network' is treated as fixed during model estimation and simulation. Default is 'FALSE'.
fix_z_alocalLogical. If 'TRUE' (default), alocal dyads in the neighborhood are fixed.
return_neighborhoodLogical. If 'TRUE' (default) and 'neighborhood' is 'NULL', a full neighborhood (all dyads) is generated implying global dependence. If 'FALSE', no neighborhood is set.
file(character) Optional file path to load a saved 'iglm.data' object state.
A new 'iglm.data' object.
set_z_network()
Sets the 'z_network' of the 'iglm.data' object.
iglm.data_generator$set_z_network(z_network)
z_networkA matrix representing the network. Can be a 2-column edgelist or a square adjacency matrix. @return The 'iglm.data' object itself ('self'), invisibly.
set_type_x()
Sets the 'type_x' of the 'iglm.data' object.
iglm.data_generator$set_type_x(type_x)
type_xA character string for the type of 'x_attribute'. Must be one of '"binomial"', '"poisson"', or '"normal"'. @return The 'iglm.data' object itself ('self'), invisibly.
set_type_y()
Sets the 'type_y' of the 'iglm.data' object.
iglm.data_generator$set_type_y(type_y)
type_yA character string for the type of 'y_attribute'. Must be one of '"binomial"', '"poisson"', or '"normal"'.
The 'iglm.data' object itself ('self'), invisibly.
set_scale_x()
Sets the 'scale_x' of the 'iglm.data' object.
iglm.data_generator$set_scale_x(scale_x)
scale_xA positive numeric value for scaling (e.g., variance for "normal" type).
The 'iglm.data' object itself ('self'), invisibly.
set_scale_y()
Sets the 'scale_y' of the 'iglm.data' object.
iglm.data_generator$set_scale_y(scale_y)
scale_yA positive numeric value for scaling (e.g., variance for "normal" type).
The 'iglm.data' object itself ('self'), invisibly.
set_x_attribute()
Sets the 'x_attribute' of the 'iglm.data' object.
iglm.data_generator$set_x_attribute(x_attribute)
x_attributeA numeric vector for the first unit-level attribute.
The 'iglm.data' object itself ('self'), invisibly.
set_y_attribute()
Sets the 'y_attribute' of the 'iglm.data' object.
iglm.data_generator$set_y_attribute(y_attribute)
y_attributeA numeric vector for the first unit-level attribute.
The 'iglm.data' object itself ('self'), invisibly.
gather()
Gathers the current state of the 'iglm.data' object into a list. This includes all attributes, network, and configuration details necessary to reconstruct the object later.
iglm.data_generator$gather()
A list containing the current state of the 'iglm.data' object.
set_fix_z_alocal()
Sets the option whether alocal edges are fixed or not.
iglm.data_generator$set_fix_z_alocal(fix_z_alocal)
fix_z_alocalA logical value indicating whether alocal edges should be treated as fixed or not.
delete_isolates()
Deletes isolates from the 'z_network' and updates the attributes and neighborhood accordingly. Isolates are actors that do not have any connections in the 'z_network'. This method identifies such actors, removes them from the attributes and neighborhood, and updates the 'z_network' to reflect the new actor indices.
iglm.data_generator$delete_isolates()
The 'iglm.data' object itself ('self'), invisibly.
save()
Saves the current state of the 'iglm.data' object to a specified file path in RDS format. This includes all attributes, network, and configuration details necessary to reconstruct the object later.
iglm.data_generator$save(file)
file(character) The file where the object state should be saved. Must have a .rds extension.
The 'iglm.data' object itself ('self'), invisibly.
set_fix_x()
Sets the 'fix_x' of the 'iglm.data' object.
iglm.data_generator$set_fix_x(fix_x)
fix_xA logical value indicating if 'x_attribute' is fixed or random.
The 'iglm.data' object itself ('self'), invisibly.
set_fix_z()
Sets the 'fix_z' of the 'iglm.data' object.
iglm.data_generator$set_fix_z(fix_z)
fix_zA logical value indicating if 'z_network' is fixed or random.
The 'iglm.data' object itself ('self'), invisibly.
mean_z()
Calculates the density of the 'z_network'.
iglm.data_generator$mean_z()
A numeric value for the network density.
mean_x()
Calculates the mean of the 'x_attribute'.
iglm.data_generator$mean_x()
A numeric value for the mean of 'x_attribute'.
mean_y()
Calculates the mean of the 'y_attribute'.
iglm.data_generator$mean_y()
A numeric value for the mean of 'y_attribute'.
x_distribution()
Calculates the distribution of the 'x_attribute'.
iglm.data_generator$x_distribution( value_range = NULL, prob = TRUE, plot = TRUE )
value_range(numeric vector) Optional range of values to consider for the distribution. If 'NULL' (default), the range is inferred from the data.
prob(logical) If 'TRUE' (default), returns probabilities; if 'FALSE', returns frequencies.
plot(logical) If 'TRUE' (default), plots the distribution using a density plot for continuous data or a bar plot for discrete data.
A numeric vector representing the distribution of 'x_attribute' (invisible).
y_distribution()
Calculates the distribution of the 'y_attribute'.
iglm.data_generator$y_distribution( value_range = NULL, prob = TRUE, plot = TRUE )
value_range(numeric vector) Optional range of values to consider for the distribution. If 'NULL' (default), the range is inferred from the data.
prob(logical) If 'TRUE' (default), returns probabilities; if 'FALSE', returns frequencies.
plot(logical) If 'TRUE' (default), plots the distribution using a density plot for continuous data or a bar plot for discrete data.
A numeric vector representing the distribution of 'y_attribute' (invisible).
edgewise_shared_partner()
Calculates the matrix of edgewise shared partners. This is a two-path matrix (e.g., $A A^T$ or $A^T A$).
iglm.data_generator$edgewise_shared_partner(type = "ALL")
type(character) The type of two-path to calculate for directed
networks. Ignored if network is undirected.
Must be one of:
'"OTP"' (Outgoing Two-Path, ),
'"ISP"' (Ingoing Shared Partner, ),
'"OSP"' (Outgoing Shared Partner, ),
'"ITP"' (Incoming Two-Path, ),
'"ALL"' (Any one of the above).
Default is '"ALL"'.
A sparse matrix ('dgCMatrix') of shared partner counts.
set_neighborhood_overlap()
Sets the neighborhood and overlap matrices.
iglm.data_generator$set_neighborhood_overlap(neighborhood, overlap)
neighborhoodA matrix for a secondary neighborhood. Can be a 2-column edgelist or a square adjacency matrix.
overlapA matrix for the overlap network. Can be a 2-column edgelist or a square adjacency matrix.
None. Updates the internal neighborhood and overlap matrices.
dyadwise_shared_partner()
Calculates the matrix of dyadwise shared partners.
iglm.data_generator$dyadwise_shared_partner(type = "ALL")
type(character) The type of two-path to calculate for directed
networks. Ignored if network is undirected.
Must be one of:
'"OTP"' (Outgoing Two-Path, ),
'"ISP"' (Ingoing Shared Partner, ),
'"OSP"' (Outgoing Shared Partner, ),
'"ITP"' (Incoming Two-Path, ),
'"ALL"' (Any one of the above).
Default is '"ALL"'.
A sparse matrix ('dgCMatrix') of shared partner counts.
geodesic_distances_distribution()
Calculates the geodesic distance distribution of the symmetrized 'z_network'.
iglm.data_generator$geodesic_distances_distribution( value_range = NULL, prob = TRUE, plot = TRUE )
value_range(numeric vector) A vector 'c(min, max)' specifying the range of distances to tabulate. If 'NULL' (default), the range is inferred from the data.
prob(logical) If 'TRUE' (default), returns a probability distribution (proportions). If 'FALSE', returns raw counts.
plot(logical) If 'TRUE', plots the distribution.
A named vector (a 'table' object) with the distribution of geodesic distances. Includes 'Inf' for unreachable pairs.
geodesic_distances()
Calculates the all-pairs geodesic distance matrix for the symmetrized 'z_network' using a matrix-based BFS algorithm.
iglm.data_generator$geodesic_distances()
A sparse matrix ('dgCMatrix') where 'D[i, j]' is the shortest path distance from i to j. 'Inf' indicates no path.
edgewise_shared_partner_distribution()
Calculates the distribution of edgewise shared partners.
iglm.data_generator$edgewise_shared_partner_distribution( type = "ALL", value_range = NULL, prob = TRUE, plot = TRUE )
type(character) The type of shared partner matrix to use. See 'edgewise_shared_partner' for details. Default is '"ALL"'.
value_range(numeric vector) A vector 'c(min, max)' specifying the range of counts to tabulate. If 'NULL' (default), the range is inferred from the data.
prob(logical) If 'TRUE' (default), returns a probability distribution (proportions). If 'FALSE', returns raw counts.
plot(logical) If 'TRUE', plots the distribution.
A named vector (a 'table' object) with the distribution of shared partner counts.
dyadwise_shared_partner_distribution()
Calculates the distribution of dyadwise shared partners.
iglm.data_generator$dyadwise_shared_partner_distribution( type = "ALL", value_range = NULL, prob = TRUE, plot = TRUE )
type(character) The type of shared partner matrix to use. See 'dyadwise_shared_partner' for details. Default is '"ALL"'.
value_range(numeric vector) A vector 'c(min, max)' specifying the range of counts to tabulate. If 'NULL' (default), the range is inferred from the data.
prob(logical) If 'TRUE' (default), returns a probability distribution (proportions). If 'FALSE', returns raw counts.
plot(logical) If 'TRUE', plots the distribution.
A named vector (a 'table' object) with the distribution of shared partner counts.
degree_distribution()
Calculates the degree distribution of the 'z_network'.
iglm.data_generator$degree_distribution( value_range = NULL, prob = TRUE, plot = TRUE )
value_range(numeric vector) A vector 'c(min, max)' specifying the range of degrees to tabulate. If 'NULL' (default), the range is inferred from the data.
prob(logical) If 'TRUE' (default), returns a probability distribution (proportions). If 'FALSE', returns raw counts.
plot(logical) If 'TRUE', plots the degree distribution.
If the network is directed, a list containing two 'table' objects: 'in_degree' and 'out_degree'. If undirected, a single 'table' object with the degree distribution.
degree()
Calculates the degree sequence(s) of the 'z_network'.
iglm.data_generator$degree()
If the network is directed, a list containing two vectors: 'in_degree_seq' and 'out_degree_seq'. If undirected, a single list containing the vector 'degree_seq'.
spillover_degree_distribution()
Calculates the spillover degree distribution between actors with 'x_attribute == 1' and actors with 'y_attribute == 1'.
iglm.data_generator$spillover_degree_distribution( prob = TRUE, value_range = NULL, plot = TRUE )
prob(logical) If 'TRUE' (default), returns a probability distribution (proportions). If 'FALSE', returns raw counts.
value_range(numeric vector) A vector 'c(min, max)' specifying the range of degrees to tabulate. If 'NULL' (default), the range is inferred from the data.
plot(logical) If 'TRUE', plots the distributions.
A list containing two 'table' objects: 'out_spillover_degree' (from x_i=1 to y_j=1) and 'in_spillover_degree' (from y_i=1 to x_j=1).
plot()
Plot the network using 'igraph'.
Visualizes the 'z_network' using the 'igraph' package. Nodes can be colored by 'x_attribute' and sized by 'y_attribute'. 'neighborhood' edges can be plotted as a background layer.
iglm.data_generator$plot( node_color = "x", node_size = "y", show_overlap = TRUE, layout = igraph::layout_with_fr, network_edges_col = "grey60", neighborhood_edges_col = "orange", main = "", legend_col_n_levels = NULL, legend_size_n_levels = NULL, legend_pos = "right", alpha_neighborhood = 0.2, edge.width = 1, edge.arrow.size = 1, vertex.frame.width = 0.5, coords = NULL, legend_size = 0.5, ... )
node_color(character) Attribute to map to node color. One of '"x"' (default), '"y"', or '"none"'.
node_size(character) Attribute to map to node size. One of '"y"' (default), '"x"', or '"constant"'.
show_overlap(logical) If 'TRUE' (default), plot the 'neighborhood' edges as a background layer.
layoutAn 'igraph' layout function (e.g., 'igraph::layout_with_fr').
network_edges_col(character) Color for the 'z_network' edges.
neighborhood_edges_col(character) Color for the 'neighborhood' edges.
main(character) The main title for the plot.
legend_col_n_levels(integer) Number of levels for the color legend.
legend_size_n_levels(integer) Number of levels for the size legend.
legend_pos(character) Position of the legend (e.g., '"right"').
alpha_neighborhood(numeric) Alpha transparency for neighborhood edges.
edge.width(numeric) Width of the network edges.
edge.arrow.size(numeric) Size of the arrowheads for directed edges.
vertex.frame.width(numeric) Width of the vertex frame.
coords(matrix) Optional matrix of x-y coordinates for node layout.
legend_size(numeric) Scaling factor for the size legend.
...Additional arguments passed to 'plot.igraph'.
A list containing the 'igraph' object ('graph') and the layout coordinates ('coords'), invisibly.
print()
Print a summary of the 'iglm.data' object to the console.
iglm.data_generator$print(digits = 3, ...)
digits(integer) Number of digits to round numeric output to.
...Additional arguments (not used).
The object's private environment, invisibly.
clone()
The objects of this class are cloneable with this method.
iglm.data_generator$clone(deep = FALSE)
deepWhether to make a deep clone.
The 'iglm.object' class encapsulates all components required to define, estimate, and simulate from a generalized linear model under interference. This includes the model formula, coefficients, the underlying network and attribute data (via a 'iglm.data' object), sampler controls, estimation controls, and storage for results.
formula('formula') Read-only. The model formula specifying terms and data object.
coef('numeric') Read-only. The current vector of non-degrees coefficient estimates or initial values.
coef_degrees('numeric' or 'NULL') Read-only. The current vector of degrees coefficient estimates or initial values, or 'NULL' if not applicable.
results('results') Read-only. The results R6 object containing all estimation and simulation outputs.
iglm.data('iglm.data') The associated iglm.data R6 object containing the network and attribute data.
control('control.iglm') The control.iglm object specifying estimation parameters.
sampler('sampler.iglm') The sampler.iglm object specifying MCMC sampling parameters.
name('character') The name of the model.
sufficient_statistics('numeric') Read-only. A named vector of the observed network statistics corresponding to the model terms, calculated on the current 'iglm.data' data.
new()
Internal method to calculate the observed count statistics based on the model formula and the data in the 'iglm.data' object. Populates the 'private$.sufficient_statistics' field.
Internal validation method. Checks the consistency and validity of all components of the 'iglm.object'. Stops with an error if any check fails.
Creates a new 'iglm.object'. This involves parsing the formula, linking the data object, initializing coefficients, setting up sampler and control objects, calculating initial statistics, and validating.
iglm.object.generator$new( formula = NULL, coef = NULL, coef_degrees = NULL, sampler = NULL, control = NULL, name = NULL, file = NULL )
formulaA model 'formula' object. The left-hand side should be the
name of a iglm.data object available in the calling environment.
See iglm-terms for details on specifying the right-hand side terms.
coefA numeric vector of initial coefficients for the terms in the formula (excluding degree coefficeints). If 'NULL', coefficients are initialized to zero.
coef_degreesAn optional numeric vector of initial degree coefficients. Should be 'NULL' if the formula does not include degree-correcting terms.
samplerA sampler.iglm object specifying the MCMC sampler
settings. If 'NULL', default settings are used.
controlA control.iglm object specifying estimation control
parameters. If 'NULL', default settings are used.
nameAn optional character string specifying a name for the model, would be used in plots and model assessment.
file(character or 'NULL') If provided, loads the sampler state from the specified .rds file instead of initializing from parameters.
A new 'iglm.object'.
is_equivalent()
Check if this iglm object is equivalent to another iglm object by comparing their defining features, data, and parameters.
iglm.object.generator$is_equivalent(other, tol = 1e-05, check_results = FALSE)
otherAnother object to compare against.
tolTolerance for numeric comparisons (default is 1e-5).
check_results(logical) If 'TRUE', also requires the estimation results and MCMC samples to match exactly. Default is 'FALSE' (only compares model specification, input data, and initial coefficients).
'TRUE' if the objects are equivalent, otherwise 'FALSE'.
assess()
Performs model assessment by calculating specified network statistics
on the observed network and comparing their distribution to the
distribution obtained from simulated networks based on the current
model parameters. Requires simulations to have been run first (via
iglm.object$simulate or iglm.object_generator$estimate).
iglm.object.generator$assess(formula, plot = TRUE)
formulaA formula specifying the network statistics to assess
(e.g., '~ degree_distribution() + geodesic_distances_distribution()').
The terms should correspond to methods available in the iglm.data
object that end with 'distributions'.
If the term mcmc_diagnostics is included, MCMC diagnostics will also be computed.
plot(logical) If 'TRUE', generates plots comparing observed and simulated statistics. Default is 'TRUE'.
An object of class 'iglm_model_assessment' containing the observed statistics and the distribution of simulated statistics. The result is also stored internally.
print()
Print a summary of the 'iglm.object'. If estimation results are available, they are printed in a standard coefficient table format.
iglm.object.generator$print(
digits = 3,
rows = c(1, 2),
signif.stars = getOption("show.signif.stars"),
eps.Pvalue = 1e-04,
print.formula = TRUE,
print.fitinfo = TRUE,
print.coefmat = TRUE,
print.call = TRUE,
...
)digits(integer) Number of digits for rounding numeric output.
rowsIf a numeric vector with values between 1 and 5 is provided, only the corresponding columns are printed (1: Estimate, 2: S.E., 3: t-value, 4: Pr(>|t|), 5: Global Count of Sufficient Statistic). Default is 'c(1, 2)' to show only estimates and standard errors.
signif.stars(logical) If 'TRUE', prints significance stars for the coefficients. Default is 'getOption("show.signif.stars")'.
eps.Pvalue(numeric) Tolerance for small p-values. Default is '0.0001'.
print.formula(logical) If 'TRUE' (default), prints the model formula.
print.fitinfo(logical) If 'TRUE' (default), prints information about the estimation results.
print.coefmat(logical) If 'TRUE' (default), prints the coefficient table.
print.call(logical) If 'TRUE' (default), prints the call that generated the object.
...Additional arguments passed to printCoefmat.
plot()
Plot the estimation results, including coefficient convergence paths and model assessment diagnostics if available.
iglm.object.generator$plot( stats = FALSE, trace = FALSE, model_assessment = FALSE )
stats(logical) If 'TRUE', plot the observed vs. simulated statistics from model assessment. Default is 'FALSE'.
trace(logical) If 'TRUE', plot the coefficient convergence paths. Default is 'FALSE'.
model_assessment(logical) If 'TRUE', plot diagnostics from the model assessment (if already carried out). Default is 'FALSE'.
gather()
Gathers all components of the iglm.object into a single list for
easy saving or inspection.
iglm.object.generator$gather()
A list containing all key components of the iglm.object.
This includes the formula, coefficients, sampler, control settings,
preprocessing info, time taken for estimation, count statistics,
results, and the underlying iglm.data data object.
set_name()
Set the name of the iglm.object.
iglm.object.generator$set_name(name)
name(character) The name to assign to the object.
The name of the object as a character string.
set_control()
Set control parameters for model estimation.
iglm.object.generator$set_control(control)
controlA control.iglm object specifying new
control settings.
Invisibly returns 'NULL'.
save()
Save the iglm.object to a file in RDS format.
iglm.object.generator$save(file = NULL)
file(character) File path to save the object to (has to be a RDS object).
Invisibly returns 'NULL'.
estimate()
Estimate the model parameters using the specified control settings. Stores the results internally and updates the coefficient fields.
iglm.object.generator$estimate()
If no preprocessing should be returned (as per control settings), this function returns a list containing detailed estimation results, invisibly. Includes final coefficients, variance-covariance matrix, convergence path, Fisher information, score vector, log-likelihood, and any simulations performed during estimation. Else, the function returns a list of the desired preprocessed data (as a data.frame) and needed time.
summary()
Provides a summary of the estimation results with the following columns: Estimate, SE, t-value, and Pr(>|t|). Requires the model to have been estimated first.
iglm.object.generator$summary(digits = 2, ...)
digits(integer) Number of digits for rounding numeric output.
...Additional arguments passed to printCoefmat.
Prints the summary to the console and returns 'NULL' invisibly.
simulate()
Simulate networks from the fitted model or a specified model. Stores
the simulations and/or summary statistics internally. The simulation
is carried out using the internal MCMC sampler described in simulate_iglm.
iglm.object.generator$simulate( only_stats = FALSE, display_progress = TRUE, offset_nonoverlap = 0 )
only_stats(logical) If 'TRUE', only calculate and store summary statistics for each simulation, discarding the network object itself. Default is 'FALSE'.
display_progress(logical) If 'TRUE' (default), display a progress bar during simulation.
offset_nonoverlap(numeric) Offset to apply for non-overlapping dyads during simulation (if applicable to the sampler). This option is useful if the sparsity of edges of units with non-overlapping neighborhoods is known. Default is 0.
A list containing the simulated networks ('samples', as a 'iglm.data.list' if 'only_stats = FALSE') and/or their summary statistics ('stats'), invisibly.
predict()
Calculates predicted values for the nodal covariates (x), the outcome variable (y),
and the network structure (z). The function supports two prediction modes:
marginal (based on Monte Carlo integration over simulated samples) and
conditional (based on the analytical linear predictor and point estimates).
iglm.object.generator$predict(
variant = c("conditional", "marginal"),
type = c("x", "y", "z")
)variantA character string specifying the type of prediction to generate. Must be one of:
"marginal": Computes predictions by aggregating over the MCMC samples stored
in the internal results. If samples do not exist, self$simulate() is triggered automatically.
"conditional": Computes predictions using the systematic component of the
Generalized Linear Model (GLM). It calculates the linear predictor
(plus offset and degrees terms for the network) and applies the inverse link function
.
Defaults to c("conditional", "marginal").
typeA character vector indicating which components to predict. Options are:
"x": Nodal covariates.
"y": Nodal outcome variable.
"z": Dyadic network structure (interaction probabilities).
Defaults to c("x", "y", "z").
Marginal Predictions:
When variant = "marginal", the function approximates the expected value via Monte Carlo integration:
where are the realized values from the -th simulation sample (being either attribute x, y or the connections z).
For the network z, this results in a marginal edge probability matrix averaged over all sampled networks.
Conditional Predictions:
When variant = "conditional", the function calculates the theoretical mean based on the
estimated coefficients :
For Binomial families: (Logistic).
For Poisson families: (Exponential).
For Gaussian families: (Identity).
For the network component z, the linear predictor includes dyadic covariates,
degrees effects (sender/receiver variances), and structural offsets.
A list containing the requested predictions:
x, y
A matrix or data frame where the first column is the actor ID and subsequent columns represent the predicted mean values.
zA data frame containing the edgelist with columns: sender, receiver,
and prediction (probability or intensity).
The results are also invisibly stored in the internal state private$.results.
set_coefficients()
Manually set the model coefficients to new values. This is useful for sensitivity analyses or applying the model to different scenarios.
iglm.object.generator$set_coefficients(coef, coef_degrees = NULL)
coefA numeric vector of new coefficient values for the non-degrees terms.
coef_degreesA numeric vector of new coefficient values for the degrees terms, if applicable. Must be provided if the model includes degrees effects.
The iglm.object itself, invisibly.
get_samples()
Retrieve the simulated networks stored in the object.
Requires simulate or estimate to have been run first.
iglm.object.generator$get_samples()
A list of iglm.data objects representing
the simulated networks, invisibly. Returns an error if no samples
are available.
set_sampler()
Replace the internal MCMC sampler with a new one. This is useful for changing the sampling scheme without redefining the entire model.
iglm.object.generator$set_sampler(sampler)
samplerA sampler.iglm object.
@return The iglm.object itself, invisibly.
set_target()
Replace the internal 'iglm.data' data object with a new one. This is useful for applying a fitted model to new observed data. Recalculates count statistics and re-validates the object.
iglm.object.generator$set_target(x)
xA iglm.data “ object containing the new observed data.
The iglm.object itself, invisibly.
clone()
The objects of this class are cloneable with this method.
iglm.object.generator$clone(deep = FALSE)
deepWhether to make a deep clone.
Fritz, C., Schweinberger, M. , Bhadra S., and D. R. Hunter (2025). A Regression Framework for Studying Relationships among Attributes under Network Interference. Journal of the American Statistical Association, to appear.
Stewart, J. R. and M. Schweinberger (2025). Pseudo-Likelihood-Based M-Estimation of Random Graphs with Dependent Edges and Parameter Vectors of Increasing Dimension. Annals of Statistics, to appear.
Schweinberger, M. and M. S. Handcock (2015). Local dependence in random graph models: characterization, properties, and statistical inference. Journal of the Royal Statistical Society, Series B (Statistical Methodology), 7, 647-676.
Creates a new instance of the 'results' R6 class. This class is designed to store various outputs from 'iglm' model estimation and simulation. Users typically do not need to call this constructor directly; it is used internally by the 'iglm_object'.
results(size_coef, size_coef_degrees, file = NULL)results(size_coef, size_coef_degrees, file = NULL)
size_coef |
(integer) The number of non-degrees coefficients the object should be initialized to accommodate. |
size_coef_degrees |
(integer) The number of degrees coefficients the object should be initialized to accommodate. |
file |
(character or NULL) Optional file path to load a previously saved 'results' object. If provided, the object will be initialized by loading from this file. |
An object of class 'results' (and 'R6'), initialized with empty or NA structures appropriately sized based on the input dimensions.
The 'results' class stores estimation ('$estimate()') and simulation ('$simulate()') results.
This class is primarily intended for internal use within the 'iglm' framework but provides structured access to the results via the active bindings of the main 'iglm_object'.
coefficients_path('matrix' or 'NULL') Read-only. The path of all estimated coefficients across iterations.
samples('list' or 'NULL') Read-only. A list of simulated 'iglm.data' objects (class 'iglm.data.list').
stats('matrix' or 'NULL') Read-only. Matrix of summary statistics for simulated samples, which are an 'mcmc' obect from 'coda'.
var('matrix' or 'NULL') Read-only. Estimated variance-covariance matrix for non-degrees coefficients.
fisher_degrees('matrix' or 'NULL') Read-only. Fisher information matrix for degrees coefficients.
fisher_nondegrees('matrix' or 'NULL') Read-only. Fisher information matrix for non-degrees coefficients.
score_degrees('numeric' or 'NULL') Read-only. Score vector for degrees coefficients.
score_nondegrees('numeric' or 'NULL') Read-only. Score vector for non-degrees coefficients.
llh('numeric' or 'NULL') Read-only. Vector of log-likelihood values recorded during estimation.
model_assessment('list' or 'NULL') Read-only. Results from model assessment (goodness-of-fit).
estimated('logical') Read-only. Flag indicating if estimation has been completed.
new()
Creates a new 'results' object. Initializes internal fields, primarily setting up an empty matrix for the 'coefficients_path' based on the expected number of coefficients.
results.generator$new(size_coef, size_coef_degrees, file)
size_coef(integer) The number of non-degrees (structural) coefficients in the model.
size_coef_degrees(integer) The number of degrees coefficients in the model (0 if none).
file(character or 'NULL') If provided, loads the sampler state from the specified .rds file instead of initializing from parameters.
A new 'results' object, initialized to hold results for a model with the specified dimensions.
set_model_assessment()
Stores the results object generated by a model assessment (goodness-of-fit) procedure within this 'results' container.
results.generator$set_model_assessment(res)
resAn object containing the model assessment results, expected to have the class 'iglm_model_assessment'.
The 'results' object itself ('self'), invisibly. Called for its side effect of storing the assessment results.
set_prediction()
Stores prediction results.
results.generator$set_prediction(prediction)
predictionAn object containing the prediction results (is a list of class 'iglm.prediction'.
gather()
Gathers the current state of the 'results' object into a list for saving or inspection. This includes all internal fields such as coefficient paths, samples, statistics, variance-covariance matrix, Fisher information, score vectors, log-likelihood values, model assessment results, and estimation status.
results.generator$gather()
A list containing all the internal fields of the 'results' object.
save()
Saves the current state of the 'results' object to a specified file path in RDS format. This allows for persisting the results for later retrieval and analysis.
results.generator$save(file)
file(character) The file path where the results state should be saved. Must be a valid character string. The file will be saved in RDS format, so it should end with '.rds'.
The 'results' object itself ('self'), invisibly.
resize()
Resizes the internal storage for the coefficient paths to accommodate a different number of coefficients. This is useful if the model structure changes and the results object needs to be reset.
results.generator$resize(size_coef, size_coef_degrees)
size_coef(integer) The new number of non-degrees coefficients.
size_coef_degrees(integer) The new number of degrees coefficients. @return The 'results' object itself ('self'), invisibly.
update()
Updates the internal fields of the 'results' object with new outputs, typically after an estimation run ('$estimate()') or simulation run ('$simulate()'). Allows selectively updating components. Appends to 'coefficients_path' and 'llh' if called multiple times after estimation. Replaces 'samples' and 'stats'.
results.generator$update( coefficients_path = NULL, samples = NULL, var = NULL, fisher_degrees = NULL, fisher_nondegrees = NULL, score_degrees = NULL, score_nondegrees = NULL, llh = NULL, stats = NULL, estimated = FALSE )
coefficients_path(matrix) A matrix where rows represent iterations and columns represent all coefficients (non-degrees then degrees), showing their values during estimation. If provided, appends to any existing path.
samples(list) A list of simulated 'iglm.data' objects (class 'iglm.data.list'). If provided, replaces any existing samples.
var(matrix) The estimated variance-covariance matrix for the non-degrees coefficients. Replaces existing matrix.
fisher_degrees(matrix) The Fisher information matrix for degrees coefficients. Replaces existing matrix.
fisher_nondegrees(matrix) The Fisher information matrix for non-degrees coefficients. Replaces existing matrix.
score_degrees(numeric) The score vector for degrees coefficients. Replaces existing vector.
score_nondegrees(numeric) The score vector for non-degrees coefficients. Replaces existing vector.
llh(numeric) Log-likelihood value(s). If provided, appends to the existing vector of log-likelihoods.
stats(matrix) A matrix of summary statistics from simulations, where rows correspond to simulations and columns to statistics. Replaces or extends the existing matrix and will be turned into a mcmc object from the 'coda' package.
estimated(logical) A flag indicating whether these results come from a completed estimation run. Updates the internal status.
The 'results' object itself ('self'), invisibly. Called for its side effects.
remove_samples()
Clears the stored simulation samples ('.samples') and statistics ('.stats') from the object, resetting it to an empty list. This might be used to save memory or before running new simulations.
results.generator$remove_samples()
The 'results' object itself ('self'), invisibly.
plot()
Generates diagnostic plots for the estimation results. Currently plots:
The log-likelihood path across iterations.
The convergence paths for degrees coefficients (if present).
The convergence paths for non-degrees coefficients.
Optionally, can also trigger plotting of model assessment results if available.
results.generator$plot( trace = FALSE, stats = FALSE, model_assessment = FALSE, ... )
trace(logical) If 'TRUE' (default), plot the trace plots of the estimation (log-likelihood and coefficient paths). Requires model to be estimated.
stats(logical) If 'TRUE', plots the normalized statistics from simulations. Default is 'FALSE'.
model_assessment(logical) If 'TRUE', attempts to plot the results stored in the '.model_assessment' field. Requires model assessment to have been run and a suitable 'plot' method for 'iglm_model_assessment' objects to exist. Default is 'FALSE'.
...Additional fits with identical model_assessment terms are currently identified from this argument. The names of the arguments are shown as the legend in the model assessment plots.
Requires estimation results ('private$.estimated == TRUE') to plot convergence diagnostics. Requires model assessment results for the model assessment plots.
print()
Prints a concise summary of the contents of the 'results' object, indicating whether various components (coefficients path, variance matrix, Fisher info, score, samples, stats, etc.) are available.
results.generator$print(...)
...Additional arguments (currently ignored).
The 'results' object itself ('self'), invisibly.
clone()
The objects of this class are cloneable with this method.
results.generator$clone(deep = FALSE)
deepWhether to make a deep clone.
Creates an object of class 'sampler.iglm' (and 'R6') which holds all parameters controlling the MCMC sampling process for 'iglm' models. This includes global settings like the number of simulations and burn-in, as well as references to specific samplers for the network ('z') and attribute ('x', 'y') components.
This function provides a convenient way to specify these settings before passing them to the 'iglm' constructor or simulation functions.
sampler.iglm( sampler_x = NULL, sampler_y = NULL, sampler_z = NULL, n_simulation = 100, n_burn_in = 10, init_empty = TRUE, seed = NA, cluster = NULL, file = NULL )sampler.iglm( sampler_x = NULL, sampler_y = NULL, sampler_z = NULL, n_simulation = 100, n_burn_in = 10, init_empty = TRUE, seed = NA, cluster = NULL, file = NULL )
sampler_x |
An object of class 'sampler.net.attr' (created by 'sampler.net.attr()') specifying how to sample the 'x_attribute'. If 'NULL' (default), default 'sampler.net.attr()' settings are used. |
sampler_y |
An object of class 'sampler.net.attr' specifying how to sample the 'y_attribute'. If 'NULL' (default), default settings are used. |
sampler_z |
An object of class 'sampler.net.attr' specifying how to sample the 'z_network' ties *within* the defined neighborhood/overlap region. If 'NULL' (default), default settings are used. |
n_simulation |
(integer) The number of independent samples to generate after the burn-in period. Default: 100. Must be non-negative. |
n_burn_in |
(integer) The number of MCMC iterations to discard at the start for burn-in. Default: 10. Must be non-negative. |
init_empty |
(logical) If 'TRUE' (default), initialize the MCMC chain from an empty state. |
seed |
(integer or 'NA') A single integer seed set once before sampling begins to ensure reproducibility. If 'NA' (default), a random seed is generated automatically. |
cluster |
A parallel cluster object (e.g., from 'parallel::makeCluster()') for parallel simulations. If 'NULL' (default), simulations run sequentially. |
file |
(character or 'NULL') If provided, loads the sampler state from the specified .rds file instead of initializing from parameters. |
An object of class 'sampler.iglm' (and 'R6').
'sampler.net.attr', 'iglm', 'control.iglm'
n_actor <- 50 sampler_new <- sampler.iglm( n_burn_in = 100, n_simulation = 10, seed = 42, sampler_x = sampler.net.attr(n_proposals = n_actor * 10), sampler_y = sampler.net.attr(n_proposals = n_actor * 10), sampler_z = sampler.net.attr(n_proposals = n_actor^2, tnt = TRUE), init_empty = FALSE ) sampler_new sampler_new$seed sampler_new$set_n_simulation(100) sampler_new$n_simulationn_actor <- 50 sampler_new <- sampler.iglm( n_burn_in = 100, n_simulation = 10, seed = 42, sampler_x = sampler.net.attr(n_proposals = n_actor * 10), sampler_y = sampler.net.attr(n_proposals = n_actor * 10), sampler_z = sampler.net.attr(n_proposals = n_actor^2, tnt = TRUE), init_empty = FALSE ) sampler_new sampler_new$seed sampler_new$set_n_simulation(100) sampler_new$n_simulation
The 'sampler.iglm' class is an R6 container for specifying and storing
the parameters that control the MCMC (Markov Chain Monte Carlo) sampling
process used in iglm simulations and potentially during estimation.
It includes settings for the number of simulations, burn-in period,
initialization, and
parallelization options. It also holds references to component samplers
(sampler.net.attr objects) responsible for sampling individual parts
(attributes x, y, network z).
sampler_x('sampler_net_attr') The sampler configuration object for the x attribute.
sampler_y('sampler_net_attr') The sampler configuration object for the y attribute.
sampler_z('sampler_net_attr') The sampler configuration object for the z network (overlap region).
n_simulation('integer') The number of configurations to simulate.
n_burn_in('integer') The number of initial MCMC iterations to discard.
init_empty('logical') Whether to initialize simulations from an empty state.
seed('integer') Read-only. The random seed used for sampling.
cluster('cluster' object or 'NULL') The parallel cluster object being used, or 'NULL'.
new()
Create a new 'sampler.iglm' object. Initializes all sampler settings, using defaults for component samplers ('sampler.net.attr') if not provided, and validates inputs.
sampler.iglm.generator$new( sampler_x = NULL, sampler_y = NULL, sampler_z = NULL, n_simulation = 100, n_burn_in = 10, init_empty = TRUE, seed = NA, cluster = NULL, file = NULL )
sampler_xAn object of class 'sampler.net.attr' controlling sampling for the x attribute. If 'NULL', defaults from 'sampler.net.attr()' are used.
sampler_yAn object of class 'sampler.net.attr' controlling sampling for the y attribute. If 'NULL', defaults from 'sampler.net.attr()' are used.
sampler_zAn object of class 'sampler.net.attr' controlling sampling for the z network (within the defined neighborhood/overlap). If 'NULL', defaults from 'sampler.net.attr()' are used.
n_simulation(integer) The number of network/attribute configurations to simulate and store after the burn-in period. Default is 100. Must be non-negative.
n_burn_in(integer) The number of initial MCMC iterations to discard (burn-in) before starting to collect simulations. Default is 10. Must be non-negative.
init_empty(logical) If 'TRUE' (default), the MCMC chain is initialized from an empty state (e.g., empty network, attributes at mean). If 'FALSE', initialization might depend on the specific sampler implementation (e.g., starting from observed data).
seed(integer or 'NA') A single integer seed for the random number generator, set once before sampling begins. If 'NA' (default), a random seed is generated automatically.
clusterA parallel cluster object (e.g., from the 'parallel' package) to use for running simulations in parallel. If 'NULL' (default), simulations are run sequentially.
file(character or 'NULL') If provided, loads the sampler state from the specified .rds file instead of initializing from parameters.
A new 'sampler.iglm' object.
set_cluster()
Sets the parallel cluster object to be used for simulations.
sampler.iglm.generator$set_cluster(cluster)
clusterA parallel cluster object from the 'parallel' package.
deactive_cluster()
Deactivates parallel processing for this sampler instance by setting the internal cluster object reference to 'NULL'.
sampler.iglm.generator$deactive_cluster()
The 'sampler.iglm' object itself ('self'), invisibly.
set_n_simulation()
Sets the number of simulations to generate after burn-in.
sampler.iglm.generator$set_n_simulation(n_simulation)
n_simulation(integer) The number of simulations to set.
None.
set_n_burn_in()
Sets the number of burn-in iterations.
sampler.iglm.generator$set_n_burn_in(n_burn_in)
n_burn_in(integer) The number of burn-in iterations to set.
None.
set_init_empty()
Sets whether to initialize simulations from an empty state.
sampler.iglm.generator$set_init_empty(init_empty)
init_empty(logical) 'TRUE' to initialize from empty, 'FALSE' otherwise.
None.
set_x_sampler()
Sets the sampler configuration for the x attribute.
sampler.iglm.generator$set_x_sampler(sampler_x)
sampler_xAn object of class 'sampler_net_attr'.
None.
set_y_sampler()
Sets the sampler configuration for the y attribute.
sampler.iglm.generator$set_y_sampler(sampler_y)
sampler_yAn object of class 'sampler_net_attr'.
None.
set_z_sampler()
Sets the sampler configuration for the z attribute.
sampler.iglm.generator$set_z_sampler(sampler_z)
sampler_zAn object of class 'sampler_net_attr'.
None.
set_seed()
Sets the random seed for this sampler.
sampler.iglm.generator$set_seed(seed)
seed(integer) The random seed to set.
None.
print()
Prints a formatted summary of the sampler configuration to the console.
sampler.iglm.generator$print(digits = 3, ...)
digits(integer) Number of digits for formatting numeric values. Default: 3.
...Additional arguments (currently ignored).
The 'sampler.iglm' object itself ('self'), invisibly.
gather()
Gathers all data from private fields into a list.
sampler.iglm.generator$gather()
A list containing all information of the sampler.
save()
Save the object's complete state to a directory. This will save the main sampler's settings to a file named 'sampler.iglm_state.rds' within the specified directory, and will also call the 'save()' method for each nested sampler (.x, .y, .z), saving them into the same directory.
sampler.iglm.generator$save(file)
file(character) The file to a directory where the state files will be saved. The directory will be created if it does not exist.
The object itself, invisibly.
clone()
The objects of this class are cloneable with this method.
sampler.iglm.generator$clone(deep = FALSE)
deepWhether to make a deep clone.
Creates an object of class 'sampler_net_attr' (and 'R6'). Specifies MCMC sampling parameters for one component (attribute or network) within the 'iglm' simulation framework. Used as input to 'sampler.iglm()'.
sampler.net.attr(n_proposals = 10000, file = NULL, tnt = TRUE)sampler.net.attr(n_proposals = 10000, file = NULL, tnt = TRUE)
n_proposals |
(integer) Number of MCMC proposals per sampling update. Default: 10000. |
file |
(character or 'NULL') If provided, loads state from an .rds file. |
tnt |
(logical) If 'TRUE' (default), use Tie-No-Tie sampling. |
An object of class 'sampler_net_attr' (and 'R6').
'sampler.iglm'
sampler_comp_default <- sampler.net.attr() sampler_comp_default sampler_comp_custom <- sampler.net.attr(n_proposals = 50000, tnt = FALSE) sampler_comp_customsampler_comp_default <- sampler.net.attr() sampler_comp_default sampler_comp_custom <- sampler.net.attr(n_proposals = 50000, tnt = FALSE) sampler_comp_custom
The 'sampler_net_attr' class is a simple R6 container used within the 'sampler.iglm' class. It holds the MCMC sampling parameters for a single component of the 'iglm' model, such as one attribute (e.g., 'x_attribute') or a part of the network (e.g., 'z_network' within the overlap). It stores the number of proposals and the TNT flag. The random seed is managed centrally by the parent 'sampler.iglm' object.
n_proposals('integer') Read-only. Number of MCMC proposals per step.
tnt('logical') Read-only. Whether TNT sampling is used.
new()
Create a new 'sampler_net_attr' object.
sampler.net.attr.generator$new(n_proposals = 10000, file = NULL, tnt = TRUE)
n_proposals(integer) The number of MCMC proposals (iterations) to perform for this specific component during each sampling step. Default is 10000. Must be a non-negative integer.
file(character or 'NULL') If provided, loads the sampler state from the specified .rds file instead of initializing from parameters.
tnt(logical) If 'TRUE' (default), use Tie-No-Tie sampling (only if used for networks).
A new 'sampler_net_attr' object.
print()
Print a summary of the sampler settings for this component.
sampler.net.attr.generator$print(indent = " ")
indent(character) Indentation string. Default is " ".
The object itself, invisibly.
gather()
Gathers all data into a list.
sampler.net.attr.generator$gather()
A list with 'n_proposals' and 'tnt'.
set_n_proposals()
Sets the number of MCMC proposals.
sampler.net.attr.generator$set_n_proposals(n_proposals)
n_proposals(integer) Number of proposals.
set_tnt()
Sets whether to use TNT sampling.
sampler.net.attr.generator$set_tnt(tnt)
tnt(logical) 'TRUE' to use TNT sampling.
save()
Save state to an .rds file.
sampler.net.attr.generator$save(file)
file(character) File path.
The object itself, invisibly.
clone()
The objects of this class are cloneable with this method.
sampler.net.attr.generator$clone(deep = FALSE)
deepWhether to make a deep clone.
Simulate responses and connections.
simulate_iglm( formula, basis = NULL, coef, coef_degrees = NULL, sampler = NULL, only_stats = TRUE, display_progress = FALSE, offset_nonoverlap = 0, cluster = NULL, fix_x = FALSE, fix_z = FALSE )simulate_iglm( formula, basis = NULL, coef, coef_degrees = NULL, sampler = NULL, only_stats = TRUE, display_progress = FALSE, offset_nonoverlap = 0, cluster = NULL, fix_x = FALSE, fix_z = FALSE )
formula |
A model 'formula' object. The left-hand side should be the
name of a 'iglm.data' object available in the calling environment.
See |
basis |
An optional 'iglm.data' object to serve as the basis for the simulation. If provided, the simulation starts from the state defined in this object. If 'NULL' (default), the initial state is taken from the 'iglm.data' object referenced in the 'formula'. |
coef |
Numeric vector containing the coefficient values for the structural (non-degrees) terms defined in the 'formula'. |
coef_degrees |
Numeric vector specifying the degrees coefficient values (expansiveness/attractiveness). This is required only if the 'formula' includes degrees terms. Its length must be 'n_actor' (for undirected networks) or '2 * n_actor' (for directed networks), where 'n_actor' is determined from the 'iglm.data' object in the formula. |
sampler |
An object of class 'sampler.iglm' (created by 'sampler.iglm()') specifying the MCMC sampling parameters. This includes the number of simulations ('n_simulation'), burn-in iterations ('n_burn_in'), initialization settings ('init_empty'), and component sampler settings ('sampler_x', 'sampler_y', etc.). If 'NULL' (default), default settings from 'sampler.iglm()' are used. |
only_stats |
(logical). If |
display_progress |
Logical. If 'TRUE', progress messages or a progress bar (depending on the backend implementation) are displayed during simulation. Default is 'FALSE'. |
offset_nonoverlap |
Numeric scalar value passed to the C++ simulator. This value is typically added to the linear predictor for dyads that are not part of the 'overlap' set defined in the 'iglm.data' object, potentially modifying tie probabilities outside the primary neighborhood. Default is '0'. |
cluster |
Optional parallel cluster object created, for example, by “parallel::makeCluster“. If provided and valid, the function performs a single burn-in simulation on the main R process, then distributes the remaining 'n_simulation' tasks across the cluster workers using “parallel::parLapply“. The master seed is offset for each worker to ensure different random streams. If 'NULL' (default), all simulations are run sequentially in the main R process. |
fix_x |
Logical. If 'TRUE', the simulation holds the 'x_attribute' fixed
at its initial state (from the |
fix_z |
Logical. If 'TRUE', the simulation holds the 'z_network' fixed
at its initial state (from the |
Parallel Execution: When a 'cluster' object is provided, the simulation process is adapted:
A single simulation run (including burn-in specified by 'sampler$n_burn_in') is performed on the master node to obtain a starting state for the parallel chains.
The total number of requested simulations ('sampler$n_simulation') is divided among the cluster workers.
“parallel::parLapply“ is used to run simulations on each worker. Each worker starts from the state obtained after the initial burn-in, performs zero additional burn-in ('n_burn_in = 0' passed to workers), and generates its assigned share of the simulations. Component sampler seeds are offset based on the worker ID to ensure pseudo-independent random number streams.
Results (simulated objects or statistics) from all workers are collected and combined.
This approach ensures that the initial burn-in phase happens only once, saving time.
A list containing one or two components (depending on 'only_stats'):
If 'only_stats = FALSE', this is a list of length 'sampler$n_simulation' where each element is a 'iglm.data' object representing one simulated draw from the model. The list has the S3 class '"iglm.data.list"'. If 'only_stats = TRUE', this component is omitted.
A numeric matrix with 'sampler$n_simulation' rows and 'length(coef)' columns. Each row contains the features (corresponding to the model terms in 'formula') calculated for one simulation draw. Column names are set to match the term names.
The function stops with an error if:
The length of 'coef' does not match the number of terms derived from 'formula'.
'formula_preprocess' fails.
The 'sampler' object is not of class 'sampler.iglm'.
The C++ backend 'xyz_simulate_cpp' encounters an error.
Helper functions like 'XYZ_to_R' or 'is_cluster_active' are not found.
Warnings may be issued if default sampler settings are used.
iglm for creating the model object,
sampler.iglm for creating the sampler object,
iglm.data for the data object structure.
This data object is data derived from the Twitter (X) interactions between U.S. state legislators, which is a subset of the data analyzed in Fritz et al. (2025).' The data is filtered to include only legislators from 10 states (NY, CA, TX, FL, IL, PA, OH, GA, NC, MI) and is further subset to the largest connected component based on mention or retweet activity.
This object contains the main iglm.data object and 5
pre-computed dyadic covariates.
data(state_twitter)data(state_twitter)
A list object containing 6 components. Let N be the number of
legislators in the filtered 10-state subset.
A iglm.data object (which is also a list)
parameterized as follows:
x_attribute: A binary numeric vector of length N.
Value is 1 if the legislator's party is 'Republican',
0 otherwise.
y_attribute: A Poisson numeric vector of length N.
Represents the count of hatespeech incidents
(actors_data$number_hatespeech) for each legislator.
z_network: A directed edgelist (2-column matrix) of
size n_edges x 2. A tie (i, j) exists if legislator
i either mentioned or retweeted legislator j.
neighborhood: A directed edgelist (2-column matrix).
Represents the follower network, where a tie (i, j) exists
if legislator i follows legislator j.
Self-loops (diagonal) are included.
An N x N matrix. matrix[i, j] = 1 if legislator i
and legislator j have the same gender, 0 otherwise.
An N x N matrix. matrix[i, j] = 1 if legislator i
and legislator j have the same race, 0 otherwise.
An N x N matrix. matrix[i, j] = 1 if legislator i
and legislator j are from the same state, 0 otherwise.
A 1 x N matrix (a row vector). matrix[1, i] = 1 if
legislator i is 'White', 0 otherwise.
A 1 x N matrix (a row vector). matrix[1, i] = 1 if
legislator i is 'female', 0 otherwise.
Gopal, Kim, Nakka, Boehmke, Harden, Desmarais. The National Network of U.S. State Legislators on Twitter. Political Science Research & Methods, Forthcoming.
Kim, Nakka, Gopal, Desmarais,Mancinelli, Harden, Ko, and Boehmke (2022). Attention to the COVID-19 pandemic on Twitter: Partisan differences among U.S. state legislators. Legislative Studies Quarterly 47, 1023–1041.
Fritz, C., Schweinberger, M. , Bhadra S., and D. R. Hunter (2025). A Regression Framework for Studying Relationships among Attributes under Network Interference. Journal of the American Statistical Association, to appear.
Computes statistics.
statistics(formula)statistics(formula)
formula |
A model 'formula' object. The left-hand side should be the
name of a |
A named numeric vector. Each element corresponds to a term in the
'formula', and its value is the calculated observed feature
for that term based on the data in the iglm.data object. The names of the
vector match the coefficient names derived from the formula terms.
# Create a iglm.data object n_actor <- 10 neighborhood <- matrix(1, nrow = n_actor, ncol = n_actor) type_x <- "binomial" type_y <- "binomial" x_attr_data <- rbinom(n_actor, 1, 0.5) y_attr_data <- rbinom(n_actor, 1, 0.5) z_net_data <- matrix(0, nrow = n_actor, ncol = n_actor) object <- iglm.data( z_network = z_net_data, x_attribute = x_attr_data, y_attribute = y_attr_data, neighborhood = neighborhood, directed = FALSE, type_x = type_x, type_y = type_y ) statistics(object ~ edges(mode = "local") + attribute_y + attribute_x)# Create a iglm.data object n_actor <- 10 neighborhood <- matrix(1, nrow = n_actor, ncol = n_actor) type_x <- "binomial" type_y <- "binomial" x_attr_data <- rbinom(n_actor, 1, 0.5) y_attr_data <- rbinom(n_actor, 1, 0.5) z_net_data <- matrix(0, nrow = n_actor, ncol = n_actor) object <- iglm.data( z_network = z_net_data, x_attribute = x_attr_data, y_attribute = y_attr_data, neighborhood = neighborhood, directed = FALSE, type_x = type_x, type_y = type_y ) statistics(object ~ edges(mode = "local") + attribute_y + attribute_x)