From 596670194951c97d790380d50a7c9e5d65215573 Mon Sep 17 00:00:00 2001 From: "Pavel N. Krivitsky" Date: Wed, 6 Nov 2024 16:56:40 +1100 Subject: [PATCH 1/2] More ergm vignette fixes. --- vignettes/ergm.Rmd | 62 +++++++++++++++------------------------------- 1 file changed, 20 insertions(+), 42 deletions(-) diff --git a/vignettes/ergm.Rmd b/vignettes/ergm.Rmd index e6a61884..344d2b2d 100644 --- a/vignettes/ergm.Rmd +++ b/vignettes/ergm.Rmd @@ -103,12 +103,10 @@ where $p$ is the number of terms in the model. From this one can more easily ob The statistics $g(y)$ can be thought of as the "covariates" in the model. In the network modeling context, these represent network features like density, homophily, triads, etc. In one sense, they are like covariates you might use in other statistical models. But they are different in one important respect: these $g(y)$ statistics are functions of the network itself -- each is defined by the frequency of a specific configuration of dyads observed in the network -- so they are not measured by a question you include in a survey (e.g., the income of a node), but instead need to be computed on the specific network you have, after you have collected the data. As a result, every term in an ERGM must have an associated algorithm for computing its value for your network. The `ergm` package in `statnet` includes about 150 term-computing algorithms. We will explore some of these terms in this -tutorial, and links to more information are provided in -[section 3](#model-terms-available-for-ergm-estimation-and-simulation). - +tutorial. You can get the list of all available terms, and the syntax for using them, by typing: ```{r, eval=FALSE} -ergmTerm +? ergmTerm ``` and you can look up help for a specific term, say, `edges`, by typing: ```{r, eval=FALSE} @@ -133,6 +131,23 @@ One key distinction in model terms is worth keeping in mind: terms are either _ An overview and discussion of many of these terms can be found in @MoHa08s. +#### Coding new `ergm` terms + +There is a `statnet` package --- `ergm.userterms` --- +that facilitates the writing of new +`ergm` terms. The package is available [on GitHub](https://github.com/statnet/ergm.userterms), and installing it will +include the tutorial (ergmuserterms.pdf). The tutorial can +also be found in @HuGo13e, +and some introductory slides and installation instructions from the workshop +we teach on coding `ergm` terms can be found +[here](https://statnet.org/workshops/). For the most recent API available for implementing terms, see the Terms API vignette. + +Note that writing up new `ergm` terms requires some knowledge of +C and the ability +to build R from source. While the latter is covered in the tutorial, +the many environments for building R and the rapid changes in +these environments make these instructions obsolete quickly. + #### ERGM probabilities: at the tie-level The ERGM expression for the probability of the entire graph shown above can be re-expressed in terms of the conditional log-odds of a single tie between two actors: @@ -475,43 +490,6 @@ It's a small difference in this case (and a small network, with little missing d MORAL: If you have missing data on ties, be sure to identify them by assigning the "NA" code. This is particularly important if you're reading in data as an edgelist, as all dyads without edges are implicitly set to "0" in this case. - -## 3. Model terms available for *ergm* estimation and simulation - -Model terms are the expressions (e.g. "triangle") -used to represent predictors on the right-hand size of equations used -in: - -* calls to `summary` (to obtain measurements of network statistics -on a dataset) -* calls to `ergm` (to estimate an ergm model) -* calls to `simulate` (to simulate networks from an ergm model -fit) - -Because these terms are not exogeneous measures, but functions of -the dyad states in the network, they must be calculated for -the network that is being modeled. -Many ERGM terms are simple counts of configurations (e.g., edges, nodal degrees, stars, triangles), but others are more complex functions of these configurations (e.g., geometrically weighted degrees and shared partners). In theory, any configuration (or function of configurations) can be a term in an ERGM. In practice, however, these terms have to be constructed before they can be used---that is, one has to explicitly write an algorithm that defines and calculates the network statistic of interest. This is another key way that ERGMs differ from traditional linear and general linear models. - -The terms that can be used in a model also depend on the type of network being analyzed: directed or undirected, one-mode or two-mode ("bipartite"), binary or valued edges. - - -### Coding new `ergm` terms - -There is a `statnet` package --- `ergm.userterms` --- -that facilitates the writing of new -`ergm` terms. The package is available [on GitHub](https://github.com/statnet/ergm.userterms), and installing it will -include the tutorial (ergmuserterms.pdf). The tutorial can -also be found in @HuGo13e, -and some introductory slides and installation instructions from the workshop -we teach on coding `ergm` terms can be found -[here](https://statnet.org/workshops/). For the most recent API available for implementing terms, see the Terms API vignette. - -Note that writing up new `ergm` terms requires some knowledge of -C and the ability -to build R from source. While the latter is covered in the tutorial, -the many environments for building R and the rapid changes in -these environments make these instructions obsolete quickly. ## 4. Assessing convergence for dyad dependent models: MCMC Diagnostics @@ -630,7 +608,7 @@ never produce an interesting network with this density -- this is what we call "model degneracy." For more detailed discussion of model degeneracy in the ERGM context, -see the papers by Mark Handcock referenced [below.](References) +see @Ha03a, @SnPa06n, and @Sc11i. In that worst case scenario, we end up not being able to obtain coefficent estimates, so we can't use the GOF function to identify how the model simulations deviate from the observed data. We can, however, still use the MCMC diagnostics to observe what is happening with the simulation algorithm, and this (plus some experience and intuition about the behavior of `ergm` terms) can help us improve the model specification. From ad155409323c34f2f0087d91138fa460699b7579 Mon Sep 17 00:00:00 2001 From: "Pavel N. Krivitsky" Date: Wed, 6 Nov 2024 16:58:47 +1100 Subject: [PATCH 2/2] Bumped version in NEWS. --- DESCRIPTION | 2 +- inst/NEWS.Rd | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/DESCRIPTION b/DESCRIPTION index a1b7492d..03666fb7 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,5 +1,5 @@ Package: ergm -Version: 4.7-7458 +Version: 4.7.4-7458 Date: 2024-11-04 Title: Fit, Simulate and Diagnose Exponential-Family Models for Networks Authors@R: c( diff --git a/inst/NEWS.Rd b/inst/NEWS.Rd index 108ea48f..95ee9d87 100644 --- a/inst/NEWS.Rd +++ b/inst/NEWS.Rd @@ -66,7 +66,7 @@ -\section{Changes in version 4.7.3}{ +\section{Changes in version 4.7.4}{ \subsection{NEW FEATURES}{ \itemize{