Skip to content

Commit

Permalink
文本分析 1
Browse files Browse the repository at this point in the history
  • Loading branch information
XiangyunHuang committed Jan 20, 2024
1 parent 695fe4a commit d041127
Show file tree
Hide file tree
Showing 18 changed files with 1,237 additions and 114 deletions.
8 changes: 6 additions & 2 deletions .github/workflows/quarto-book-macos.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,15 @@ jobs:
runs-on: macos-13
env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
CMDSTAN_VERSION: "2.33.1"
CMDSTAN_VERSION: "2.34.0"
RETICULATE_PYTHON_ENV: /opt/.virtualenvs/r-tensorflow
steps:
- uses: actions/checkout@v4

- name: Install Quarto
uses: quarto-dev/quarto-actions/setup@v2
with:
version: 1.4.527
version: 1.4.545

- uses: r-lib/actions/setup-r@v2
with:
Expand Down Expand Up @@ -60,6 +60,8 @@ jobs:
virtualenv -p /usr/bin/python3 $RETICULATE_PYTHON_ENV
source $RETICULATE_PYTHON_ENV/bin/activate
pip3 install -r docker/requirements.txt
python -m spacy download en_core_web_sm
python -m spacy download zh_core_web_sm
deactivate
- name: Install LaTeX packages
Expand All @@ -74,6 +76,8 @@ jobs:
- name: Reinstall R packages from source
run: |
install.packages(c("Matrix", "MatrixModels", "rjags", "lme4", "TMB", "glmmTMB"), repos = "https://cran.r-project.org/", type = "source")
if(!require('remotes')) install.packages('remotes')
remotes::install_github("tidyverse/ggplot2", ref = remotes::github_pull("5592"))
shell: Rscript {0}

- name: Render Book
Expand Down
8 changes: 6 additions & 2 deletions .github/workflows/quarto-book-ubuntu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,15 @@ jobs:
runs-on: ubuntu-22.04
env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
CMDSTAN_VERSION: "2.33.1"
CMDSTAN_VERSION: "2.34.0"
RETICULATE_PYTHON_ENV: /opt/.virtualenvs/r-tensorflow
steps:
- uses: actions/checkout@v4

- name: Install Quarto
uses: quarto-dev/quarto-actions/setup@v2
with:
version: 1.4.527
version: 1.4.545

- uses: r-lib/actions/setup-r@v2
with:
Expand All @@ -42,6 +42,8 @@ jobs:
virtualenv -p /usr/bin/python3 $RETICULATE_PYTHON_ENV
source $RETICULATE_PYTHON_ENV/bin/activate
pip3 install -r docker/requirements.txt
python -m spacy download en_core_web_sm
python -m spacy download zh_core_web_sm
deactivate
- name: Setup CmdStan
Expand Down Expand Up @@ -71,6 +73,8 @@ jobs:
- name: Reinstall R packages from source
run: |
install.packages(c("Matrix", "MatrixModels", "rjags", "lme4", "TMB", "glmmTMB"), repos = "https://cran.r-project.org/", type = "source")
if(!require('remotes')) install.packages('remotes')
remotes::install_github("tidyverse/ggplot2", ref = remotes::github_pull("5592"))
shell: Rscript {0}

- name: Setup magick
Expand Down
3 changes: 3 additions & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ Imports:
HistData,
hglm,
INLA (>= 23.9.9),
jiebaR,
knitr (>= 1.44),
lars,
latticeExtra,
Expand Down Expand Up @@ -116,12 +117,14 @@ Imports:
scs,
sf (>= 1.0.9),
showtext,
spacyr,
spaMM,
spdep,
splancs,
spatstat,
stars (>= 0.6.0),
tensorflow,
text2vec,
tidycensus,
tidygraph,
titanic,
Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,12 +23,12 @@
- 分类数据的分析
- 数据建模
- 网络分析(R 语言社区开发者协作网络)
- 时序分析(美团股价收益波动率建模
- 空间分析(预测核辐射强度的分布
- 时序分析(美团股价收益率的风险建模
- 空间分析(预测核辐射强度的空间分布
- 优化建模
- 统计计算
- 数值优化
- 优化问题
- 统计计算(统计模型与优化问题的关系)
- 数值优化(线性、非线性、约束和无约束)
- 优化问题(TSP 问题、投资组合问题等)
- 贝叶斯建模
- 概率推理框架
- 广义线性模型
Expand Down
4 changes: 2 additions & 2 deletions analyze-areal-data.qmd
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# 区域数据分析 {#sec-analyze-areal-data}
# 空间区域数据分析 {#sec-analyze-areal-data}

## 苏格兰唇癌数据分析 {#sec-scotland-lip-cancer}

Expand All @@ -21,7 +21,7 @@

响应变量服从泊松分布

- BYM-INLA [@blangiardo2013;@moraga2020]
- BYM-INLA [@blangiardo2013; @moraga2020]
- BYM-Stan [@morris2019; @donegan2022; @cabral2022]

记录 1975-1986 年苏格兰 56 个地区的唇癌病例数,这是一个按地区汇总的数据。
Expand Down
7 changes: 4 additions & 3 deletions analyze-spatial-data.qmd
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# 预测核辐射强度的分布 {#sec-nuclear-pollution-concentration}
# 空间点参考数据分析 {#sec-nuclear-pollution-concentration}

```{r}
#| echo: false
source("_common.R")
```

本章内容属于空间分析的范畴,空间分析的内容十分广泛,本章仅以一个模型和一个数据简略介绍。一个模型是空间广义线性混合效应模型,空间广义线性混合效应模型在流行病学、生态学、环境学等领域有广泛的应用,如预测某地区内的疟疾流行度分布,预测某地区 PM 2.5 污染物浓度分布等。一个数据来自生态学领域,数据集所含样本量不大,但每个样本收集成本不小,采集样本前也都有实验设计,数据采集的地点是预先设定的。下面将对真实数据分析和建模,任务是预测核辐射强度在朗格拉普岛上的分布。
本章内容属于空间分析的范畴,空间分析的内容十分广泛,主要分三大块,分别是空间点参考数据分析、空间点模式分析和空间区域数据分析。本章仅以一个模型和一个数据简略介绍空间点参考数据分析。一个模型是空间广义线性混合效应模型,空间广义线性混合效应模型在流行病学、生态学、环境学等领域有广泛的应用,如预测某地区内的疟疾流行度分布,预测某地区 PM 2.5 污染物浓度分布等。一个数据来自生态学领域,数据集所含样本量不大,但每个样本收集成本不小,采集样本前也都有实验设计,数据采集的地点是预先设定的。下面将对真实数据分析和建模,任务是预测核辐射强度在朗格拉普岛上的分布。

## 数据说明 {#sec-rongelap-data}

Expand Down Expand Up @@ -900,7 +900,8 @@ p3 <- ggplot() +
theme_bw() +
labs(x = "横坐标(米)", y = "纵坐标(米)", fill = "辐射强度") +
theme(
legend.position = c(0.75, 0.1),
legend.position = "inside",
legend.position.inside = c(0.75, 0.1),
legend.direction = "horizontal",
legend.background = element_blank()
)
Expand Down
Loading

0 comments on commit d041127

Please sign in to comment.