-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'refactor-network.py' of github.com:dsi-clinic/2024-wint…
…er-climate-cabinet-campaign-finance-tracker into refactor-network.py
- Loading branch information
Showing
6 changed files
with
100 additions
and
80 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,9 +5,9 @@ | |
1. Collect: Gather key states' political campaign finance report data which should include recipient information, donor information, and transaction information. | ||
2. Transform: Define database schema for storing transaction and entity information and write code to transform and validate raw data to fit appropriate schema. | ||
3. Clean: Perform record linkage and fix likely data entry errors. | ||
4. Classify: Label all entities as fossil fuel, clean energy, or other | ||
5. Graph: Construct a network graph of campaign finance contributions | ||
6. Analyze: Perform analysis on network data and join with other relevant dataset | ||
4. Classify: Label all entities as fossil fuel, clean energy, or other. | ||
5. Graph: Construct a network graph of campaign finance contributions with mirco-level and macro-level views. | ||
6. Analyze: Perform analysis on network data and join with other relevant dataset. | ||
|
||
|
||
## Setup | ||
|
@@ -32,24 +32,19 @@ For developing, please use either a Docker dev container or slurm computer clust | |
|
||
### Network Visualization | ||
|
||
# TODO: #101 document what we want to see in the visualization and decide how many types of visual are needed | ||
|
||
The network visualizations created and their associated relevant metrics are housed in the `\output` directory. Specifically, [this](https://github.com/dsi-clinic/2024-winter-climate-cabinet-campaign-finance-tracker/tree/main/output/network_graphs) folder. Details about the approaches adopted for these visuals are present in [this](https://github.com/dsi-clinic/2024-winter-climate-cabinet-campaign-finance-tracker/blob/main/output/network_graphs/README.md) document. | ||
|
||
## Repository Structure | ||
|
||
### utils | ||
Project python code | ||
Project python code. | ||
|
||
### notebooks | ||
Contains short, clean notebooks to demonstrate analysis. | ||
Contains short, clean notebooks to demonstrate analysis. This is a dynamic folder with notebooks added/removed as per current working processes. | ||
|
||
### data | ||
|
||
Contains details of acquiring all raw data used in repository. If data is small (<50MB) then it is okay to save it to the repo, making sure to clearly document how to the data is obtained. | ||
|
||
If the data is larger than 50MB than you should not add it to the repo and instead document how to get the data in the README.md file in the data directory. | ||
|
||
This [README.md file](/data/README.md) should be kept up to date. | ||
Contains details of acquiring all raw data used in repository. | ||
|
||
### output | ||
This folder is empty by default. The final outputs of make commands will be placed here by default. | ||
|
@@ -73,7 +68,7 @@ Student Email: [email protected] | |
Student Name: Yangge Xu | ||
Student Email: [email protected] | ||
|
||
Student Name: Bhavya Pandey | ||
Student Name: Bhavya Pandey | ||
Student Email: [email protected] | ||
|
||
Student Name: Kaya Lee | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,7 @@ | ||
# Output README | ||
# Output | ||
--- | ||
'deduplicated_UUIDs.csv' : Following record linkage work in the record_linkage pipeline, this file stores all the original uuids, and indicates the uuids to which the deduplicated uuids have been matched to. | ||
`deduplicated_UUIDs.csv` : Following record linkage work in the record_linkage pipeline, this file stores all the original uuids, and indicates the uuids to which the deduplicated uuids have been matched to. | ||
|
||
'network_metrics.txt' : Following the network graph creation, this file stores some summarizing metrics about the netowork including: 50 nodes of highest centrality (in-degree, out-degree, eigenvector, and betweenness), density, assortativity based on classification, and clustering. | ||
`network_metrics.txt` : Following the network graph creation, this file stores some summarizing metrics about the netowork including: 50 nodes of highest centrality (in-degree, out-degree, eigenvector, and betweenness), density, assortativity based on classification, and clustering. | ||
|
||
This folder gets populated with output files upon running the `make` commands. The final network visualization graph outputs and metrics are housed here. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters