Married or in-union women of reproductive age who have their need for family planning satisfied with modern methods (%)
This Repo throws light on fascinating journey of Project in crafting data engineering pipelines for meticulous data analysis.
How does access to modern family planning methods vary across different regions and socioeconomic groups?
- Sources: WHO datasets WHO Indicators.
- Source link: https://www.who.int/data/gho/data/indicators/indicator-details/GHO/married-or-in-union-women-of-reproductive-age-who-have-their-need-for-family-planning-satisfied-with-modern-methods-(-)
- This Project journey involves data collection from The World Health Organization Relational Data Hub.
- Had collected data set Related to Women’s maternal and reproductive health which is related to family planning satisfied with modern methods (%),
- Here Ensuring the reliability and relevance of data was paramount for me as it formed the foundation for the depth and accuracy of our analysis.
- In this stage data cleaning was on focus.
- Here, I prioritized data cleaning and quality by addressing issues like missing values, nulls, duplicates, outliers, changing data's physical type
- Ensuring standardization with Python. Hence, This meticulous preparation ensures that the data aligns seamlessly with analysis goals.
- Modern Family Planning Data Cleaning and Transformation Notebook
- Cleaned Data CSV
- Step three involves data transformation, where I have shaped the data to fit the needs of analysis.
- This includes normalization to ensure consistency and clarity in data representation, setting the stage for effective modeling.
- Data Normalization Notebook
- Normalized Tables
- Normalizing Period Ranges
Fact Table
crafting entity-relationship diagrams (ERDs) and establishing connections between datasets by Postgre-SQL and assigning primary and foreign keys within each tables.
Delved into exploratory data analysis using Python libraries, and explored patterns with cleaned data sets.
This phase unveils insights and prepares the data for meaningful visualizations.
Here, We we can see, Analysis on continents level for the percentage of Married or in-union women who can access and use modern
family planning methods to control if and when they can have children. Also, we had included a geo heat map in our project as well with HTML File, screen shot of map that is on the screen.
modern family planning methods varies significantly across regions and socioeconomic groups globally. In the Americas and Southeast Asia,
access rates are notably higher, around 72% and 71% respectively. This high rate can be depends, due to better healthcare infrastructure, services, education, awareness and availability of Tech fields within that regions.
On the other hand, the Eastern Mediterranean has the lowest percentage at 50.3%. lower rate can be often due to cultural barriers, lack of this facilities. These disparities show the varying levels of support and challenges different regions face in providing family planning services.
Here are the highlights for the top three countries within each continent. In Africa, Zimbabwe leads with 84.8% of women
having their needs met. Egypt leads in the Eastern Mediterranean with 80%. Europe sees France at the forefront with 95.5%.
The Democratic People's Republic of Korea leads South-East Asia at 89.6%, and in the Western Pacific, China stands out with an impressive 96.6%.
Here are the highlights for the top three countries within each continent. In Africa, Zimbabwe leads with 84.8% of women
having their needs met. Egypt leads in the Eastern Mediterranean with 80%. Europe sees France at the forefront with 95.5%.
The Democratic People's Republic of Korea leads South-East Asia at 89.6%, and in the Western Pacific, China stands out with an impressive 96.6%.
involves data visualization for further analysis with Interactive Geoographical Heat Map
where I transformed complex findings into clear, insightful visual representations.
This step ensures that the results are not only understood but also actionable for stakeholders.
Important
Key information users need to know to achieve their goal.
Ultimately, Data journey concludes with interpreting the results, weaving them into meaningful conclusions Through this approach, I ensure that my analysis not only addresses initial problems but also adds unexpected value to business requirements through my technical expertise.
- CSV
- OS
- matlotlib
- Pandas
- pyplot
- numpy
- seaborn
- geopandas
- folium
- time
- Selenium, webdriver
- Ipython.display, image
- Ultimately, Data journey concludes with interpreting the results, weaving them into meaningful conclusions
- Through this approach, I ensure that my analysis not only addresses initial problems but also adds unexpected value to business requirements through my technical expertise.
Note
Useful information that users should know, even when skimming content.
Data Flow:
- Data sourced from WHO -> Processed in Jupyter Notebook -> Stored and retrieved from a SQL database.
- Schema Diagram: Detailed in the Engineering_ERD folder.
Tools Used:
- Storage: SQL database for organized data storage and retrieval.
- Processing: Jupyter Notebook (odern Family Planning Data Cleaning and Transformation.ipynb) for data manipulation and analysis.
Analytical Use Cases
- Access Disparities: Analyzing regional and socioeconomic variations in access to family planning.
Demonstration
- Jupyter Notebook: Demonstrates data retrieval and visualization.
- Visuals: Include Geo Heat Maps and line graph
Assumptions:
- When the period of study was done between 2 years (i.e. 2022-2023), it is assumed that the results of that particular study corresponds to 12 months and it is a reflection of the latest year (2023).__
- The datasets were broken down in intervals of 3 years each starting in 2003 to 2023 to allow consistent analysis of data over time.
- The study was done in married and in-union women of reproductive age, which is assumed to be between 15-49 years.
- Assumed the same collecting data method accross countries.
Limitations:
- There are more indicators that could have been analyzed to contribute to the overall hypothesis. We focused on 4 key indicators due to time constrainsts.
- Period data was not standardized accross datasets. Some assumptions needed to be made to standardize it and make them fully comparable.
Ethical Considerations:
- Ensuring the confidentiality and ethical use of data.
- Addressing biases inherent in data collection methods.
Future Work Scope:
- Extended Analysis: Incorporate more indicators for a comprehensive view.
- Data Integration: Enhance the database with additional sources and real-time data.
- Interactive Dashboards: Develop more interactive visualization tools for dynamic data exploration.
- Please, refer to the word file to get the summary of the findings
Folder Structure:
- Extracted Folders: Contains all exported datasets and analysis results.
- Engineering_ERD: ERD for schema and SQL database export.
- Project_Analysis: Findings and summary documents.
How to Run:
- Environment Setup: Ensure you have Python and Jupyter Notebook installed.
- Dependencies: Install required libraries numpy, pandas, matplotlib, seaborn.
- Run Notebook: Open .ipynb in Jupyter Notebook and run the cells sequentially.