Descerning Insights From Data

This is a blog about projects I work on to improve my programming and data analyst skills.

Hi, I'm Mariann! 👋

🚀 About Me

I am currently working towards a degree in Computer Science with Data Analytics. I previously worked in education and have taught in schools in The United States, Germany, South Korea and China. In addition to English, I can speak German and survival Chinese. I am particularly interested in education, politics and issues related to social equity.

🔗 Links

🛠 Data Analysis Projects

I have completed several data analysis projects as part of my degree.

Fraud Detection: I used Python to clean and explore a set of data comprimised of information about credit card customers and their purchases. I used Association Rules Mining and Cluster Analysis to create profiles of customers who were more likely to experience credit card fraud. I then used Naive Bayes and Decision Trees to look for a correlation between fraudulant purchases and location.
Health Predictions: I used Python to clean and explore the 2015 NY state Behavioral Risk Factor data. I designed an SQL database to store data and queried it as needed for analysis. After exploring the data I constructed research questions to guide the analysis. I then used Linear Regression, Naïve Bayes, and Hoeffding trees to look for connections bewtween arthritis severity, doctor’s advice and exercise.
Diabetic Correlations: Python was used to clean and explore data related to diabetes and hospital admissions. I designed research questions that could be answered using the given data. Linear Regression, Logistic Regression, Random Forest and SMO support vector machines were used to look at the connections between diabetes, medications and hospital readmissions.
Health Code Violations: Python was used to create a GUI which analyzed data and displayed visualizations showing the relationship between restaurant health code violations and zip code. MySQl was used to store and manipulate data.
Topic Modelling Podcasts: Python was used to clean and prepare podcast transcripts for analysis. Various topic modelling algorithms were used to explore which topics were most prevelant in podcasts. Different visualization methods, such as word clouds and graphing the distance between topics were used to investigate how topic modelling might be used to identify podcasts related to specific needs in education, advertising and horizon scanning. Topics were also used to support the creation of a podcast search engine.

In addition, I have completed one personal project and am working on a second one.

Making Movies Successful: I combined Movie Data from Kaggle with Profit information obtained using the TMDB API. Python was used to clean and analyze data. Found a connection between content ratings, genre and movie profits as well as between the number of movies actors were previously in and profits.
Use of Online Platforms In The Time of Covid: This is the project I am currently working on. I am looking to see how the use of Online Learning Platforms changed during 2020. I am also investigating correlations between The use of Online Platforms and other factors such as school culture, percentage of students receiveing free and reduced lunch and achievement on standardized tests scores. Python was used to clean data and a MySQL database was used to store it. Data was anlaysed using statistics, linear regression and association rules mining.

Projects Using Java

Coin Sorter: I used Java, JavaFx and CSS to build a GUI which asked the user to enter an amount in cents. It then calculated the possible options for currency exhange as well as the number of bills and coins that the user could choose to have returned. This project also involved defining and implementing test cases to verify that the program worked as intended.
Client_Server: Java was used to simulate a Client and Server running the stop-and-wait and go-back-N network protocols.

🛠 Skills

Java: object oriented programming
Python: scikit learn, pandas, numpy, matplot lib, spaCy, ntlk
SQL
Database design
Machine Learning: Clustering, Association Rules Mining, Classification, Topic Modelling, Choosing appropriate models
R: tidyverse, arules

Name		Name	Last commit message	Last commit date
Latest commit History 101 Commits
_data		_data
_pages		_pages
_posts		_posts
assets/images		assets/images
.gitignore		.gitignore
Gemfile		Gemfile
README.md		README.md
_config.yml		_config.yml
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Descerning Insights From Data

Hi, I'm Mariann! 👋

🚀 About Me

🔗 Links

🛠 Data Analysis Projects

Projects Using Java

🛠 Skills

About

Releases

Packages

Languages

MariannBea/MariannBea.github.io

Folders and files

Latest commit

History

Repository files navigation

Descerning Insights From Data

Hi, I'm Mariann! 👋

🚀 About Me

🔗 Links

🛠 Data Analysis Projects

Projects Using Java

🛠 Skills

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages