This project generates a synthetic dataset using various statistical distributions, providing insights into the nature of random data. The dataset includes values from Normal, Uniform, Exponential, Random Integers, and Binomial distributions, allowing for a comprehensive analysis of different types of data.
The dataset is designed for educational purposes, offering a practical example of how to generate and analyze random data.
- Data Sources: Data is generated using Python libraries such as NumPy and Pandas.
- Distributions:
- Normal Distribution: Simulates continuous data with a Gaussian distribution.
- Uniform Distribution: Provides values within a specified range.
- Exponential Distribution: Models the time between events.
- Random Integers: Simulates discrete values.
- Binomial Distribution: Represents binary outcomes.
- Statistics: Descriptive statistics including mean, median, and standard deviation are computed.
- Visualizations: Histograms are created to observe the distribution patterns.
- Python: For data generation and analysis.
- NumPy: For numerical operations and random data generation.
- Pandas: For data manipulation and analysis.
- Matplotlib: For plotting visualizations.
- Seaborn: For enhanced data visualization.
The generated dataset includes the following columns:
- Normal Distribution: Values drawn from a Gaussian distribution.
- Uniform Distribution: Values uniformly distributed between specified limits.
- Exponential Distribution: Values following an exponential distribution.
- Random Integers: Integer values within a specified range.
- Binomial Distribution: Values from a binomial distribution representing binary outcomes.
The project includes histograms for each type of distribution:
- Normal Distribution Histogram: Shows the distribution of values from the Gaussian distribution.
- Uniform Distribution Histogram: Displays the range and frequency of uniformly distributed values.
- Exponential Distribution Histogram: Illustrates the spread of values from the exponential distribution.
- Random Integers Histogram: Visualizes the frequency of discrete integer values.
- Binomial Distribution Histogram: Represents the frequency of binary outcomes.
- Run the Script: Execute
App.py
to generate the dataset and visualizations. - Explore Visualizations: Use the Streamlit interface to select columns and view histograms.
- Download Data: Use the download button to save the generated dataset as a CSV file.
- Install the necessary Python libraries:
pip install -r requirements.txt
- Distribution Patterns: Analyze how different statistical distributions generate data with varying patterns.
- Data Analysis: Utilize the generated dataset for educational purposes, testing, and further analysis.
This project is licensed under the MIT License - see the LICENSE file for details.
- LinkedIn: Profile
- Contact: Sunny Bibyan