data-exp-vis-2020/Dataformats at master · sdam-elte/data-exp-vis-2020

History

Name		Name	Last commit message	Last commit date
parent directory ..
Datafile-formats.ipynb		Datafile-formats.ipynb
File_formats.md		File_formats.md
Large_data.md		Large_data.md
Large_datasets.ipynb		Large_datasets.ipynb
Pandas_Vtypes.ipynb		Pandas_Vtypes.ipynb
Readme.md		Readme.md
generate_timeseries_data.ipynb		generate_timeseries_data.ipynb

Readme.md

How to store data?

Depending on the type and size of the data there are several formats and libraries from which one can choose. Other factors such as

the fragmentation of the information that is stored in the data
the amount of metadata
or the way we need to access the data (sequentially or randomly) also matters in our choice of datastorage.

For smaller datasets (< 1 GB) all of the data can be read into the memory, so any operation on it will be cheap and the emphasis is more on the logic how the data is organized. In case of larger datasets we need to optimize for quick accessibilty and build functions around it that will reconstruct the logical structure again.

One way to store data is to create a relational database with a database management system (DBMS) (e.g. MySQL). This has the benefit of saving space by utlizing the relations within the data and minimizing redundancy. A DBMS can also serve multiple users at the same time. Obviously DBMS won't help much if data is not related at all (e.g. in case of timeseries).

Topics

File formats
- Historical overview of file formats
- Advantages and disadvantages of various file formats
- Algorithms for searching for data
Openscience
- Version control systems
- Publishing with code and data
Handling large datasets

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataformats

Dataformats

Readme.md

How to store data?

Topics

Files

Dataformats

Directory actions

More options

Directory actions

More options

Latest commit

History

Dataformats

Folders and files

parent directory

Readme.md

How to store data?

Topics