Found 5 bookmarks
Newest
autodb: Automatic Database Normalisation for Data Frames
autodb: Automatic Database Normalisation for Data Frames
Automatic normalisation of a data frame to third normal form, with the intention of easing the process of data cleaning. (Usage to design your actual database for you is not advised.) Originally inspired by the 'AutoNormalize' library for 'Python' by 'Alteryx' (<a href="https://github.com/alteryx/autonormalize" target="_top"https://github.com/alteryx/autonormalize/a>), with various changes and improvements. Automatic discovery of functional or approximate dependencies, normalisation based on those, and plotting of the resulting "database" via 'Graphviz', with options to exclude some attributes at discovery time, or remove discovered dependencies at normalisation time.
·cran.r-project.org·
autodb: Automatic Database Normalisation for Data Frames
Data Modeling - Relational Databases (SQL) vs Data Lake (File Based) - Confessions of a Data Guy
Data Modeling - Relational Databases (SQL) vs Data Lake (File Based) - Confessions of a Data Guy
Data Modeling is a topic that never goes away. Sometimes I do reminisce about the good ol’ days of Kimball-style data models, it was so simple, straightforward, just the same thing for years. Then Big Data happened, Spark happened. Things just changed. There is a lot of new content coming out around Data Lakes and […]
·confessionsofadataguy.com·
Data Modeling - Relational Databases (SQL) vs Data Lake (File Based) - Confessions of a Data Guy