A Statistician’s R Notebook - Creating Professional Excel Reports with R: A Comprehensive Guide to openxlsx Package
GitPodcast - Repository to Podcast in Seconds
Turn any GitHub repository into an engaging podcast in seconds.
Josiah Parry - Implementing OpenID Connect (OIDC) in R
Problem Statement
S7: a new OO system for R.
Prototypes and sizes
August 2024 Student Housing Update
As of August, 92.8% of beds at the core 175 universities tracked by RealPage were leased for the Fall 2024 semester.
2024-U.S.-Student-Housing-Preleasing-Report.pdf
LIDA | LIDA: Automated Visualizations with LLMs
LIDA is a tool to automatically explore data, generate visualizations and infographics from data using large language models like ChatGPT and GPT4
Explore Your Knowledge Base
Build Elegant R shiny Apps with New ‘Card’ Features Using {card.pro}
Build Robust R shiny Apps with Elegant and Highly Customizabel ‘Card’ Features Using card.pro R package
Web Tool
Web Tool.
A table in my model records building valuations over time. Is it a slowly-changing dimension table or a fact fable?
I'm building a data model for a report that allows users to analyze building valuations over time, and details about buildings and their current leases.
I have a fact table that contains leasing
You have two fact tables that differ only in terms of granularity. Your Fact_Leases table, for example, is a fact table at the granularity of a lease. I can assume this quite safely because it appears the Lease ID column is a primary key. Each row of that table represents a lease.
On the other hand, your ?_Valuations table is a fact table at the granularity of quarter-time-building. That is, each row not only represents a building but also a quarter time period. And one way you can sort of know that this is a fact table is by understanding that if you had a date-dimension table, you could relate the two on their Quarter columns (although it would be a many-to-many relationship). Therefore, your date-DIMENSION table would be explaining the facts of your valuations. (I'd recommend, however, replacing your Quarter column with actual dates, and allow the date-dimension table to inform the quarters. That's an aside, though.)
Now, the problem of repeating valuation metrics occurs because you are trying to combine two fact tables at different levels of granularity. When you try to apply the valuations to the Fact_Leases table, which is at the granularity of lease, Power BI (or any BI tool, for that matter) can't understand how to apportion the valuation at the BUILDING level down to the LEASE level of granularity. So it just repeats. And it's important to keep this in mind when developing your reporting. No visualizations built at the context level of lease will be able to include a valuation metric because valuations exist only at a higher level of granularity.
Some of the more useful Tidyverse functions
R functions for every data engineer using Tidyverse Tidyverse has long been an amazing collection of R packages, primarily for data engineering and data science. Common among these packages is the …
Build a Docker Image from a Directory or Project
Simple utilities to generate a Dockerfile from a directory or project, build the corresponding Docker image, and push the image to DockerHub.
UNCHARTED DATA: Automating Workflows with GitHub Actions
How to automate data collection and app deployment with GitHub Actions.
Create .Renviron file
Within the get_data.R script of my repository, I extract my EIA API key from my R environment so that I can connect to the EIA API and pull the data needed for my project. In order for this to occur during my workflow, I need to create an .Renviron file within my virtual environment and store the key within that environment.
- name: Create and populate .Renviron file
run: |
echo EIA_API_KEY="$EIA_API_KEY" >> ~/.Renviron
shell: bash
UNCHARTED DATA: Interactive Tooltip Tables
How to include tables in your {ggiraph} tooltips.
UNCHARTED DATA: Introducing the {reactablefmtr} Package
An R package created to make the styling and customization of {reactable} tables easier.
Plotly
Plotly's
UNCHARTED DATA: Using Crosstalk to Add User-Interactivity
Linking an interactive plot and table together with the crosstalk package.
Using Crosstalk to Add User-Interactivity
The goal is to link the reactable table I created to a plotly chart and provide additional filter options that control both the table and the chart.
An important note: in order to use crosstalk, you must create a shared dataset and call that dataset within both plotly and reactable. Otherwise, your dataset will not communicate and filter with eachother. The code to do this is SharedData$new(dataset).
If you expand the code below, you’ll see that the code to build a table in reactable is quite extensive. I will not go into the details in this post, but do recommend a couple great tutorials that I used to create the interactive table such as this tutorial from Greg Lin, and this from Tom Mock which really helped me understand how to use CSS and Google fonts to enhance the visual appeal of the table (see the “Additional CSS Used for Table” section below for more info).
If you have ever built something in Shiny before, you’ll notice that the crosstalk filters are very similar. You can add a filter to any existing column in the dataset. As you can see in the code below, I used a mixture of filter_checkbox and filter_select depending on how many unique options were available in the column you’re filtering. My rule of thumb is if there are more than five options to choose from it’s probably better to put them into a list in filter_select like I did with the Division filtering as to not take up too much space on the page.
For the layout of the data visualization, I used bscols to place the crosstalk filters side-by-side with the interactive plotly chart.
I then placed the reactable table underneath and added a legend to the table using tags from the htmltools package.
The final result is shown below. Feel free to click around and the filters and you will notice that both the plot and the table will filter accordingly. Another option is to drag and click on the plot and you will see the table underneath mimic the teams shown.
starschemar: Obtaining Star Schemas from Flat Tables
starschemar
6 Ways Gen AI is improving Data Modelling
GDPR and your right to be deleted
mgramin/awesome-db-tools: Everything that makes working with databases easier
Everything that makes working with databases easier - mgramin/awesome-db-tools
Declarative vs Versioned Workflows | Atlas | Manage your database schema as code
This section introduces two types of workflows that are supported by Atlas
Schema Change Management Tools
Here's a brief history of database schema migration and how modern, opensource solutions can be used so both Devs and Ops can work less and accomplish more.
Yoyo database migrations — yoyo-migrations 9.0.0.dev0 documentation
python-blog/2024/06 - June/postgres_pydantic at main · fbaptiste/python-blog
about_Ref - PowerShell
Describes how to create and use a reference type variable. You can use reference type variables to permit a function to change the value of a variable that is passed to it.
coolbutuseless/yyjsonr: Fast JSON package for R
Fast JSON package for R.
FreeApi.app
A free resource to learn and master API
Design Patterns in R
Build robust and maintainable software with object-oriented design patterns in R. Design patterns abstract and present in neat, well-defined components and interfaces the experience of many software designers and architects over many years of solving similar problems. These are solutions that have withstood the test of time with respect to re-usability, flexibility, and maintainability. R6P provides abstract base classes with examples for a few known design patterns. The patterns were selected by their applicability to analytic projects in R. Using these patterns in R projects have proven effective in dealing with the complexity that data-driven applications possess.