00-INBOX

137 bookmarks

Newest

Thesis_GIMAM7_DAJOUD.pdf

·up.raindrop.io·Sep 22, 2025

Thesis_GIMAM7_DAJOUD.pdf

Spatial Analysis With R.pdf

·up.raindrop.io·Sep 6, 2025

Spatial Analysis With R.pdf

xjhjzb1vzrdncmb4hc3tqg11da2q.pdf

·up.raindrop.io·Aug 21, 2025

xjhjzb1vzrdncmb4hc3tqg11da2q.pdf

MCP Hub - MCP Server Search & Discover & Collection Website

Discover, manage, and share model context protocol (MCP) tools for LLM clients

·mcphub.io·Jun 27, 2025

MCP Hub - MCP Server Search & Discover & Collection Website

/etc/inputrc

·linuxfromscratch.org·Jun 21, 2025

/etc/inputrc

DeepWiki | AI documentation you can talk to, for every repo

DeepWiki provides up-to-date documentation you can talk to, for every repo in the world. Think Deep Research for GitHub - powered by Devin.

#ai #tool #dev #llm #tech #github #docs

·deepwiki.com·May 22, 2025

DeepWiki | AI documentation you can talk to, for every repo

ImageFX - labs.google/fx

Transform text into images and explore with endless imagination.

·labs.google·May 14, 2025

ImageFX - labs.google/fx

Harlequin: The SQL IDE for Your Terminal.

Harlequin is a drop-in replacement for the DuckDB CLI, SQLite CLI, psql, etc. that brings SQL IDE features to your terminal.

#tool #sql #cli #terminal #shell #command line #opensource #database

·harlequin.sh·May 14, 2025

Harlequin: The SQL IDE for Your Terminal.

Building an AI-powered location explorer with Shiny and Claude – WALKER DATA

GIS, demographics, and data science consulting

#shiny #ai #claude #llm #r-development #dev

·walker-data.com·May 7, 2025

Building an AI-powered location explorer with Shiny and Claude – WALKER DATA

Making sense out of Semi-Structured data

Parsing JSON with the Extract Nested Data component within Matillion Data Productivity Cloud connected to Snowflake simplifies the parsing for many semi-structured data patterns. The JSON format has become a more popular format for semi-structured data, primarily because it is more consistent containing all key:value pairs. JSON handles repeating elements by containing them in an array as a value of a key:value pair. For this article, I am using the same example data set that was used in part one on XML only this sample data is represented as JSON. I also walk you through how to convert the XML to JSON to simplify parsing XML. Extract Nested Data We start by using the Extract Nested Data component, which simplifies parsing semi-structured data. In this example, we’re using several of them to traverse the nested elements. First, the JSON file is loaded into a table called donut_json, which contains a single column defined as a variant “data_value.” Next, configure the Columns property of the Extract Nested component. I used “Autofill”’ and let the component identify the structure of the JSON. I have deselected all the columns and chosen to pass through the Item attributes and element values. In the example, I also passed through the Filling element, keeping it a variant for further processing downstream. Since the topping elements are repeating at the first level, the component has flattened toppings into separate rows automatically, so I was able to select the element value level for toppings. Another property to call out is the Outer join property on the Configuration tab. Since all of the elements do not exist for every item, I needed to set Outer Join = “Yes.” This will retain all the rows for all items, even though only two items have Fillings. Flatten Variant The Flatten Variant component is used to flatten arrays. Although the Extract Nested Data component can sometimes be used, the Flatten Variant lets you explicitly break a column into more rows than the original extract nested data if you are seeking further granularity from the extract nested component. The batter element in this example has two formats, so I have to treat the Batter array differently by using a Flatten Variant component to parse the array of batters into separate rows. The initial Extract Nested Data component created a new row for each item and each topping. From there, we want a new row for each item, topping and batter. I tested the batter element to determine if it’s an array, by using the IS_ARRAY() function in a Calculator component. IS_ARRAY("items_item-element_batters_batter") After that, Flatten the array into separate rows per batter element before extracting the attributes. Set the Column Flatten property to read the batter array column In the column mappings, use the flatten alias to map to an output variant column Finally, we bring all the rows back together, remove unwanted columns, and write to a new table. The Unite component unions all the rows back together The Rename component allows us to remove any unwanted fields, like the arrays, and rename and reorder the fields The Rewrite component writes to a new table The resulting final pipeline is much simpler than the previous XML one. Convert XML to JSON Our example pipeline started with a file that was already in a JSON format. However, if you have an XML file that needs to be converted and you would like to convert the XML to JSON inside a pipeline, you’ll use the code below. Create an Orchestration Pipeline First, I created a separate Orchestration pipeline that contains a SQL Script component to create a Snowflake UDF using the code below. This code calls a Snowflake Snowpark package called “xmltodict.” Our example XML_to_JSON Python code follows. Parse With the Calculator Component Next in my Transformation pipeline, I called the procedure in a Calculator component. The parse_json function formats the JSON so it’s readable. Normalizing Semi-Structured Data Semi-structured files typically contain data

Semi-structured files typically contain data that has been nested, and we often want to store that data in a structured format more friendly to analytics and reporting. Many times, as we flatten out deeply nested data, we end up with a multi-join or cartesian join where all upper-level elements of the file are joined with all nested elements of the file.

real-world examples are often very large when flattened. In these cases, we need to evaluate the data contained in the JSON response and determine the best model to represent the data in different tables.

In order to split the dimensions into separate tables, the first Extract Nested Data component will pass the full element as a variant downstream in order to start to split out the different datasets into separate streams.

#JSON #data #parsing

·matillion.com·May 7, 2025

Making sense out of Semi-Structured data

Shiny App

·shinylive.io·Jan 16, 2025

Shiny App

PostgreSQL 17.2 Documentation

#database #postgres #documentation #data-engineering

·postgresql.org·Jan 9, 2025

PostgreSQL 17.2 Documentation

PostgreSQL 15.10 Documentation

#database #postgres #documentation

·postgresql.org·Jan 9, 2025

PostgreSQL 15.10 Documentation

pgAdmin4 Docs

#pgAdmin4 #database #documentation #docs #tool #postgres

·127.0.0.1:62499·Jan 9, 2025

pgAdmin4 Docs

A table in my model records building valuations over time. Is it a slowly-changing dimension table or a fact fable?

I'm building a data model for a report that allows users to analyze building valuations over time, and details about buildings and their current leases. I have a fact table that contains leasing

You have two fact tables that differ only in terms of granularity. Your Fact_Leases table, for example, is a fact table at the granularity of a lease. I can assume this quite safely because it appears the Lease ID column is a primary key. Each row of that table represents a lease.

On the other hand, your ?_Valuations table is a fact table at the granularity of quarter-time-building. That is, each row not only represents a building but also a quarter time period. And one way you can sort of know that this is a fact table is by understanding that if you had a date-dimension table, you could relate the two on their Quarter columns (although it would be a many-to-many relationship). Therefore, your date-DIMENSION table would be explaining the facts of your valuations. (I'd recommend, however, replacing your Quarter column with actual dates, and allow the date-dimension table to inform the quarters. That's an aside, though.)

Now, the problem of repeating valuation metrics occurs because you are trying to combine two fact tables at different levels of granularity. When you try to apply the valuations to the Fact_Leases table, which is at the granularity of lease, Power BI (or any BI tool, for that matter) can't understand how to apportion the valuation at the BUILDING level down to the LEASE level of granularity. So it just repeats. And it's important to keep this in mind when developing your reporting. No visualizations built at the context level of lease will be able to include a valuation metric because valuations exist only at a higher level of granularity.

#data-engineering #stackoverflow #data-model #dimensional-modeling #fact-table #lease #data-warehouse #data-modeling-best-practices

·stackoverflow.com·Jan 5, 2025

A table in my model records building valuations over time. Is it a slowly-changing dimension table or a fact fable?

Amazon AWS - KnowledgeShop

Amazon AWS AWS Fundamentals Storage and Content Delivery Service Storage Basics S3 (Simple Storage Service) S3 Overview S3 Buckets S3 Objects S3 …

Read Later

·fizalihsan.github.io·Feb 6, 2022

Amazon AWS - KnowledgeShop

The Docker Everything Bagel™ – Spin Up A Local Data Stack

Use docker compose to create local replicas of the modern data stack with one command.

Read Later

·lakefs.io·Feb 4, 2022

The Docker Everything Bagel™ – Spin Up A Local Data Stack

Welcome - Julia Data Science

·juliadatascience.io·Jan 27, 2022

Welcome - Julia Data Science

Strategic Intelligence | World Economic Forum

Strategic insights and contextual intelligence from the World Economic Forum. Explore and monitor the issues and forces driving transformational change across economies, industries and systems.

Read Later

·intelligence.weforum.org·Jan 7, 2022

Strategic Intelligence | World Economic Forum

Should I provide a declarative API? (You propably should) | meshcloud

Declarative APIs are becoming more and more popular, especially in the context of Infrastructure as Code. At meshcloud we've implemented a declarative API. In this post I want to provide insights into the process and answer these questions: Does it make sense to provide declarative APIs for all systems? Which use-cases benefit from it and […]

Read Later

·meshcloud.io·Aug 7, 2021

Should I provide a declarative API? (You propably should) | meshcloud

Knowledge Vault Of Digital Notes | Mishacreatrix

Jumpstart your Knowledge Management System with my hand-crafted vault of literature notes.

Read Later

·mishacreatrix.com·Aug 7, 2021

Knowledge Vault Of Digital Notes | Mishacreatrix

Editors · Babel

## Syntax Highlighting

Read Later

·babeljs.io·Aug 7, 2021

Editors · Babel

Falling Into The Pit of Success

Eric Lippert notes the perils of programming in C++: I often think of C++ as my own personal Pit of Despair Programming Language. Unmanaged C++ makes it so easy to fall into traps. Think buffer overruns, memory leaks, double frees, mismatch between allocator and deallocator, using freed memory, umpteen dozen

Read Later

·blog.codinghorror.com·Aug 7, 2021

Falling Into The Pit of Success