No Clocks

No Clocks

3146 bookmarks
Newest
Tidy Schema Validation for Data Frames
Tidy Schema Validation for Data Frames
Validate data.frames against schemas to ensure that data matches expectations. Define schemas using tidyselect and predicate functions for type consistency, nullability, and more. Schema failure messages can be tailored for non-technical users and are ideal for user-facing applications such as in shiny or plumber.
·whipson.github.io·
Tidy Schema Validation for Data Frames
Importing spatial data to PostGIS | Microsoft Community Hub
Importing spatial data to PostGIS | Microsoft Community Hub
If you’re new to PostGIS and want to try it out, one of the first things you’ll want to do is to import some data into your database. I want to provide an overview of how to accomplish this. The post provides an introduction to some common data import terminology, details on where to find spatial data and tools for importing data to PostGIS, and specific instructions for how to use each tool to perform the import.   For an introduction to PostGIS, see my previous blog post here.   Introduction If you’ve worked with other data types before, then most of the process for importing spatial data will be self-explanatory. However, for someone working with spatial data, acronyms like CRS and SRID might be confusing.   For data import, you should know the coordinate reference system (https://en.wikipedia.org/wiki/Spatial_reference_system) your data is in and which spatial reference identifier (https://en.wikipedia.org/wiki/Spatial_reference_system#Identifier) number is used to reference that specific CRS. The SRID defines all the parameters of a data set’s geographic coordinate system and projection. Using an SRID is convenient because it packs all the information about a map projection into a single number. When creating spatial objects for insertion into the database, the SRID is required.   Where to find spatial data? Before you can load any data to your database, you need to acquire it. I think that one of the most underrated skills for GIS experts is to know where to find suitable data. If you are using PostGIS to manage the data that you have in your organization already, you don’t have to start your process by searching new data, but rather working with your own. I have added here a few data sources that you can try out: https://www.openstreetmap.org/: OpenStreetMap (OSM) is a collaborative project to create a free editable map of the world. Many people refer to OSM as the Wikipedia of maps. If you seek easy ways to get an extract of the data, you can check for https://www.geofabrik.de/data/download.html for Shapefiles or https://download.osmdata.xyz/ for GeoPackages.   OpenStreetMap data from Africa. All the roads in the OSM database.  © OpenStreetMap contributors.  https://www.naturalearthdata.com/: Natural Earth is a public domain map dataset available at 1:10m, 1:50m, and 1:110 million scales. Featuring tightly integrated vector and raster data, with Natural Earth you can make a variety of visually pleasing, well-crafted maps with cartography or GIS software. So, if you need country boundaries, states or railroads in the world on a very general level, this is your data of choice. https://freegisdata.rtwilson.com/: The page contains a categorised list of links to over 500 sites providing freely available geographic datasets. So, after you have loaded some files to your local disk, you can start looking at different ways on how to import them to PostGIS.   Tools for importing data to PostGIS Although the official documentation states that there are https://postgis.net/docs/using_postgis_dbmanagement.html#loading_geometry_data, the ways to achieve this are much more diverse. The two recognized by the documentation are formatted SQL statements or using the Shape file loader/dumper (shp2pgsql), so we can start with those.   Formatted SQL So loading with plain SQL is not recommended and nobody probably does it, but going through this method first, gives you a good idea of what is happening behind the scenes. It’s important to know how PostGIS handles spatial data types and the content of the http://postgis.net/workshops/postgis-intro/geometries.html, which can’t store strings or integers.   For example, a valid insert statement to create and insert an OGC spatial object would be:   INSERT INTO data.islands(geom, the_name ) VALUES ( ST_GeomFromText('POINT(0 0)', 4326), 'NULL ISLAND');   Another example is a method where you already have latitude and longitude data as text or numbers in your table and you want to build a valid geometry object from those.   First you need to add a geometry column for your data:   SELECT AddGeometryColumn ('data','islands','geom',4326,'POINT',2);   Next you could update the content of the geometry column based on the columns from the data:   UPDATE data.islandsset geom = ST_SetSRID(ST_MakePoint(longitude, latitude),4326);   In a nutshell, a simple insert works, but remember to build your geometry in a valid way!   For better performance, you should use COPY statements instead of INSERT, as it results in https://www.citusdata.com/blog/2017/11/08/faster-bulk-loading-in-postgresql-with-copy/   shp2pgsql The most common data format for spatial data has traditionally been the ESRI shapefile. shp2pgsql is a command line tool to import ESRI shapefiles to the database. Under Unix, you can use the following command for importing a new PostGIS table:   shp2pgsql -s -c -D -I . | \ psql -d -h -U   On Windows it’s https://www.bostongis.com/pgsql2shp_shp2pgsql_quickguide.bqg. Additionally, there’s the shp2pgsql-utility, which is a small GUI to import shapefiles to PostGIS.     ogr2ogr The tool I use most often for loading data to PostGIS is probably ogr2ogr. It is a very powerful tool to convert data into and from PostGIS https://gdal.org/drivers/vector/index.html. GDAL/OGR is a software package that powers a great variety of different geospatial software tools and it comes with QGIS already installed on the server computer. Conveniently ogr2ogr comes with the QGIS installation and Windows users can access the commands directly through https://trac.osgeo.org/osgeo4w/. So for a single import of a shapefile, you would run the following command:   ogr2ogr -f “PostgreSQL” PG:”host= dbname= user= password=” \yourdatafile.shp -lco SCHEMA=foo   The parameters used to run the ogr2ogr-command here are:   -f output file format name -lco layer creation option   Combining ogr2ogr commands with simple bash allow you to load a folder full of shapefiles!   for %i in (*.shp) do ogr2ogr -update -append -f PostgreSQL PG:"host= port=5432 dbname= user= password= schemas=myshapefiles" %i   This small and simple script has been useful many times. This will create you a table automatically and append all the files to the correct tables.   Loading data with QGIS QGIS allows a few different approaches when it comes to loading spatial data to PostGIS. In my previous blog post you can find some basic information about QGIS and how to connect to PostGIS. Inside QGIS you can do data import through DB Manager, but I recommend using Export to PostgreSQL, as it is using COPY statements in the background rather than INSERTS and https://geosupportsystem.se/2018/09/03/postgis-uppfoljning-2/ You can load data layers from the project or from disk. Below you see a screenshot from the main dialog.     Besides ogr2ogr this is another tool for data imports that I use very often, as most of my workflows are strongly QGIS related.   Loading data with Python and psycopg2 If you are integrating PostGIS into an application and want to automate things, GUI-workflows are not viable options. Instead you might want to look into loading data with Python. The most obvious choice for making a connection from Python to PostGIS is https://wiki.postgresql.org/wiki/Psycopg2_Tutorial. The most important thing to understand when working with psycopg2 are the data types for geometries that was already mentioned earlier. Inserting latitude and longitude values as such to a geometry column will not work, but instead you have to build your insert value with the lat and long parameters as follows:   ST_SetSRID(ST_MakePoint(%s, %s), 4326)   Other aspects of psycopg2 workflows won’t differ much from normal Python-PostgreSQLdata pipelines.   raster2pgsql If you are working with raster data, the benefits of moving your data from a file to a database are not nearly as obvious as with vector data. However PostGIS offers https://postgis.net/docs/RT_reference.html to store and analyze raster data and a wide range of raster functions. Check for example http://blog.cleverelephant.ca/2019/08/postgis-3-raster.html about working with raster in PostGIS. For loading raster to PostGIS, two main options are https://postgis.net/docs/using_raster_dataman.html#RT_Raster_Loader tool or GDAL with Python. Just like shp2pgqsl, raster2pgsql comes packaged with a PostGIS bundle installation. Basic structure of a raster2pgsql command is as follows:   raster2pgsql raster_options_go_here raster_file yourtable > out.sql   Just like shp2pgsql, this outputs a SQL file that you can then run in PostGIS.   Tools for OpenStreetMap data OpenStreetMap is a slightly specific case, but worth going through here separately, as it is widely used. The OpenStreetMap data structure in a database is by default divided into lines, roads, points and polygons. Depending on the application, you might want to change the style the data is loaded. However, the default structure works well with many tools and services. Most Linux distributions include osm2pgsql, which is a good generic tool in importing a small piece of OSM data to PostGIS. https://github.com/openstreetmap/osm2pgsql is actively maintained and widely used.   A basic way to load the data into PostGIS for rendering would be:   osm2pgsql --create --database postgres data.osm.pbf   This will load the data from data.osm.pbf into the planet_osm_point, planet_osm_line, planet_osm_roads, and planet_osm_polygon tables.   A thorough walkthrough on the osm2pgsql import process can be found at https://www.cybertec-postgresql.com/en/open-street-map-to-postgis-the-basics/. https://github.com/omniscale/imposm3, written in Go, is designed to create databases that are optimized for rendering. You need a json mapping to define the data schema. For a simple import with imposm you could try the following command:   imposm import -connection postgis://user:password@host/database \ -mapping mapping.json -read /path/to/osm.pbf -write   Yet anot...
·techcommunity.microsoft.com·
Importing spatial data to PostGIS | Microsoft Community Hub
Cursor Agent System Prompt (March 2025)
Cursor Agent System Prompt (March 2025)
Cursor Agent System Prompt (March 2025). GitHub Gist: instantly share code, notes, and snippets.
·gist.github.com·
Cursor Agent System Prompt (March 2025)
Code Arena
Code Arena
Code Arena lets developers compare how top LLMs build apps, websites, games, and more. Watch AI models code, refine, and deploy real software live.
·lmarena.ai·
Code Arena
How Cursor (AI IDE) Works
How Cursor (AI IDE) Works
Turning LLMs into coding experts and how to take advantage of them.
·blog.sshh.io·
How Cursor (AI IDE) Works
First Street API
First Street API
Comprehensive application programming interfaces for quantifying physical climate risk
·docs.firststreet.org·
First Street API
Flood, Wildfire, Wind and Heat Risk Model Methodology
Flood, Wildfire, Wind and Heat Risk Model Methodology
Nationwide models built off of decades of peer-reviewed research forecast the physical climate risk of flood, wildfire, wind and extreme heat.
·firststreet.org·
Flood, Wildfire, Wind and Heat Risk Model Methodology
Flood Factor® Flood Risk Model Methodology
Flood Factor® Flood Risk Model Methodology
Nationwide physically-based flood model forecasts how climate change will impact flood risk from rain, streamflow, sea level rise, and storm surge.
·firststreet.org·
Flood Factor® Flood Risk Model Methodology
What is GIS? | Geographic Information System Mapping Technology
What is GIS? | Geographic Information System Mapping Technology
Find the definition of GIS. Learn how this mapping and analysis technology is crucial for making sense of data. Learn from examples and find out why GIS is more important than ever.
·esri.com·
What is GIS? | Geographic Information System Mapping Technology
publiclab/leaflet-environmental-layers: Collection of different environmental map layers in an easy to use Leaflet library, similar to https://github.com/leaflet-extras/leaflet-providers#leaflet-providers
publiclab/leaflet-environmental-layers: Collection of different environmental map layers in an easy to use Leaflet library, similar to https://github.com/leaflet-extras/leaflet-providers#leaflet-providers
Collection of different environmental map layers in an easy to use Leaflet library, similar to https://github.com/leaflet-extras/leaflet-providers#leaflet-providers - publiclab/leaflet-environmenta...
·github.com·
publiclab/leaflet-environmental-layers: Collection of different environmental map layers in an easy to use Leaflet library, similar to https://github.com/leaflet-extras/leaflet-providers#leaflet-providers
🐘 PostgreSQL Power User Cheatsheet: The Guide for DBAs & Developers
🐘 PostgreSQL Power User Cheatsheet: The Guide for DBAs & Developers
🚀 Supercharge your PostgreSQL skills! This cheatsheet covers architecture, performance tuning, advanced SQL, JSONB, MVCC, security, replication & essential commands. A must-have for DBAs & Developers looking to master Postgres.
·cheatsheets.davidveksler.com·
🐘 PostgreSQL Power User Cheatsheet: The Guide for DBAs & Developers
Character encodings: Essential concepts
Character encodings: Essential concepts
Introduces a number of basic concepts needed to understand other articles that deal with characters and character encodings.
·w3.org·
Character encodings: Essential concepts
pgModeler - PostgreSQL Database Modeler
pgModeler - PostgreSQL Database Modeler
Open source data modeling tool designed for PostgreSQL. No more DDL commands written by hand. Let pgModeler do the job for you!
·pgmodeler.io·
pgModeler - PostgreSQL Database Modeler
Leaflet Environmental Layers
Leaflet Environmental Layers
Collection of different environmental map layers in an easy to use Leaflet library.
·publiclab.github.io·
Leaflet Environmental Layers
Dbdbml
Dbdbml
·github.com·
Dbdbml
Storing External Requests
Storing External Requests
I’ve worked in payment systems for a long time, and one common theme is that they make a lot of 3rd party API calls. For example, there are numerous payment networks and banks, fraud and risk checks, onboarding and KYC/KYB (Know Your Customer/Business) services, and more. And then there are the inbound calls as well, such as webhooks and callbacks from integration partners, e-commerce stores, and other outside interactions.
The first step is storing all inbound and outbound API calls. People will often use logs for this, but I think it’s far more valuable to put them in the database instead. This makes them easier to query, easier to aggregate, and easier to join against other data.
You can use one table for both inbound and outbound or separate them depending on preferences, but generally it’s useful to store most of the available information, such as: URL of the request Datetime of the request Request body Response body Response status/code (e.g. 200, 500) Total time spent on the request Request headers Response headers Metadata
For metadata, I like to use a JSON column and add in any metadata that links this request to an entity in the system. For example, it could include user_id, order_id, request_id, etc. As a JSON column, you can include more than one, and even include other complex nested information.
Response headers often include extra debugging information, such as request or trace Ids, version numbers, etc. It’s common when asking for 3rd party support to provide these values so they can go look in their own logging to find your requests.
Rather than try to write code for every API call, it’s often better to hook into the request/response lifecycle in one place and instrument all calls.
The way you do this depends on the language and libraries, but they are generally called interceptors.
Where possible, I like to record the request fields before the outbound call is started (e.g. request body, request headers) and then go back and update the row to store the response fields once the call is completed. There are several advantages over a single write at the end of the request cycle:
·pgrs.net·
Storing External Requests
mkquartodocs
mkquartodocs
mkquartodocs extension
·jspaezp.github.io·
mkquartodocs