No Clocks

No Clocks

2680 bookmarks
Newest
REST API in R with plumber
REST API in R with plumber
API and R Nowadays, it’s pretty much expected that software comes with an HTTP API interface. Every programming language out there offers a way to expose APIs or make GET/POST/PUT requests, including R. In this post, I’ll show you how to create an API using the plumber package. Plus, I’ll give you tips on how to make it more production ready - I’ll tackle scalability, statelessness, caching, and load balancing. You’ll even see how to consume your API with other tools like python, curl, and the R own httr package.
Nowadays, it’s pretty much expected that software comes with an HTTP API interface. Every programming language out there offers a way to expose APIs or make GET/POST/PUT requests, including R. In this post, I’ll show you how to create an API using the plumber package. Plus, I’ll give you tips on how to make it more production ready - I’ll tackle scalability, statelessness, caching, and load balancing. You’ll even see how to consume your API with other tools like python, curl, and the R own httr package
# When an API is started it might take some time to initialize # this function stops the main execution and wait until # plumber API is ready to take queries. wait_for_api <- function(log_path, timeout = 60, check_every = 1) { times <- timeout / check_every for(i in seq_len(times)) { Sys.sleep(check_every) if(any(grepl(readLines(log_path), pattern = "Running plumber API"))) { return(invisible()) } } stop("Waiting timed!") }
Oh, in some examples I am using redis. So, before you dive in, make sure to fire up a simple redis server. At the end of the script, I’ll be turning redis off, so you don’t want to be using it for anything else at the same time. I just want to remind you that this code isn’t meant to be run on a production server.
redis is launched in a background, , so you might want to wait a little bit to make sure it’s fully up and running before moving on.
wait_for_redis <- function(timeout = 60, check_every = 1) { times <- timeout / check_every for(i in seq_len(times)) { Sys.sleep(check_every) status <- suppressWarnings(system2("redis-cli", "PING", stdout = TRUE, stderr = TRUE) == "PONG") if(status) { return(invisible()) } } stop("Redis waiting timed!") }
First off, let’s talk about logging. I try to log as much as possible, especially in critical areas like database accesses, and interactions with other systems. This way, if there’s an issue in the future (and trust me, there will be), I should be able to diagnose the problem just by looking at the logs alone. Logging is like “print debugging” (putting print(“I am here”), print(“I am here 2”) everywhere), but done ahead of time. I always try to think about what information might be needed to make a correct diagnosis, so logging variable values is a must. The logger and glue packages are your best friends in that area.
Next, it might also be useful to add a unique request identifier ((I am doing that in setuuid filter)) to be able to track it across the whole pipeline (since a single request might be passed across many functions). You might also want to add some other identifiers, such as MACHINE_ID - your API might be deployed on many machines, so it could be helpful for diagnosing if the problem is associated with a specific instance or if it’s a global issue.
In general you shouldn’t worry too much about the size of the logs. Even if you generate ~10KB per request, it will take 100000 requests to generate 1GB. And for the plumber API, 100000 requests generated in a short time is A LOT. In such scenario you should look into other languages. And if you have that many requests, you probably have a budget for storing those logs:)
It might also be a good idea to setup some automatic system to monitor those logs (e.g. Amazon CloudWatch if you are on AWS). In my example I would definitely monitor Error when reading key from cache string. That would give me an indication of any ongoing problems with API cache.
Speaking of cache, you might use it to save a lot of resources. Caching is a very broad topic with many pitfalls (what to cache, stale cache, etc) so I won’t spend too much time on it, but you might want to read at least a little bit about it. In my example, I am using redis key-value store, which allows me to save the result for a given request, and if there is another requests that asks for the same data, I can read it from redis much faster.
Note that you could use memoise package to achieve similar thing using R only. However, redis might be useful when you are using multiple workers. Then, one cached request becomes available for all other R processes. But if you need to deploy just one process, memoise is fine, and it does not introduce another dependency - which is always a plus.
info <- function(req, ...) { do.call( log_info, c( list("MachineId: {MACHINE_ID}, ReqId: {req$request_id}"), list(...), .sep = ", " ), envir = parent.frame(1) ) }
#* Log some information about the incoming request #* https://www.rplumber.io/articles/routing-and-input.html - this is a must read! #* @filter setuuid function(req) { req$request_id <- UUIDgenerate(n = 1) plumber::forward() }
#* Log some information about the incoming request #* @filter logger function(req) { if(!grepl(req$PATH_INFO, pattern = "PATH_INFO")) { info( req, "REQUEST_METHOD: {req$REQUEST_METHOD}", "PATH_INFO: {req$PATH_INFO}", "HTTP_USER_AGENT: {req$HTTP_USER_AGENT}", "REMOTE_ADDR: {req$REMOTE_ADDR}" ) } plumber::forward() }
To run the API in background, one additional file is needed. Here I am creating it using a simple bash script.
library(plumber) library(optparse) library(uuid) library(logger) MACHINE_ID <- "MAIN_1" PORT_NUMBER <- 8761 log_level(logger::TRACE) pr("tmp/api_v1.R") %>% pr_run(port = PORT_NUMBER)
·zstat.pl·
REST API in R with plumber
SPA Mode | Remix
SPA Mode | Remix
From the beginning, Remix's opinion has always been that you own your server architecture. This is why Remix is built on top of the Web Fetch API and can run on any modern runtime via built-in or community-provided adapters. While we believe that having a server provides the best UX/Performance/SEO/etc. for most apps, it is also undeniable that there exist plenty of valid use cases for a Single Page Application in the real world:
SPA Mode is basically what you'd get if you had your own React Router + Vite setup using createBrowserRouter/RouterProvider, but along with some extra Remix goodies: File-based routing (or config-based via routes()) Automatic route-based code-splitting via route.lazy <Link prefetch> support to eagerly prefetch route modules <head> management via Remix <Meta>/<Links> APIs SPA Mode tells Remix that you do not plan on running a Remix server at runtime and that you wish to generate a static index.html file at build time and you will only use Client Data APIs for data loading and mutations. The index.html is generated from the HydrateFallback component in your root.tsx route. The initial "render" to generate the index.html will not include any routes deeper than root. This ensures that the index.html file can be served/hydrated for paths beyond / (i.e., /about) if you configure your CDN/server to do so.
·remix.run·
SPA Mode | Remix
RESTful API Design Best Practices Guide 2024
RESTful API Design Best Practices Guide 2024
Guide to RESTful API design best practices in 2024 covering resource-based architecture, stateless communication, client-server separation, URI design, HTTP method usage, security, performance optimization, and more.
·daily.dev·
RESTful API Design Best Practices Guide 2024
Schema-driven development in 2021 - 99designs
Schema-driven development in 2021 - 99designs
Schema-driven development is an important concept to know in 2021. What exactly is schema-driven development? What are the benefits of schema-driven development? We will explore the answers to these questions in this article.
·99designs.com·
Schema-driven development in 2021 - 99designs
API Documentation Using Hacker Tools Mitmproxy2swagger
API Documentation Using Hacker Tools Mitmproxy2swagger
Discover mitmproxy2swagger: A quick solution to generate API documentation, bridging the gap between backend and frontend teams effortlessly in just 2 mins
API documentation is a collection of references, tutorials, documents, or videos that help developers use your API governed by the Open API Specification(OAS). An API(Application programming interface) is a data-sharing technique that helps applications communicate with each other. Not the best definition in the world but I like to think of an API as a dynamic messenger. They can store your message, process it, and also deliver it to multiple people. They are also responsible for the security of your message until it reaches you.
There are a lot of tools in the market used to produce great documentation; Swagger, Postman, Doxygen, ApiDoc, and Document360 just to name a few. However, most developers remain oblivious to the tools developed for reconnaissance which when you interact with them are useful to developers as well.
mitmproxy2swagger
mitmweb is a component of the mitmproxy project and it will serve to intercept the requests that will be channeled to the listener port opened at 8080
Next, we'll need to configure the requests source for which we'll use Postman
Next, click on the gear icon at the top right corner of the postman interface to access the settings
On the settings pop up select proxy and then toggle use custom proxy configuration Here we'll add the proxy listener port so that Postman can channel all request through out custom proxy from mitmproxy
·muriithigakuru.hashnode.dev·
API Documentation Using Hacker Tools Mitmproxy2swagger
Reverse Engineer an API using MITMWEB and POSTMAN and create a Swagger file (crAPI)
Reverse Engineer an API using MITMWEB and POSTMAN and create a Swagger file (crAPI)
Many times when the we are trying to Pentest an API we might not get access to Swagger file or the documentations of the API, Today we will…
Many times when the we are trying to Pentest an API we might not get access to Swagger file or the documentations of the API, Today we will try to create the swagger file using Mitmweb and Postman.
Man in The Midlle Proxy (MITMweb)
run mitmweb through our command line in Kali
and as we can see it starts to listen on the port 8080 for http/https traffic, and we will make sure that its running by navigating to the above address which is the localhost at port 8081
and then we will proxy our traffic thorugh Burp Suite proxy port 8080 because we already has mitmweb listening for this port (make sure Burp is closed)
and then we will stop the capture and use mitmproxy2swagger to analyse it
·medium.com·
Reverse Engineer an API using MITMWEB and POSTMAN and create a Swagger file (crAPI)
Reverse engineering a Web API
Reverse engineering a Web API
Introduction Most websites or web services have an API in the backend that delivers requested data to its frontend. This can be anything from the Google Search API to delivering a message on Discord. Some people in the gaming community scan a game’s username database for certain available special names, like 3 letter names, to register them. I’ve been asked to write a tool to automate that. To do that I had to reverse engineer the R6DB API. I then could use that API to check for available usernames programmatically. This API has shut down since, likely due to abuse. The method I’m going to show also works on Electron Apps such as Discord by bringing up the DevTools. For any other app, you can use something like Fiddler to intercept the web requests.
·vollragm.github.io·
Reverse engineering a Web API
Package-Wide Variables/Cache in R Packages | R-bloggers
Package-Wide Variables/Cache in R Packages | R-bloggers
It’s often beneficial to have a variable shared between all the functions in an R package. One obvious example would be the maintenance of a package-wide cache for all of your functions. I’ve encountered this situation multiple times and always forget at least one important step in the process, so I thought I’d document it [...]
·r-bloggers.com·
Package-Wide Variables/Cache in R Packages | R-bloggers
Agent Protocol
Agent Protocol
Agent Protocol - The open source communication protocol for AI agents.
·agentprotocol.ai·
Agent Protocol
Simple Arrays
Simple Arrays
Provides a toolkit for manipulating arrays in a consistent, powerful, and intuitive manner through the use of broadcasting and a new array class, the rray.
·rray.r-lib.org·
Simple Arrays
DevContainer.ai
DevContainer.ai
Generate Custom Dev Containers in Seconds with AI
·devcontainer.ai·
DevContainer.ai
Spider: The Web Crawler for AI
Spider: The Web Crawler for AI
Experience cutting-edge web crawling with unparalleled speeds, perfect for LLMs, Machine Learning, and Artificial Intelligence. The fastest and most efficient web scraper tailored for AI applications.
·spider.cloud·
Spider: The Web Crawler for AI
R Coding Style Best Practices - Datanovia
R Coding Style Best Practices - Datanovia
1      1Share This article describes the essentials of R coding style best practices. It’s based on the tidyverse style guide. Google’s current guide is also derived from the tidyverse style guide. […]
·datanovia.com·
R Coding Style Best Practices - Datanovia
Chatgpt R-programming Prompts • PromptDen
Chatgpt R-programming Prompts • PromptDen
Explore a curated collection of thought-provoking chatbot prompts designed for R-programming enthusiasts. Ignite your coding creativity!
·promptden.com·
Chatgpt R-programming Prompts • PromptDen