Found 141 bookmarks
Newest
UNCHARTED DATA: Using Crosstalk to Add User-Interactivity
UNCHARTED DATA: Using Crosstalk to Add User-Interactivity
Linking an interactive plot and table together with the crosstalk package.
Using Crosstalk to Add User-Interactivity
The goal is to link the reactable table I created to a plotly chart and provide additional filter options that control both the table and the chart.
An important note: in order to use crosstalk, you must create a shared dataset and call that dataset within both plotly and reactable. Otherwise, your dataset will not communicate and filter with eachother. The code to do this is SharedData$new(dataset).
If you expand the code below, you’ll see that the code to build a table in reactable is quite extensive. I will not go into the details in this post, but do recommend a couple great tutorials that I used to create the interactive table such as this tutorial from Greg Lin, and this from Tom Mock which really helped me understand how to use CSS and Google fonts to enhance the visual appeal of the table (see the “Additional CSS Used for Table” section below for more info).
If you have ever built something in Shiny before, you’ll notice that the crosstalk filters are very similar. You can add a filter to any existing column in the dataset. As you can see in the code below, I used a mixture of filter_checkbox and filter_select depending on how many unique options were available in the column you’re filtering. My rule of thumb is if there are more than five options to choose from it’s probably better to put them into a list in filter_select like I did with the Division filtering as to not take up too much space on the page.
For the layout of the data visualization, I used bscols to place the crosstalk filters side-by-side with the interactive plotly chart. I then placed the reactable table underneath and added a legend to the table using tags from the htmltools package. The final result is shown below. Feel free to click around and the filters and you will notice that both the plot and the table will filter accordingly. Another option is to drag and click on the plot and you will see the table underneath mimic the teams shown.
·uncharteddata.netlify.app·
UNCHARTED DATA: Using Crosstalk to Add User-Interactivity
HTTP resources and specifications - HTTP | MDN
HTTP resources and specifications - HTTP | MDN
HTTP was first specified in the early 1990s. Designed with extensibility in mind, it has seen numerous additions over the years; this lead to its specification being scattered through numerous specification documents (in the midst of experimental abandoned extensions). This page lists relevant resources about HTTP.
·developer.mozilla.org·
HTTP resources and specifications - HTTP | MDN
Best Practices
Best Practices
For API designers and writers wishing formalize their API in an OpenAPI Description document.
Keep a Single Source of Truth Regardless of your design approach (design-first or code-first) always keep a single source of truth, i.e., information should not be duplicated in different places. It is really the same concept used in programming, where repeated code should be moved to a common function.
Otherwise, eventually one of the places will be updated while the other won’t, leading to headaches… in the best of cases. For instance, it is also commonplace to use code annotations to generate an OpenAPI description and then commit the latter to source control while the former still lingers in the code. As a result, newcomers to the project will not know which one is actually in use and mistakes will be made. Alternatively, you can use a Continuous Integration test to ensure that the two sources stay consistent.
Add OpenAPI Descriptions to Source Control OpenAPI Descriptions are not just a documentation artifact: they are first-class source files which can drive a great number of automated processes, including boilerplate generation, unit testing and documentation rendering. As such, OADs should be committed to source control, and, in fact, they should be among the first files to be committed. From there, they should also participate in Continuous Integration processes.
Make the OpenAPI Descriptions Available to the Users Beautifully-rendered documentation can be very useful for the users of an API, but sometimes they might want to access the source OAD. For instance, to use tools to generate client code for them, or to build automated bindings for some language. Therefore, making the OAD available to the users is an added bonus for them. The documents that make up the OAD can even be made available through the same API to allow runtime discovery.
There is Seldom Need to Write OpenAPI Descriptions by Hand Since OADs are plain text documents, in an easy-to-read format (be it JSON or YAML), API designers are usually tempted to write them by hand. While there is nothing stopping you from doing this, and, in fact, hand-written API descriptions are usually the most terse and efficient, approaching any big project by such method is highly impractical. Instead, you should try the other existing creation methods and choose the one that better suits you and your team (No YAML or JSON knowledge needed!):
OpenAPI Editors: Be it text editors or GUI editors they usually take care of repetitive tasks, allow you to keep a library of reusable components and provide real-time preview of the generated documentation.
Domain-Specific Languages: As its name indicates, DSL’s are API description languages tailored to specific development fields. A tool is then used to produce the OpenAPI Description. A new language has to be learned, but, in return, extremely concise descriptions can be achieved.
Code Annotations: Most programming languages allow you to annotate the code, be it with specific syntax or with general code comments. These annotations, for example, can be used to extend a method signature with information regarding the API endpoint and HTTP method that lead to it. A tool can then parse the code annotations and generate OADs automatically. This method fits very nicely with the code-first approach, so keep in mind the first advice given at the top of this page when using it (Use a Design-First Approach)…
A Mix of All the Above: It’s perfectly possible to create the bulk of an OpenAPI Description using an editor or DSL and then hand-tune the resulting file. Just be aware of the second advice above (Keep a Single Source of Truth): Once you modify a file it becomes the source of truth and the previous one should be discarded (maybe keep it as backup, but out of the sight and reach of children and newcomers to the project).
Describing Large APIs
Do not repeat yourself (The DRY principle). If the same piece of YAML or JSON appears more than once in the document, it’s time to move it to the components section and reference it from other places using $ref (See Reusing Descriptions. Not only will the resulting document be smaller but it will also be much easier to maintain). Components can be referenced from other documents, so you can even reuse them across different API descriptions!
Split the description into several documents: Smaller files are easier to navigate, but too many of them are equally taxing. The key lies somewhere in the middle. A good rule of thumb is to use the natural hierarchy present in URLs to build your directory structure. For example, put all routes starting with /users (like /users and /users/{id}) in the same file (think of it as a “sub-API”). Bear in mind that some tools might have issues with large files, whereas some other tools might not handle too many files gracefully. The solution will have to take your toolkit into account.
Use tags to keep things organized: Tags have not been described in the Specification chapter, but they can help you arrange your operations and find them faster. A tag is simply a piece of metadata (a unique name and an optional description) that you can attach to operations. Tools, specially GUI editors, can then sort all your API’s operation by their tags to help you keep them organized.
Links to External Best Practices There’s quite a bit of literature about how to organize your API more efficiently. Make sure you check out how other people solved the same issues you are facing now! For example: The API Stylebook contains internal API Design Guidelines shared with the community by some well known companies and government agencies.
Best Practices This page contains general pieces of advice which do not strictly belong to the Specification Explained chapter because they are not directly tied to the OpenAPI Specification (OAS). However, they greatly simplify creating and maintaining OpenAPI Descriptions (OADs), so they are worth keeping in mind.
Use a Design-First Approach Traditionally, two main approaches exist when creating OADs: Code-first and Design-first. In the Code-first approach, the API is first implemented in code, and then its description is created from it, using code comments, code annotations or simply written from scratch. This approach does not require developers to learn another language so it is usually regarded as the easiest one. Conversely, in Design-first, the API description is written first and then the code follows. The first obvious advantages are that the code already has a skeleton upon which to build, and that some tools can provide boilerplate code automatically. There have been a number of heated debates over the relative merits of these two approaches but, in the opinion of the OpenAPI Initiative (OAI), the importance of using Design-first cannot be stressed strongly enough.
The reason is simple: The number of APIs that can be created in code is far superior to what can be described in OpenAPI. To emphasize: OpenAPI is not capable of describing every possible HTTP API, it has limitations. Therefore, unless these descriptive limitations are perfectly known and taken into account when coding the API, they will rear their ugly head later on when trying to create an OpenAPI description for it. At that point, the right fix will be to change the code so that it uses an API which can be actually described with OpenAPI (or switch to Design-first altogether). Sometimes, however, since it is late in the process, it will be preferred to twist the API description so that it matches more or less the actual API. It goes without saying that this leads to unintuitive and incomplete descriptions, that will rarely scale in the future. Finally, there exist a number of validation tools that can verify that the implemented code adheres to the OpenAPI description. Running these tools as part of a Continuous Integration process allows changing the OpenAPI Description with peace of mind, since deviations in the code behavior will be promptly detected. Bottom line: OpenAPI opens the door to a wealth of automated tools. Make sure you use them!
·learn.openapis.org·
Best Practices
SpeCrawler: Generating OpenAPI Specifications from API Documentation Using Large Language Models
SpeCrawler: Generating OpenAPI Specifications from API Documentation Using Large Language Models
In the digital era, the widespread use of APIs is evident. However, scalable utilization of APIs poses a challenge due to structure divergence observed in online API documentation. This underscores the need for automat…
·ar5iv.labs.arxiv.org·
SpeCrawler: Generating OpenAPI Specifications from API Documentation Using Large Language Models
REST API in R with plumber
REST API in R with plumber
API and R Nowadays, it’s pretty much expected that software comes with an HTTP API interface. Every programming language out there offers a way to expose APIs or make GET/POST/PUT requests, including R. In this post, I’ll show you how to create an API using the plumber package. Plus, I’ll give you tips on how to make it more production ready - I’ll tackle scalability, statelessness, caching, and load balancing. You’ll even see how to consume your API with other tools like python, curl, and the R own httr package.
Nowadays, it’s pretty much expected that software comes with an HTTP API interface. Every programming language out there offers a way to expose APIs or make GET/POST/PUT requests, including R. In this post, I’ll show you how to create an API using the plumber package. Plus, I’ll give you tips on how to make it more production ready - I’ll tackle scalability, statelessness, caching, and load balancing. You’ll even see how to consume your API with other tools like python, curl, and the R own httr package
# When an API is started it might take some time to initialize # this function stops the main execution and wait until # plumber API is ready to take queries. wait_for_api <- function(log_path, timeout = 60, check_every = 1) { times <- timeout / check_every for(i in seq_len(times)) { Sys.sleep(check_every) if(any(grepl(readLines(log_path), pattern = "Running plumber API"))) { return(invisible()) } } stop("Waiting timed!") }
Oh, in some examples I am using redis. So, before you dive in, make sure to fire up a simple redis server. At the end of the script, I’ll be turning redis off, so you don’t want to be using it for anything else at the same time. I just want to remind you that this code isn’t meant to be run on a production server.
redis is launched in a background, , so you might want to wait a little bit to make sure it’s fully up and running before moving on.
wait_for_redis <- function(timeout = 60, check_every = 1) { times <- timeout / check_every for(i in seq_len(times)) { Sys.sleep(check_every) status <- suppressWarnings(system2("redis-cli", "PING", stdout = TRUE, stderr = TRUE) == "PONG") if(status) { return(invisible()) } } stop("Redis waiting timed!") }
First off, let’s talk about logging. I try to log as much as possible, especially in critical areas like database accesses, and interactions with other systems. This way, if there’s an issue in the future (and trust me, there will be), I should be able to diagnose the problem just by looking at the logs alone. Logging is like “print debugging” (putting print(“I am here”), print(“I am here 2”) everywhere), but done ahead of time. I always try to think about what information might be needed to make a correct diagnosis, so logging variable values is a must. The logger and glue packages are your best friends in that area.
Next, it might also be useful to add a unique request identifier ((I am doing that in setuuid filter)) to be able to track it across the whole pipeline (since a single request might be passed across many functions). You might also want to add some other identifiers, such as MACHINE_ID - your API might be deployed on many machines, so it could be helpful for diagnosing if the problem is associated with a specific instance or if it’s a global issue.
In general you shouldn’t worry too much about the size of the logs. Even if you generate ~10KB per request, it will take 100000 requests to generate 1GB. And for the plumber API, 100000 requests generated in a short time is A LOT. In such scenario you should look into other languages. And if you have that many requests, you probably have a budget for storing those logs:)
It might also be a good idea to setup some automatic system to monitor those logs (e.g. Amazon CloudWatch if you are on AWS). In my example I would definitely monitor Error when reading key from cache string. That would give me an indication of any ongoing problems with API cache.
Speaking of cache, you might use it to save a lot of resources. Caching is a very broad topic with many pitfalls (what to cache, stale cache, etc) so I won’t spend too much time on it, but you might want to read at least a little bit about it. In my example, I am using redis key-value store, which allows me to save the result for a given request, and if there is another requests that asks for the same data, I can read it from redis much faster.
Note that you could use memoise package to achieve similar thing using R only. However, redis might be useful when you are using multiple workers. Then, one cached request becomes available for all other R processes. But if you need to deploy just one process, memoise is fine, and it does not introduce another dependency - which is always a plus.
info <- function(req, ...) { do.call( log_info, c( list("MachineId: {MACHINE_ID}, ReqId: {req$request_id}"), list(...), .sep = ", " ), envir = parent.frame(1) ) }
#* Log some information about the incoming request #* https://www.rplumber.io/articles/routing-and-input.html - this is a must read! #* @filter setuuid function(req) { req$request_id <- UUIDgenerate(n = 1) plumber::forward() }
#* Log some information about the incoming request #* @filter logger function(req) { if(!grepl(req$PATH_INFO, pattern = "PATH_INFO")) { info( req, "REQUEST_METHOD: {req$REQUEST_METHOD}", "PATH_INFO: {req$PATH_INFO}", "HTTP_USER_AGENT: {req$HTTP_USER_AGENT}", "REMOTE_ADDR: {req$REMOTE_ADDR}" ) } plumber::forward() }
To run the API in background, one additional file is needed. Here I am creating it using a simple bash script.
library(plumber) library(optparse) library(uuid) library(logger) MACHINE_ID <- "MAIN_1" PORT_NUMBER <- 8761 log_level(logger::TRACE) pr("tmp/api_v1.R") %>% pr_run(port = PORT_NUMBER)
·zstat.pl·
REST API in R with plumber
Smarter Single Page Application with a REST API
Smarter Single Page Application with a REST API
How can you build a Single Page Application with a REST API that doesn't have a ton of business logic in the client? Use Hypermedia!
When the Browser is the client consuming HTML, it understands how to render HTML. HTML has a specification. The browser understands how to handle a <form> tag or a <button>. It was driven by the HTML at runtime.
How can you build a smarter Single Page Application with a REST API? The concepts have been since the beginning of the web, yet have somehow lost their way in modern REST API that drives a Single Page Application or Mobile Applications. Here’s how to guide clients based on state by moving more information from design time to runtime.
State If you’re developing more than a CRUD application, you’re likely going to be driven by the state of the system. Apps that have Task Based UIs (hint: go read my post on Decomposing CRUD to Task Based UIs) are guiding users down a path of actions they can perform based on the state of the system. The example throughout this post is the concept of a Product in a warehouse. If we have a tasks that let’s someone mark a Product as no longer being available for sale or it being available for sale, these tasks can be driven by the state of the Product. If the given UI task is “Mark as Available” then the Product must be currently unavailable and we have a quantity on hand that’s greater than zero
History of Clients Taking a step back a bit, web apps were developed initially with just plain HTML (over 20 years ago for me). In its most basic form, a static HTML page contained a <form> that the browser rendered for the user to fill out and submit. The form’s action would point to a URI usually to a script, often written in Perl, in the cgi-bin folder on the webserver. The script would take the form data (sent via POST from the browser) and insert it into a database, send an email, or whatever the required behavior was. As web apps progressed, instead of the HTML being in a static file, it was dynamically created by the server. But it was still just plain HTML. The browser was the client. HTML was the content it’s consuming.
Modern Clients As web apps progressed with AJAX (XMLHttpRequest) instead of using HTML forms, Javascript was used to send the HTTP request. The browsers turned more into the Host of the application which was written in Javascript. Now, Javascript is the client. JSON is the content it’s consuming.
Runtime vs Design Time When the Browser is the client consuming HTML, it understands how to render HTML. HTML has a specification. The browser understands how to handle a <form> tag or a <button>. It was driven by the HTML at runtime.
In modern SPAs consuming JSON, the data itself is unstructured. Each client has to be created uniquely based on the content it receives. This has to be developed at design time when creating the javascript client.
When developing a SPA, you may leverage something like OpenAPI to generate code to use in the SPA/clients to make the HTTP calls to the server. But you must understand as a developer, at design time (when developing) when to make a call to the server.
To use my earlier example of making a product available for sale, if you were developing a server-side rendered HTML web app, you wouldn’t return the form apart of the HTML if the product couldn’t be made available. You would do this because on the server you have the state of the product (fetched from the database). If you’re creating a SPA, you’re likely putting that same logic in your client so you can conditionally show UI elements. It wouldn’t be a great experience for the user to be able to perform an action, then see an error message because the server/api threw a 400 because the product is not in a state to allow it to be available.
Hypermedia Hypermedia is what is used in HTML to tell the Browser what it can do. As I mentioned earlier, a <form> is a hypermedia control.
HTTP APIs The vast majority of modern HTTP APIs serving JSON, do not provide any information in the content (JSON) about what actions or other resources the consuming client (SPA) can take. Meaning, we provide no information at runtime. All of that has to be figured out at design time.
You will still need to know (via OpenAPI) at design time, all the information about the routes you will be calling, and their results, however, you can now have the server return JSON that can guide the client based on state.
·codeopinion.com·
Smarter Single Page Application with a REST API