No Clocks

No Clocks

2764 bookmarks
Newest
LLM Beyond its Core Capabilities as AI Assistants or Agents
LLM Beyond its Core Capabilities as AI Assistants or Agents
Transform your LLM as helpful assistants with function calling
Both OpenAI programing guide and Anyscale Endpoints blog [7] distill down to simple steps: Call the model with the user query and a list of functions defined in the Chat Completions API parameter as tools. The model can choose to call one or more functions; if so, the content will be a stringified JSON object adhering to your custom schema. Parse the string into JSON in your code, and call your function with the provided arguments if they exist. Call the model again by appending the function response as a new message, and let the model summarize the results back to the user. Following the above simple steps, our user_content to the LLM generates three required parameters (location, latitude, longitude) as a JSON object in its response.
Examples and Use Cases of Function Calling in LLM
Apart from the above use cases mentioned in the Open AI programming guide [10], Ben Lorica visually and comprehensively captures use cases of general function calling in LLMs, including the OpenAI Assistant Tools API [11]. Lorica succinctly states that early use cases include applications such as customer service chatbots, data analysis assistants, and code generation tools. Other examples extend to creative, logistical, and operational domains: writing assistants, scheduling agents, summarizing news., etc.
·ai.gopubby.com·
LLM Beyond its Core Capabilities as AI Assistants or Agents
OpenAI Platform - Assistants API
OpenAI Platform - Assistants API
Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.
An Assistant represents an entity that can be configured to respond to a user's messages using several parameters like model, instructions, and tools.
·platform.openai.com·
OpenAI Platform - Assistants API
Prompt and empower your LLM, the tidy way
Prompt and empower your LLM, the tidy way
The tidyprompt package allows users to prompt and empower their large language models (LLMs) in a tidy way. It provides a framework to construct LLM prompts using tidyverse-inspired piping syntax, with a library of pre-built prompt wrappers and the option to build custom ones. Additionally, it supports structured LLM output extraction and validation, with automatic feedback and retries if necessary. Moreover, it enables specific LLM reasoning modes, autonomous R function calling for LLMs, and compatibility with any LLM provider.
·tjarkvandemerwe.github.io·
Prompt and empower your LLM, the tidy way
REST API in R with plumber
REST API in R with plumber
API and R Nowadays, it’s pretty much expected that software comes with an HTTP API interface. Every programming language out there offers a way to expose APIs or make GET/POST/PUT requests, including R. In this post, I’ll show you how to create an API using the plumber package. Plus, I’ll give you tips on how to make it more production ready - I’ll tackle scalability, statelessness, caching, and load balancing. You’ll even see how to consume your API with other tools like python, curl, and the R own httr package.
Nowadays, it’s pretty much expected that software comes with an HTTP API interface. Every programming language out there offers a way to expose APIs or make GET/POST/PUT requests, including R. In this post, I’ll show you how to create an API using the plumber package. Plus, I’ll give you tips on how to make it more production ready - I’ll tackle scalability, statelessness, caching, and load balancing. You’ll even see how to consume your API with other tools like python, curl, and the R own httr package
# When an API is started it might take some time to initialize # this function stops the main execution and wait until # plumber API is ready to take queries. wait_for_api <- function(log_path, timeout = 60, check_every = 1) { times <- timeout / check_every for(i in seq_len(times)) { Sys.sleep(check_every) if(any(grepl(readLines(log_path), pattern = "Running plumber API"))) { return(invisible()) } } stop("Waiting timed!") }
Oh, in some examples I am using redis. So, before you dive in, make sure to fire up a simple redis server. At the end of the script, I’ll be turning redis off, so you don’t want to be using it for anything else at the same time. I just want to remind you that this code isn’t meant to be run on a production server.
redis is launched in a background, , so you might want to wait a little bit to make sure it’s fully up and running before moving on.
wait_for_redis <- function(timeout = 60, check_every = 1) { times <- timeout / check_every for(i in seq_len(times)) { Sys.sleep(check_every) status <- suppressWarnings(system2("redis-cli", "PING", stdout = TRUE, stderr = TRUE) == "PONG") if(status) { return(invisible()) } } stop("Redis waiting timed!") }
First off, let’s talk about logging. I try to log as much as possible, especially in critical areas like database accesses, and interactions with other systems. This way, if there’s an issue in the future (and trust me, there will be), I should be able to diagnose the problem just by looking at the logs alone. Logging is like “print debugging” (putting print(“I am here”), print(“I am here 2”) everywhere), but done ahead of time. I always try to think about what information might be needed to make a correct diagnosis, so logging variable values is a must. The logger and glue packages are your best friends in that area.
Next, it might also be useful to add a unique request identifier ((I am doing that in setuuid filter)) to be able to track it across the whole pipeline (since a single request might be passed across many functions). You might also want to add some other identifiers, such as MACHINE_ID - your API might be deployed on many machines, so it could be helpful for diagnosing if the problem is associated with a specific instance or if it’s a global issue.
In general you shouldn’t worry too much about the size of the logs. Even if you generate ~10KB per request, it will take 100000 requests to generate 1GB. And for the plumber API, 100000 requests generated in a short time is A LOT. In such scenario you should look into other languages. And if you have that many requests, you probably have a budget for storing those logs:)
It might also be a good idea to setup some automatic system to monitor those logs (e.g. Amazon CloudWatch if you are on AWS). In my example I would definitely monitor Error when reading key from cache string. That would give me an indication of any ongoing problems with API cache.
Speaking of cache, you might use it to save a lot of resources. Caching is a very broad topic with many pitfalls (what to cache, stale cache, etc) so I won’t spend too much time on it, but you might want to read at least a little bit about it. In my example, I am using redis key-value store, which allows me to save the result for a given request, and if there is another requests that asks for the same data, I can read it from redis much faster.
Note that you could use memoise package to achieve similar thing using R only. However, redis might be useful when you are using multiple workers. Then, one cached request becomes available for all other R processes. But if you need to deploy just one process, memoise is fine, and it does not introduce another dependency - which is always a plus.
info <- function(req, ...) { do.call( log_info, c( list("MachineId: {MACHINE_ID}, ReqId: {req$request_id}"), list(...), .sep = ", " ), envir = parent.frame(1) ) }
#* Log some information about the incoming request #* https://www.rplumber.io/articles/routing-and-input.html - this is a must read! #* @filter setuuid function(req) { req$request_id <- UUIDgenerate(n = 1) plumber::forward() }
#* Log some information about the incoming request #* @filter logger function(req) { if(!grepl(req$PATH_INFO, pattern = "PATH_INFO")) { info( req, "REQUEST_METHOD: {req$REQUEST_METHOD}", "PATH_INFO: {req$PATH_INFO}", "HTTP_USER_AGENT: {req$HTTP_USER_AGENT}", "REMOTE_ADDR: {req$REMOTE_ADDR}" ) } plumber::forward() }
To run the API in background, one additional file is needed. Here I am creating it using a simple bash script.
library(plumber) library(optparse) library(uuid) library(logger) MACHINE_ID <- "MAIN_1" PORT_NUMBER <- 8761 log_level(logger::TRACE) pr("tmp/api_v1.R") %>% pr_run(port = PORT_NUMBER)
·zstat.pl·
REST API in R with plumber
SPA Mode | Remix
SPA Mode | Remix
From the beginning, Remix's opinion has always been that you own your server architecture. This is why Remix is built on top of the Web Fetch API and can run on any modern runtime via built-in or community-provided adapters. While we believe that having a server provides the best UX/Performance/SEO/etc. for most apps, it is also undeniable that there exist plenty of valid use cases for a Single Page Application in the real world:
SPA Mode is basically what you'd get if you had your own React Router + Vite setup using createBrowserRouter/RouterProvider, but along with some extra Remix goodies: File-based routing (or config-based via routes()) Automatic route-based code-splitting via route.lazy <Link prefetch> support to eagerly prefetch route modules <head> management via Remix <Meta>/<Links> APIs SPA Mode tells Remix that you do not plan on running a Remix server at runtime and that you wish to generate a static index.html file at build time and you will only use Client Data APIs for data loading and mutations. The index.html is generated from the HydrateFallback component in your root.tsx route. The initial "render" to generate the index.html will not include any routes deeper than root. This ensures that the index.html file can be served/hydrated for paths beyond / (i.e., /about) if you configure your CDN/server to do so.
·remix.run·
SPA Mode | Remix
RESTful API Design Best Practices Guide 2024
RESTful API Design Best Practices Guide 2024
Guide to RESTful API design best practices in 2024 covering resource-based architecture, stateless communication, client-server separation, URI design, HTTP method usage, security, performance optimization, and more.
·daily.dev·
RESTful API Design Best Practices Guide 2024
Godspeed Systems
Godspeed Systems
The benefits of schema driven development and single source of truth for microservices or API or event systems w.r.t productivity, maintainability & agility
This article focuses on Schema Driven Development (SDD) and Single Source of Truth (STT) paradigms as two first principles every team must follow. It is an essential read for CTOs, tech leaders and every aspiring 10X engineer out there. While I will touch on SDD mainly, I will talk in brief also about the 8 practices I believe are essential, and why we need them. Later in the blog you will see pratical examples of SDD and STT with screenshots and code snippets as applicable.
What is Schema Driven Development? SDD is about using a single schema definition as the single source of truth, and letting that determine or generate everything else that depends on the schema. For ex. generating CRUD APIs for multiple kinds of event sources and protocols, doing input/output validations in producer and consumer, generating API documentation & Postman collection, starting mock servers and parallel development, generating basic test cases. And as well - sigh of relief that changing in one place will reflect change everywhere else automatically (Single Source of Truth)
SDD helps to speedily kickstart and smoothly manage parallel development across teams, without writing a single custom line of code by hand . It is not only useful for kickstarting the project, but also seamlessly upgrading along with the source schema updates. For ex. If you have a database schema, you can generate CRUD API, Swagger, Postman, Test cases, Graphql API, API clients etc. from the source database schema. Can you imagine the effort and errors saved in this approach? Hint: I once worked in a team of three backend engineers who for three months, only wrote CRUD APIs, validations, documentation and didn't get time to write test cases. We used to share Postman collection over emails.
What are the signs that your team doesn't use SDD? Such teams don't have an "official" source schema. They manually create and manage dependent schemas, APIs, test cases, documentation, API clients etc. as independent activities (while they should be dependent on the source schema). For ex. They handcraft Postman collections and share over email. They handcraft the CRUD APIs for their Graphql, REST, gRpc services.
In this approach you will have Multiple sources of Truth (your DB schema, the user.schema.js file maintained separately, the Express routes & middlewares maintained separately, the Swagger and Postman collections maintained separately, the test cases maintained separately and the API clients created separately. So much redundant effort and increased chances of mistakes! Coupling of schema with code, with event sources setup (Express, Graphql etc). Non-reusability of the effort already done. Lack of standardisation and maintainability - Also every developer may implement this differently based on their style or preference of coding. This means more chaos, inefficiencies and mistakes! And also difficulty to switch between developers.
You will be Writing repetitive validation code in your event source controllers, middleware and clients Creating boilerplatefor authentication & authorisation Manually creating Swagger specs & Postman collection (and maintaining often varying versions across developers and teams, shared across emails) Manually creating CRUD APIs (for database access) Manually writing integration test cases Manually creating API clients
Whether we listen on (sync or async) events, query a database, call an API or return data from our sync event calls (http, graphql, grpc etc) - in all such cases you will be witnessing Redundant effort in maintaining SST derivatives & shipping upgrades Gaps in API, documentation, test cases, client versions Increased work means increase in the probability of errors by 10X Increased work means increased areas to look into when errors happen (like finding needle in haystack) - Imagine wrong data flowing from one microservice to another, and breaking things across a distributed system! You would need to look across all to identify the source of error.
When not following SST, there is no real source of truth This means whenever a particular API has a new field or changed schema, we need to make manual change in five places - service, client(s), service, swagger, postman collection, integration test cases. What if the developer forgets to update the shared Postman collection? Or write validation for the new field in the APIs? Do you now see how versions and shared API collections can often get out of sync without a single source of truth? Can you imagine the risk, chaos, bugs and inefficiencies this can now bring? Before we resume back to studying more about SDD and SST, lets have a quick detour to first understand some basic best practices which I believe are critically important for tech orgs, and why they are important?
The 8 best practices In upcoming articles we will touch upon these 8 best practices. Schema Driven Development & Single Source of Truth (topic of this post) Configure Over Code Security & compliance Decoupled (Modular) Architecture Shift Left Approach Essential coding practices Efficient SDLC: Issue management, documentation, test automation, code reviews, productivity measurement, source control and version management Observability for fast resolution
Why should you care about ensuring best practices? As a tech leader should your main focus be limited to hustling together an MVP and taking it to market? But MVP is just a small first step of a long journey. This journey includes multiple iterations for finding PMF, and then growth, optimisation and sustainability. There you face dynamic and unpredictable situations like changing teams, customer needs, new compliance, new competition etc. Given this, should you lay your foundation keeping in mind the future as well? For ex. maintainability, agility, quality, democratisation & avoiding risks?
·godspeed.systems·
Godspeed Systems
Schema-driven development in 2021 - 99designs
Schema-driven development in 2021 - 99designs
Schema-driven development is an important concept to know in 2021. What exactly is schema-driven development? What are the benefits of schema-driven development? We will explore the answers to these questions in this article.
·99designs.com·
Schema-driven development in 2021 - 99designs
API Documentation Using Hacker Tools Mitmproxy2swagger
API Documentation Using Hacker Tools Mitmproxy2swagger
Discover mitmproxy2swagger: A quick solution to generate API documentation, bridging the gap between backend and frontend teams effortlessly in just 2 mins
API documentation is a collection of references, tutorials, documents, or videos that help developers use your API governed by the Open API Specification(OAS). An API(Application programming interface) is a data-sharing technique that helps applications communicate with each other. Not the best definition in the world but I like to think of an API as a dynamic messenger. They can store your message, process it, and also deliver it to multiple people. They are also responsible for the security of your message until it reaches you.
There are a lot of tools in the market used to produce great documentation; Swagger, Postman, Doxygen, ApiDoc, and Document360 just to name a few. However, most developers remain oblivious to the tools developed for reconnaissance which when you interact with them are useful to developers as well.
mitmproxy2swagger
mitmweb is a component of the mitmproxy project and it will serve to intercept the requests that will be channeled to the listener port opened at 8080
Next, we'll need to configure the requests source for which we'll use Postman
Next, click on the gear icon at the top right corner of the postman interface to access the settings
On the settings pop up select proxy and then toggle use custom proxy configuration Here we'll add the proxy listener port so that Postman can channel all request through out custom proxy from mitmproxy
·muriithigakuru.hashnode.dev·
API Documentation Using Hacker Tools Mitmproxy2swagger
Reverse Engineer an API using MITMWEB and POSTMAN and create a Swagger file (crAPI)
Reverse Engineer an API using MITMWEB and POSTMAN and create a Swagger file (crAPI)
Many times when the we are trying to Pentest an API we might not get access to Swagger file or the documentations of the API, Today we will…
Many times when the we are trying to Pentest an API we might not get access to Swagger file or the documentations of the API, Today we will try to create the swagger file using Mitmweb and Postman.
Man in The Midlle Proxy (MITMweb)
run mitmweb through our command line in Kali
and as we can see it starts to listen on the port 8080 for http/https traffic, and we will make sure that its running by navigating to the above address which is the localhost at port 8081
and then we will proxy our traffic thorugh Burp Suite proxy port 8080 because we already has mitmweb listening for this port (make sure Burp is closed)
and then we will stop the capture and use mitmproxy2swagger to analyse it
·medium.com·
Reverse Engineer an API using MITMWEB and POSTMAN and create a Swagger file (crAPI)
Reverse engineering a Web API
Reverse engineering a Web API
Introduction Most websites or web services have an API in the backend that delivers requested data to its frontend. This can be anything from the Google Search API to delivering a message on Discord. Some people in the gaming community scan a game’s username database for certain available special names, like 3 letter names, to register them. I’ve been asked to write a tool to automate that. To do that I had to reverse engineer the R6DB API. I then could use that API to check for available usernames programmatically. This API has shut down since, likely due to abuse. The method I’m going to show also works on Electron Apps such as Discord by bringing up the DevTools. For any other app, you can use something like Fiddler to intercept the web requests.
·vollragm.github.io·
Reverse engineering a Web API
Package-Wide Variables/Cache in R Packages | R-bloggers
Package-Wide Variables/Cache in R Packages | R-bloggers
It’s often beneficial to have a variable shared between all the functions in an R package. One obvious example would be the maintenance of a package-wide cache for all of your functions. I’ve encountered this situation multiple times and always forget at least one important step in the process, so I thought I’d document it [...]
·r-bloggers.com·
Package-Wide Variables/Cache in R Packages | R-bloggers
Agent Protocol
Agent Protocol
Agent Protocol - The open source communication protocol for AI agents.
·agentprotocol.ai·
Agent Protocol
Simple Arrays
Simple Arrays
Provides a toolkit for manipulating arrays in a consistent, powerful, and intuitive manner through the use of broadcasting and a new array class, the rray.
·rray.r-lib.org·
Simple Arrays
DevContainer.ai
DevContainer.ai
Generate Custom Dev Containers in Seconds with AI
·devcontainer.ai·
DevContainer.ai