Archives (2008 - 2023)

Archives (2008 - 2023)

1501 bookmarks
Newest
Vagrant - Welcome
Vagrant - Welcome

Vagrant is a tool for building and distributing virtualized development environments.

By providing automated creation and provisioning of virtual machines using Oracle’s VirtualBox, Vagrant provides the tools to create and configure lightweight, reproducible, and portable virtual environments.

Vagrant - Welcome
Apdex.org
Apdex.org

Apdex is a numerical measure of user satisfaction with the performance of enterprise applications. It converts many measurements into one number on a uniform scale of 0-to-1 (0 = no users satisfied, 1 = all users satisfied). This metric can be applied to any source of end-user performance measurements. If you have a measurement tool that gathers timing data similar to what a motivated end-user could gather with a stopwatch, then you can use this metric. Apdex fills the gap between timing data and insight by specifying a uniform way to measure and report on the user experience.

The index translates many individual response times, measured at the user-task level, into a single number. A Task is an individual interaction with the system, within a larger process. Task response time is defined as the elapsed time between when a user does something (mouse click, hits enter or return, etc) and when the system (client, network, servers) responds such that the user can proceed with the process. This is the time during which the human is waiting for the system. These individual waiting periods are what define the "responsiveness" of the application to the user.

Apdex.org
OpenTSDB - A Distributed, Scalable Monitoring System
OpenTSDB - A Distributed, Scalable Monitoring System

OpenTSDB is a distributed, scalable Time Series Database (TSDB) written on top of HBase. OpenTSDB was written to address a common need: store, index and serve metrics collected from computer systems (network gear, operating systems, applications) at a large scale, and make this data easily accessible and graphable. Thanks to HBase's scalability, OpenTSDB allows you to collect many thousands of metrics from thousands of hosts and applications, at a high rate (every few seconds). OpenTSDB will never delete or downsample data and can easily store billions of data points. As a matter of fact, StumbleUpon uses it to keep track of hundred of thousands of time series and collects over 100 million data points per day in their main production cluster.

Imagine having the ability to quickly plot a graph showing the number of active worker threads in your web servers, the number of threads used by your database, and correlate this with your service's latency (example below). OpenTSDB makes generating such graphs on the fly a trivial operation, while manipulating millions of data point for very fine grained, real-time monitoring.

OpenTSDB - A Distributed, Scalable Monitoring System
What and How to Measure Performance « Nick Gerner
What and How to Measure Performance « Nick Gerner

Last week I wrote about performance testing Open Site Explorer.  But I didn’t write much about how and why to collect the relevant data.  In this post I’ll write about the tools I use to collect performance data, how I aggregate it, and little bit about what those data tell us.  This advice applies equally well when running a performance test or during normal production operations of any web application.

I collect three kinds of data:

system performance characteristics client-side, perceived performance server-side errors and per-request details

What and How to Measure Performance « Nick Gerner
3.4 million page views per day, 92 M per month, one server and Drupal! | DrupalCamp Toronto 2010
3.4 million page views per day, 92 M per month, one server and Drupal! | DrupalCamp Toronto 2010

In this talk, Khalid of 2bits.com, Inc., Inc will talk about a how to scale a Drupal web site with the following statistics.

3.4 million pages per day peak 92 million page views per month 189,650 page views per hour peak 840,000 visits on peak day 22.96 million visits per month 52,747 visits per hour peak So far, this is the highest traffic a Drupal site gets that we heard of.

What is amazing is that this web site runs on a single mid range server ...

We will discuss how we:

How to tune the LAMP stack for optimal performance How to make Drupal performant, yet keep things simple and maintainable How to monitor the entire hardware and software stack Lessons learned, do's and don'ts

3.4 million page views per day, 92 M per month, one server and Drupal! | DrupalCamp Toronto 2010
redisql - Project Hosting on Google Code
redisql - Project Hosting on Google Code

Redisql is a lightweight SQL server AND Redisql is built on top of the NOSQL datastore redis, supports redis data-structures and redis commands and supports (de)normalisation of these data structures (lists,sets,hash-tables) to/from SQL tables. Redisql can also easily import/export tables to/from Mysql for Data-warehousing. Redisql is not only a data storage Swiss Army Knife, it is also extremely fast and extremely memory efficient.

Speed is achieved by being an event driven network server that stores ALL data in RAM and achieves disk persistence by using a spare cpu-core to periodically log data changes (i.e. no threads, no locks, no undo-logs, serving data over a network at RAM speed) Storage data structures w/ very low memory overhead and data compression, via algorithms w/ insignificant performance hits, greatly increase the amount of data you can fit in RAM Your hard disk's swap is utilised when your data can no longer fit in RAM. In this mode, performance is not negatively effected, if rarely-used data sits idle in swap. Redisql can use 100% of your RAM for storage and still provide disk persistence. Optimising to the SQL statements most commonly used in OLTP workloads yields a lightweight SQL server designed for low latency at high concurrency (i.e. mindblowing speed).

redisql - Project Hosting on Google Code
mihasya's ishmael at master - GitHub
mihasya's ishmael at master - GitHub

This is a simple UI to put on top of the data that mk-query-digest outputs. It let's you browse the query report in a more readable fashion. The aim is to display all the information from the report in a readable, navigable way. This tool does not add anything to the mk-query-digest utility itself. It simply displays the data that the utility generates.

mihasya's ishmael at master - GitHub
Bugfixes without Tests are Anti-fixes — Agile Web Development & Operations
Bugfixes without Tests are Anti-fixes — Agile Web Development & Operations

A bugfix without a test is an anti-fix. You heard me – right up there next to the anti-christ himself. After committing the bugfix, the developer thinks their ‘Done’ when in reality they’ve just introduced a new bug (and more complexity) into the system.

Bugs are incredibly interesting facts. They are indicative of that rare species – source code that is actually used (remember the Urban Myth that only 20% of your source code is actually used on a daily basis?). If a customer has taken the time to try and get something done with your application, the least you can do is write tests for any bugs they happened to come across. The test is your unspoken agreement with the end-user that this particular bug won’t happen again.

Bugfixes without Tests are Anti-fixes — Agile Web Development & Operations
Yelp's Tron at master - GitHub
Yelp's Tron at master - GitHub
Tron is a centralized system for managing periodic batch processes and services across a cluster. If you find cron or fcron to be insufficient for managing complex work flows across multiple computers, Tron might be for you.
Yelp's Tron at master - GitHub
InfoQ: Hiring for an Agile Team
InfoQ: Hiring for an Agile Team
Thus, hiring technical people for an Agile team and being hired can be difficult, no matter what the economy is doing. The key lies in identifying a sound process to get the most compatible people on board. People should be hired for talent rather than specific skills that they possess at that particular time.
InfoQ: Hiring for an Agile Team
SystemTap
SystemTap

SystemTap provides free software (GPL) infrastructure to simplify the gathering of information about the running Linux system. This assists diagnosis of a performance or functional problem. SystemTap eliminates the need for the developer to go through the tedious and disruptive instrument, recompile, install, and reboot sequence that may be otherwise required to collect data.

SystemTap provides a simple command line interface and scripting language for writing instrumentation for a live running kernel. We are publishing samples, as well as enlarging the internal "tapset" script library to aid reuse and abstraction.

Among other tracing/probing tools, SystemTap is the tool of choice for complex tasks that may require live analysis, programmable on-line response, and whole-system symbolic access. SystemTap can also handle simple tracing jobs.

SystemTap
MCollective - Overview
MCollective - Overview

The Marionette Collective aka. mcollective is a framework to build server orchestration or parallel job execution systems.

Mcollective’s primary use is to programmatically execute actions on clusters of servers. In this regard it operates in the same space as tools like Func, Fabric or Capistrano.

By not relying on central inventories and tools like SSH, it’s not simply a fancy SSH “for loop”. MCollective uses modern tools like Publish Subscribe Middleware and modern philosophies like real time discovery of network resources using meta data and not hostnames. Delivering a very scalable and very fast parallel execution environment.

The focus is on catering to the needs of enterprises and large deploys. Pluggable Authentication, Authorization and Auditing capabilities sets it apart from other tools in this space.

MCollective - Overview
Capistrano Multi Stage Instructions - Box Vault
Capistrano Multi Stage Instructions - Box Vault

Features provided in the capistrano-ext gem allows you to setup multiple environments within your deploy.rb file giving you the ability to run commands such as:

cap development deploy:migrations or

cap production deploy:migrations and have your application deploy to the different environments properly. This document provides you instructions on how to properly accomplish this.

Capistrano Multi Stage Instructions - Box Vault
GNU Parallel - build and execute command lines from standard input in parallel - Accueil [Savannah]
GNU Parallel - build and execute command lines from standard input in parallel - Accueil [Savannah]
GNU parallel is a shell tool for executing jobs in parallel locally or using remote computers. A job is typically a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables.
GNU Parallel - build and execute command lines from standard input in parallel - Accueil [Savannah]
aost - Project Hosting on Google Code
aost - Project Hosting on Google Code
The Tellurium Automated Testing Framework (Tellurium) is a UI module-based automated testing framework for web applications. The UI module is a collection of UI elements you group together. Usually, the UI module represents a composite UI object in the format of nested basic UI elements.
aost - Project Hosting on Google Code
Behat - BDD in PHP
Behat - BDD in PHP

Behat is an open source behavior driven development framework for php 5.3.

Behat was inspired by Ruby's Cucumber project and especially it's syntax part (Gherkin). It tries to be like Cucumber with input (Feature files) and output (console formatters), but in core, it built from the ground on the shoulders of giants:

Symfony Dependency Injection component Symfony Event Dispatcher component Symfony Console component Symfony Finder component Unlike any other php testing framework that tests applications inside out. Behat is testing applications outside in. It means, that Behat works only with your application's input/output. If you want to test your models - use unit testing framework instead, Behat created for behavior testing (but can be used for anything +) ).

Also, there's symfony plugin for Behat, so you can start testing your applications right now.

Behat - BDD in PHP
Celery - The Distributed Task Queue
Celery - The Distributed Task Queue

Celery is an asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well.

The execution units, called tasks, are executed concurrently on a single or more worker servers. Tasks can execute asynchronously (in the background) or synchronously (wait until ready).

Celery is already used in production to process millions of tasks a day.

Celery is written in Python, but the protocol can be implemented in any language. It can also operate with other languages using webhooks.

Celery - The Distributed Task Queue
Ganglia Monitoring System
Ganglia Monitoring System
Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. It leverages widely used technologies such as XML for data representation, XDR for compact, portable data transport, and RRDtool for data storage and visualization. It uses carefully engineered data structures and algorithms to achieve very low per-node overheads and high concurrency. The implementation is robust, has been ported to an extensive set of operating systems and processor architectures, and is currently in use on thousands of clusters around the world. It has been used to link clusters across university campuses and around the world and can scale to handle clusters with 2000 nodes.
Ganglia Monitoring System