SRE

237 bookmarks

Custom sorting

Three Terraform Mistakes, and How to Avoid Them

Learn about Terraform gotchas, and how to solve them, so that you will hopefully be spared utter despair and panic

#terraform #gcp

·awstip.com·Dec 2, 2022

Three Terraform Mistakes, and How to Avoid Them

indent-rainbow - Visual Studio Marketplace

Extension for Visual Studio Code - Makes indentation easier to read

·marketplace.visualstudio.com·Nov 30, 2022

indent-rainbow - Visual Studio Marketplace

Visualizing Multi Cloud IAM Concepts

AWS Azure GCP IAM visualized and key concepts

#iam #security #azure #aws #gcp

·julian-wieg.medium.com·Nov 29, 2022

Visualizing Multi Cloud IAM Concepts

emblem/docs/decisions at main · GoogleCloudPlatform/emblem · GitHub

:diamond_shape_with_a_dot_inside:Emblem Giving is a sample application that demonstrates a serverless architecture with continuous delivery, and trouble recovery. - emblem/docs/decisions at main ·...

·github.com·Nov 28, 2022

emblem/docs/decisions at main · GoogleCloudPlatform/emblem · GitHub

How Complex Systems Fail

#systems #theory

·how.complexsystems.fail·Nov 27, 2022

How Complex Systems Fail

Supporting Data Driven Change With SLOs

In Support of Change

#slo #sli

·medium.com·Nov 23, 2022

Supporting Data Driven Change With SLOs

The Incident Retrospective Ground Rules | Honeycomb

Join Lex, SRE at Honeycomb, as he describes the incident retrospective process we abide by, and see why he was pleasantly surprised.

#incident_management

·honeycomb.io·Nov 23, 2022

The Incident Retrospective Ground Rules | Honeycomb

Nóva :nova: (@nova@hachyderm.io)

SLA We promise SLO We want SLI We have

#slo #sli #sla #poetry

·hachyderm.io·Nov 8, 2022

Nóva :nova: (@nova@hachyderm.io)

Dear Console,… - a collection of code snippets to use in the browser console

#javascript #dev_tools

·codepo8.github.io·Nov 6, 2022

Dear Console,… - a collection of code snippets to use in the browser console

mikaelvesavuori/dorametrix: Dorametrix is a serverless web service that helps you calculate your DORA metrics, by inferring your metrics from events you create with webhooks (or manually!).

Dorametrix is a serverless web service that helps you calculate your DORA metrics, by inferring your metrics from events you create with webhooks (or manually!). - mikaelvesavuori/dorametrix: Doram...

#dora

·github.com·Nov 7, 2022

mikaelvesavuori/dorametrix: Dorametrix is a serverless web service that helps you calculate your DORA metrics, by inferring your metrics from events you create with webhooks (or manually!).

Automate end to end processes and quickly respond to events with Datadog Workflows

Learn how to combine monitoring and workflow automation into a single, streamlined solution with Datadog Workflows.

·datadoghq.com·Nov 2, 2022

Automate end to end processes and quickly respond to events with Datadog Workflows

Gain visibility and control of your cloud spend with Datadog Cloud Cost Management | Datadog

Unlock visibility into the cloud costs of your teams. Empower engineers to optimize the cost of their services and adopt a culture of cost awareness.

#datadog #cost_management

·datadoghq.com·Nov 1, 2022

Gain visibility and control of your cloud spend with Datadog Cloud Cost Management | Datadog

What are reasonable SLOs for Kafka? - Ops - Confluent Community

Opinions are my own… These depend on the SLAs you are supporting with your SLIs. But here are a couple of core ones: Controller count - must equal 1 else something is wrong Under replicated partitions - under replicated partitions greater than one is normally an early warning that something is about to go pear shaped. Depending on your setting for publish acks, this might mean that some publishers might also stop, if min ISR is less than required. Leader elections - These might happen due to...

#kafka #slo

·forum.confluent.io·Oct 26, 2022

What are reasonable SLOs for Kafka? - Ops - Confluent Community

Seeing Like an SRE: Site Reliability Engineering as High Modernism

#sre #thought_piece

·usenix.org·Oct 25, 2022

Seeing Like an SRE: Site Reliability Engineering as High Modernism

Best Practices for Local File Parameters | Amazon Web Services

If you have ever passed the contents of a file to a parameter of the AWS CLI, you most likely did so using the file:// notation. By setting a parameter’s value as the file’s path prepended by file://, you can explicitly pass in the contents of a local file as input to a command: aws […]

#aws #aws_cli

·aws.amazon.com·Oct 25, 2022

Best Practices for Local File Parameters | Amazon Web Services

OpenSLO/OpenSLO: Open specification for defining and expressing service level objectives (SLO)

Open specification for defining and expressing service level objectives (SLO) - OpenSLO/OpenSLO: Open specification for defining and expressing service level objectives (SLO)

#open_slo #specification

·github.com·Oct 25, 2022

OpenSLO/OpenSLO: Open specification for defining and expressing service level objectives (SLO)

Critical User Journeys | Payments Reseller Subscription API | Google Developers

#critical_user_journeys #slo

·developers.google.com·Oct 25, 2022

Critical User Journeys | Payments Reseller Subscription API | Google Developers

Google - Site Reliability Engineering

#google_sre_workbook #slo

·sre.google·Oct 25, 2022

Google - Site Reliability Engineering

Network Monitoring Software by ManageEngine OpManager

ManageEngine OpManager provides easy-to-use Network Monitoring Software that offers advanced Network & Server Performance Management. Download free trial now!

#p95 #95%#95pct #stats

·manageengine.com·Oct 25, 2022

Network Monitoring Software by ManageEngine OpManager

What Does It Mean To Be In The 95Th Percentile? – Problem Solver X

#p95 #95%#95pct #stats

·problemsolverx.com·Oct 25, 2022

What Does It Mean To Be In The 95Th Percentile? – Problem Solver X

What is The 95th Percentile, And Why Does It Matter? – FirstWave

95% of your requests / the other 5% are the times it exceed this value

#95pct #95%#p95 #stats

·firstwave.com·Oct 25, 2022

What is The 95th Percentile, And Why Does It Matter? – FirstWave

Avoiding the 'SLOs as Reliability Theater' trap

#slo

·usenix.org·Oct 25, 2022

Avoiding the 'SLOs as Reliability Theater' trap

Site Reliability Engineering: SLI Implementation Example

The Service Level Indicator is the ongoing measurement of your system that tells you whether you’re meeting your objective

#sli

·oladosu777.medium.com·Oct 23, 2022

Site Reliability Engineering: SLI Implementation Example

Message Queueing vs. Event Stream Processing in Azure

Message Queueing vs. Event Stream Processing in Azure.

#event_streams #message_queues #servicebus #azure #messaging

·scale-tone.github.io·Oct 19, 2022

Message Queueing vs. Event Stream Processing in Azure

2 Ways to Check TLS Certificate expiration Date with OpenSSL Command - SSLHOW

We can quickly solve TLS or SSL certificate issues by checking the certificate’s expiration from the openssl command line. Today, let us see how to check certificate’s expiration date in 2 ways. The first one is to check the certificate on remote server side. The second is to check the certificate by PEM files. Check […]

#ssl/tls #openssl

·sslhow.com·Oct 17, 2022

2 Ways to Check TLS Certificate expiration Date with OpenSSL Command - SSLHOW

Getting started with OpenTelemetry for Python

Observability is the ability to measure the internal states of a system by examining its outputs. A...

#open_telemetry #o11y #tracing

·dev.to·Oct 16, 2022

Getting started with OpenTelemetry for Python

OpenSLO/oslo: CLI tool for the OpenSLO spec

CLI tool for the OpenSLO spec. Contribute to OpenSLO/oslo development by creating an account on GitHub.

#open_slo #slo #tools

·github.com·Oct 14, 2022

OpenSLO/oslo: CLI tool for the OpenSLO spec

OpenSLO/openslo-backstage-plugins: Backstage plugins for OpenSLO

Backstage plugins for OpenSLO. Contribute to OpenSLO/openslo-backstage-plugins development by creating an account on GitHub.

#open_slo #slo

·github.com·Oct 14, 2022

OpenSLO/openslo-backstage-plugins: Backstage plugins for OpenSLO

OpenSLO/slogen: tool to create and manage content for reliability tracking from logs/event data.

tool to create and manage content for reliability tracking from logs/event data. - OpenSLO/slogen: tool to create and manage content for reliability tracking from logs/event data.

#slo #open_slo #tools

·github.com·Oct 14, 2022

OpenSLO/slogen: tool to create and manage content for reliability tracking from logs/event data.

/bin/bash based SSL/TLS tester: testssl.sh

TLS/SSL security testing with Open Source Software

#tools #cli #ssl/tls

·testssl.sh·Oct 12, 2022

/bin/bash based SSL/TLS tester: testssl.sh