StatusBacon

Librato

librato

Homepage: https://www.librato.com/
Status Page: https://status.librato.com/


Open Incidents

No open incidents

Scheduled Incidents

No scheduled incidents

Previous Incidents

UI Latency

2017-01-11 23:19:39
Action Date Description
Resolved 2017-01-12 00:57:03 This incident has been resolved.
Monitoring 2017-01-11 23:30:19 A fix has been implemented and we are monitoring the results
Investigating 2017-01-11 23:19:39 We are currently investigating a problem causing latency in our UI

We are currently investigating an issue causing latency loading metric tag descriptions in our UI

2017-01-11 22:43:19
Action Date Description
Resolved 2017-01-12 00:58:18 This incident has been resolved.
Monitoring 2017-01-11 23:04:47 We have added capacity and are monitoring the backlog of request traffic.
Investigating 2017-01-11 22:43:19 We are currently investigating this issue.

librato.com dns

2016-12-23 20:22:12
Action Date Description
Resolved 2016-12-23 20:34:33 This incident has been resolved.
Identified 2016-12-23 20:22:12 We have identified a DNS problem affecting the resolution of librato.com

We are currently investigating an issue affecting our UI.

2016-12-22 17:17:53
Action Date Description
Resolved 2016-12-22 18:36:58 This incident has been resolved.
Monitoring 2016-12-22 17:30:46 A fix has been implemented and we are monitoring the results.
Investigating 2016-12-22 17:17:53 We are currently investigating this issue.

We are currently investigating an issue that is causing increased API latency

2016-12-22 03:56:38
Action Date Description
Resolved 2016-12-22 04:26:08 This incident has been resolved.
Monitoring 2016-12-22 04:04:13 A fix has been implemented and we are monitoring the results.
Investigating 2016-12-22 03:56:38 We are currently investigating this issue.

Problems with redirects and snapshots for a small percentage of customers

2016-12-21 19:43:32
Action Date Description
Resolved 2016-12-21 20:34:54 This incident has been resolved.
Monitoring 2016-12-21 19:43:32 During a scheduled maintenance we created some problems with snapshots and redirects for a small percentage of customers. We've modified the change and are continuing to closely monitor the affected systems.

Alerts UI Unavailable

2016-12-20 22:43:35
Action Date Description
Resolved 2016-12-20 22:59:48 This incident has been resolved.
Monitoring 2016-12-20 22:53:58 A fix has been implemented and we are monitoring the results
Investigating 2016-12-20 22:43:35 We are currently investigating an issue that is affecting the Alerts user interface

Investigating increased latency and failures

2016-12-16 22:19:18
Action Date Description
Resolved 2016-12-17 06:50:12 All systems continue to be fully operational. After some additional investigation and checks we believe the immediate incident to be resolved. We will publish a public post-mortem once we've completed a comprehensive review.
Monitoring 2016-12-17 06:29:56 Alerts with a "stops reporting" threshold are now functioning again. At this point all functionality is operational and we are continuing to monitor the situation.
Update 2016-12-17 06:15:25 We believe we have identified the set of causes intersecting to cause this incident. Some initial changes have been applied and both the API and web application performance has improved as a result. We are continuing to monitor the performance and investigate. Alerts with a "stops reporting" threshold are still unavailable, all other alert types are functional.
Update 2016-12-17 05:05:12 Still working on reducing latency on API routes..
Update 2016-12-17 04:25:04 Continuing to work through high latency on the API routes, we will continue to update this issue.
Update 2016-12-17 03:40:01 We are continuing to work through high latency on certain API routes. Data submission is still being accepted and the alerting pipeline is isolated from current latency.
Identified 2016-12-17 01:52:23 We are investigating a regression in API behavior that is leading to failures.
Update 2016-12-17 00:52:36 Alerts are processing again and we have isolated problematic resources.
Update 2016-12-16 23:56:20 Alert processing delay has increased, we are still investigating.
Monitoring 2016-12-16 23:31:25 Alert processing has returned to real time. We are continuing to monitor the situation.
Identified 2016-12-16 22:58:36 We have identified the source of the issue and are working on a fix.
Update 2016-12-16 22:26:37 Alerts may be delayed.
Investigating 2016-12-16 22:19:18 We are currently investigating this issue.

Intermittent snapshot failures

2016-12-16 14:50:54
Action Date Description
Resolved 2016-12-16 15:51:13 Snapshots are working again.
Identified 2016-12-16 14:50:54 We have identified intermittent snapshot failures and are investigating.

Degraded UI performance and API delays

2016-12-07 16:24:33
Action Date Description
Resolved 2016-12-07 22:27:56 We have identified the problem and implemented a fix.
Monitoring 2016-12-07 16:53:31 A fix has been implemented and we are monitoring the results.
Investigating 2016-12-07 16:24:33 We are investigating delays in the API and the User Interface