Page publishing slow/failing. Page serving may be affected.

Incident Report for Unbounce

Postmortem

As the sysops manager and ultimately the person at Unbounce responsible for our service uptime, I'd like to start off by apologizing for any problems this outage caused. Our whole team is working hard to try and prevent this from happening again in the future.

What happened

Between 12:08 AM and 3:40 AM PDT, Monday August 10th, and again between 12:03 PM and 12:19 PM PDT the same day, Amazon AWS (our hosting provider) experienced serious service interruptions in their Simple Storage Service (S3), which Unbounce uses for storage for many of our services (both internal and external).

Which services went down?

Although no services completely went offline, publishing pages was severely impacted, often delayed for quite some time, or failing to publish at all.

In a small percentage of cases (about 0.05%) published pages on unbouncepages.com as well as custom domains were serving "404 - Not Found" errors.

Which services stayed up?

All other backend services were functioning normally. The page builder itself was up during the outage (with exception of page publishing). Visitor stats, form submissions, and click conversions were also unaffected.

How we resolved the issue

The problem was due to a service outside of our direct control. We were only able to monitor the situation and update our status page during the outage.

Short term fixes

We don't have any immediate changes planned. We still believe the choices we've made and systems we have in place are quite stable and reliable. Making reactive changes to an outage like this may cause more harm than good.

Longer term fixes

From a long-term perspective, we are always looking at ways to improve the reliability and uptime for all our systems. We are currently investigating the possibility of having redundant storage locations for published pages, giving us the ability to turn off one location if it is behaving badly.

What are your questions?

If you have questions or concerns, feel free to reach out. My email is mthorpe@unbounce.com; however, I'd ask that you email support@unbounce.com and address it to me so that the rest of the team can stay in the loop.

Sincerely, Mike Thorpe Technical Operations Manager

Posted Aug 11, 2015 - 15:14 PDT

Resolved

This issue has now been resolved. All services should now be back to normal. If you have a page that is stuck publishing, please reach out to us at support@unbounce.com.
Posted Aug 10, 2015 - 12:44 PDT

Investigating

We are currently experiencing an issue with page publishing and page serving. You will likely be see issues when attempting to publish a page. You may see issues with pages or page assets loading.

This is an issue with our backend provider, who are currently investigating. We will update here as soon as we have more information.
Posted Aug 10, 2015 - 12:20 PDT