Google Cloud Is Down | Hacker News

		Google Cloud Is Down
		374 points by markoa 1 hour ago \| hide \| past \| web \| favorite \| 132 comments
		https://status.cloud.google.com https://status.cloud.google.com/incident/compute/19003 Status page reports all green, however the outage is affecting YouTube, Snapchat, and thousands of other users.

boulos 33 minutes ago

Disclosure: I work on Google Cloud (but disclaimer, I'm on vacation and so not much use to you!).

We're having what appears to be a serious networking outage. It's disrupting everything, including unfortunately the tooling we usually use to communicate across the company about outages.

There are backup plans, of course, but I wanted to at least come here to say: you're not crazy, nothing is lost (to those concerns downthread), but there is serious packet loss at the least. You'll have to wait for someone actually involved in the incident to say more.

odiroot 31 minutes ago

> including unfortunately the tooling we usually use to communicate across the company about outages.

There's some irony in that.

boulos 24 minutes ago

Edit: and I agree!

I’m not in SRE so I don’t bother with all the backup modes (direct IRC channel, phone lines, “pagers” with backup numbers). I don’t think the networking SRE folks are as impacted in their direct communication, but they are (obviously) not able to get the word out as easily.

Still, it seems reasonable to me to use tooling for most outages that relies on “the network is fine overall”, to optimize for the common case.

Note: the status dashboard now correctly highlights (Edit: with a banner at the top) that multiple things are impacted because Networking. The Networking outage is the root cause.

marksomnian 22 minutes ago

> the status dashboard now correctly highlights that multiple things are impacted because Networking.

this column of green checkmarks begs to differ: https://i.imgur.com/2TPD9e9.png

pm90 5 minutes ago

This is a person who's trying to help out while on vacation…can we try being more thankful, and not nitpick everything they say?

boulos 16 minutes ago

The banner at the top. Sorry if that wasn’t clear.

SmokeGS 24 minutes ago

>nothing is lost

except time

sdan 21 minutes ago

and a nice Sunday afternoon

digaozao 13 minutes ago

And lots of sales on my case

toomuchtodo 15 minutes ago

And the illusion of superiority over non cloud offerings.

k__ 6 minutes ago

How come?

toomuchtodo 2 minutes ago

There are some whole argue that the resiliency of cloud providers beats on prem or self hosted, and yet they’re down just as much or more (GCP, Azure, and AWS all the same).

You want velocity for your dev team? You get that. You want better uptime? Your expectations are gonna have a bad time.

sarim 11 minutes ago

Funny how as soon as I realized that Gmail and Google Sheet aren’t working properly I rushed to HN to figure out what’s going on. I love this community!

ksajadi 55 minutes ago

GCP status page is worthless as it's always happy and green when production systems are down and then they might acknowledge something an hour later

JimboOmega 5 minutes ago

Just like AWS, then. "Some users are experiencing increased error rates" = "Everything has been down for hours"

lionradio 2 minutes ago

I think this might be a static page they are hosting on Akamai?

markoa 20 minutes ago

We're hosting an open global Zoom call for all engineers affected by the outage, join us at https://zoom.us/j/793450725

avocado4 11 minutes ago

One click on this link and it instantly starts streaming your webcam footage to everyone in the chat room.

chrisseaton 6 minutes ago

Use 'Turn off my video when joining meeting' in Zoom.

Sebguer 6 minutes ago

This is a local zoom setting, you can change it.

Causality1 7 minutes ago

Why? Do you expect to be able to do something about it or did you just want somewhere classier than Twitter to complain?

colinbartlett 50 minutes ago

Google Cloud is the number 4 most monitored status page on StatusGator and Google Apps is number 12. In addition, at least 20 other services we monitor seemingly depend on Google Cloud because they all posted issues as soon as Google went down.

It's always interesting to see these outages at large cloud providers spider out across the rest of the internet, a lot of the world depends on Google to stay up.

FPGAhacker 21 minutes ago

I guess we know what steam uses (the store at least).

xNevo 9 minutes ago

No issues for me. Maybe they have a failover mechanism?

hhs 14 minutes ago

"a lot of the world depends on Google to stay up."

Yup, I'm trying to check the Associated Press News right now and it's having trouble connecting to "storage.googleapis.com".

hazeii 22 minutes ago

…and only the paranoid survive?

jshprentz 16 minutes ago

The two Google Cloud networking incidents are:

Incident #19008 began at 2019-06-02 12:48. https://status.cloud.google.com/incident/cloud-networking/19…

Incident #19009 began at 2019-06-02 12:53. https://status.cloud.google.com/incident/cloud-networking/19…

Times are US/Pacific

pm90 10 minutes ago

They both seem to say the same thing….

darkof 1 hour ago

That feeling when you open https://console.cloud.google.com and see that you don't have your Kubernetes clusters and CloudSQL databases, but CTA to create first.

captn3m0 5 minutes ago

Mumbai region here, and GKE seems to be fine.

vpontis 1 hour ago

Gosh, this was so scary… I thought someone had hacked in and deleted everything…

I hope they come back. This is still pretty scary

jjeaff 13 minutes ago

Same. I was thinking, oh, my db cluster must be having trouble recovering. Couldn't get any response through kubectl. Logged in to the cloud console and it looks all brand new, like I have no clusters setup at all.

Of course, this is 2 weeks after switching everything over from AWS.

stnmtn 55 minutes ago

Same, my Manager called my and said "everything is down".

So I wander over to my Firebase console, and there's no database loading. Thank god for twitter, and people also saying that they have the same issue or I would have for sure though we've been hacked.

I hope this is a good wake up call for everyone. I know that I'm going to think more about how we do backups and fail-safes

hanniabu 50 minutes ago

And here I thought I was having a bad day with Google Play not loading

dpau 9 minutes ago

my vm instances are all still there, can even log in via SSH in the compute engine tab. looks like they got a reboot 15 min ago. just restarted some processes but lost my progress on about 12hrs of computing time, i'm guessing it's going to be hard to get a refund..

CSDude 2 minutes ago

With Google Cloud incidents, most of the time whole regions fail, and with AWS generally only a region fails. Of course there would be exceptions, but Google Cloud does not make me feel safe as an outsider (and a user of multi-region AWS)

macintux 59 minutes ago

And thus was ruined hundreds or thousands of pleasant Sunday afternoons.

I don’t miss being on pager duty one bit. I see it looming in my headlights, sadly.

xerxes901 56 minutes ago

Spare a thought for the pleasant Australian early Monday mornings too! Always a rude awakening…

newsbinator 13 minutes ago

It's the Queen's birthday, a Monday off here in New Zealand…

… but not for everybody now.

jagtesh 22 minutes ago

Multi-cloud for those times when you really need that level of availability and can afford it.

squarefoot 37 minutes ago

And Gmail too doesn't feel very well today.

  [21:55:19] POP< +OK send PASS   [21:55:19] POP> PASS ********   [21:55:21] POP< +OK Welcome.   [21:55:21] POP> STAT   [21:55:21] POP< -ERR [SYS/TEMP] Temporary system problem.   Please try again later.

hazeii 8 minutes ago

IMAP as well – for some considerable time now.

ToniCipriani 25 minutes ago

Pretty much Gsuite is out: https://www.google.com/appsstatus#hl=en&v=status

klon 53 minutes ago

Anyone using both AWS and GCP that can form an opinion on availability of both? As a GCP customer I am not very happy with theirs.

pm90 50 minutes ago

GCP is incredibly bad at communicating when there are problems with their systems. Just terrible. Its only when our apps start to break that we notice something is down, then look at the green dashboard which is even more infuriating.

jjeaff 4 minutes ago

Their dashboard does show red on GCE and networking right now, for what it's worth. https://status.cloud.google.com/

timdorr 31 minutes ago

AWS is often the same way. No one seems to be good at communicating outage details.

obeattie 5 minutes ago

I really don’t get this. There’s a huge number of complaints about poor communication from companies like Google and AWS during every outage. Yet they remain seemingly indifferent to how much customer trust they are losing, and the competitive edge the first one to get this right could gain.

I speak from experience: at Monzo we have had some pretty horrific and very public outages, yet because we communicate openly and proactively with our customers, people are very understanding and trust that we’re doing our best to fix things. The amount of love we’ve had from our customers during times when we’ve put them in a really bad spot – sometimes without access to their money – has been astonishing.

TallGuyShort 20 minutes ago

I suspect there's a correlation between outages that are easy to detect and communicate and outages that automation can recover from so easily that you hardly notice.

kenhwang 13 minutes ago

AWS has what feel like monthly AZ brownouts (typically degradated performance or other control plane issue) with the yearly-ish regional brown/blackout.

GCP has quarterly-ish global blackouts, and generally on the data plane at that which makes them significantly more severe.

jjeaff 5 minutes ago

Are there any services that track uptime for various regions and zones from various providers? It's rare that everything goes down and thus the cloud providers pretend they have almost no downtime.

tubaguy50035 47 minutes ago

Obviously we don't know what the extent of the issue is yet, but afaik there has never been an AWS incident that has affected multiple regions where an application had been designed to use them (like using region specific S3 endpoints). GCP and Azure have had issues in multiple regions that would have affected applications designed for multi-region.

votepaunchy 39 minutes ago

> like using region specific S3 endpoints

AWS had the S3 incident affecting all of us-east-1: “Other AWS services in the US-EAST-1 Region that rely on S3 for storage, including the S3 console, Amazon Elastic Compute Cloud (EC2) new instance launches, Amazon Elastic Block Store (EBS) volumes (when data was needed from a S3 snapshot), and AWS Lambda were also impacted while the S3 APIs were unavailable.”

https://aws.amazon.com/message/41926/

WaxProlix 33 minutes ago

There was a massive push after that to have everything regionalized. It's not 100% but it's super close at this point.

tango24 20 minutes ago

That's one region, not the multiple region that OP mentioned

theevilsharpie 36 minutes ago

I find GCP quicker to post status updates about issues than AWS, but GCP also seems to run into more problems that span across multiple regions.

I'm overall happy with it, but if I needed to run a service with a 99.95% uptime SLA or higher, I wouldn't rely solely on GCP.

w_s_l 23 minutes ago

You know this reminds me of a bad taste that Google Sales team left when I asked for some of my billing that I was unaware of running after following a quickstart guide.

AWS refunded me in the first reply on the same day!

GCP sales rep just copy pasted a link to a self support survey that essentially told me, after a series of YES or NO questions that they can't refund me.

So why not just tell your customers like it is? Google Cloud is super strict when it comes to billing. I have called my bank to do a chargeback and put a hold on all future billing with GCP.

I'm now back to AWS and still on a Free Tier. Apparently the $300 Trial with Google Cloud did not include some critical products, AWS Free tier makes it super clear and even still I sometimes leave something running on and discover it in my invoice….

I've yet to receive a reply from Google and its been a week now.

I do appreciate other products such as Firebase but honestly for infrastructure and for future integration with enterprise customers I feel AWS is more appropriate and mature.

codys 44 minutes ago

Other google services are also affected:

https://www.google.com/appsstatus#hl=en&v=status

https://i.imgur.com/pcqwwA4.png

mixedbit 37 minutes ago

Good that Google+ is up again

different_sort 48 minutes ago

I was playing around this afternoon with appengine, and thought I broke one my projects when I started getting 502 back.

There appears to be some irregularities on consumer services as well that are of course certainly related, youtube was behaving a bit oddly for me.

The impact seems to be cascading down from just GCE to other services as well – that status page certainly does not reflect the reality of the situation. You can't even sign into GCP right now, and things that run on GCE, like appengine seem impacted.

echelon 37 minutes ago

Nest is down for me right now.

It's amazing how far-reaching outages can be these days.

unicornmama 36 minutes ago

There are dozens of us.

andrewprock 25 minutes ago

Code reuse is a wonderful thing, until it's not.

brown9-2 35 minutes ago

When talking about GCE being down please also mention what regions you are talking about

chupasaurus 31 minutes ago

In this case it's a luck if any are working correctly, a problem is global with some exceptions.

brown9-2 29 minutes ago

seems to be some comments here of some regions functioning ok, although perhaps it’s not 100% in all regions

sgammon 7 minutes ago

us-central1 us-west1 us-west2

is what I’ve heard so far. east seems to be OK, and Europe too

mandatory 56 minutes ago

Yep, I can no longer see my Cloud SQL database – it's as if I've never created one at all. Really hoping this is just an issue displaying it and that Google hasn't punted my infrastructure and backups.

vpontis 44 minutes ago

Praying isn't working. Now, I'll try sobbing 🙁

Havoc 27 minutes ago

Systematic problem solving. I like it

Exuma 21 minutes ago

https://downdetector.com/

Pretty much every service is down

PerfectElement 42 minutes ago

Just 2 weeks after I migrated a DB cluster from Azure to Google Cloud thinking things would get better.

duality 26 minutes ago

FWIW they might still be.

leowoo91 41 minutes ago

0 issues at compute, reporting for europe-west3-b,

seibelj 24 minutes ago

Not sure if related, but I was going to a BBQ yesterday and myself and 3 other people got lost because Google Maps app glitched out, directing us to the wrong places. If you search twitter for #googlemaps tons of people have the same issue. Surprised no one has posted about it.

ikeboy 11 minutes ago

So that's why YouTube was being weird. I thought it was an extension problem or something.

sdan 28 minutes ago

Github contribution graphs are also gone

benbristow 36 minutes ago

Wondered why Snapchat was being weird today. Thought it was my pi-hole setup blocking something from working, but nope, it's Google!

javabean_ 6 minutes ago

could this be the result of another BGP hack ? cyberwarfare ? I am just speculating here big time.

sgammon 10 minutes ago

So far the Ko list:

GCE, GKE, BQ, Pub/Sub, GAE

asia-south1 us-west1 us-central1 us-west2

jamisonbryant 14 minutes ago

I've noticed problems on GDrive (GSuite) and YouTube as well. Connected?

sdan 19 minutes ago

Weird for Twitter to still be up and fully functioning. I thought they migrated everything to GCP this/last year?

sorenn111 16 minutes ago

Not the main functionality of the service, just lots of data analysis tooling. nothing that end users would notice

sdan 5 minutes ago

Interesting. Thought I had read some posts of them migrating their data, but you could definitely be right.

mtarnovan 26 minutes ago

Is shopify on google cloud? i noticed they are having issues too

brown9-2 19 minutes ago

yes

arach 54 minutes ago

Confirming issues on our end. I'm able to load up my console but when I go to Kubernetes Engine, I don't see my clusters. I'm monitoring closely on twitter

dang 42 minutes ago

https://news.ycombinator.com/item?id=20077275

https://news.ycombinator.com/item?id=20077293

derekhh 46 minutes ago

> We will provide more information by Sunday, 2019-06-02 12:45 US/Pacific.

I'm not seeing anything at 12:47.

timdorr 30 minutes ago

They are updating the root cause issue, which is networking, here: https://status.cloud.google.com/incident/cloud-networking/19…

Next update is in about 25 minutes.

derekhh 45 minutes ago

https://status.cloud.google.com/incident/cloud-networking/19…

404 – Impressive

ryacko 41 minutes ago

Cloud status dashboards seem to be hosted on the same cloud, which doesn't say much about redundancy.

chupasaurus 8 minutes ago

AWS changed internals of Service Health dashboard after they couldn't update it when S3 went down in us-east-1 (https://aws.amazon.com/message/41926/)

edit: wording

pgoodjohn 37 minutes ago

Everything looking normal on our GKE / CloudSQL stuff (eu-west1)

wichert 4 minutes ago

gcloud tells me:

WARNING: The following zones did not respond: us-west2, us-west2-a, southamerica-east1-c, us-west2-b, southamerica-east1, us-east4-b, us-east4, us-east4-a, northamerica-northeast1-c, northamerica-northeast1-b, us-west2-c, southamerica-east1-b, northamerica-northeast1, southamerica-east1-a, northamerica-northeast1-a, us-east4-c. List results may be incomplete.

Luckily for us eu-west1 seems to be working normally.

snisarenko 1 hour ago

The status page took a while to show issues. My app was down, and Twitter knew google cloud was down before the official status page.

40acres 25 minutes ago

Can't wait for the postmortem!

murat124 20 minutes ago

My money is on config push.

tr33house 54 minutes ago

Took me a while to track latency issues to GCP. Wasn't expecting it. This also seems to affect some GAE instances and some of their products like google photos. At least according to my observations

different_sort 48 minutes ago

I see this as well.

maz1b 8 minutes ago

Ironically, I moved all of my objects off GCS today.

landon32 52 minutes ago

u.s. west: all our cloud compute is inaccessible rn…. our API is down, can't ssh into the servers, and also can't see them on the dashboard.

f_martinez 19 minutes ago

We are on region us-east1 and our systems are still up. Specifically, we are on us-east1-b.

horyzen 1 hour ago

Yeah I was having trouble accessing my Gsuite apps, had a couple of 502s, which led me to check HN. While it doesn't give me 502 now, it's abnormally slow.

vinayan3 1 hour ago

Looks like only GCE is down according to the status page now. I'm able to access my console for instances and GKE clusters.

miller_joe 0 minutes ago

They initially opened an incident against GCE (https://status.cloud.google.com/incident/compute/19003) then opened one against Networking (https://status.cloud.google.com/incident/cloud-networking/19…).

The networking incident looks like the one to follow for updates now.

sladey 58 minutes ago

I happened to be initializing a GKE pool upgrade just as this occurred. The upgrade is now stuck according to the console.

The interesting thing is that a couple of minutes before everything went wrong, kubectl returned a "error: You must be logged in to the server (Unauthorized)" error

snisarenko 1 hour ago

My site runs on Google App Engine and its down as well.

murat124 1 hour ago

GCP has been down since 11:50am and they acked it 35 mins later. They're great at leaving their customers in the dark.

copperx 52 minutes ago

Not much different from AWS, from what I've heard.

msbarnett 34 minutes ago

Yeah, Amazon is the master of having their status page read all Green while half of US-East is in the toilet

numbsafari 44 minutes ago

Definitely the case. Neither are super great at this. One issue is that issues that may 100% impact individual clients may only impact a vanishingly small amount of their overall service load. That mismatch between customer and provider experience is one of the ugly aspects of public cloud providers.

discodave 7 minutes ago

That's why AWS is all about their Personal Health Dashboard (PHD). They can post specific issues for your account in there. Also, they get to keep the public page looking nice and green to show to executives of prospective customers.

kazen44 36 minutes ago

Also, it's one which gets hugely understated when people "move to the cloud".

especially if you use your bussiness for B2B services. Stuff like this could make you loose your bussiness, especially if some entity like google doesn't communicate and as a result, you do not have a answer for your own customers.

Medium sized private cloud providers are a lot better at this, considering the communication lines are a lot shorter.

sadris 27 minutes ago

So is Youtube.

miller_joe 1 hour ago

It took them 45 minutes to open a status page on this massive outage. I love GCP but that's not great.

nkassis 1 hour ago

The GCE console also affected, couldn't send a support ticket just getting errors.

xerxes901 1 hour ago

Couldn't load the support console to "me too" this one either!

pishpash 5 minutes ago

Let's see if perfect leetcode skills will save the day. /s

bilal4hmed 53 minutes ago

gitlab is slow too

amenod 49 minutes ago

Slow is understatement… some pages on gitlab.com take minutes to load, and jobs take tens of minutes to start.

EDIT: It's been like that since at least 12h ago though. Not sure if it's connected to Google Cloud?

wenbin 29 minutes ago

btw, Google Analytics realtime is down as well.

bbayer 2 minutes ago

I wasn't aware that outage and had small heartattack when I saw huge drop of visitors. I think other metrics are also affected.

stuck_in_the_ma 1 hour ago

Google Play is also experiencing massive issues.

digaozao 1 hour ago

I cant see any gke cluster in Brazil, or any VM.

miller_joe 55 minutes ago

I'm seeing that with northamerica-northeast1. I can't access anything over the network in that region and most of the GKE clusters and VMs in that region aren't listed in the console