PDA

View Full Version : Shaw downtown on fire?



Pages : [1] 2

phreezee
07-11-2012, 02:24 PM
Hearing that's the case, and some reports of internet down.

ercchry
07-11-2012, 02:26 PM
something blew up on the 7th floor, lots of crap down from a corporate standpoint.

CivicDXR
07-11-2012, 02:29 PM
I work for AHS IT, all clinical apps are down, AHS servers are housed there, and its affecting the hospitals huge. I'm at Rockyview right now, and they just called an emergency meeting of all managers in the building.

phreezee
07-11-2012, 02:33 PM
I can see Fire and EMS on the scene from my building now.

Mar
07-11-2012, 02:34 PM
Punishment for their shitty services.

nzwasp
07-11-2012, 02:35 PM
There datacenter for their network is on the 7th floor, i used to work on the 7th floor right outside that room.

kenny
07-11-2012, 02:38 PM
Shocked to see how much stuff is affected from a relatively minor incident like this. 311 is down, computer systems in ambulances are down, 3 radio stations, tons of companies are reporting their phone lines are down and many more saying their websites are not available either.

Surprised so many companies are relying on Shaw to host their services in a building that so many people have access to.

ercchry
07-11-2012, 02:38 PM
A downtown building is being evacuated following reports of an explosion and fire.

The Calgary Fire Department is responding to Shaw Court, 630 3 Ave S.W., for reports of a blaze.

Spokesman Jayson Doyscher said a second alarm fire response is underway but authorities were not able to confirm if an explosion took place nor the size of the fire.

“They’re still invesigating at this time,” said Doyscher.

“They’re evacuating the building.”

Three radio stations, Country 105, AM 770 and Q 107, are housed in the building.

Nearby roads are being closed down to traffic.

SHAW employee Colin Morrison said staff originally figured the outage was caused by another series of rolling blackouts.

“When the fire alarms started to go off there was a bit of concern,” Morrison said. “None of our emergency lighting went back on.”

Calgary EMS say they are not treating any patients.

More to come.

Read more: http://www.calgaryherald.com/Downtown+building+evacuated+following+reports+explosion/6918507/story.html#ixzz20Lfr4deF

SmAcKpOo
07-11-2012, 02:43 PM
AHS doesn't have a redundant COLO?...interesting....

nzwasp
07-11-2012, 02:43 PM
No photos of this yet?

1-Bar
07-11-2012, 02:46 PM
Fun times!! Computers in amboolances are functional. Just email and info pushes to electronic paperwork is down.

Mar
07-11-2012, 02:47 PM
I'm surprised they managed to evacuate the place. I was in Nexen testing a version update of some data management software a couple of years ago and the fire alarm went off from someone burning popcorn in the microwave. Not one person got out of their seat.

Disoblige
07-11-2012, 02:49 PM
Originally posted by Mar
I'm surprised they managed to evacuate the place. I was in Nexen testing a version update of some data management software a couple of years ago and the fire alarm went off from someone burning popcorn in the microwave. Not one person got out of their seat.
Large business buildings usually have a fire and emergency evacuation procedure with a stage 1 and stage 2 alert. That way, an entire building isn't evacuated over a silly thing such as popcorn burning.

kenny
07-11-2012, 02:50 PM
Due to an incident and infrastructure damage downtown, the Municipal Emergency Plan has been activated, effective 14:23 hours. ‪#yyc‬

http://twitter.com/cityofcalgary/statuses/223156546660601856

Anomaly
07-11-2012, 02:51 PM
Fire department told us UPS fire, knocked out or MPLS and PtP links....:(

codetrap
07-11-2012, 02:56 PM
Originally posted by Anomaly
Fire department told us UPS fire, knocked out or MPLS and PtP links....:(
Sorry, but I must laugh, and laugh hard. Shaw sux.

kenny
07-11-2012, 02:57 PM
Reading twitter, reports are a generator malfunctioned and exploded causing the sprinkler system to activate on the 7th floor (data centre?).

That would be terrible if the sprinker system caused everything to fry. Bad planning haha.

Cos
07-11-2012, 02:58 PM
.

1-Bar
07-11-2012, 02:59 PM
Yeah just heard the same thing on the radio as that was the cause. Fun times!!!

nzwasp
07-11-2012, 02:59 PM
which company cos?

I look after a large company office in airdrie

Xtrema
07-11-2012, 03:00 PM
Originally posted by Anomaly
Fire department told us UPS fire, knocked out or MPLS and PtP links....:(

MPLS is down, working off a slow ass redundant link.

nzwasp
07-11-2012, 03:01 PM
Originally posted by kenny
Reading twitter, reports are a generator malfunctioned and exploded causing the sprinkler system to activate on the 7th floor (data centre?).

That would be terrible if the sprinker system caused everything to fry. Bad planning haha.

Ah well could be some additional slowness for people that want to call the contact center tonight or perhaps chat online to shaw, or maybe pay their bills.

lasimmon
07-11-2012, 03:13 PM
Is that why I hear sirens all afternoon from my office at 4th and 4th?

Really piles on to the hangover.

Anomaly
07-11-2012, 03:14 PM
Originally posted by Xtrema


MPLS is down, working off a slow ass redundant link.

Wish I could say the same, no non shaw redundant links. Our Telus internet is up, so L2L VPN tunnel time. :(

ercchry
07-11-2012, 03:18 PM
with all the water-free fire suppression systems out there for server rooms... they had sprinklers?! :nut:

-relk-
07-11-2012, 03:19 PM
Will people with Shaw as their ISP be without internet tonight?

I really hope not, I wanna play some League when I get home.

Cos
07-11-2012, 03:25 PM
.

nickyh
07-11-2012, 03:31 PM
My company is affected,phones are down, no server connections, nothing is working except our cell phones.......And it's quarter end!
I wonder if this will be resolved tomorrow of if it's a work from home day as our remote connections work.

nickyh
07-11-2012, 03:33 PM
Originally posted by -relk-
Will people with Shaw as their ISP be without internet tonight?

I really hope not, I wanna play some League when I get home.

I'm at home now and I'm not affected, Shaw is my provider.

Mibz
07-11-2012, 03:33 PM
Our "residential" test links at work are still up and my house phone still works. Looks like it's just business stuff.

My condolences to anybody affected by this. Further condolences to the NOC/DC guys at Shaw.

-relk-
07-11-2012, 03:36 PM
Originally posted by nickyh


I'm at home now and I'm not affected, Shaw is my provider.

:clap:

dexlargo
07-11-2012, 03:47 PM
Originally posted by ercchry
with all the water-free fire suppression systems out there for server rooms... they had sprinklers?! :nut: I read that the sprinklers were going off on the floor where the fire happened. I would guess outside of the server room. They almost certainly would have had halon or similar system in their server room - they're not rookies.

Actually, reading the Herald Article, it sounds like the fire department is saying the explosion happened in an electrical room on the 13th floor (the top floor?), which would make sense - why would a generator be right in the server room? They probably lost power to the servers. But why would the generators be cutting in? Usually they only switch on when there's a power failure.

Zhariak
07-11-2012, 03:48 PM
Originally posted by ercchry
with all the water-free fire suppression systems out there for server rooms... they had sprinklers?! :nut:

I was just about to say that's retarded if they have water based suppression systems! You'd think the place would have flooded with CO2 or something...

And just to add, all my clients in the d-town core on shaw are down...

mark4091
07-11-2012, 03:51 PM
Improperly vented generator room?

nzwasp
07-11-2012, 03:58 PM
Fire photo yusss


https://p.twimg.com/AxjdhhMCQAAZxMO.jpg

Mibz
07-11-2012, 04:02 PM
Originally posted by dexlargo
But why would the generators be cutting in? Usually they only switch on when there's a power failure. Victim of our rolling blackouts? Are they still doing them?

lilmira
07-11-2012, 04:04 PM
Damn you Neo

jdmXSI
07-11-2012, 04:12 PM
Just heard the Emergency Broadcast over the radio how you may have difficulties calling 911 in downtown and in surounding areas on your land line :nut:

Alterac
07-11-2012, 04:13 PM
From what I have gathered, the Mechanical room had an incident, thus crippling the Generation/UPS backup system and causing a fire.

The Fire Department cuts power to any building that it burning, so the entire thing went dark.

AHS, ATB, IBM Colocation, BP Datacentre, etc

All down.


"The system is down, the system is down"
http://cripzthecomic.com/wp-content/uploads/2011/03/strongbad-300x225.png

kenny
07-11-2012, 04:13 PM
Originally posted by jdmXSI
Just heard the Emergency Broadcast over the radio how you may have difficulties calling 911 in downtown and in surounding areas on your land line :nut:

Only if you are a Shaw Phone customer.

supe
07-11-2012, 04:14 PM
Yeah our office at 5th ave 4th street is without phones and internet.

Our office here on 11th ave and 3rd street is still without phones good thing we went with someone else for internet.

Can't believe a company like shaw doesn't have redundancy built in.

LollerBrader
07-11-2012, 04:23 PM
Originally posted by supe

Can't believe a company like shaw doesn't have redundancy built in.

Well they did... it was part of their redundancy that assploded.

bowlofrice
07-11-2012, 04:29 PM
it was an ac unit on the roof that blew, mainly 12th floor thats affected... they had to kill power to the entire building though, thats why everything is down.. meaning none of our internal emails, phones... pretty much everything is down

speedog
07-11-2012, 04:29 PM
Originally posted by supe
Yeah our office at 5th ave 4th street is without phones and internet.

Our office here on 11th ave and 3rd street is still without phones good thing we went with someone else for internet.

Can't believe a company like shaw doesn't have redundancy built in. Shaw probably has redundancy built in, but when things happened as Alterac explained earlier in this thread, then any data centre provider would pretty much be hooped. Shut off all the AC and DC power in TELUS' main data centre and see the crap that would happen - it is just a bad situation that had just the right things go wrong in the right sequence.

And if I remember correctly from when I left TELUS almost 5 years ago, TELUS was moving to sprinkler systems in most of their network sites and getting rid of the Halon dump systems - just can't remember why anymore.

toastgremlin
07-11-2012, 04:30 PM
Originally posted by speedog
And if I remember correctly from when I left TELUS almost 5 years ago, TELUS was moving to sprinkler systems in most of their network sites and getting rid of the Halon dump systems - just can't remember why anymore. Doesn't Halon come in two varieties: incredibly lethal to humans, or just harmful to the environment?

rage2
07-11-2012, 04:34 PM
Originally posted by supe
Can't believe a company like shaw doesn't have redundancy built in.
I deal with a lot of DR stuff at work, and a datacenter like Shaw's would have tons of redundancy built in at every level, except for physical location. Meaning if they host a website for you, don't expect them to have a second datacenter with your website mirrored and ready to go.

Datacenter explosions like this is pretty rare. Having a backup datacenter for redundancy is *very* expensive for most companies to even consider, so a worst case scenario like this usually isn't covered by most companies. It's a very high cost/low risk scenario.

What I'm surprised at though, is that the City of Calgary, and Alberta Health Services does not have redundant datacenters that they can kick in if the primary DC goes down. If anything, critical services like that should be housed in multiple locations. Same goes for some of Shaw's critical infrastructure, ie home phones, and even their website. The city's 911 service isn't hosted with Shaw, I'm curious now if they have redundant backup datacenter in case their primary blew up.

LollerBrader
07-11-2012, 04:34 PM
Brother is in crescent heights and says he can't see any smoke around shaw currently.

Cos
07-11-2012, 04:36 PM
.

supe
07-11-2012, 04:40 PM
Alterac's post came in just before mine so it makes more sense now.

But I also agree, I thought their main office was more an admin office, I would have thought they would run more of the DC type stuff from their campus location.

bowlofrice
07-11-2012, 04:42 PM
that giant cement building near campus is where the data center is supposed to be relocated to when its finished

speedog
07-11-2012, 04:43 PM
If I remember correctly, TELUS' does have data centers that backup/mirror each other, but the cost to have data hosted out of these centers on a fully redundant basis is higher as a result. TELUS does have multiple mirrored data centers across the country too - lots of money was spent to build these places, one or two in B.C., one or two in Alberta and at least a couple in Ontario when I was last there.

speedog
07-11-2012, 04:45 PM
Originally posted by bowlofrice
that giant cement building near campus is where the data center is supposed to be relocated to when its finished No different then TELUS, they used to host a ton of data center stuff at their main central office location in downtown Calgary (not the TELUS Tower), but that has been migrated over to another location over the past few years.

Mar
07-11-2012, 04:47 PM
Why weren't they hosting at Q9 instead of locally?

speedog
07-11-2012, 04:48 PM
Some people's tweets are hilarious right now - stupid, stupid people. Nothing like blurting out nonsense instead of taking the time to do a bit of reading.

speedog
07-11-2012, 04:50 PM
Originally posted by Mar
Why weren't they hosting at Q9 instead of locally? Probably because Shaw won the contract.

rage2
07-11-2012, 05:02 PM
Originally posted by Cos
It seems weird to me though that they would have a data center in their head office. Most data centers that I have seen are some random warehouse.
It's ridiculous how much stuff is run out of Shaw court in this city, and not just Shaw owned network/infrastructure. I'll bet a lot of other smaller datacenters downtown are running into problems right now because they're fed their primary and redundant links out of Shaw court.

Years ago, when we were with Group Telecom, which was originally Shaw fiberlink and built out of Shaw court. The company then spun off into Group Telecom, and eventually bought out by Bell. The equipment never moved, and to this day still stays at Shaw court.

Beyond used to run on there, and there was a day where a tech at Shaw court brushed a cable, and took out all routes to anyone still on the old Group Telecom fiber circuits. Because the circuits are owned by Bell, it took 3 hours for Bell techs to drive there, get access, diagnose, and restore service. Beyond went down for a few hrs that day.

It took us a while to get everything bypassed off of Telus circuits and Shaw court circuits at our offices. I guess today, we're glad we did!

dannie
07-11-2012, 05:10 PM
Keep in mind, Registry services and Alberta Health Care is IBM based too.... All registries are unable to process at this time....

r3ccOs
07-11-2012, 05:11 PM
Shaw's datacenter is one of IBM's primary co-lo's and is a primary datacenter to a number of high profile customers

I can tell you that ATB for one is impacted, and their DR design for their new platform isn't providing much for being "high available for disaster recovery"

that being said, Shaw's facility was one of the better ones in town

Alterac
07-11-2012, 05:12 PM
We are working on multihoming our internet here also, currently we use Rogers (Enmax really), and we are migrating it to Q9 Provided 200mbps fibre link.

Then we shouldnt have problems with our peers going down or anything like that, with the plus side of not having to manage the edge bgp.

Xtrema
07-11-2012, 05:15 PM
Originally posted by Mar
Why weren't they hosting at Q9 instead of locally?

Cost? I think VOD service alone would have a very large storage footprint. But you would think they would have colo up at Barlow office, especially MPLS stuff.

BTW, Q9 is now Bell.


Originally posted by dannie
Keep in mind, Registry services and Alberta Health Care is IBM based too.... All registries are unable to process at this time....

Didn't IBM sell all those to AT&T? Or is that just the NOC portion.


Originally posted by Cos


FAB. If you are with us I'm 4988

Ha, I would have guessed FAB too from your spotting thread.

dannie
07-11-2012, 05:21 PM
Just the NOC

Cos
07-11-2012, 05:27 PM
.

Disoblige
07-11-2012, 05:39 PM
This sucks! No internet or cable as I live downtown. Never thought I'd be so happy I have Telus LTE on my phone.

eblend
07-11-2012, 05:55 PM
My old empoloyer ERCB (quit last friday) is right across the shaw building and after many years trying to retire the mainframe moved their mainframe services to that shaw building, so speaking with some of my old co-workers, their mainframe is down. I bet this would be an all hands on board situation in many IT orgs affected, so I am sure I would have been pulled in to this as well. My new employer, while still gov doesn't use the mainframe so we are all peachy :) Guess i took a good time to leave. Surprised their site is down though, would assume they would have some offsite provider at least be able to host an outage page

codetrap
07-11-2012, 05:59 PM
Originally posted by dannie
Keep in mind, Registry services and Alberta Health Care is IBM based too.... All registries are unable to process at this time....
That's interesting, because the registries network homes into servers that are located in the BP building here in Calgary, and to my knowledge, those were unaffected.

Oz-
07-11-2012, 06:45 PM
Signal all the Business Continuity specialists.

cam_wmh
07-11-2012, 07:42 PM
http://29.media.tumblr.com/tumblr_lxjm4n4umw1rn1xxfo1_400.gif

narou
07-11-2012, 09:10 PM
Hopefully they get something up for ahs. I got work to do.

Cos
07-11-2012, 09:12 PM
.

hedge
07-11-2012, 09:15 PM
I can't believe some of these places don't have a DR plan in place, we'd be back up in about 5 minutes.

G-ZUS
07-11-2012, 09:18 PM
Originally posted by narou
Hopefully they get something up for ahs. I got work to do.


:werd:

TurboMedic
07-11-2012, 09:37 PM
Originally posted by 1-Bar
Fun times!! Computers in amboolances are functional. Just email and info pushes to electronic paperwork is down.

Yah, that was helpful today........

rage2
07-11-2012, 09:40 PM
Originally posted by hedge
I can't believe some of these places don't have a DR plan in place, we'd be back up in about 5 minutes.
I think the biggest issue is that most companies, city/provincial government is just relying on the Shaw datacenter SLA to make sure services are up, which is the biggest mistake. SLA's are useless, in extended outages such as this, you get refunded some of your fees, which is tiny compared to the $ lost for your business/government.

It's not up to the datacenter/colocation to provide datacenter redundancy. Datacenters go down, even Amazon EC2 out east went down for hours last weekend from intense storms which took out Instagram and Netflix.

The servers/software architecture that's being hosted has to be designed to work in a failover site scenario. Most software isn't designed for this and is impossible to make it failover to an offsite datacenter reliably without dataloss or corruption. If the city was expecting 100% uptime even with a datacenter loss, then that's an oversight right there.

Fact is, as I mentioned earlier, datacenter loss is rare, and when you present a cost/risk/benefit analysis for an offsite mirrored backup to management, it makes very little sense and will rarely get approval at most companies. A SAN is expensive. SAN replication over WAN is retarded. And that's just 1 small piece of the puzzle in a multi-site environment.

With that being said, some of the critical services at the government level should have been designed and hosted in a multi-site environment from the start. I wouldn't fully blame Shaw, I would blame the people responsible for the systems that are down right now.

hampstor
07-11-2012, 09:42 PM
Originally posted by rage2

I deal with a lot of DR stuff at work, and a datacenter like Shaw's would have tons of redundancy built in at every level, except for physical location. Meaning if they host a website for you, don't expect them to have a second datacenter with your website mirrored and ready to go.

Datacenter explosions like this is pretty rare. Having a backup datacenter for redundancy is *very* expensive for most companies to even consider, so a worst case scenario like this usually isn't covered by most companies. It's a very high cost/low risk scenario.

What I'm surprised at though, is that the City of Calgary, and Alberta Health Services does not have redundant datacenters that they can kick in if the primary DC goes down. If anything, critical services like that should be housed in multiple locations. Same goes for some of Shaw's critical infrastructure, ie home phones, and even their website. The city's 911 service isn't hosted with Shaw, I'm curious now if they have redundant backup datacenter in case their primary blew up.

As it's an IBM colo site, there would be a fair number of IBM mainframes there. AHS and the City of Calgary were big IBM mainframe users...

Over the last year, I discovered many organizations that still operate mainframes do not have a DR site for the mainframes because they can't afford to. I would imagine that any systems that still rely on the mainframes are actually what is down right now. Would make sense since the provinces' registry system is very old.

I'm just guessing of course... I was recently working closely with a team on an IBM mainframe rehost project and this is the first thing that came to my mind when I saw the list of customers down and this being an IBM site.

rage2
07-11-2012, 09:50 PM
Originally posted by hampstor
As it's an IBM colo site, there would be a fair number of IBM mainframes there. AHS and the City of Calgary were big IBM mainframe users...

Over the last year, I discovered many organizations that still operate mainframes do not have a DR site for the mainframes because they can't afford to. I would imagine that any systems that still rely on the mainframes are actually what is down right now. Would make sense since the provinces' registry system is very old.
Not only is it expensive, it's sometimes impossible from an architectural point of view with these legacy systems.

Lucky for us, VMWare has made things a lot easier to do multi-site redundancy for software that wasn't designed for it. Taking the software architectural costs out of the equation is a huge money saver.

dannie
07-11-2012, 10:01 PM
Originally posted by hampstor


As it's an IBM colo site, there would be a fair number of IBM mainframes there. AHS and the City of Calgary were big IBM mainframe users...

Over the last year, I discovered many organizations that still operate mainframes do not have a DR site for the mainframes because they can't afford to. I would imagine that any systems that still rely on the mainframes are actually what is down right now. Would make sense since the provinces' registry system is very old.

I'm just guessing of course... I was recently working closely with a team on an IBM mainframe rehost project and this is the first thing that came to my mind when I saw the list of customers down and this being an IBM site.

You got it. Registries use the IBM mainframe at that site. What surprised us was that registries went down a solid 45 minutes before AHS did. I don't know enough about the systems to go into further detail or why one system took longer to fail. But as of right now, it's still down and there is no current ETA for a fix.

1-Bar
07-11-2012, 10:05 PM
Originally posted by TurboMedic


Yah, that was helpful today........

Haha you were on day shift i presume?!?

M.alex
07-12-2012, 02:13 AM
1.40am ... I could connect back to my work VPN. Yay :clap:

sputnik
07-12-2012, 05:43 AM
Originally posted by ercchry
with all the water-free fire suppression systems out there for server rooms... they had sprinklers?! :nut:

Sprinkers are actually the choice of monitored data centers.

With fire supression like Inergen everyone has to evacuate at even the smallest puff of smoke that sets off the system. With sprinklers you can fight the fire with an extinguisher and the only the sprinker above the fire gets triggered.

Go to almost any high end data center (Q9, Shaw, Telus etc) and you will see sprinklers. Most will only have Inergen for if the fire is too big to handle and those systems are only manually engaged.

Zhariak
07-12-2012, 06:13 AM
Originally posted by rage2

Not only is it expensive, it's sometimes impossible from an architectural point of view with these legacy systems.

Lucky for us, VMWare has made things a lot easier to do multi-site redundancy for software that wasn't designed for it. Taking the software architectural costs out of the equation is a huge money saver.


Vmware HA and DRS for the win yooooo!!!!

Fire? What fire???

Zhariak
07-12-2012, 06:31 AM
UPDATE:

Some of my clients on Shaw biz downtown are coming back up right now as I type this!

eblend
07-12-2012, 06:47 AM
Originally posted by Zhariak



Vmware HA and DRS for the win yooooo!!!!

Fire? What fire???

Both ha and drs are cluster based features, underneath a datacenter in vcenter...they won't be doing you much good if the datacenter as a whole is down. Clusters are usually not geographically dispursed and is designed more for a single server failure vs the failure of a whole site. Perhaps the VMware reference is more for Srm which is designed just for this type of scenario, but even still its only good for VMS and alot of places are not 100% virtual so there would still be so.e work to be done to bring back their physical server infrastructure and depending on the network setup, possibly their networking as well. Good to hear Shaw is coming back, at least their customers outside of the core didn't lose their internet, else geeks all over would gone nuts without their pron..

sputnik
07-12-2012, 06:55 AM
Originally posted by speedog
And if I remember correctly from when I left TELUS almost 5 years ago, TELUS was moving to sprinkler systems in most of their network sites and getting rid of the Halon dump systems - just can't remember why anymore.

Halon and Inergen systems aren't preferred because you can't do any remediation work in the DC after it fills the DC with gas. Additionally Halomethane (Halon) is no longer allowed to be installed or charged because it is a CFC gas.

With these fire suppression systems you basically have to hope the fire goes out and wait a few hours until you can get in and check for any damage. This is good for smaller server rooms that do not have someone on staff 24/7.

Telus' data center on 10th Ave (west of 14th St SW) has security on staff around the clock and if there is a fire they send one of those guys in with an extinguisher to put out the fire. If the fire is so bad that it is heating up the sprinkers, the wax seal on that sprinkler will melt and shower the localized area and the water with run into (and drain from) the raised floor. The three Q9 sites throughout the city are the same.

Most fires are in a data center will be minor electrical fires due to a faulty part. Usually not much more than a plume of smoke and some sparks from a short. Places like Telus and Q9 also have very strict rules about storing combustible materials like paper or cardboard in customer cages so the likelihood of a fire spreading is pretty low.

sputnik
07-12-2012, 07:12 AM
Originally posted by rage2
I deal with a lot of DR stuff at work, and a datacenter like Shaw's would have tons of redundancy built in at every level, except for physical location. Meaning if they host a website for you, don't expect them to have a second datacenter with your website mirrored and ready to go.

Getting someones website mirrored at another location is probably the easiest thing to make redundant. Global load balancing on a layer 7 switch and a VM farm and you are pretty much good to go.

The real problem is Shaw provides point-to-point managed links to many of their customers downtown that terminate at Shaw Court. This is redundancy that you really can't virtualize because you are dealing with physical fibre. That said. The last company I worked for had redundant 10G fibre from every office building to both of their data centers. However the bills for those connections were NOT cheap.


Originally posted by rage2
Datacenter explosions like this is pretty rare. Having a backup datacenter for redundancy is *very* expensive for most companies to even consider, so a worst case scenario like this usually isn't covered by most companies. It's a very high cost/low risk scenario.

Exceptionally rare. To have a generator fail and blow up a UPS would be very scary and not many companies (other than perhaps Q9) are in a position to work around a failure of this magnitude.


Originally posted by rage2
What I'm surprised at though, is that the City of Calgary, and Alberta Health Services does not have redundant datacenters that they can kick in if the primary DC goes down. If anything, critical services like that should be housed in multiple locations. Same goes for some of Shaw's critical infrastructure, ie home phones, and even their website. The city's 911 service isn't hosted with Shaw, I'm curious now if they have redundant backup datacenter in case their primary blew up.

It doesn't surprise me at all.

IT budgets in the public sector (especially health care) are some of the worst you will ever come across. Try convincing the government you want a couple million dollars a year more for infrastructure operations redundancy when there is an MRI or surgery waiting list.

Even medium sized companies (100-500 employees) often don't have the budgets to handle DR at this level either and Calgary is filled with companies like this.

Another issue is also testing. How many companies fail over their systems on a regular basis or run production systems for a period of time from their secondary site/servers?

I worked for a large financial company based out of Toronto and on a monthly basis would run their systems from their Markham DC and then the next month from their downtown Toronto DC. This was the only way that they could be sure that failing over if necessary could happen easily and with minimal downtime.

Was this incident at Shaw rare? Yes.

Does it surprise me that there were so many customers affected for so long? No.

sputnik
07-12-2012, 07:28 AM
Originally posted by Alterac
We are working on multihoming our internet here also, currently we use Rogers (Enmax really), and we are migrating it to Q9 Provided 200mbps fibre link.

Then we shouldnt have problems with our peers going down or anything like that, with the plus side of not having to manage the edge bgp.

Just make sure you have diverse paths for those fibre connections or a backhoe could take you out in a matter of seconds.

Or at least make sure that they are leasing the fibre loop from Enmax.

The nice thing about Enmax fibre is that it runs next to all of the high voltage lines. So should a backhoe dig up your fibre, the operator probably died doing it.

TurboMedic
07-12-2012, 07:36 AM
Originally posted by 1-Bar


Haha you were on day shift i presume?!?

Yah, still not up today....waiting for my Epcrs to populate from yesterday!

eblend
07-12-2012, 07:42 AM
Originally posted by sputnik

So should a backhoe dig up your fibre, the operator probably died doing it.

Awsome! ;) Serves him right, not cuz of the fibre, but for digging where Enmax puts its cables without checking.

dexlargo
07-12-2012, 08:40 AM
Originally posted by sputnik
Another issue is also testing. How many companies fail over their systems on a regular basis or run production systems for a period of time from their secondary site/servers?My work is affected by this, and we do have DR plans - the problem is deciding to implement the plan. Apparently, they have been waiting to get an idea of how long it's going to take Shaw to get back up and running. The decision is being made right now as to whether or not the backup centre is going to be activated for mainframe services - which is all that we're missing.

I'm told once the metaphorical button is pushed and the troops are scrambled, they expect us to run off the backup location within 12 hours. I have no idea why they didn't do this right away - not my department - but I imagine it's very spendy. Probably also because we aren't completely crippled and twiddling our thumbs without our mainframe, but if it's down for more than a day or two, it really will start to impact us.

EDIT - I misunderstood the information I was given, our disaster plan was activated within 1 hour of the outage, the backup site is active. The decision that is being made now is whether to go live with the backup mainframe - not sure what's affecting that decision.

Mibz
07-12-2012, 08:46 AM
Come on people, cloud. Cloud everything. Cloud would've prevented this. All my cloud shit is still up. Everybody should be on the cloud, there's really nothing better than clouding your stuff on the cloud.

Cloud.

So one of our main internet links at work is down. No big deal, we've obviously got another, but it was strange to me that it was fine last night and then had to be taken down this morning. I'm sure there's a valid reason, was just weird :P

EDIT: Ugh, I have to go through a WebWasher on this link. How can I get any work done if I can't go on Facebook?

KLCC
07-12-2012, 08:49 AM
It seems like AHS is still down for the time being....:banghead: :banghead:

** update, rumor floating around the office, is that AHS will likely be back on line (around 2 pm @ earliest)

rage2
07-12-2012, 09:11 AM
Originally posted by dexlargo
EDIT - I misunderstood the information I was given, our disaster plan was activated within 1 hour of the outage, the backup site is active. The decision that is being made now is whether to go live with the backup mainframe - not sure what's affecting that decision.
You might as well activate it at this point, even if there's data loss (my guess as to why they're not firing up the backup). From reports I was hearing last night, every floor has tons of water damage. They're trying to figure out what equipment is salvagable at this point. In any case, they're firing up all government systems first, so if you're just a customer, it'll take a while to come back online.

dannie
07-12-2012, 09:12 AM
Registries are still down, little chance of them coming back up this morning. We're being told later this afternoon at the earliest and even that is unlikely.

ercchry
07-12-2012, 09:15 AM
Originally posted by rage2

You might as well activate it at this point, even if there's data loss (my guess as to why they're not firing up the backup). From reports I was hearing last night, every floor has tons of water damage. They're trying to figure out what equipment is salvagable at this point. In any case, they're firing up all government systems first, so if you're just a customer, it'll take a while to come back online.

the woman showed up to work today (barlow) they told her to go home and put on "something she doesnt mind getting dirty" and head downtown :rofl:

mean while all the employees that reside in shaw court are enjoying a paid day off.. during stampede

ercchry
07-12-2012, 09:25 AM
my buddy made this

http://i.imgur.com/A7MFt.jpg

schocker
07-12-2012, 09:33 AM
Originally posted by ercchry
mean while all the employees that reside in shaw court are enjoying a paid day off.. during stampede

Employee 1: Hey I got tickets to doggie do tomorrow!
Employee 2: But we have to work!
Employee 1: Not exactly, watch this
KABOOOOOM!
:rofl:

sputnik
07-12-2012, 09:42 AM
Originally posted by ercchry
mean while all the employees that reside in shaw court are enjoying a paid day off.. during stampede

I doubt that. You can bet that anyone who works one the DC floors will be at work... or they are sleeping right now because they have been up for the past 24-30 hours.

roopi
07-12-2012, 09:45 AM
Originally posted by Mibz
Come on people, cloud. Cloud everything. Cloud would've prevented this. All my cloud shit is still up. Everybody should be on the cloud, there's really nothing better than clouding your stuff on the cloud.

Cloud.


:rofl:

Is Shaw's customer portal not available either? Shaw bill is due today. :thumbsdow

dexlargo
07-12-2012, 09:49 AM
Originally posted by rage2

You might as well activate it at this point, even if there's data loss (my guess as to why they're not firing up the backup). From reports I was hearing last night, every floor has tons of water damage. They're trying to figure out what equipment is salvagable at this point.Not my decision. I just use the mainframe (actually, not me so much but a lot of what I do goes through it and comes from it, but I do use it directly on occasion.) We're told that no data prior to 1 minute before the outage will be lost, so I don't think data loss is the issue. The only thing I can think of is that it might be a headache to switch back to the normal production system and import all of the data from the backup back into the normal production environment, but I've been out of IT for a while, and never was in the mainframe stuff, so I honestly don't know what the considerations are.


In any case, they're firing up all government systems first, so if you're just a customer, it'll take a while to come back online.It is the government - so maybe Shaw is telling us that they'll have us back up shortly? I don't know.

ercchry
07-12-2012, 09:52 AM
Originally posted by sputnik


I doubt that. You can bet that anyone who works one the DC floors will be at work... or they are sleeping right now because they have been up for the past 24-30 hours.

sorry, i mean SALES. not everyone in this world works in ICT or whatever... SALES team barlow=cleanup crew, SALES and whoever else she has on her facebook from shaw court=???

anarchy
07-12-2012, 10:06 AM
We're providing regular updates on our Twitter (@ShawHelp) and Facebook (facebook.com/shaw) platforms.

Employees located at Shaw Court are being asked to work from home today.

No firm updates on ETA but the vast majority of customers are up and running. Pretty optimistic about getting things back online by end of day.