PDA

View Full Version : Explain to me why websites crash



ExtraSlow
09-19-2016, 10:45 AM
OK, I'm sure this isn't rocket science, but it's been bugging me.

I've heard a lot of stories of website which get short-term surges in traffic, and that causes them to be down for a while.

I get that bandwidth, or server capacity, could be a limiting factor, but what ACTUALLY happens? Does it mean only a percentage of the people trying to visit the page get through, or is everyone just really slow, or is the whole thing down for a period?

Recently experienced this while trying to sign up for parent-teacher interviews though the CBE website. Surely they have a pretty good idea off how much traffic they can expect? After all, they know how many kids they teach, and how many teachers they employ.

No idea if they staggered the log-inn times for different schools. Seems like they probably should.

Since I now run two websites, I'd like to know if this is something I should be worrying about for the future. I'm hoping at some point I'll surpass my current average of about 100 visits a month . . . :whipped:

Anywhoooooo . . . .
talk to me like an idiot, how does this happen?

taemo
09-19-2016, 11:11 AM
eithere suddenly there's too many users trying to access your webserver or your server or network is not able to handle the amount of traffic
bL_YPUejMdw

codetrap
09-19-2016, 11:36 AM
.

-relk-
09-19-2016, 11:36 AM
The above video explains it well.

An easy analogy that works for me is to think of it like a 2 lane tunnel. It can easily handle 100 cars/hour, but throw 10000 cars from 10 lanes in a minute through, and traffic will climb to a standstill. Stop the influx of cars and reset everything, and traffic is running smoothly again.

revelations
09-19-2016, 11:41 AM
Im sure website dev guys could list a dozen reasons why a website crashes (refuses to load).

DDOS is just one.

rage2
09-19-2016, 11:49 AM
Originally posted by codetrap
Incompetent admins.
It's not that simple. You can throw as much hardware at it and it could still fail due to limitations in what's delivering the web pages. It's an easy solution for a static website, but when you start getting into dynamically generated content where there's components involved for session data management, or database backends for generating said dynamic content, it gets much more complicated. Every single component needs to be able to scale up quickly to not become a bottleneck, and the code + services needs to be able to work in that model efficiently.

Hardware scaling has historically been a problem on the IT side of things, trying to balance costs of running infrastructure that can handle the peaks, and what to do when you don't have enough hardware for an unpredicted spike. It's not like you can run to a corner store, buy some hardware and have it running in minutes to handle unanticipated loads. AWS has pretty much solved that problem though, but it's a bit of work to make it all work seamlessly.

G
09-19-2016, 12:16 PM
^what the Boss man said.

Amazon, Microsoft Azure and the likes will definitely help you scale more easily because they will take care of all the hardware issues if you use their PaaS.

ExtraSlow
09-19-2016, 12:23 PM
Well, let's look at an example of my company website. It's super simple, has no dynamically generated content. My host (siteground) suggests that I'm good for around 25,000 visits per month, or just under a thousand per day. What happens to me if say ten thousand people try to access it around the same time?

rage2
09-19-2016, 12:38 PM
Are you looking at what actually happens when it dies from an influx of traffic?

I dunno what siteground is using to determine the 25k/month visit rate, but whatever it is, the same thing pretty much happens as you run into bottlenecks. As connections to the site ramp up, each page will take longer to serve due to a lack of bandwidth or lack of CPU/memory to process the request. As more users try to load pages and users get impatient and press F5, load times will start getting worse, until it surpass a timeout (say the user's browser) and just error out. When you start getting long load times, a side effect is that concurrent sessions will spike up dramatically since it's not completing page loads fast enough, and you can start hitting server limits of concurrent limits, throwing server side errors.

That's a simplistic view of what's happening under the hood. Without looking at your exact setup, can't tell you where the bottlenecks are.

With that being said, static websites are the easiest to fix. Sign on to a caching service such as cloudflare, and fucking set it to cache everything. If your site is small enough, it's completely free. All the pages will be served at the cloudflare endpoint, and you'll get a couple of hits a day as cloudflare refreshes the caches from your servers. Bonus is it protects you from DDOS attacks as well (if you pay for it on the higher plans). The drawback is that whenever you update your site, you have to wait for the caches to expire, or set CF to flush the cache for a specific page to grab the new content.

thetransporter
09-19-2016, 09:47 PM
bad scripts
insecure scripts
construction workers cutting fiber optic cables to websites that do not have more than one provider
or people cutting fiber optic thinking its copper to steal and sell for food or additctions to websites that do not have more than one provider

dell servers had bad capacitors that would pop ,

DNS records not being updated at end users ISP Or stuck

service watcher not working

sometimes its hardware ... i find even l2 "Carrier class" network switches can lock up when they are not even busy or DOS attacks

a faulty network cable can lock up switch or hardware (even high quality ones)

but honestly if you really think about with all the reducency in place its pretty amazing where tech is
i think its cool how some servers have two different power supplies even RAID is cool

I think all that effort built into networking, computers, is under appreciated and you see used servers for next to nothing

codetrap
09-19-2016, 10:55 PM
.

rage2
09-19-2016, 11:02 PM
Originally posted by codetrap
So, to summarize. If you have a good admin that knows what they're doing, plans correctly, and then sets up the environment properly....? I get my response was simplistic. But that doesn't make it untrue.
What I'm trying to say is that you can have the best admin in the world with unlimited hardware and they would still be at the mercy of web developer's quality of code.

ZenOps
09-20-2016, 06:28 AM
Scripting timeouts.

If a website fails to move a minimum amount of data in a specific period of time, usually the webbrowser or even the OS may determine that the connection is severed. In which case, it usually stops trying - forcing you to press the refresh button if you are that desperate to try again.

codetrap
09-20-2016, 11:21 AM
.

Xtrema
09-20-2016, 12:03 PM
Originally posted by codetrap
Totally fair. I suspect in CBE's case, its shit code, with shit hardware, and shit network.

It's call budget constraints.

codetrap
09-20-2016, 12:40 PM
.