Forum performance issues

Started by Guinness, October 11, 2011, 07:28:58 AM

Previous topic - Next topic

Guinness

Right now, I'm collecting data. For instance, see the screenshot from an offisite monitoring and reporting tool I started using yesterday. It shows the median, etc. access times on a timeplot (times are actually Stockholm time, so +6 from Eastern US time). At least in that small dataset, we can see that the time to access the homepage went up quite a bit around 7am in Stockholm (1 AM in Atlanta).

So that's a little odd. Our traffic is more or less infinitesimal in the great scheme of things. At peak we do about 350 hits per hour, which is 5.8 hits per minute. About 1/5 of those are homepage hits, and about half hit the DB at all. So in practice we're looking at less than 3 visits to the database per minute. It's also not just database, as stuff not coming out of the database, like CSS files occasionally take forever too. I've done everything I can to optimize the site for browser caches, but still I see the slowness too.

So I've told the webhost support team all that, and I'm waiting to hear back. If it goes on like this, I guess I'll have to go shop for hosting somewhere else (or something).

Anecdotal reports of slowness aren't actually all that useful for solving a problem like this. What's really needed is metrics, which I'm working on collecting. If you are getting slow page renders, and you know how to use firebug in firefox or web inspector in chrome, please tell me what object or objects you saw that were slow, and if you haven't already taken note of the "page created in" stats at the bottom of the page please take note of them when reporting slowness. Thanks!



Nobody

OK, I just installed Firebug - so what do you need Guinness?

Most of the slowness is reported as "waiting", and the by far longest wait is for the actual "index.php?..." itself.
Oh and 3 Errors are reported ("$ is not defined" and 2x "jQuery is not defined").
And there is a 404 (missing picture "topnav_bg.gif") and a 400 (bad request "%buttonSrc%").

Guinness

That's about what I expected to see...

The 404 is a bug in the theme. I suppose I could fix that, but it's not slowing stuff down.

The 3 JS errors: same story.

Most of the slowness I see is rendering index.php too.

I've been mining access logs, and it looks like the good bots and Bing have decided to do a full crawl. This is probably not helping, and I suppose could be our real root cause. It's crawling fairly aggressively. I've logged into Bing's (and Google's, and Yahoo's) webmaster tools and tried to adjust their crawl interval, but it'll probably take hours, if not days before they take notice and back off.

I don't want to just block the spiders, as we want to be have good ranks in searches, so I'm trying to tread lightly there.

Guinness

I've made a little config tweak which shouldn't affect anything, but if you have trouble posting or whatever, try to let me know. This post is mainly to make sure I can still post. :)

Guinness

When I get a moment here and there today, I may try a few things that may make the forum slower or unavailable for a few minutes. I apologize in advance.

The webhost has more or less conveyed to me that we're on a shared host and not paying a whole lot, so this problem isn't worth their time. I suppose if whoever else is on the same host starts complaining, something might magically happen, but I'm not holding my breath. So if the few tweaks I've got left in the bag don't significantly improve performance, I'll have to quit being stubborn, eat the relative pittance I've already paid them, and go shopping again.

If I end up moving the forum (again), the time-to-live on the DNS record can be no smaller than 4 hours. This means that you might end up seeing the old forum in maintenance mode for that long, or longer, while various dns servers upstream of you catch up to the change. If that becomes necessary I'll announce it in advance.

Thanks for hanging in there with me.

Carthaginian

Guinness - just let us know what you are doing, and when you are doing it.
If things are unfixable and we do have to move, it's not like everyone can't chip in a few bucks to get things working somewhere new.
So 'ere's to you, Fuzzy-Wuzzy, at your 'ome in old Baghdad;
You're a pore benighted 'eathen but a first-class fightin' man;
We gives you your certificate, an' if you want it signed
We'll come an' 'ave a romp with you whenever you're inclined.

Guinness

Well, it's fast now... Grrr.

I did a last ditch attempt at getting support by sending them examples of the queries that were slow here, but fast on my 8 year old junk extra computer at home. Maybe that got someone's attention. It usually takes 12 hours to get a response to any of my emails to the helpdesk.

At any rate, I've shopped around a little bit. I went with stupid-simple shared hosting to try to keep it simple: I don't want to spend a lot of time adminning stuff like a mail server if I don't have, so that model worked for me. If I have to go somewhere else, I may end up renting a virtual private server of my own, and moving my personal hosting stuff there too. In that case, subdividing navalism costs from my own would be tough.

At any rate, we're talking an amount of money that's below my personal limit for whether or not accounting overhead is worth it, so passing the hat around is probably not necessary.

Nobody

Whatever you did Guinness, seems to work. The page was considerably faster today. Well, relatively speaking at least. But at <5 seconds, that is just on the good side of acceptable - at least for me - although that is still 20 to 50 times slower than before.

Guinness

I think maybe I got their attention. Someone restarted mysql in the last hour...

We'll see how it goes.

Guinness

Signs point to good. Here's a graph of response time as seen from a server in sweden.

14:30 on that graph is when mysql on the shared host was restarted. response times since: much better.

ctwaterman

Hooody... Hoo.... man just getting people to reboot a server seems to be a serious bit of work...

Charles
Just Browsing nothing to See Move Along

Nobody

Response time was excellent after you post yesterday (0.09 seconds), but now it's back to 2 to 5 seconds plus occasional time-outs. Or did you do something a couple of minutes ago?

Guinness

I have access to only limited metrics on the host itself, but between 3 and 4 am my time, I don't see any glaring issues. For instance, there is something that logs any SQL query that takes longer than 1 second, and none of those were logged in that hour. Only a single one was logged at 2am and another at 6 am, but those are both related to scheduled maint. jobs that scan the table holding all the posts in the forum, so I believe that's just normal. Attached is the last two days of access times from the offsite monitor in Sweden.

The other big thing in the monitoring I can see is that the standard deviation of the checks (meaning how much checks tend to deviate from the median, etc.) is way way down since yesterday afternoon. I take this as a good sign. It means we're getting much more consistent page load times. If I had a choice between 9 loads taking a 10th of a second and one taking 10 seconds, and all 10 taking 1 second, I'd take the latter.

One thing I do see from that time: Bing's bot has been crawling the site for the last few days. I asked it to dial it back the day before yesterday, but one of their machines didn't seem to pick that up. It quit finally at 5 am today.



Jefgte

That's working normaly for me to open or post now.
About 0.2 or 0.3sec

Tks

Jef
"You French are fighting for money, while we English are fighting for honor!"
"Everyone is fighting for what they miss. "
Surcouf

Guinness

In an attempt to assure a consistent experience, I've been tinkering with using Amazon's cloudfront service to service static content from the forum.

Cloudfront is a "content delivery network". The idea is it caches files, distributes them behind the scenes, and then when you request them, they come from the node in their network "closest" to you, which should mean they should come nice and fast.

For now, only the Curve theme has been changed to use this, as well as everyone's avatars. The Avatar files were actually the slowest objects on the site last week, even slower than the homepage or topic list pages. I'm going to let that bake a bit as we say in the business, then see how the results look.

The Cloudfront pricing model is quite generous for little sites. If my math is right, this would only cost me $0.1275 (US) a month. :)