Show more

after another short down time everything is fixed! had to recreate an index that got lost in the migration which was slowing down the DB and everything else. No amount of resources was going to help.

But it is fixed now all ques are empty or very close to it. we will now downgrade the DB to a more sane level now that it is fixed (we upgrades shortly to maintain the system). But it will sill be a pretty hefty system for us, and we can always scale back up when needed.

TL;DR everything is wording fine now.

PS we are now going to work on a staging environment to test upgrades so we can safely start moving the main server through the upgrade cycle. stay tuned.

So we found the real problem haunting us. Turns out we didnt even need the bigger database. There was just an index that got dropped during the migration. We are working now to put it back in place. At which point things should be back up to their normal speed.

I just upgraded the DB server to x2 the CPU. This seems to fix the underlying issue. The pull queue (not related to most things) is now recovering as well.

Grateful... Gratitude... Thanks... Thank You... Appreciate The Work... Many Thanks @freemo

Also uploaded a smaller size picture as I think you can still see the "thanks" even when auto-enlarged! :D

So image upload works but is slow.... going to look into why now that the queues are mostly caught up.

One last update before I go to bed and disapear for 12 hours.

The backlog has went from 1.2 mil at its peak earlier today to 0,4 mil now after we reconfigured things. It is steadily going down and should have everything up to working order before I get up.

One or two people were able to get images loaded after a VERY long wait. So while images still arent working it seems very likely related to the backlog. In a few hours when the backlog clears I expect image uploads should work again. If not I will check what the problem is in the morning.

Other than that most things appear to be working and everything should be functional soon.

The back log is about 2/3rds complete. This afternoon it peaked at 1.2 million and now it is ~0.4 million. I jut moved to pg_bouncer to sped that up a bit. Looks like I more than doubled the process time. Almost there.

WOOT! the sidekiq backlog is now at 0.6 million... so its 50% of the way through from its peak! finally I can get some sleep tonight.

So the Sidekiq backlog on is still progressing. As of a few hours ago at the peak of the problem our backlog was 1.2 million jobs. As of right now its down to 0.7 million jobs. It is steadily decreasing there is just a lot to get through from the downtime. We are expecting everything to be back in working order when its done which should be around end of day. In the meantime things are still usable but you may experience very long lag on some actions.

Images still cant be uploaded, we hope this is the same problem.

It boggles my mind how people will hate more vigorously for someones opinions on how governments should be run or obscure ethical opinions, and yet a persons actual actions, and how they treat others, means almost nothing.

Like you can be the worst person in the world as long as your doing it to someone who has some obscure opinion you disagree with... like how are virtual ethics more important than actual ethics a person acts on.

Since that last tweak our Queue has went from 1.2 million backlog down to 0.9 million and continues to decrease.

We are also setting up a mechanism that should allow me to speed up the process further soon, but either way for now we are still on track to be back to caught up soon. At that point hopefully all functionality will be restored (I found a few issues and fixed them). If not we can start addressing any remaining problems at that time.

So the wqueue is going down.. its just a lot of old errors retrying. Gonna try pg_bouncer to up the connection.

@freemo I never doubted you could do it. ๐Ÿ™ƒ

Now about the "24 horus". I know Horus the Falcon, Horus the Elder, Horus the Younger, Horus the Child and Horus of Behdebt.
That's only 5. The question is: Who are the other 19? ๐Ÿ˜‰

Insidious question: Did you try to make an upgrade while moving to a new server? :ablobblastoff:

Had a lot of things wanting to say, but I ultimately forgot them when I couldn't spill them out during the maintenance of qoto.

I still don't think I'm a mastodon user several days ago, until I noticed qoto is down and I have nowhere to post my words. Rethink a bit, and I realized that qoto has become my major platform to spill out words. And I'm feeling comfortable with that. What a luxury.

(I just remembered I still have no update for my blog for this month. And the clock is ticking)

As spam comes in we will block the servers on the . Please be patient this is happening across the whole fedi and we are working on better ways to address it.

update.

So I went to bed last night and work up to find the sidekiq workers were backlogging and we have 800K backlogged jobs. It was due to a misconfiguration that I have now fixed and it appears the backlog is quickly resolving itself.

If you noticed any weirdness this should be resolved int he next few hours as the backlog clears.

@freemo thank you for all your work on this. When the dust settles, would it be possible to have a dedicated account to follow, just for maintenance and downtime updates? (Preferably on a different server, in case qoto.org is down)

is back up.

Please keep in mind will need quite a few hours to handle the backlog from downtime. Tomorrow we are going to split out the workers so they can begin using the scaling. So tomorrow this should be fixed. For the next 24 hours expect things to be a bit slow and uploading pictures probably wont work until we fix that.

Give it 24 hours and hopefully things will be back to normal at that time.

Show more
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.