Major queue processing improvements.
Due to lack of traffic and motivation, I haven't worked on this in a few months besides a fix in trying to process non existent users sometime in December. A 38% increase in users though in the last 2 weeks and positive feedback from users has put the spring back in my step however. With ~550 users, my daily updates have been taking an excessive amount of time to complete, about 2 users/minute which stretches it past 4 hours a day, or 1/6th of the day.
My idea originally was to create a new queue system that'd have around 5 workers, that whenever they'd to scrape something, would put in a request to a new download queue that would download and return the request, during which the worker would wait until a response. That'd require a bit of a rework of the worker though, so for meantime, I took some elements of the new worker and altered the current one to process up to 5 different users at a time. I noticed however that it wasn't really improving the queue rate; the workers slowed down as there were more of them. I found that the bottlecap was not the scraping, but the amount of sql requests being done. One idea I had in the past but wasn't sure how to do was to update only series that the user has updated; no sense in continuously processing a show that they watched years ago and won't ever update again. But at the time, I could only think of using the last updated xml tag, which only considers episode changes. Based off what I did with users as a whole to avoid processing inactive users, I hash the show details and compare for changes. I also reduced the number of sql requests by bundling some of them together, I'm unsure how much better this would be, but I think it's a definite improvement. Finally, there's more robust handling of errors now, to hopefully prevent the queue from clogging. I think at one point, an error had clogged the queue to 5000, which was almost of 2 weeks of data lost.
In the end, I was able to get the queue rate from 2 users/minute to about 9/minute, completing the daily updates in an hour. A 450% increase in speed, not bad. Also added a link to this service's club on MAL and provided an explanation for ratings, HWTW, and Hipster Score on the main page.