From Pete’s Desk is a blog series by Peter Hudson, co-founder and CEO at BitLit.
There’s a piece of advice that I’ve heard in start-up circles: “do stuff that doesn’t scale”… It’s good advice, because it means you can focus on doing anything you can to solve the problem you’ve got right in front of you. Well this morning there was a story about us on LifeHacker and then one on CNet… and the influx of new users caused more than a bit of smoke to start coming out of a our servers.
As of 9:04AM Pacific Time, we’re working on spinning up new servers on AWS, but there seems to be a limit of 20 EC2 instances that they’ll let you spin up without getting approval for an account volume increase. Amazon: I know you guys creep this site… if you’re reading this, increase our EC2 limit so we can actually process all the shelfies that people are sending!
For anybody with a technical bent, our stack is Scala on Play with Zookeeper to coordinate the C++ computer vision processes. Since AWS is being fussy about letting us spin up more servers, we’re trying to deploy more shelfie processes on Google Compute.
Thanks for all the support and interest. Cue the twitter #failwhale… and just because we recently signed The Oatmeal’s publisher, here’s an artist’s impression of a Tumble Beast doing horrible things to our servers:
—-Update as of 14:26 Pacific Time— We’ve fixed a few things and the shelfie segmenter (the thing that slices up a shelfie into individual books) is now working nicely at scale. But we’re still having some issues with PostgreSQL on RDS. Stay tuned. All the shelfies that are getting uploaded will be processed… the queue is just a bit deeper than we’d like to admit.
—- Update as of 17:31 Pacific Time— We’ve fixed a scaling and indexing bug in Elastic Search, so the shelfie queue should start processing a lot faster now… but there are over 20,000 shelfies in the queue. So it might take a while for the backlog to clear. Thanks everybody for the positive tweets and encouragement today.
—- Update as of 19:42 Pacific Time— We’ve fixed a bunch of scaling issues with Elastic Search, but now is seems like PostgeSQL in our RDS instance on AWS might be a bottleneck. The bad news is that fixing that might mean a 2 to 3 minute server side shutdown. Fingers crossed we won’t have to do that.
—- Update as of 1:31 AM Pacific Time Dec 24 — Things are running much more smoothly now. Marius has been Gandalf The White of Elastic Search & AWS today. But that said, there’s still a HUGE queue of shelfies that need to get run through the neural networks for spine recognition. We just checked and the queue it was over 30,000 shelfies deep… so at the current rate it’ll take up to 12 hours to clear the backlog. So that “your shelfie will take 15 minutes to process” message in the app is going to be wrong for the next 24 hours or so. Thanks again to everybody for the support and tweets and emails. It’s been a completely wild day — we have more people take shelfies today than took shelfies during the entire of 2014! You’re amazing. We’ll keep working to make BitLit awesome and get you free/cheap bundled ebooks.
—-Update as of 9:08 AM Pacific Time Dec 24 — Things are now running pretty well. The shelfie queue is processing and hopefully we’ll have everything processed and out by the end of day today (Merry Christmas, here’s your shelfie result and some free ebooks). Thanks again (and again) for all the positive emails, tweets, and comments! It means the world to us to see that you all seem to like what we’ve been working on for the last 2 years.
—-Update as of 15:54 Pacific Time Dec 24 — Things have been running smoothly for most of the day. We were able to swap up our RDS instance to a db.m3.xlarge instance during a lull at around 3:00 this morning. If anybody was taking a shelfie around then for a 2 minute window, we might have lost your shelfie — sorry. But the good news is that seemed to solve the bottleneck. The OCR servers have been plowing through the queue of shelfie images which at it’s peak was 30,000 deep. It’s now down to around 11,000 and we’ve been staying in front of the load all day. Thanks everybody for the support and understanding as we got things to scale up over the last 36 hours. The enthusiasm from readers around the world about BitLit, eBook bundling, and Shelfies has been an incredible early Christmas present. Thank you and Merry Christmas to all.
Pete & Marius