Hey, there! My name is Pierre Liebenberg. I’m an indie iOS developer, and I built Scan Thing: Scan Anything. I have a background in sociology, education, UI/UX design and, more recently, software engineering. Currently, I lead a team of iOS developers at Paycom.
I’ve been working on Scan Thing for quite a long time. In fact, I wrote some of the foundations while I was working on my first app, Lexico: A Dyslexia Reading Aid, which I built and released in 2017 when we discovered our youngest son had severe dyslexia. While building Lexico, I found that the app needed a document-scanner of some sort to make it easy for parents of children with dyslexia to digitize analog texts in a way that made them usable in Lexico. Lexico’s core functionality relies on PDFs, so a document scanner that simply converts an image into a PDF wouldn’t do the trick. Instead, we needed to run OCR against a scan and embed the recognized text (sized and structured with regards to line height and font size) in an image of the original document – something that no other PDF-scanning app did. Simply extracting the text wouldn’t do either, because, for children’s books, pictures provide important context for young readers. Therefore, we needed a high-fidelity digital version of a physical document where all the text was searchable and selectable.
Once I had the document-scanning working in Lexico, I added basic text recognition, because, again, I wanted something that my son could use to quickly grab text from a sign or restaurant menu. I never shipped that code, but kept it in a small custom app for my son. After that, having solved two of the three use cases, it occurred to me that scanning text is one thing, but scanning anything is another thing altogether – and one that was worth pursuing.
The app’s name has been around since the initial idea. It was a working title, but, the more I thought about it, it started to make a certain kind of sense. I mean, it scans anything, so giving it a nebulous name (kind of like Swamp Thing) reflected the fact that it’s difficult to pigeonhole the app.
I began trying my hand at image segmentation shortly after I solved the document-scanning problem. I gave up five or six times over the course of the last two years. It’s an incredibly hard problem to solve. I needed to get far more comfortable with machine learning and computer vision than I was at the time.
A large number of late nights and weekends later, I had a working prototype. After that, it was the usual “the last 10% of the app takes 90% of the time” timeline. I spent a lot of time making the UI as snappy, fluid, and minimalistic as possible. I learned about simplifying UI design to the extent that it translates across abilities, age groups, and languages when I built Lexico. As a result, you’ll notice modified UI components that had their origin in Lexico (the selection rectangles and mode switches are good examples of this) in Scan Thing. I had to cut a few features for the 1.0 release in order to focus on polishing the remaining features. In particular, I left out flash support, support for the front-facing camera, text-scanning for photo-library images, and support for inserting alternative backgrounds behind scanned objects. At this point, I’ve issued updates to include all of these features, with the exception of the alternative-backgrounds feature.
I didn’t have much of a launch strategy other than to tell family and friends. I did, however, contact MacStories to let them know about the app’s release. They included it in their Club MacStories newsletter – for which I’m endlessly grateful. From there, iphoneblog.de picked it up, wrote a very kind review, and then things got very busy in Germany very quickly. I owe Scan Thing’s German and German-speaking users an enormous debt of gratitude for their support. They’ve been absolutely wonderful.
More recently, I put Scan Thing up on ProductHunt where it spent part of the day in the #9 position because of the efforts of 200 or so amazing hunters (not least @Shawn Roos of Canva) before going down to #13. That exposure generated a lot of interest, and a lot of positive feedback. iPhoneIslam also wrote a very positive review, and I’d like to thank Arabic-speaking users of Scan Thing for their support.
One of Scan Thing’s tent pole features is that it doesn’t require an internet connection, an account, or a subscription. There’s also no data collection or ads, and all the processing happens on the device. As a rule of thumb, I also avoid using third-party frameworks and libraries in my code base. This is partly because I’m stubborn, but also because I don’t like maintaining dependencies and any code that I let into my code base potentially compromises my users’ privacy.
This approach meant that I had to get comfortable with machine learning and computer vision to the extent that I didn’t have to rely on cloud-based solutions. There is an enormous body of open-source work in the computer-vision space (not to mention learning resources), which I found immensely helpful.
Stanford’s Introduction to Computer Vision course is a great place to start. The lectures are freely available on Youtube. After that, Google’s free machine-learning crash course is another great resource. We ultimately stand on the shoulders of giants, and I’m very grateful for the work done by brilliant engineers and data scientists in the field.
There aren’t any ready-made solutions out there for the kind of thing that Scan Thing does, so a large part of the development process included learning and understanding work done in the open-source space and writing large amounts of proprietary code that make Scan Thing what it is today. In terms of languages, Scan Thing’s development involved Swift, some Objective-C, and some Python.
At the first-party level, Apple’s CoreML and Vision APIs are extremely powerful; I recommend using them if you haven’t already.
I think the biggest benefit is that you can work on whatever it is that you’re passionate about. Passion for a subject and success are not, however, synonymous no matter how much work goes into it.
I tend to avoid discussing upcoming features in specifics, but, broadly, I have a number of feature requests that I’d like to implement (document-scanning as the default scanning mode is a big one as is 360-degree scanning), and I’d like to add the alternative-backgrounds feature sooner rather than later.
I’m also working on a 2.5D side-scroller called Signal, which I’m building in Unity, but that’s a conversation for another time.
I don’t have a traditional computer-science background as I’m completely self-taught (this is more common than you might think). I learned Swift – and programming in general – by watching Lynda.com courses available through our local public library. I also – I kid you not – used Swift Playgrounds to learn. It’s a great resource and highly recommend it for people who are new to, or curious about, programming.
As far as games go, I’ve been absolutely obsessed with Playdead’s Inside ever since they released it.
My favourite and most-used app is Reeder (it’s an RSS reader and the attention to detail in that app is astounding).
A knowledge-sharing community for app developers