Do Not Use a Chainsaw To Slice a Mango
Data cleaning can be a tiresome, difficult and even boring business. Having to transform data from its raw form to something that can be analysed, is one of the most time-consuming tasks Data Scientists will ever undertake.
Even well known data cleaning tools like Alteryx, Data Ladder and WinPure, despite their best efforts, struggle to improve the aforementioned situation. What’s worse is that, these tools, great as they are, tend to offer much more than what is actually needed by the everyday user.
Data Cleaning Tools and FOMO
As is the case with most Enterprise solutions, data cleaning tools tend to come with a lot of features bundled in but most of us will probably end up using less than 10% of them. It is almost as if these features are only included so that they can cover all bases… and that is a bad way to do business. Sure, a few people reading this post might have use for ALL those features but, in my experience, this is rarely the case.
This FOMO-like behaviour by software developers, of course, leads to high software procurement costs and budget-breaking after-sales training sessions that most people would happily do without.
Besides, most users tend to have simple data cleaning needs e.g. removing duplicates from email lists, meaning that being forced to buy the aforementioned tools (with all their unnecessary extra features) is simply overkill.
Choose The Right Tool for the Job
Flookup is a lightweight data cleaning add-on for Google Sheets. It processes data using a home-brewed fuzzy matching algorithm (Peregrine), that is built on years of experience of cleaning some of the worst data you can find anywhere. This experience produced a robust algorithm that is arguably the fastest algorithm online, when stacked up against popular algorithms like Levenshtein, Cosine Similarity, etc.
… and Flookup achieves all of this without requiring any coding (I mean, data cleaning is hard enough without having to df this or df that!)
Built for Humans
Flookup enjoys the advantage of being part of Google Sheets, an application that also has numerous built-in functions and solutions that can make data cleaning easier and more user-friendly.
I personally used Flookup in a big name-matching project for 3 years and it not only increased our efficiency, but it also reduced overall project costs. Tasks that took days to complete were being completed in mere minutes, with a less than 1% error rate on average.
So, if you are feeling adventurous and want to try a lean, yet powerful data cleaning solution, then give Flookup a shot.
No, it will not bake you a pizza, but it will certainly make it easier to handle everyday tasks like:
- Removing or highlighting fuzzy duplicates.
- Merging databases.
- Comparing and correcting fuzzy text entries.
Thank you for reading this far and I hope I have written enough to spark your curiosity.
I hope to see you among our 70,000+ users soon.