We often think of Data science and machine learning as skills essential to a niche group of researchers, data scientists, and developers. But the world as we know today revolves around data and algorithms, just as it used to revolve around programming a decade back. As data science and algorithms get integrated into all aspects of businesses across industries, data science like Microsoft Excel will become ubiquitous and will serve as a handy tool which makes you better at your job no matter what your job is. Knowing data science is key to having a bright career in this algoconomy (algorithm driven economy).
If you are big on new year resolutions, make yourself a promise to carve your place in the algorithm-powered world by becoming data science savvy.
|Follow these three resolutions to set yourself up for a bright data-driven career.
In this three-part series, we expand on how data professionals could go about achieving these three resolutions. But the principles behind the ideas are easily transferable to anyone in any job. Think of them as algorithms that can help you achieve your desired professional outcome! You simply need to engineer the features and fine-tune the hyperparameters specific to your industry and job role.
1st Resolution: Learn the building blocks of data science
If you are interested in starting a career in data science or in one that involves data, here is a simple learning roadmap for you to develop your technical skills.
- Start off with learning a data-friendly programming language, one that you find easy and interesting.
- Next, brush up your statistics skills. Nothing fancy, just your high school math and stats would do nicely.
- Next, learn about algorithms – what they do, what questions they answer, how many types are there and how to write one.
- Finally, you can put all that learning to practice by building models on top of your choice of Machine Learning framework.
Now let’s see, how you can accomplish each of these tasks
1. Learn Python or any another popular data friendly programming language you find interesting (Learning period: 1 week – 2 months)
If you see yourself as a data scientist in the near future, knowing a programming language is one of the first things to check off your list. We suggest you learn a data-friendly programming language like Python or R. Python is a popular choice because of its strong, fast, and easy computational capabilities for the Data Science workflow. Moreover, because of a large and active community, the likelihood of finding someone in your team or your organization who knows Python is quite high, which is an added advantage.
|“Python has become the most popular programming language for data science because it allows us to forget about the tedious parts of programming and offers us an environment where we can quickly jot down our ideas and put concepts directly into action.” – Sebastian Raschka|
We suggest learning the basics from the book Learn Python in 7 days by Mohit, Bhaskar N. Das. Then you can move on to learning Python specifically for data science with Python Data Science Essentials by Alberto Boschetti.
Additionally, you can learn R, which is a highly useful language when it comes to statistics and data. For learning R, we recommend R Data science Essentials by Raja B. Koushik. You can learn more about how Python and R stand against each other in the data science domain here.
Although R and Python are the most popular choices for new developers and aspiring data scientists, you can also use Java for data science, if that is your cup of tea. Scala is another alternative.
2. Brush up on Statistics (Learning period: 1 week – 3 weeks)
While you are training your programming muscle, we recommend that you brush through basic mathematics (probability and statistics). Remember, you already know everything to get started with data science from your high school days. You just need to refresh your memory with a little practice. A good place to start is to understand concepts like standard deviation, probability, mean, mode, variance, kurtosis among others. Now, your normal high-school books should be enough to get started, however, an in-depth understanding is required to leverage the power of data science. We recommend the book Statistics for Data Science by James D. Miller for this.
3. Learn what machine learning algorithms do and which ones to learn (Learning period: 1 month – 3 months)
Machine Learning is a powerful tool to make predictions based on huge amounts of data. According to a recent study, in the next ten years, ML algorithms are expected to replace a quarter of the jobs across the world, in fields like transport, manufacturing, architecture, healthcare and many others. So the next step in your data science journey is learning about machine learning algorithms.
There are new algorithms popping up almost every day. We’ve collated a list of top ten algorithms that you should learn to effectively design reliable and robust ML systems.
But fear not, you don’t need to know all of them to get started. Start with some basic algorithms that are majorly used in the real world applications like linear regression, naive bayes, and decision trees.
4. Learn TensorFlow, Keras, or any other popular machine learning framework (Learning period: 1 month – 3 months)
After you have familiarized yourself with some of the machine learning algorithms, it is time you put that learning to practice by building models based on those algorithms. While there are many cloud-based machine learning options that have click-based model building features available, the best way to learn a skill is to get your hands dirty.
There is a growing range of frameworks that make it easy to build complex models while allowing for high degrees of customization. Here is a list of top 10 deep learning frameworks at your disposal to choose from. Our favorite pick is TensorFlow. It’s Python-based, backed by Google, has a very good documentation, and there are tons of tutorials and videos available on the internet to guide you. You can find a comprehensive list of books for learning Tensorflow here.
We also recommend learning Keras, which is a good option if you have some knowledge of Python programming and want to get started with deep learning. Try the book Deep Learning with Keras, by Antonio Gulli and Sujit Pal, to get you started.
If you find learning from multiple sources daunting, just learn from Sebastian Raschka’s Python machine learning book.
Once you have got your fundamentals right, it is important to stay relevant through continuous learning and reskilling. Check out part 2 where we explore how you could about doing this in a systematic and time efficient manner. In part 3, we look at ways you can own your work and become aware of its outcome.