IBM’s SPSS Modeler provides a powerful, versatile workbench that allows you to build efficient and accurate predictive models in no time. What else separates IBM SPSS Modeler from other enterprise analytics tools out there today? To know just that, we talk to arguably two of the most popular members of the SPSS community.
Keith is a career-long practitioner of predictive analytics and data science, has been engaged in statistical modeling, data mining, and mentoring others in this area for more than 20 years. He is also a consultant, an established author, and a speaker. Although his consulting work is not restricted to any one tool, his writing and speaking have made him particularly well known in the IBM SPSS Statistics and IBM SPSS Modeler communities.
Jesus is an independent statistical consultant and has been using SPSS products for over 20 years. With a Ph.D., in Psychometrics from Fordham University, he is a former SPSS Curriculum Team Lead and Senior Education Specialist, and has developed numerous SPSS learning courses and trained thousands of users.
In this interview with Packt, Keith and Jesus give us more insights on the Modeler as a tool, the different functionalities it offers, and how to get the most out of it for all your data mining and analytics needs.
Key Interview Takeaways
- IBM SPSS Modeler is easy to get started with but can be a tricky tool to master
- Knowing your business, your dataset and what algorithms you are going to apply are some key factors to consider before building your analytics solution with SPSS Modeler
- SPSS Modeler’s scripting language is Python, and the tool has support for running R code
- IBM SPSS Modeler Essentials helps you effectively learn data mining and analytics, with a focus on working with data than on coding
Predictive Analytics has garnered a lot of attention of late, and adopting an analytics-based strategy has become the norm for many businesses. Why do you think this is the case?
Jesus: I think this is happening because everyone wants to make better-informed decisions. Additionally, predictive analytics brings the added benefit of discovering new relationships that you were previously not aware of.
Keith: That’s true, but it’s even more exciting when the models are deployed and are potentially driving automated decisions.
With over 40 years of combined experience in this field, you are master consultants and trainers, with an unrivaled expertise when it comes to using the IBM SPSS products. Please share with us the story of your journey in this field. Our readers would also love to know how your day-to-day schedule looks like.
Jesus: When I was in college, I had no idea what I wanted to be. I took courses in many areas, however I avoided statistics because I thought it would be a waste of time, after all, what else is there to learn other than calculating a mean and plugging it into fancy formulas (as a kid I loved baseball, so I was very familiar with how to calculate various baseball statistics). Anyway, I took my first statistics course (where I learned SPSS) since it was a requirement, and I loved it. Soon after I became a teaching assistant for more advanced statistics courses and I eventually earned my Ph.D. in Psychometrics, all the while doing statistical consulting on the side. After graduate school, my first job was as an education consultant for SPSS (where I met Keith). I worked at SPSS (and later IBM) for seven years, at first focusing on training customers on statistics and data-mining, and then later on developing course materials for our trainings. In 2013 Keith invited me to join him as an IBM partner, so we both trained customers and developed a lot of new and exciting material in both book and video formats. Currently, I work as an independent statistical and data-mining consultant and my daily projects range from analyzing data for customers, training customers so they can analyze their own data, or creating books and videos on statistics and data mining.
Keith: Our careers have lots of similarities. My current day to day is similar too. Lately, about 1/3rd of my year is lecturing and curriculum development for organizations like TDWI (Transforming Data with Intelligence), The Modeling Agency, and UC Irvine Extension. The majority of my work is in predictive analytics consulting. I especially enjoy projects where I’m brought in early and can help with strategy and planning. Then, the coach and mentor take over a team until they are self-sufficient. Sometimes building the team is even more exciting than the first project because I know that they will be able to do many more projects in the future.
There is a plethora of predictive analytics tools used today – for desktop and enterprises. IBM SPSS Modeler is one such tool. What advantages does SPSS Modeler have over the others, in your opinion?
Keith: One of our good friends who co-authored the IBM SPSS Modeler Cookbook made an interesting comment about this at a conference. He is unique in that he has done one-day seminars using several different software tools. As you know, it is difficult to present data mining in just one day. He said that only with Modeler he is able to spend some time on each of the CRISP-DM phases of a case study in a day. I think he feels this way because it’s among the easiest options to use. We agree. While powerful, and while it takes a whole career to master everything, it is easy to get started.
Are there any prerequisites for using SPSS Modeler? How steep is the learning curve in order to start using the tool effectively?
Keith: Well, the first thing I want to mention is that there are no prerequisites for our PACKT video IBM SPSS Modeler Essentials. In that, we assume that you are starting from scratch. For the tool in general, there aren’t any specific requisites as such, however knowing your data, and what insights you are looking for always helps.
Jesus: Once you are back at the office, in order to be successful on a data mining project or efficiently utilize the tool, you’ll need to know your business, your data, and the modeling algorithm you are using.
Keith: The other question that we get all the time is how much statistics and machine learning do you have to know. Our advice is to start with one or maybe two algorithms and learn them well. Try to stick to algorithms that you know. In our PACKT course, we mostly focus on just Decision Trees, which one of the easiest to learn.
What do you think are the 3 key takeaways from your course – IBM SPSS Modeler Essentials?
The 3 key takeaways from this course, we feel are:
Start slow. Don’t pressure yourself to learn everything all at once. There are dozens of “nodes” in Modeler. We introduce the most important ones so start there.
Be brilliant in the basics. Get comfortable with the software environment. We recommend the bests ways to organize your work.
Don’t rush to Modeling. Remember the Cross Industry Standard Process for Data Mining (CRISP-DM), which we cover in the video. Use it to make sure that you proceed systematically and don’t skip critical steps.
IBM recently announced that SPSS Modeler would be available freely for educational usage. How can one make the most of this opportunity?
Jesus: A large portion of the work that we have done over the past few years has been to train people on how to analyze data. Professors are in a unique position to expose more students to data mining since we teach only those students whose work requires this type of training, whereas professors can expose a much larger group of people to data mining. IBM offers several programs that support professors, students, and faculty; for more information visit: https://www-01.ibm.com/software/analytics/spss/academic/
Keith: When seeking out a university class, whether it be classroom or online, ask them if they use Modeler or if they allow you to complete your homework assignments in Modeler. We recognize that R based classes are very popular now, but you potentially won’t learn as much about Data Mining. Sometimes too much of the class is spent on coding so you learn R, but learn less about analytics. You want to spend most of the class time actively working with data and producing results.
With the rise of open source languages such as R and Python and their applications in predictive analytics, how do you foresee enterprise tools like SPSS Modeler competing with them?
Keith: Perhaps surprisingly, we don’t think Modeler does compete with R or Python. A lot of folks don’t know that Python is Modeler’s scripting language. Now, that is an advanced feature, and we don’t cover it in the Essentials video, but learning Python actually increases your knowledge of Modeler. And Modeler supports running R code right in a Modeler stream by using the R nodes. So Modeler power users (or future power users) should keep learning R on their to-do list. If you prefer not to use code, you can produce powerful results without learning either by just using Modeler straight out of the box. So, it really is all up to you.
If this interview has sparked your interest in learning more about IBM SPSS Modeler, make sure you check out our video course IBM SPSS Modeler Essentials right away!