As I mentioned in yesterday’s post, one of my personal tech topics that I want to explore in 2017 is data science.  For as long as I’ve known, I love data.  As a hobbyist in my teens, I was playing with Access and reporting on data.  I eventually migrated to Visual Basic talking to Access… which led to me taking an internship right out of high school where I was QAing data sheets and working with a contractor on an app that was migrating an Access database to a VB front end and SQL Server back end.  That contractor saw my curiosity and excitement around data, and he introduced me to the Oracle database administrator.  Fast forward into my career – lots of fun writing data reports in Crystal Reports and SQL Server Reporting Services and wearing the database administrator hat over many versions of SQL Server!  Moving right along, I end up writing and supporting web applications that talk to SQL Server back ends.  Nowadays, I’m working at The Software Guild, writing database curriculum for both C# and Java cohorts and encouraging our apprentices to explore databases – amongst other topics.  I get to play with SQL Server and MySQL.

However, as much as I get to play with these tools and data, I’ve been more curious about the topic that is getting a lot of talk – data science.  One of my friends asked what we wanted to learn more about in 2017, and when I mentioned data science, another friend asked if I had met Matthew Renze yet.  While I hadn’t crossed paths with him at that point, I was curious.  He linked me to his courses, which gave me an idea of what to expect with the pre-compiler.  Most of all, I was looking forward to a day of data science at CodeMash, hoping to see what all the talk was about.

Pre-compiler – Practical Data Science with R

With a name like “practical data science”, I went into the pre-compiler expecting how to work with R and put it in practice.  The name of the pre-compiler workshop set the expectations for me quite clearly.  Reading the abstract and the pre-reqs for it, everything was spelled out enough for me to have reasonable expectations going into it.

R and RStudio

In this Practical Data Science with R workshop, we learned about the R language and used RStudio to run through labs on various topics in data science.  I really enjoyed Matthew’s storytelling, weaving a story around a fictitious guy’s ridiculous idea for a space western musical movie.  We played with a movies dataset for many of our labs, looking at the data and seeing why this guy’s musical idea was a bit ridiculous and unwise. For some other labs, we also played with iris data.

Looking at the R language, it made sense to me.  Everything being treated as a vector… I had seen that in other languages before, so it didn’t seem foreign.  The arrows of assignment reminded me of lambda syntax in Java and C#… oh arrows and lambdas and assignments… again, it seemed familiar enough.  The indexing with the ranges reminded me of my adventures with Ruby Koans of CodeMashes past.    Even now, as I recap this, I am realizing that some of the familiarity is due to my past background – surviving engineering and math statistics courses using MATLAB and Maple.  In fact, during the workshop, I mentioned to my friend Victor that I wish I had this mentality back then, as my advanced math classes may have been more tolerable back then.  Playing with R reminded me of how much I love analyzing data and building out visualizations.

R in Visual Studio

In the workshop, Matthew Renze mentioned that you could also run these things in Visual Studio.  Of course, I couldn’t resist – running a new language for me in a tool I am quite familiar with!  I installed R Tools for Visual Studio and ran through the labs from today in Visual Studio.  I really like that the Ctrl-Enter to execute code in RStudio carried over into Visual Studio.  The visualizations were neat to see when I ran them in Visual Studio.

Inspiration to Play More

After sitting through the data science workshop today, I realized a lot about myself and my love of data.  I realize that my love of data really hasn’t changed in the past couple decades – I really do enjoy seeing what all is in a database, how the data relates, the various trends, cleaning it up, understanding why there are certain trends and what the outliers may indicate.  While I had a quick flashback to younger me not happy in my classes in college that introduced the concepts, I realized that I still like the visualizations and calculations, and with the right teachers, things aren’t as bad as they once seemed.  Playing with data makes me excited, and today’s workshop reaffirmed that.

This really confirmed – 2017 will be my year to have fun with data science.