I’ve been vicariously living on the outskirts of data visualization, data analysis, and data science since I graduated from college in May. I decided to take the plunge in creating public visualizations about two weeks ago. But first — storytime!
I have a long history with data. My first experience was in grade school as a human web-scraper, grabbing information off webpages to put in Excel spreadsheets for my dad.
The next set of notable experiences were in middle school. I got excited about aviation (planes still hold a special place in my heart) and had started regularly flying a flight simulator. I wanted a way to track my flight time and various breakouts (daytime, nighttime, instrument, etc.). My dad helped me design an Excel spreadsheet to do this. This was my introduction to formulas and locking reference cells. Around that time, I wanted to have a career in air traffic control. So while hunting around at different universities to see their offerings, we developed yet another spreadsheet to analyze and compare universities. This also extended to myriad financial models for personal finance projects.
In college, I was introduced to MATLAB and Mathematica. They became my preferred calculators for homework, projects, and exams that would allow their use. During one of my introductory physics labs, I wrote a MATLAB program to simplify the calculations for a projectile motion problem. I made extensive use of Microsoft Excel for calculations and plotting in my intermediate physics lab (example) and advanced physics lab. However, I returned to MATLAB again to plot and analyze information from a CSV file that was too large to show properly in Excel (figures 19–23). I also used Mathematica to graph some equations (figures 1–5) that were otherwise so complicated as to defy an intuitive understanding.
After I graduated, I looked back at my four years of college. I discovered that I enjoyed analyzing and graphing information. As a result, I decided to improve my abilities in both arenas. After reading books (such as Storytelling with Data), listening to podcasts like the Super Data Science Podcast and the How to Get an Analytics Job Podcast, attending the second 2020 Super Data Science Go Virtual Conference, watching and practicing through quite a few LinkedIn Learning courses, and reading plenty of articles (mostly on Medium), I finally felt ready to start creating my own work. You can see a sneak-peak of my beginnings here.
First, you need to sign up to create a Tableau Public profile (it’s free).
Once you have created your profile, you’ll receive an email to verify and activate your account. Complete that and set up the rest of your profile. I added a profile photo, links, and a neat little bio.
After your account is activated, if you don’t already have Tableau Desktop, make sure to download Tableau Public (it’s free). It has limitations in data access compared to Tableau Desktop. Also, everything is public, so don’t use Tableau Public for anything confidential.
When your visualization is ready to publish, go to Server -> Tableau Public -> Save to Tableau Public. The page, story, or dashboard you are on when you hit save will be the default view. Once the information has been processed, a browser window will open with the editing field where you’ll be able to add tags and a brief description.
Tableau has their own free trainings available. You can also learn through books, college courses, boot camps, YouTube videos, or online courses. I learned 98% of what I needed from here. Professor Google was fantastic for everything else, with plenty of YouTube tutorials and articles to explain what I needed help with.
I chose two different data sets to start with. First, as a data nerd, I’ve been tracking my productivity at work. With a little cleanup from my tracking Google Spreadsheet, I created a dashboard of my most prominent activities, along with a text card to explain it all. I added an information icon at the bottom to add more descriptive information without overwhelming the visual. I found the tutorial to create that icon on YouTube.
For the second visualization, I was inspired by a document my university sent out. By law, they’re required to report campus crime statistics. Unfortunately, they chose a boring table to do so (actually a series of tables so they didn’t bleed across PDF pages).
I thought it could be striking to create a graphical representation. It shows some clear patterns, which could indicate larger underlying matters that need to be addressed. Those, unfortunately, aren’t clear in the above table.
Instead of creating one dashboard, I thought this would be best displayed as a story with multiple pages. On the first page, I made a cross between a heatmap and three separate treemaps. These clearly show that some of the tracked offenses happen far more frequently than the others. I plan to add data labels to only the most prominent blocks in a future cleanup, but this is good for now.
As a reminder about treemaps, you can think of them as a squarified version of a pie chart. Since this page was created with the aggregate measures, I added a custom tooltip to show the breakout by location on each square.
I also created a heatmap table as an alternate means of displaying the same data as the treemaps.
I had a scattered Google Spreadsheet that I exported as a CSV for data import for my first visualization. There wasn’t much cleaning involved, although all the notes attached to each cell were stripped, which is sufficient for now, although they could be needed later.
For the second visualization, the data was presented in a PDF table. I tried copying it into MS Word to create a table before placing it into Excel. However, that resulted in some weird misformatting. I wound up copying the data by hand. If you discover any errors (you can download the workbook and look at the included data), please let me know. In the future, as I continue my coding journey, I’ll likely use a package to scrape the data and prepare it in a more friendly format. That will be iteration two on this process!
After walking you through how I got into data analysis and data visualization, I showed you how to create a Tableau Public profile and get started. I also walked you through some of the choices I made in creating my first two Tableau Public visualizations. If you have any questions, suggestions, or comments, please leave them below.
I look forward to hearing from you!