Midterm
Introduction
We’ve all seen or at least heard of Marvel. Whether it’s the Avengers, Fantastic Four, or the X-Men, kids from all over the world play with and idolize these heroes. While their screen time in the MCU has been scarce, the X-Men remain one of the oldest teams in all of comics, with countless issues published since their debut in 1963, chronicling their valiant efforts to protect a world that fears and hates them. I chose to do my project on this team because I was one of those kids. What better way to apply everything that I’ve learned in this class than by connecting it with something that I love?
Sources
The origins of this data come from https://github.com/malcolmbarrett/claremontrun/blob/master/data-raw/character_and_location_data.xlsx. All data is from Chris Claremont’s 16-year stint on the X-Men and tracks very specific character stats, such as explicitly stating “I love you” (to whom).
and visible tears. With every team comes a weakest link. For this project, I decided to identify the weakest member of the X-Men by measuring effectiveness and vulnerability through these stats: # of times captured, rendered unconscious, declared dead, and total kills.
There was a lot of formatting and cleaning to do with this data, and it left me lost and frustrated. First, I got rid of all the unnecessary data I didn’t need by removing columns. I then reordered the rows, making the character names the farthest left, then the issue number. There were 183 issues during Claremont’s 16-year stint, which made blanking down on OpenRefine very annoying. This wasn’t enough, though, as I imported my data into both Flourish and RawGraphs, and nothing came out. The main issue was that there were too many rows with no numbers on the board. To fix this, I highlighted all of my data columns and replaced all blanks with 0s, getting rid of empty data.
Processes
To identify the weakest link, I used Flourish to create an interactive chart that contrasts character durability and impact in combat. I filtered the charts using the issue number, and the different colors represent an individual stat. By mapping this out, I could make an objective statement on which characters were either sacrificial lambs used to progress the plot against those who constantly dictated the outcome of the story.
Presentation
I decided to go with a stacked bar chart to quantitatively show multiple variables side by side. I found this to be the most transparent way of showing the results. Unfortunately, I wasn’t able to find a way to show total counts for each character given the number of issues the data covers. I was very satisfied with the color scheme and layout that Flourish generated, so I didn’t make any changes there. It’s simple, easy to read, and not distracting—all things taught to us by Ms. Winton. Embedding an interactive visualization is so important because it allows others to explore your data and make claims of their own. Just like Marvel movies, it turns something static into an interactive story.
Significance
This approach to the data revealed that “weakness” is a narrative choice rather than the power level of the character. After counting up all the totals, the data showed that Storm is the weakest character. This is inherently false because Storm is regarded as one of the strongest members of the X-Men, having the power to control the weather and atmosphere. It’s rather because she is so strong that writers had to nerf her to allow other characters to shine. This project is a perfect example of distant reading. I analyzed 16 years of data to uncover implicit patterns in how characters are treated by the writers.