MSDS692 Data Science Practicum Author: "Frederick Pletz" YouTube Presentation: https://youtu.be/-YGfAKSknOQ
This project analyzes Magic the Gathering (MtG) card data to identify the power curve for MtG chreatures. Additionally, the creature types and keyword abilities are visualized to compare with the MtG color wheel.
FinalCreatureData.csv - csv containing cleaned creature data for all "simple" creatures over the last three decades of MtG cards. Fred Pletz - Final Presentation.ppt - final presentation slides used in YouTube presentation. LICENSE PracticumFinal.Rmd - R mark down code for project PracticumFinal.docx - R mark down generated word document
MtG is a trading card game where players cast cards that represent spells, summoning creatures or causing affects such as buffing creatures or removing all creatures from play. From a game design perspective, MtG presents an extremely unique challenge since there are so many card interactions possible from the over 20,000 unique magic cards created since the game was first produced in 1993. This project attempts to assess the power curve of magic cards to identify if there is a way to roughly calculate the cost of any given card - and therefore judge it's relative strenght in the game.
For this project, I started by downloading the data from mtgjson.com and doing exploratory analysis in Excel. Based on project timeline, I filtered to "simple" creature cards to allow time to complete the analysis. The filtering reduced the dataset to roughly 10% of the overall cards, using creature cards with fixed power and toughness and non-keyword mechanics. Cards that are illegal for turnament play and other special cards were also filtered out. After filtering the data set, the columns were narrowed to those relavent for the specific data sicence application of calculating mana cost. I added the date in hopes I would have time to calculate power curve changes over time, but unfortunately did not have time to complete that portion of the project. I also added a new data column representing the non-colorless mana cost of any given card. Colored mana has to be paid by the specific color, which can result in not having the correct mana color for a spell. That means colorless mana cost would theoretically be "cheaper" than mana of a specific color in the mana cost.
For additinal exploritory analysis, I used Rapid Miner to run multiple algorithms against the data. This provided a quick view of the expected end results of the project, providing a shortcut to ensure the project would provide useful results.
Finally, I loaded the data into R and used a general linear model and random forrest model to create predictive algorithms for determining the cost of a simple creature spell. The resulting General linear model had a decent success rate of aproximately 60% when rounding the decimal costs to the nearest integer for comparison to the actual integer spell costs. The model had 151 results with higher predicted costs, and 181 with lower predicted costs, and 517 predicted accurately. Using the heatmap visualization, it becomes clear that the majority of predictions were not far off from the calculated cost - with a few surprising outliers.
The Random Forest model showed an even better result, with fewer outliers:
Assessing the word clouds, it became almost comically obvious that there was selection bias in the data set, as all but two mana colors had "flying" as the number one keyword ability. I think that this was primarily due to the reduction in scope to "simple" creatures, and a broader text analysis would likely result in a more accurate picture of each color's focus on the color wheel. However, there were still interesting things brought out from the word clouds:
White, focus on humans, soldiers, knights, angels, and cats with abilites focusing on life gain, coordination, and combat advantage
Blue, focus on sea creatures and elementals with defensive abilities and countering as a key focus
Black, a focus on undead and evil creatures with abiliteis to drain life and cause fear
Red, a focus on war-like mytical creatures and abilities that focus on fast tempo
Greern, a focus on wild creatures and elves with abilitiesd that focus on countering other colors and overwhelming through superior strenght
Uncolored being made up entirely of artifacts with a lot of inanament creatures with no specific ability focus
This research provided an initial view in the power curves of MtG. The research would benefit greatly by expanding to all MtG cards. This becomes difficult with the sheer number of unique abilities that are hard to quantify, so barring expanding to ALL cards, it would at least be good to see research that included more common non-keyword abilities.
It is possible to map MtG spells to an aproximate cost, though due to the nature of the game, no calculation will ever be an exact value. This research provides insight into the rough power curve of cards to show over and under-valued spells.
MTGJSON (n.d.) Magic The Gathering cards in portable formats. Retrieved from: http://www.mtgjson.com/.