Introduction

Achieving a position on a D1 college football team and subsequently progressing to the NFL represents an aspirational goal for high school varsity football players. During my own tenure as a high school football player, I frequently compared myself to those professional NFL football players occupying equivalent positions on the field and harbored a desire to enhance my own physical stature. While the skills that one brings to a position through talent and dedication may serve as a primary determinant for gaining entry into an NFL team, factors such as body status (including weight and height), college selection, and other considerations may also play a pivotal role in the selection process. Accordingly, I resolved to develop an interactive dashboard for high school varsity football players to acquire a more comprehensive understanding of these critical considerations.

There were three datasets utilized in the creation of this dashboard, all of which are linked above. The initial dataset comprises NFL player data ranging from the 1940s to 2017, while the second and third datasets contain geometry-related information pertinent to players' respective birth states. The original dataset included a vast array of labels for player positions. To streamline the classification process, I condensed all labels into the primary positions within the offensive, defensive, and special teams units, employing my expertise in football. Subsequently, I handled any missing values within the dataset and merged the various tables using Python, resulting in a much more refined format conducive to the deployment of visualizations within JavaScript. For a more comprehensive understanding of how the dataset was cleaned and processed, kindly refer to the linked source code.

Question1:

What are the weight and height patterns among players categorized by position?


To answer this question, I have created a scatterplot. There are 8 buttons below, and each one will provide the data for the corresponding position.

Summary of Marks and Channels
Color Scheme d3.schemeCategory10: Categorical
Marks Points representing individual players in the dataset
X position channel Encodes the player's height (a continuous variable)
Y position channel Encodes the player's weight (a continuous variable)
Color channel Encodes the player's position (a categorical variable)

Based on the scatterplot presented above, it is evident that there exists a positive correlation between a player's height and weight across all positions in the NFL. Nonetheless, it appears that Defensive Linemen (DL) display the weakest positive correlation among all positions. Furthermore, Quarterbacks (QB) generally exhibit greater height compared to Wide Receivers (WR) and Running Backs (RB), are shorter in stature relative to Tight Ends (TE), and exhibit similar heights to their Offensive Linemen (OL) counterparts. However, it is evident that while QBs share comparable heights with OLs, they typically exhibit less weight to enhance their mobility, whereas OLs require additional weight to bolster their protective capabilities.

back to top

Question2:

Which colleges have produced a higher number of players drafted in the first round?


To answer this question, I have created a bar plot that only includes colleges that have had at least 5 players drafted in the first round of NFL drafts in their history. The bar plot below includes an interactive tooltip that shows the exact data when you hover your mouse over a bar.

Summary of Marks and Channels
Color Scheme d3.interpolateGreens: Sequential (Single-Hue)
Marks Bars whose height encode the number of players drafted from each college.
X position channel Encodes the colleges (a categorical variable).
Y position channel Encodes the number of players drafted (a quantitative variable).
Color channel Encodes the number of players drafted (a quantitative variable).

Based on the bar plot presented above, we can clearly see that Ohio State University, the University of Southern California, the University of Miami, the University of Notre Dame, and the University of Alabama are the top five universities that have had the most NFL players drafted in the first round from the 1940s to 2017. This is not surprising because these 5 teams all have a long history and a famous team culture. Therefore, when considering selecting a D1 college program, those schools are all good options because they have the history and connections to help players get higher draft picks.

back to top

Question3:

Do NFL teams prefer certain positions in the draft?


To answer this question, I have created a heatmap. I assigned a linearly converted draft score based on the draft position, where the higher the score, the earlier a player is drafted. In the heatmap below, each box represents the average draft score of the corresponding team for drafting the corresponding position. The heatmap includes an interactive tooltip that shows the exact average draft score when you hover your mouse over a box.

Summary of Marks and Channels
Color Scheme d3.interpolateBlues: Sequential (Single-Hue)
Marks Boxes whose color encodes the average draft score of
the corresponding team for the corresponding position.
X position channel Encodes the player's position (categorical variable).
Y position channel Encodes the team (categorical variable).
Color channel Encodes the average draft score (continuous variable).

Based on the heatmap presented above, it is evident that the graph corresponds to external knowledge. For instance, the Green Bay Packers are renowned for their aversion to selecting wide receivers early in the draft to support their long-term quarterbacks. The heatmap reflects this stance, as the WR position exhibits one of the lowest draft scores for the Packers among all positions. Conversely, the heatmap illustrates that both the Jacksonville Jaguars and Cincinnati Bengals exhibit a tendency to select quarterbacks early in the draft. Although this pattern is not captured in the data used in this visualization, it is consistent with their recent decision to draft Trevor Lawrence and Joe Burrow in the first overall pick.

back to top

Question4:

Which states have produced the most NFL players?


To answer this question, I created a geomap to display the number of NFL players born in each state. I did this because certain states may have a stronger football culture for middle and high school players, providing them with a greater chance of being recruited by D1 football programs and eventually getting drafted by an NFL team. The geomap also includes an interactive tooltip that shows the exact number of players when you hover your mouse over a state.

Summary of Marks and Channels
Color Scheme d3.interpolateBlues: Sequential (Single-Hue)
Marks Geographical map of the US states.
Color channel Encodes the number of NFL players born in each state.
Darker color represents more players.
Size channel Encodes the number of NFL players born in each state.
A larger NFL logo represents more players.

Based on the geomap presented above, it is clear that states such as Texas, California, and Florida have the highest number of NFL players born. Generally, the southern and midwestern states have a stronger football culture, which correlates with the larger number of NFL players they produce. Therefore, if someone wants to play professional football, considering a move to those states and getting exposed to their football culture as early as possible could increase their chances of playing professionally in the future.

back to top

Question5:

What is the typical salary range for active NFL players?


To answer this question, I created a boxplot to display the salary range for active NFL players. Since salaries can vary by position, the boxplot includes an interactive dropdown menu that allows users to select a specific position and display the corresponding salary range for active players.

Select a position to view the salary range of active players:
Summary of Marks and Channels
Color Scheme d3.schemeCategory10: Categorical
Marks Boxes that encode the 25th and 75th quartiles and median salary values.
Whiskers that encode the cutoff for outliers.
Points that encode individual players in the dataset.
X position channel Encodes the salary for NFL players.
Color channel Encodes the position of NFL players.

Based on the boxplot presented above, it is clear that salaries do vary by position. It is not surprising that quarterbacks (QBs) make the most among all positions, followed by both offensive and defensive linemen. It is also not surprising that tight ends (TEs) usually make more than wide receivers (WRs) because it is harder to find someone who is big and tall yet still has the mobility and athleticism required for the TE position. However, the outliers for WRs make much more than the outliers for TEs because WRs are much more likely to become stars. Therefore, the takeaway for youth football players is this: if you have arm talent, try your best to become a QB; if you are big, become a lineman; otherwise, try your best to become the best skill player you can be.

back to top

Conclusion

That's it for this visualization project! Some key takeaways are: