UFC Racing Bar Chart

GitHub: https://github.com/cinhui/ufc-bar-chart
Technologies used: Python, D3.js, Javascript

Motivation

The goal behind this project was to produce a racing bar chart showing the top 15 UFC fighters with the most number of UFC fights over time.

Data

The match data used here was obtained by web scraping the results tables from UFC event Wikipedia pages (as described in an earlier post). The data contained UFC events from Nov 12th, 1993 through May 9th, 2020. There was a total of 514 events (excluding cancelled events) with a total of 5576 matches.

The matches data was loaded into a Pandas data frame. The Date and Event columns provide the date and the number of the event. Fighter1 and Fighter2 are the two fighters in the match. The Result columns tells us if it was a defeat, draw, or no contest. The Method column tells us how the match concluded, e.g. whether it was knockout, submission, etc. The Round and Time columns provides the duration of the match. The Weight class column tells the weight class division of the match.

<class 'pandas.core.frame.DataFrame'>
Int64Index: 5576 entries, 0 to 10
Data columns (total 11 columns):
Date            5576 non-null object
Event           5576 non-null object
Fighter1        5576 non-null object
Fighter2        5576 non-null object
Method          5576 non-null object
Note            5576 non-null object
Result          5576 non-null object
Round           5576 non-null float64
Time            5576 non-null object
Time (secs)     5576 non-null int64
Weight class    5576 non-null object

Since the UFC’s official record book reflects all UFC fights from UFC 28 to the present, we take a subset of the dataset with only matches starting from Nov 17th, 2000. UFC 28 which took place on Nov 17th , 2000 was the first UFC event to use the Unified Rules of MMA.

matches_subset_df = matches_df[matches_df['Date'] >='2000-11-17']

Since UFC 28, there was been 483 events and 5314 matches.

Number of matches: 5314
Number of events: 483

The next step was to calculate the number of wins, losses, draws, no contests, along with total number of matches for each fighter. This was determined based on the the Result and Method columns. The pseudocode is presented below.

records_df = pd.DataFrame(columns=('fighter','date','result'))

for index, row in matches_subset_df.iterrows():
    if row['Result']=='def': 
         # There was a win and a loss
         # Assign row['Fighter1'] as a win
         # Assign row['Fighter2'] as a loss
    elif row['Result'] =='vs' and row['Method'] == "No Contest": 
         # It was a no contest
         # Assign row['Fighter1'] and row['Fighter2'] as a no contest
    elif row['Result'] == 'vs' and row['Method'] == "Draw": 
         # It was a draw
         # Assign row['Fighter1'] and row['Fighter2'] as a draw 
    else:
         print("Error")
         break

This generates a data frame that contains the fighters are rows and the columns are a count of the number of their wins, losses, draws, no contests, and total matches. This is one of the input files that will be used for creating the bar chart.

The next step was to extract list of fighters, and based on most recent weight class and assign a color that will be used in the bar chart. There were a total of 1890 fighters (men and women). This data frame was then written to a json file to be used for the bar chart.

records_df = records_df.sort_values(by=['fighter','date'])
output_fighter_df = records_df.drop_duplicates('fighter', keep='last')[['fighter','date','weight class']].reset_index(drop=True)
output_fighter_df['bar_color'] = output_fighter_df['weight class'].apply(lambda x: get_bar_color(re.sub(r' ([^)]*)', '', x)))

Racing Bar Chart with D3.js

The bar chart was created using D3.js and Javascript. The layout of the racing bar chart is depicted in the diagram below. The x-axis shows the number of matches. The y-axis shows a sorted list of top 15 fighters with the largest number of matches in decreasing order. The bars represent fighters as they climb towards the top position. The pseudocode for presented below.

// Load data files
// sequence.csv contains a numbered list of the dates and event names for each UFC event in chronological order.
// datafile is a csv file where the rows are the fighters and the columns are the total number of fights for each time since.
// fighterfile is a json file that contains the color coding for each fighter.

Promise.all([
     d3.csv("sequence.csv"),
     d3.csv(datafile),
     d3.json(fighterfile),
... 

const sequenceStart = 0;
const sequenceEnd = sequenceArray.length;

// Build initial svg layout
...
...

// Use scaleLinear() to transform (or map) data values into position variables on the svg canvas
let x = d3.scaleLinear()
.domain([0, d3.max(sequenceValue, d => d.value)])
.range([margin.left, width-300-margin.right-160]);

let y = d3.scaleLinear()
.domain([max_value, 0])
.range([height-margin.bottom, 1.5*margin.top]);

// Start animation
let ticker = d3.interval(e => { 

     // Call the function to obtain the top 15 fighters with the largest total number of matches at the current sequence (time slice)
     sequenceValue = computeDataSlice();
     // The fighters are sorted in decreasing order and assigned a rank
     sequenceValue.forEach((d,i) => d.rank = i);

     // Update labels and bars
     // Width of the bars is a function of the number of total matches
     ...

     // Increment sequence until the end
     sequence++; 
     if(sequence> sequenceEnd) 
     ticker.stop();
}, delayDuration);

The bars enter from the bottom left corner of the svg canvas. The x-value as it appears on the screen is the same for each bar. The width is calculated based on the number of total matches (scaled to fit the canvas using the scaleLinear function). The y-value is computed based on its rank that was assigned after computing the data slice. When the bar transitions, it updates its y-value and the width of the bar. When the bar exits, it moves off the screen in the y-axis and is removed.

bars 
     .enter() 
     .append('rect') 
     .attr('class', d => `bar ${d.name.replace(/\s/g,'_')}`) 
     .attr('x', x(0)+1) 
     .attr( 'width', d => x(d.value)-x(0)) 
     .attr('y', d => y(max_value+1)+5) 
     .attr('height', y(1)-y(0)-barPadding) 
     .style('fill', d => d.color) 
     .transition() 
     .duration(tickDuration) 
     .ease(d3.easeLinear) 
     .attr('y', d => y(d.rank)+5);
bars
     .transition()
     .duration(tickDuration)
     .ease(d3.easeLinear)
     .attr('width', d => Math.max(0, x(d.value)-x(0)))
     .attr('y', d => y(d.rank)+5);

bars
    .exit()
    .transition()
    .duration(tickDuration)
    .ease(d3.easeLinear)
    .attr('width', d => Math.max(0, x(d.value)-x(0)))
    .attr('y', d => y(max_value))
    .remove();

Here is the racing bar chart for the top 15 UFC fighters with the greatest total number of matches over time since UFC 28 on Nov 17th, 2000.

Screenshot of Racing Bar Chart. Click to open link.

Full source code can be found in the project GitHub repository.

Related Links

Here are two videos that uses these data driven charts.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.