How to win in fantasy football with data science

Matteo Amabili
4 min readMay 23, 2021

Introduction. In this post I will answer a very important question that (almost) every football fan has asked himself: how to win in fantasy football. In the following, for the readers who are unfamiliar with the game, I will review some of the main rules.

What is fantasy football. In fantasy football you become the president and the coach of a football team participating in a fantasy league. The league owner sets a budget equal for each president to create his own team and assigns a price to each player. At the begin of the season, you build your team via a draft: you choose 3 goalkeepers, 8 defenders, 8 midfielders, and 6 strikers. In public league, a player can be part of more than one team. On each match day you act as the coach of the team: you choose your lineup — 11 players disposed in a module, for instance 3–4–3. At the end of the match day, the league owner awards each players with a score based on player’s performance plus (or minus) some bonus (or malus). Examples of bonus are goals or assists, while malus are yellow or red cards. The score of the team is the sum of the score of the 11 players. At the end of the season, the winner is the team that obtain the highest total team score , i.e. the sum of the score of each match day. These rules may change in private league but the game’s idea remains more or less the same. The aim of this post is to propose a data driven solution to the selection of the best possible team during the draft.

How to model fantasy football. The problem of selecting the best possible players can be modeled as a binary optimization problem: given all the players, each with a price, role and performance score, determine the optimal team which maximize the total score. The performances score of a player is simply his total score of the previous season (season 2019/20, in my case). Player’s current price and past scores can be found on the webpages of the league owner. The optimization problem can be stated as:

The vector x is a binary vector where each component refer to a player: if the component is equal to 1 such player belong to the team. The vector P is the past season performance of each player. The scalar product between x and P is the total performance of the team. The vector Q represent the price for each player. The ith component of the vector gk, is equal to 1 if the ith-player is a goalkeeper, 0 if not. Same definition for the vectors df, mf, and st for defenders, midfielders, and strikers respectively. Finally N is the total number of players. The constraints are the fantasy football rules introduced above i.e to not exceed the total budget and to satisfy the rules about the composition of the team.

The solution. If someone is familiar with optimization problem, the fantasy football problem is similar to the knapsack problem. A well known python library for optimization problem is pymoo. The optimization is solved using a genetic algorithm (a full explanation of the genetic algorithm is below the scope of this article). The code below is the solution of the fantasy football problem:

import numpy as np
from pymoo.model.problem import FunctionalProblem
from pymoo.algorithms.so_genetic_algorithm import GA
from pymoo.factory import get_crossover, get_mutation, get_sampling
from pymoo.optimize import minimize
objs = [ lambda x: -x.dot(P)]
constr_ieq = [ lambda x: x.dot(Q)-Budget]
constr_eq = [
lambda x: x.dot(st)-6,
lambda x: x.dot(mf)-8,
lambda x: x.dot(df)-8,
lambda x: x.dot(gk)-3,
]
problem = FunctionalProblem(N , objs,
constr_ieq=constr_ieq, constr_eq=constr_eq)
algorithm = GA(
pop_size=200,
sampling=get_sampling("bin_random"),
crossover=get_crossover("bin_hux"),
mutation=get_mutation("bin_bitflip"),
eliminate_duplicates=True, return_least_infeasible=True)
res = minimize(problem, algorithm,
('n_gen', 2000), verbose=True, )

The vectors objs, constr_ieq, constr_eq define the utility function and the equality and inequality constraints. The utility function has a minus sign because in pymoo an optimization is always defined as a minimization problem. GA is the class used to define a genetic algorithm: the different functions in the class initialization (see the mutation, crossover and sampling parameters) are these needed for binary variables. Finally the function minimize run the optimization.

Result and conclusions. The above solution has been applied to Serie A, the Italian football league, using data from the 2019–20 season. Next figure report the best team. Looking at the 2020–21 season performance, some players are a very good choice (for instance Faraoni, who scored 3 goals: a lots for a defender, or Pessina who was the surprise of this season) while others had not a very good season (for instance Petagna, who changed team and was not anymore a starting player).

The presented solution does not fully solve the fantasy football game, it allows you to have the best possible team based on the previous year performances. On each match day, you should choose the best line up but everything is in your hand to have a fantastic fantasy football season and to became the champion.

--

--