Project Analysis
image

FIFA 19 Player Value Prediction & Recruitment Strategy

About Project

We decided to analyze the players' data from the game FIFA 19. There is a lot of several players from different countries, playing in different competitions. Their abilities in the game should reflect their real-world skills. The game's creators are about its efforts by creating game attributes, such as sprint speed, force bullets, endings or headers that we can express numerically. In the project, we will try to classify players into a game room based on these attributes, and positions and predict their market value in the game. Since the game does not always show players the market value of the player, but only its attributes, our model can be useful in determining the amount offered for the transfer of the player when negotiating in the game. When in the real world the abilities of the players (e.g., ending) do not bear any numerical. We will try to use the fact that in the game of expression, we can find out which attributes are important for certain gaming positions.

Problem Statement

Football clubs face the challenge of selecting the right players from a global pool with varying skill levels, prices, and growth potential. Relying solely on scouts and intuition can lead to overpriced or underperforming signings. Clubs like Manchester United need a data-driven approach to:

  • Predict accurate player market values
  • Identify high-potential, affordable players
  • Support recruitment with concrete metrics

Objective

  • Analyze the FIFA 19 dataset to identify patterns in player performance and valuation.
  • Build a machine learning model to predict market value of players.
  • Use the model to assist Manchester United in selecting 5 new signings based on specific criteria: young age, high potential, and affordability.
  • Proposed Solution

    • Cleaned and transformed the FIFA dataset (89 columns, 18k+ players)
    • Engineered new features like Potential Gap
    • Transformed monetary columns: Value, Wage, Release Clause
    Modeling Approaches
    • Linear Regression (with and without PCA)
    • Random Forest
    • Gradient Boosting
    • XGBoost (multiple variants)
    • Support Vector Machines
    Player Filtering Criteria
    • Age less than 27
    • Overall rating greater than 80
    • Positive Potential Gap
    • Value and Release Clause under €80M
    Final Output
    • Matched top players to Manchester United’s position needs: RW, LB, RM, CM, CB

    Technologies Used

    Machine Learning Models
    • Linear Regression
    • Gradient Boosting
    • XGBoost
    • Random Forest
    • Support Vector Machines
    Techniques Applied
    • Principal Component Analysis (PCA)
    • Box-Cox Transformation
    • Cross-validation
    • Feature Selection
    Dataset
    • FIFA 19 player dataset
    • 18,207 rows
    • 89 features

    Challenges Faced

    • Complex formatting in monetary features (symbols like €, M, K)
    • Data sparsity in categorical fields like position and club
    • Need for dummification and scaling before applying models
    • High-dimensional correlation and multicollinearity management
    • XGBoost and SVM were computationally intensive on the full dataset

    Methodology

    Data Cleaning
    • Removed redundant/low-variance columns
    • Eliminated irrelevant text and image-related data
    Feature Engineering
    • Converted "Value" and "Wage" from string to numeric format
    • Created Potential Gap = Potential - Overall
    Dimensionality Reduction
    • Performed PCA to reduce features from 89 to 47
    • Applied correlation filtering to remove highly collinear variables
    Modeling
    • Used caret package for model training with 10-fold cross-validation
    • Compared models using RMSE as the performance metric
    Final Deployment
    • Shortlisted 5 ideal signings based on predicted value, availability, and performance fit

    Result / Outcome

    Best Performing Model
    • XGBoost (Model 4) achieved the lowest RMSE: 0.7217
    Top 5 Suggested Player Signings
    • T. Werner – RW
    • F. Thauvin – RM
    • Jorginho – CM
    • D. Alaba – LB
    • N. Süle – CB
    Transfer Budget
    • Total Budget: €186M
    Business Impact
    • Enabled data-driven recruitment decisions
    • Helped avoid high-fee, high-risk player signings

EDA
ML MODEL