🎯 Introduction

The first step in building any machine learning solution is framing the problem correctly. This involves answering several key questions:

Type of learning: Is it a supervised, unsupervised, or reinforcement learning problem?
Features: What input data will the model use?
Labels/targets: If supervised, what are the outputs we want to predict?
Evaluation: What amount of error is acceptable, and what metrics will we use to measure success?

These decisions are highly context-dependent and should consider the nature of the data, the task, and the success criteria.

<aside> 📒

Example:

Suppose you are building a system to predict customer churn for a subscription service. This is a supervised classification problem:

Features: Usage metrics, subscription plan, engagement history
Labels: Whether the customer churned (yes/no)
Metric: Accuracy, F1-score, or AUC depending on the business priority </aside>

🔄 Reframing

Reframing is the process of changing how we represent the output or the objective of a problem to improve learning, make it more practical, or align it better with business goals. A single task, like building a recommendation system, can be framed in multiple ways—as a classification, regression, or ranking problem—and the choice profoundly affects what the model ultimately optimizes for.

🔀 Regression ↔ Classification

Some problems naturally lend themselves to one framing but benefit from being reframed as the other.

<aside> 📒

Example: A video recommendation system

Classification framing: Predict whether a user will watch a specific video (yes/no).
- Issue: This might prioritize clickbait content that maximizes clicks but not engagement.
Regression framing: Predict the fraction of the video a user is likely to watch.
- Benefit: This encourages the system to recommend videos that users actually enjoy and watch fully. </aside>

<aside> 📒

Example: Predict rainfall

Regression framing: Predict how much it will rain tomorrow
- Issue: Rain is a probabilistic event, predicting a single value is often misleading.
Classification framing: Predict discrete ranges of rainfall (0–1 mm, 1–5 mm, 5–10 mm). </aside>

🎭 Single-task → Multitask

Reframing isn't just about switching between regression and classification; it can also mean fundamentally redefining the objective of the task. Multitask learning is a powerful form of reframing where, instead of a single target, you train the model on several related targets simultaneously.

<aside> 📒

Example: Movie recommendation

Suppose we have a large movie database with customer ratings from 1 to 5, for all movies that the user had watched and rated. Our task is to build a machine learning model that will be used to serve movie recommendations to our users.

A single-task classification approach might take a user's watch history and ratings as input to predict the next movie to recommend. This is a simple, direct approach.
However, we could reframe this as a multitask learning problem by training the model to predict not just a specific movie, but also several key characteristics of users who are likely to watch a given movie (e.g., customer segment, income, preferred genre).

Instead of training the model to predict one specific movie at a time, we train it to understand what kind of person would enjoy a given movie (their age group, interests, preferred genres, etc.). Once the model learns these patterns, we can use it for many different recommendation scenarios—trending videos, documentaries, classics—simply by asking "what type of user would watch this?" We don't need to build and train a new model each time we want to recommend a different category of content.

This flexibility is a core reason why reframing matters. It also helps mitigate unintended biases by steering models away from optimizing for the wrong objective (like clicks over engagement) and improves model utility by aligning the output more closely with the real-world goal.

</aside>

⚙️ The Mechanics of Multitask Learning

Multitask learning allows a model to learn multiple related tasks at the same time. This is often more effective than training separate models for each task because the shared knowledge helps the model generalize better and reduces overfitting.

There are two main approaches to sharing parameters in a multitask learning model: