Process 1.3 - Problem Framing

🎯

Overview

Problem Framing is one of the most critical processes in any ML project. It bridges the gap between what the business needs and what ML can deliver. A poorly framed problem leads to wasted resources, missed expectations, and failed projects.

This process involves translating business requirements into a precise ML problem statement, selecting the appropriate modeling approach, defining inputs and outputs, and establishing clear boundaries for what the solution will and won't do.

💼

Business Problem

"We're losing customers"

→

🔄

Translation

Analysis & Decomposition

→

🤖

ML Problem

"Predict churn probability"

📊

ML Problem Types

Understanding the type of ML problem you're solving is fundamental. Each type requires different algorithms, data structures, and evaluation approaches:

🏷️

Classification

Predict discrete categories or labels from input features.

Examples: Spam detection, fraud identification, customer segmentation, image recognition

📈

Regression

Predict continuous numerical values from input data.

Examples: Price prediction, demand forecasting, risk scoring, lifetime value estimation

🎯

Clustering

Group similar items together without predefined labels.

Examples: Customer segmentation, anomaly detection, document grouping, market basket analysis

⭐

Recommendation

Suggest relevant items based on user behavior or preferences.

Examples: Product recommendations, content suggestions, next-best-action, personalization

⏱️

Time Series

Analyze and predict sequential data over time.

Examples: Sales forecasting, stock prediction, resource planning, trend analysis

🔤

NLP

Process and understand natural language text.

Examples: Sentiment analysis, chatbots, text summarization, entity extraction

⚡

Key Activities

1

Business-to-ML Translation
Convert business objectives into specific ML tasks. Identify what needs to be predicted, classified, or generated, and how it connects to business value.
2

Problem Type Selection
Determine the appropriate ML problem type (classification, regression, clustering, etc.) based on the desired output and available data characteristics.
3

Input/Output Definition
Clearly specify what features (inputs) will be used and what the model should produce (outputs). Define data types, formats, and expected ranges.
4

Scope Delimitation
Establish clear boundaries: what the model will and won't do, which use cases are in scope, and what edge cases should be handled.
5

Hypothesis Formulation
Document key assumptions and hypotheses about the problem. What patterns do you expect to find? What relationships should exist in the data?
6

Constraint Identification
Identify technical, business, and ethical constraints. Consider latency requirements, explainability needs, fairness criteria, and regulatory compliance.

🧩

Problem Framing Framework

Use this structured approach to systematically frame your ML problem:

📋

The 5W+H Framework for ML

W What — What exactly needs to be predicted/classified/generated?
W Who — Who will use this prediction and how?
W When — When is the prediction needed? (Real-time vs. batch)
W Where — Where will the model be deployed? (Cloud, edge, mobile)
W Why — Why is ML the right solution? What's the alternative?
H How — How will success be measured? What's the baseline?

❓

Key Questions to Answer

Problem Definition

What is the core business problem?
Can this be solved with ML?
What's the simplest version of this problem?
How would a human solve this today?

Data & Features

What data is available?
What features might be predictive?
Is labeled data available?
How much data is needed?

Output & Usage

What should the model output?
How will predictions be consumed?
What latency is acceptable?
How often will it be used?

Constraints & Risks

What are the hard constraints?
What could go wrong?
Are there fairness concerns?
What's the cost of errors?

📦

Deliverables

📄

Problem Statement Document

Formal ML problem definition with inputs, outputs, and constraints

🎯

Scope Definition

Clear boundaries of what's in and out of scope

💡

Hypothesis Document

Key assumptions and expected patterns to validate

⚠️

Constraints Catalog

Technical, business, and ethical constraints list

🛠️

Recommended Tools

🧠

Miro / Mural

Collaborative problem mapping

📊

Lucidchart

System diagrams & flows

📝

Notion / Confluence

Documentation & collaboration

🔬

Jupyter Notebooks

Exploratory analysis

📋

Google ML Problem Framing

Structured framework guide

🎨

FigJam

Visual brainstorming

💡

Best Practices

✓

Start Simple
Begin with the simplest version of the problem. You can always add complexity later. A working simple model beats a complex model that never ships.
✓

Challenge the Problem
Question whether ML is really needed. Sometimes rule-based systems or simpler analytics can solve the problem more efficiently.
✓

Consider the Human Baseline
Understand how humans currently solve this problem. This provides context for what ML needs to beat and reveals important domain knowledge.
✓

Think About Edge Cases Early
Identify edge cases and exceptions during framing. They often reveal gaps in understanding and prevent surprises during development.
✓

Document Assumptions Explicitly
Every assumption is a potential point of failure. Document them clearly so they can be validated and revisited.

💡 Pro Tips

Reframe if stuck: If a problem seems unsolvable, try reframing it from a different angle.
Decompose complex problems: Break large problems into smaller, independently solvable sub-problems.
Consider proxy targets: If direct measurement is hard, find a correlated metric that's easier to predict.
Plan for iteration: The first framing is rarely perfect. Build in time to revisit and refine.

⚠️ Common Pitfalls to Avoid

Solutioning too early: Don't jump to algorithms before fully understanding the problem.
Ignoring constraints: Technical constraints discovered late can derail entire projects.
Overcomplicating: Adding unnecessary complexity that doesn't add business value.
Vague problem statements: "Improve customer experience" is not an ML problem.

📄

Templates & Resources

📥

Problem Statement Template

Structured template for ML problem definition

📥

5W+H Worksheet

Guided questionnaire for problem framing

📥

Hypothesis Canvas

Visual template for documenting assumptions

📥

ML Problem Type Decision Tree

Guide for selecting the right problem type