🔍 Phase 1 • Process 1.3

Problem Framing

Transform business challenges into well-defined ML problems by identifying the right approach, scope, and constraints for your machine learning solution.

Duration
3-7 Days
Key Roles
Data Scientist, ML Engineer, PO
Complexity
🔴 High
🎯

Overview

Problem Framing is one of the most critical processes in any ML project. It bridges the gap between what the business needs and what ML can deliver. A poorly framed problem leads to wasted resources, missed expectations, and failed projects.

This process involves translating business requirements into a precise ML problem statement, selecting the appropriate modeling approach, defining inputs and outputs, and establishing clear boundaries for what the solution will and won't do.

💼
Business Problem
"We're losing customers"
🔄
Translation
Analysis & Decomposition
🤖
ML Problem
"Predict churn probability"
📊

ML Problem Types

Understanding the type of ML problem you're solving is fundamental. Each type requires different algorithms, data structures, and evaluation approaches:

🏷️
Classification
Predict discrete categories or labels from input features.
Examples: Spam detection, fraud identification, customer segmentation, image recognition
📈
Regression
Predict continuous numerical values from input data.
Examples: Price prediction, demand forecasting, risk scoring, lifetime value estimation
🎯
Clustering
Group similar items together without predefined labels.
Examples: Customer segmentation, anomaly detection, document grouping, market basket analysis
Recommendation
Suggest relevant items based on user behavior or preferences.
Examples: Product recommendations, content suggestions, next-best-action, personalization
⏱️
Time Series
Analyze and predict sequential data over time.
Examples: Sales forecasting, stock prediction, resource planning, trend analysis
🔤
NLP
Process and understand natural language text.
Examples: Sentiment analysis, chatbots, text summarization, entity extraction

Key Activities

  • 1
    Business-to-ML Translation
    Convert business objectives into specific ML tasks. Identify what needs to be predicted, classified, or generated, and how it connects to business value.
  • 2
    Problem Type Selection
    Determine the appropriate ML problem type (classification, regression, clustering, etc.) based on the desired output and available data characteristics.
  • 3
    Input/Output Definition
    Clearly specify what features (inputs) will be used and what the model should produce (outputs). Define data types, formats, and expected ranges.
  • 4
    Scope Delimitation
    Establish clear boundaries: what the model will and won't do, which use cases are in scope, and what edge cases should be handled.
  • 5
    Hypothesis Formulation
    Document key assumptions and hypotheses about the problem. What patterns do you expect to find? What relationships should exist in the data?
  • 6
    Constraint Identification
    Identify technical, business, and ethical constraints. Consider latency requirements, explainability needs, fairness criteria, and regulatory compliance.
🧩

Problem Framing Framework

Use this structured approach to systematically frame your ML problem:

📋
The 5W+H Framework for ML
  • W What — What exactly needs to be predicted/classified/generated?
  • W Who — Who will use this prediction and how?
  • W When — When is the prediction needed? (Real-time vs. batch)
  • W Where — Where will the model be deployed? (Cloud, edge, mobile)
  • W Why — Why is ML the right solution? What's the alternative?
  • H How — How will success be measured? What's the baseline?

Key Questions to Answer

Problem Definition
  • What is the core business problem?
  • Can this be solved with ML?
  • What's the simplest version of this problem?
  • How would a human solve this today?
Data & Features
  • What data is available?
  • What features might be predictive?
  • Is labeled data available?
  • How much data is needed?
Output & Usage
  • What should the model output?
  • How will predictions be consumed?
  • What latency is acceptable?
  • How often will it be used?
Constraints & Risks
  • What are the hard constraints?
  • What could go wrong?
  • Are there fairness concerns?
  • What's the cost of errors?
📦

Deliverables

📄

Problem Statement Document

Formal ML problem definition with inputs, outputs, and constraints

🎯

Scope Definition

Clear boundaries of what's in and out of scope

💡

Hypothesis Document

Key assumptions and expected patterns to validate

⚠️

Constraints Catalog

Technical, business, and ethical constraints list

🛠️

Recommended Tools

🧠
Miro / Mural
Collaborative problem mapping
📊
Lucidchart
System diagrams & flows
📝
Notion / Confluence
Documentation & collaboration
🔬
Jupyter Notebooks
Exploratory analysis
📋
Google ML Problem Framing
Structured framework guide
🎨
FigJam
Visual brainstorming
💡

Best Practices

  • Start Simple
    Begin with the simplest version of the problem. You can always add complexity later. A working simple model beats a complex model that never ships.
  • Challenge the Problem
    Question whether ML is really needed. Sometimes rule-based systems or simpler analytics can solve the problem more efficiently.
  • Consider the Human Baseline
    Understand how humans currently solve this problem. This provides context for what ML needs to beat and reveals important domain knowledge.
  • Think About Edge Cases Early
    Identify edge cases and exceptions during framing. They often reveal gaps in understanding and prevent surprises during development.
  • Document Assumptions Explicitly
    Every assumption is a potential point of failure. Document them clearly so they can be validated and revisited.
💡 Pro Tips
  • Reframe if stuck: If a problem seems unsolvable, try reframing it from a different angle.
  • Decompose complex problems: Break large problems into smaller, independently solvable sub-problems.
  • Consider proxy targets: If direct measurement is hard, find a correlated metric that's easier to predict.
  • Plan for iteration: The first framing is rarely perfect. Build in time to revisit and refine.
⚠️ Common Pitfalls to Avoid
  • Solutioning too early: Don't jump to algorithms before fully understanding the problem.
  • Ignoring constraints: Technical constraints discovered late can derail entire projects.
  • Overcomplicating: Adding unnecessary complexity that doesn't add business value.
  • Vague problem statements: "Improve customer experience" is not an ML problem.
📄

Templates & Resources

📥

Problem Statement Template

Structured template for ML problem definition

📥

5W+H Worksheet

Guided questionnaire for problem framing

📥

Hypothesis Canvas

Visual template for documenting assumptions

📥

ML Problem Type Decision Tree

Guide for selecting the right problem type