Skip to main content

Understanding SAM Clustering Results

Overview

SAM provides comprehensive clustering outputs designed to support both technical analysis and business decision-making. This guide explains how to interpret all 6 quality metrics (Silhouette, Davies-Bouldin, Calinski-Harabasz, Cluster Imbalance, Cluster Separation, Cluster Cohesion) and use them effectively for strategic planning.

Primary Outputs

1. Cluster Assignments (CSV Export)

Professional CSV output with cluster labels, quality metrics, and business indicators for strategic analysis

Standardized Multi-Column Format:

Record_ID | Cluster_Label | Silhouette_Score | Distance_to_Center |
Business_Metrics | Feature_Values | Quality_Indicators

Key Features:

  • Cluster Labels: Each record assigned to its optimal cluster
  • Quality Scores: Individual point silhouette scores for validation
  • Business Metrics: Revenue, profit, and operational indicators per record
  • Feature Values: Original and transformed feature values
  • Distance Metrics: Proximity to cluster centers and boundaries

2. Visual Analytics (Interactive Charts)

Chart Components:

  • Cluster Separation Plots: 2D/3D visualization of cluster boundaries
  • Silhouette Analysis: Individual point quality assessment
  • Feature Importance: Key distinguishing factors visualization
  • Business Dashboards: Performance metrics per cluster
  • Quality Heatmaps: Cluster separation and cohesion visualization

3. Executive Summary (PDF Report)

Complete executive PDF report with cluster performance, visual analytics, business insights, and strategic recommendations

Multi-Page Professional Report:

  • Title Page: Project overview and generation date
  • Cluster Summary: Model rankings and recommendations
  • Visual Analytics: All charts included with captions
  • Business Insights: Key findings and strategic implications
  • Technical Glossary: Metric definitions and interpretations

4. Advanced Visualization Suite

Task 4.1: Foundational Visualizations

  • Geospatial Distribution Map: Geographic clustering patterns
  • Performance Quadrant: Revenue vs margin scatter plots
  • Persona DNA Radar Chart: Comparative cluster profiles
  • Cluster Summary Table: High-level performance metrics

Task 4.2: Deep-Dive Analytics

  • Assortment Strategy Heatmap: Product mix analysis by cluster
  • Geographic Dominance Matrix: Regional cluster distribution
  • Trend vs Density Analysis: Competitive dynamics visualization
  • Strategic Role Mapping: Business segment classification

Task 4.3: Final Report Generation

  • Professional PDF: Multi-page executive report
  • Chart Integration: All visualizations embedded
  • Business Narratives: AI-generated insights and recommendations
  • Action Plans: Specific strategic recommendations per cluster

Understanding Quality Metrics

Primary Quality Indicators

Silhouette Score

What it measures: How well each point fits in its assigned cluster

  • Range: -1 to 1 (higher is better)
  • Excellent: > 0.7 (Clear cluster separation)
  • Good: 0.5-0.7 (Reasonable separation)
  • Fair: 0.2-0.5 (Weak separation)
  • Poor: < 0.2 (No clear separation)

Business Interpretation:

Silhouette Score = 0.65 means:
• 65% of points are well-separated into distinct clusters
• Clear business segments are identifiable
• Suitable for strategic decision-making

Davies-Bouldin Index

What it measures: Cluster compactness and separation (lower is better)

  • Excellent: < 0.5 (Very compact, well-separated clusters)
  • Good: 0.5-1.0 (Reasonable compactness)
  • Fair: 1.0-2.0 (Moderate quality)
  • Poor: > 2.0 (Poor cluster quality)

Business Interpretation:

Davies-Bouldin = 0.8 means:
• Clusters are reasonably compact and well-separated
• Business segments are distinct and actionable
• Good foundation for strategic planning

Calinski-Harabasz Score

What it measures: Between-cluster vs within-cluster variance (higher is better)

  • Excellent: > 2000 (Strong cluster separation)
  • Good: 1000-2000 (Reasonable separation)
  • Fair: 500-1000 (Moderate separation)
  • Poor: < 500 (Weak separation)

Simplified Quality Ratings

Cluster Quality Assessment

Our AI automatically grades cluster performance:

  • Excellent (Silhouette > 0.7): High confidence for strategic decisions
  • Good (Silhouette 0.5-0.7): Reliable for operational planning
  • Fair (Silhouette 0.2-0.5): Useful for directional guidance
  • Poor (Silhouette < 0.2): Consider additional data or different approach

Confidence Levels

Risk assessment for cluster reliability:

  • High: Clear separation, consistent patterns, strong model fit
  • Medium: Moderate uncertainty, acceptable for most planning
  • Low: High variability, use with caution, consider alternative approaches

Business Intelligence Metrics

Cluster Profiling and Analysis

Cluster Size Distribution

What it measures: Balance and interpretability of cluster sizes

  • Balanced: Similar-sized clusters (ideal for business segments)
  • Skewed: One dominant cluster (may indicate natural business hierarchy)
  • Fragmented: Many small clusters (may need consolidation)

Business Performance Metrics

Compare key business indicators across clusters:

Cluster 1: High Performers
• Size: 150 stores (25%)
• Avg Revenue: $2.1M
• Avg Margin: 18.5%
• Growth Rate: +12%

Cluster 2: Growth Opportunities
• Size: 200 stores (33%)
• Avg Revenue: $1.4M
• Avg Margin: 12.3%
• Growth Rate: +8%

Feature Importance Analysis

Identify which variables most distinguish clusters:

  • Revenue Drivers: Key factors driving high performance
  • Risk Indicators: Variables associated with underperformance
  • Growth Factors: Characteristics of high-growth clusters
  • Operational Metrics: Efficiency and productivity indicators

Strategic Segmentation Analysis

Business Segment Classification

Our AI automatically classifies clusters into business segments:

High Performers (Revenue > $2M, Margin > 15%):

  • Strategy: Expansion & Replication
  • Priority: HIGH - Study and replicate success factors
  • Actions: Scale successful practices, invest in growth

Growth Opportunities (Revenue < $1.5M, Margin < 12%):

  • Strategy: Support & Optimization
  • Priority: HIGH - Requires immediate attention
  • Actions: Performance improvement, targeted interventions

New Ventures (Age < 1 year):

  • Strategy: Growth Support
  • Priority: MEDIUM - Monitor maturation progress
  • Actions: Development support, patience for growth

Geographic Clusters (Regional concentration):

  • Strategy: Regional Strategy
  • Priority: MEDIUM - Regional optimization
  • Actions: Local market strategies, regional resources

Advanced Quality Metrics

Reliability and Confidence

Model Reliability Score (0-100)

Calculation: Quality-adjusted confidence measure

  • 90-100: Extremely reliable, suitable for critical decisions
  • 70-89: Good reliability, appropriate for most planning
  • 50-69: Moderate reliability, use with additional validation
  • < 50: Low reliability, consider alternative approaches

Cluster Stability Score

What it measures: Consistency of cluster assignments across multiple runs

  • High Stability: Consistent cluster assignments
  • Low Stability: Variable assignments, higher uncertainty
  • Business Impact: Planning confidence and risk assessment

Separation Coefficient

Technical Measure: Average distance between cluster centers / average cluster radius Business Interpretation:

  • > 2.0: Very clear separation between business segments
  • 1.5-2.0: Good separation, actionable segments
  • < 1.5: Overlapping segments, consider consolidation

Data Quality Indicators

Cluster Cohesion

Scale: 0-1, where higher values indicate tighter clusters

  • > 0.8: Very cohesive business segments
  • 0.6-0.8: Good cohesion, clear segment identity
  • < 0.6: Loose segments, may need refinement

Cluster Separation

Scale: 0-1, where higher values indicate better separation

  • > 0.7: Clear business segment boundaries
  • 0.5-0.7: Good separation, actionable segments
  • < 0.5: Overlapping segments, consider alternative approaches

Model Performance Comparison

Model Rankings Table

Our executive summary includes a comprehensive comparison:

ModelQuality GradeSilhouetteReliability ScoreBest Use Case
HDBSCANExcellent0.7394Strategic Segmentation
K-MeansGood0.5887Operational Clustering
GMMExcellent0.7196Risk Assessment

Recommendation Engine

Best Model Selection: Our AI recommends the optimal model based on:

  • Quality Performance: Silhouette score and separation metrics
  • Business Context: Interpretability and actionability requirements
  • Data Characteristics: Shape, size, and complexity factors
  • Computational Efficiency: Processing time and resource requirements

Risk Assessment Framework

High Confidence Scenarios (Use clusters directly)

  • Quality Grade: Excellent
  • Silhouette Score > 0.7
  • Reliability Score > 90
  • Clear business interpretation

Medium Confidence Scenarios (Use with validation)

  • Quality Grade: Good
  • Silhouette Score 0.5-0.7
  • Consider business validation
  • Develop contingency plans

Low Confidence Scenarios (Directional guidance only)

  • Quality Grade: Fair/Poor
  • Silhouette Score < 0.5
  • Focus on general patterns
  • Frequent re-clustering recommended

AI-Generated Insights

Executive Summaries

What you get: Business-focused analysis for each cluster including:

  • Performance assessment in business terms
  • Key characteristics and distinguishing factors
  • Comparison to other clusters
  • Strategic implications

Example:

"Cluster 1 represents high-performing stores (18% of total) with average revenue of $2.1M and 18.5% margins. These stores are primarily located in urban markets with high customer density. Key success factors include strong inventory management and experienced staff. Strategic recommendation: Replicate these practices in Cluster 2 stores to drive overall performance improvement."

Actionable Recommendations

Categories:

  1. Performance Optimization: Improve underperforming clusters
  2. Growth Strategy: Scale successful cluster practices
  3. Resource Allocation: Distribute resources based on cluster potential
  4. Risk Management: Address cluster-specific challenges

Interpreting Cluster Visualizations

Visual Elements

  • Cluster Colors: Each cluster has a distinct color for easy identification
  • Point Sizes: May indicate business importance (revenue, profit, etc.)
  • Boundaries: Show cluster separation and overlap areas
  • Centers: Highlight cluster centroids and characteristics

Pattern Recognition

  • Cluster Density: Tight vs loose clusters indicate segment cohesion
  • Separation: Clear boundaries vs overlap indicate business segment clarity
  • Outliers: Points far from cluster centers may need special attention
  • Hierarchies: Nested clusters may indicate business sub-segments

Business Insights

  • Segment Identification: Clear business segments for targeted strategies
  • Performance Patterns: Visual correlation between location and performance
  • Growth Opportunities: Underperforming areas with growth potential
  • Risk Assessment: Clusters with high variability or outlier concentration

Common Pitfalls to Avoid

1. Over-Interpreting Low Quality Clusters

  • Problem: Making major decisions on clusters with silhouette < 0.3
  • Solution: Use for directional guidance only

2. Ignoring Business Context

  • Problem: Accepting clusters that don't make business sense
  • Solution: Validate AI insights against business knowledge

3. Misinterpreting Cluster Sizes

  • Problem: Assuming equal cluster sizes are always better
  • Solution: Consider natural business hierarchies and market realities

4. Not Validating Against Business Metrics

  • Problem: Accepting clusters misaligned with business performance
  • Solution: Validate cluster assignments against known business outcomes