Skip to main content

Hotspot Analysis Methodology

1. Goal

Identify code files that are both Complex and Frequently Changed. High Churn + High Complexity = Hotspot.

These files are the primary candidates for:

  • Refactoring.
  • High-risk regression testing.
  • "Strangling" (extracting logic to new modules).

2. Methodology

2.1. Metric: Churn (Frequency of Change)

We use git log to count how many times a file has been committed in the last year. Threshold: > 10 commits/year indicates active development.

Command:

git log --pretty=format: --name-only --since="1 year ago" | sort | uniq -c | sort -nr | head -n 50

2.2. Metric: Complexity (Lines of Code)

We use wc -l as a proxy for complexity. While not perfect, in this legacy codebase, files with thousands of lines are invariably complex "God Classes". Threshold: > 1000 lines indicates a "God Class" or "dumping ground".

Command:

find . -name "*.cs" -not -path "*/obj/*" -not -path "*/bin/*" -print0 | xargs -0 wc -l | sort -nr | head -n 50

3. Intersection (The Hotspots)

We manually cross-reference the top 50 lists. Files appearing in BOTH lists are Critical Hotspots.

4. Current Results

See Hotspot Analysis Results.