Data Import (ETL) Strategy
1. Executive Summary: The ETL Workflow
This pipeline transforms fragmented budget data into a unified structure for Zoho Creator.
High-Level Process
- Sources (
csv/root): Raw CSV exports from various systems. Immutable. - Processing (
etl/): Python scripts extract, normalize, and reconcile data.- See
etl_integrityskill for script details.
- See
- Validation (
csv/verification/): Automated integrity checks against Business Rules.- See
etl_integrityskill for logic.
- See
- Output (
csv/zoho_import/unified/): 6 Canonical CSVs ready for import.
2. Reference Skills
- ETL Integrity Skill: The "Manual" & "The Law". Contains source-specific transformations, reconciliation logic, and business rules.
3. Quick Actions
- Run Full Pipeline:
./run_full_etl.sh - Check Health:
open csv/verification/reports/etl_validation_report.md
4. Integration Logic (Import Order)
The 6 canonical files must be imported into Zoho Creator in this strict order to maintain referential integrity:
- Providers (
01_providers.csv) - Streams (
02_streams.csv) - Budget Buckets (
03_budget_buckets.csv) - Budget Versions (
04_budget_versions.csv) - Contracts (
05_contracts.csv) - Invoices (
06_invoices.csv)
[!IMPORTANT] > Zero Tolerance Policy: Never edit output CSVs manually. If validation fails, fix the code or the source data, then re-run the pipeline.