Moldova broke our data pipeline
Summary
A real-world data pipeline failure occurs when an unquoted comma in Moldova, Republic of, causes a Redshift COPY to misparse fields. The post argues that fixing the symptom (renaming Moldova) is insufficient and outlines three approaches: sanitize at the boundary, switch from CSV to a delimiter-safe format, or move to Parquet; it advocates a combined strategy for resilience and emphasizes validating data at ingestion boundaries.