Building a Scalable Ingestion Pipeline with Temporal (Part 1)
Summary
The article describes building a scalable document ingestion pipeline using Temporal for orchestrating crawling, embedding, and indexing across multiple sources. It covers architecture patterns like a three-worker setup, sliding window backpressure, and using Cloud Storage as a data bus to avoid large payloads through Temporal. It also discusses deployment on Google Cloud Run, continue-as-new for long-running workflows, and recommendations for SMB-scale data pipelines.