Hardwood: A New Parser for Apache Parquet
Summary
Hardwood is a new open-source Parquet parser for Java (Java 21+), designed to minimize dependencies and maximize parsing performance. It offers two APIs (RowReader and ColumnReader), employs a multi-threaded pipeline, and demonstrates substantial speedups over traditional readers in benchmarks; a road map includes predicate push-down, a compatibility layer with parquet-java, and writing/CLI capabilities.