630 KiB

Исходник Постоянная ссылка Ответственный История

Historical Changelog

51.0.0 (2024-03-15)

Full Changelog

Breaking changes:

Remove internal buffering from AsyncArrowWriter \#5484 #5485 [parquet] (tustvold)
Make ArrayBuilder also Sync #5353 [arrow] (dvic)
Raw JSON writer ~10x faster \#5314 #5318 [arrow] (tustvold)

Implemented enhancements:

Prototype Arrow over HTTP in Rust #5496 [arrow]
Add DataType::ListView and DataType::LargeListView #5492 [parquet] [arrow]
Improve documentation around handling of dictionary arrays in arrow flight #5487 [arrow] [arrow-flight]
Better memory limiting in parquet ArrowWriter #5484 [parquet]
Support Creating Non-Nullable Lists and Maps within a Struct #5482 [arrow]
```
DISCUSSION
```
Build Scalar with ArrayRef #5459
AsyncArrowWriter doesn't limit underlying ArrowWriter to respect buffer-size #5450 [parquet]
Refine Display implementation for FlightError #5438 [arrow] [arrow-flight]
Better ergonomics for FixedSizeList and LargeList #5372 [arrow]
Update Flight proto #5367 [arrow] [arrow-flight]
Support check similar datatype but with different magnitudes #5358 [arrow]
Buffer memory usage for custom allocations is reported as 0 #5346 [arrow]
Can the ArrayBuilder trait be made Sync? #5344 [arrow]
support cast 'UTF8' to FixedSizeList #5339 [arrow]
Support Creating Non-Nullable Lists with ListBuilder #5330 [arrow]
ParquetRecordBatchStreamBuilder::new() panics instead of erroring out when opening a corrupted file #5315 [parquet]
Raw JSON Writer #5314 [arrow]
Add support for more fused boolean operations #5297 [arrow]
parquet: Allow disabling embed ARROW_SCHEMA_META_KEY added by the ArrowWriter #5296 [parquet]
Support casting strings like '2001-01-01 01:01:01' to Date32 #5280 [arrow]
Temporal Extract/Date Part Kernel #5266 [arrow]
Support for extracting hours/minutes/seconds/etc. from Time32/Time64 type in temporal kernels #5261 [arrow]
parquet: add method to get both the inner writer and the file metadata when closing SerializedFileWriter #5253 [parquet]
Release arrow-rs version 50.0.0 #5234

Fixed bugs:

Empty String Parses as Zero in Unreleased Arrow #5504 [arrow]
Unused import in nightly rust #5476 [parquet] [arrow] [arrow-flight]
Error The data type type List .. has no natural order when using arrow::compute::lexsort_to_indices with list and more than one column #5454 [arrow]
Wrong size assertion in arrow_buffer::builder::NullBufferBuilder::new_from_buffer #5445 [arrow]
Inconsistency between comments and code implementation #5430 [arrow]
OOB access in Buffer::from_iter #5412 [arrow]
Cast kernel doesn't return null for string to integral cases when overflowing under safe option enabled #5397 [arrow]
Make ffi consume variable layout arrays with empty offsets #5391 [arrow]
RecordBatch conversion from pyarrow loses Schema's metadata #5354 [arrow]
Debug output of Time32/Time64 arrays with invalid values has confusing nulls #5336 [arrow]
Removing a column from a RecordBatch drops schema metadata #5327 [arrow]
Panic when read an empty parquet file #5304 [parquet]
How to enable statistics for string columns? #5270 [parquet]
concat::tests::test_string_dictionary_merge failure fails on Mac / has different results in different platforms #5255 [arrow]

Documentation updates:

Minor: Add doc comments to GenericByteViewArray #5512 [arrow] (alamb)
Improve docs for logical and physical nulls even more #5434 [arrow] (alamb)
Add example of converting RecordBatches to JSON objects #5364 [arrow] (alamb)

Performance improvements:

improve float to string cast by ~20%-40% #5401 [arrow] (psvri)

Closed issues:

Add StringViewArray implementation and layout and basic construction + tests #5469 [parquet] [arrow]
Add DataType::Utf8View and DataType::BinaryView #5468 [parquet] [arrow]

Merged pull requests:

Deprecate array_to_json_array #5515 [arrow] (tustvold)
Fix integer parsing of empty strings \#5504 #5505 [arrow] (tustvold)
feat: clarifying comments in struct_builder.rs #5494 #5499 [arrow] (istvan-fodor)
Update proc-macro2 requirement from =1.0.78 to =1.0.79 #5498 [arrow] [arrow-flight] (dependabot[bot])
Add DataType::ListView and DataType::LargeListView #5493 [parquet] [arrow] (Kikkon)
Better document parquet pushdown #5491 [parquet] (tustvold)
Fix NullBufferBuilder::new_from_buffer wrong size assertion #5489 [arrow] (Kikkon)
Support dictionary encoding in structures for FlightDataEncoder, add documentation for arrow_flight::encode::Dictionary #5488 [arrow] [arrow-flight] (thinkharderdev)
Add MapBuilder::with_values_field to support non-nullable values \#5482 #5483 [arrow] (lasantosr)
feat: initial support string_view and binary_view, supports layout and basic construction + tests #5481 [arrow] (ariesdevil)
Add more comprehensive documentation on testing and benchmarking to CONTRIBUTING.md #5478 (monkwire)
Remove unused import detected by nightly rust #5477 [parquet] [arrow] [arrow-flight] (XiangpengHao)
Add RecordBatch::schema_ref #5474 [parquet] [arrow] [arrow-flight] (monkwire)
Provide access to inner Write for parquet writers #5471 [parquet] (tustvold)
Add DataType::Utf8View and DataType::BinaryView #5470 [parquet] [arrow] (XiangpengHao)
Update base64 requirement from 0.21 to 0.22 #5467 [parquet] [arrow] [arrow-flight] (dependabot[bot])
Minor: Fix formatting typo in Field::new_list_field #5464 [arrow] (alamb)
Fix test_string_dictionary_merge \#5255 #5461 [arrow] (tustvold)
Use Vec::from_iter in Buffer::from_iter #5460 [arrow] (Kikkon)
Document parquet writer memory limiting \#5450 #5457 [parquet] (tustvold)
Document UnionArray Panics #5456 [arrow] (Kikkon)
fix: lexsort_to_indices unsupported mixed types with list #5455 [arrow] (alamb)
Refine Display and Source implementation for error types #5439 [arrow] [arrow-flight] (BugenZhao)
Improve debug output of Time32/Time64 arrays #5428 [arrow] (monkwire)
Miri fix: Rename invalid_mut to without_provenance_mut #5418 [arrow] (Jefffrey)
Ensure addition/multiplications in when allocating buffers don't overflow #5417 [arrow] (Jefffrey)
Update Flight proto: PollFlightInfo & expiration time #5413 [arrow] [arrow-flight] (Jefffrey)
Add tests for serializing lists of dictionary encoded values to json #5399 [arrow] (jhorstmann)
Return null for overflow when casting string to integer under safe option enabled #5398 [arrow] (viirya)
Propagate error instead of panic for take_bytes #5395 [arrow] (viirya)
Improve like kernel by ~2% #5390 [arrow] (psvri)
Enable running arrow-array and arrow-arith with miri and avoid strict provenance warning #5387 [arrow] (jhorstmann)
Update to chrono 0.4.34 #5385 [arrow] (tustvold)
Return error instead of panic when reading invalid Parquet metadata #5382 [parquet] (mmaitre314)
Update tonic requirement from 0.10.0 to 0.11.0 #5380 [arrow] [arrow-flight] (dependabot[bot])
Update tonic-build requirement from =0.10.2 to =0.11.0 #5379 [arrow] [arrow-flight] (dependabot[bot])
Fix latest clippy lints #5376 [arrow] (tustvold)
feat: utility functions for creating FixedSizeList and LargeList dtypes #5373 [arrow] (universalmind303)
Minor(docs): update master to main for DataFusion/Ballista #5363 (caicancai)
Return an error instead of a panic when reading a corrupted Parquet file with mismatched column counts #5362 [parquet] (mmaitre314)
feat: support casting FixedSizeList with new child type #5360 [arrow] (wjones127)
Add more debugging info to StructBuilder validate_content #5357 [arrow] (viirya)
pyarrow: Preserve RecordBatch's schema metadata #5355 [arrow] (atwam)
Mark Encoding::BIT_PACKED as deprecated and document its compatibility issues #5348 [parquet] (jhorstmann)
Track the size of custom allocations for use via Array::get_buffer_memory_size #5347 [arrow] (jhorstmann)
fix: Return an error on type mismatch rather than panic \#4995 #5341 [parquet] (carols10cents)
Minor: support cast values to fixedsizelist #5340 [arrow] (Weijun-H)
Enhance Time32/Time64 support in date_part #5337 [arrow] (Jefffrey)
feat: add take_record_batch. #5333 [arrow] (RinChanNOWWW)
Add ListBuilder::with_field to support non nullable list fields \#5330 #5331 [arrow] (tustvold)
Don't omit schema metadata when removing column #5328 [arrow] (kylebarron)
Update proc-macro2 requirement from =1.0.76 to =1.0.78 #5324 [arrow] [arrow-flight] (dependabot[bot])
Enhance Date64 type documentation #5323 [arrow] (Jefffrey)
fix panic when decode a group with no child #5322 [parquet] (Liyixin95)
Minor/Doc Expand FlightSqlServiceClient::handshake doc #5321 [arrow] [arrow-flight] (devinjdangelo)
Refactor temporal extract date part kernels #5319 [arrow] (Jefffrey)
Add JSON writer benchmarks \#5314 #5317 [arrow] (tustvold)
Bump actions/cache from 3 to 4 #5308 (dependabot[bot])
Avro block decompression #5306 [arrow] (tustvold)
Result into error in case of endianness mismatches #5301 [arrow] (pangiole)
parquet: Add ArrowWriterOptions to skip embedding the arrow metadata #5299 [parquet] (evenyag)
Add support for more fused boolean operations #5298 [arrow] (RTEnzyme)
Support Parquet Byte Stream Split Encoding #5293 [parquet] (mwlon)
Extend string parsing support for Date32 #5282 [arrow] (gruuya)
Bring some methods over from ArrowWriter to the async version #5251 [parquet] (AdamGS)

50.0.0 (2024-01-08)

Full Changelog

Breaking changes:

Make regexp_match take scalar pattern and flag #5245 [arrow] (viirya)
Use Vec in ColumnReader \#5177 #5193 [parquet] (tustvold)
Remove SIMD Feature #5184 [arrow] (tustvold)
Use Total Ordering for Aggregates and Refactor for Better Auto-Vectorization #5100 [arrow] (jhorstmann)
Allow the zip compute function to operator on Scalar values via Datum #5086 [arrow] (Nathan-Fenner)
Improve C Data Interface and Add Integration Testing Entrypoints #5080 [arrow] (pitrou)
Parquet: read/write f16 for Arrow #5003 [parquet] (Jefffrey)

Implemented enhancements:

Support get offsets or blocks info from arrow file. #5252 [arrow]
Make regexp_match take scalar pattern and flag #5246 [arrow]
Cannot access pen state website on arrow-row #5238 [arrow]
RecordBatch with_schema's error message is hard to read #5227 [arrow]
Support cast between StructArray. #5219 [arrow]
Remove nightly-only simd feature and related code in ArrowNumericType #5185 [arrow]
Use Vec instead of Slice in ColumnReader #5177 [parquet]
Request to Memmap Arrow IPC files on disk #5153 [arrow]
GenericColumnReader::read_records Yields Truncated Records #5150 [parquet]
Nested Schema Projection #5148 [parquet] [arrow]
Support specifying quote and escape in Csv WriterBuilder #5146 [arrow]
Support casting of Float16 with other numeric types #5138 [arrow]
Parquet: read parquet metadata with page index in async and with size hints #5129 [parquet]
Cast from floating/timestamp to timestamp/floating #5122 [arrow]
Support Casting List To/From LargeList in Cast Kernel #5113 [arrow]
Expose a path for converting bytes::Bytes into arrow_buffer::Buffer without copy #5104 [arrow]
API inconsistency of ListBuilder make it hard to use as nested builder #5098 [arrow]
Parquet: don't truncate min/max statistics for float16 and decimal when writing file #5075 [parquet]
Parquet: derive boundary order when writing columns #5074 [parquet]
Support new Arrow PyCapsule Interface for Python FFI #5067 [arrow]
48.0.1 arrow patch release #5050 [parquet] [arrow]
Binary columns do not receive truncated statistics #5037 [parquet]
Re-evaluate Explicit SIMD Aggregations #5032 [arrow]
Min/Max Kernels Should Use Total Ordering #5031 [arrow]
Allow zip compute kernel to take Scalar / Datum #5011 [arrow]
Add Float16/Half-float logical type to Parquet #4986 [parquet]
feat: cast (Large)List to FixedSizeList #5081 [arrow] (wjones127)
Update Parquet Encoding Documentation #5051 [parquet]

Fixed bugs:

json schema inference can't handle null field turned into object field in subsequent rows #5215 [arrow]
Invalid trailing content after Z in timezone is ignored #5182 [arrow]
Take panics on a fixed size list array when given null indices #5169 [arrow]
EnabledStatistics::Page does not take effect on ByteArrayEncoder #5162 [parquet]
Parquet: ColumnOrder not being written when writing parquet files #5152 [parquet]
Parquet: Interval columns shouldn't write min/max stats #5145 [parquet]
cast Utf8 to decimal failure #5127 [arrow]
coerce_primitive not honored when decoding from serde object #5095 [arrow]
Unsound MutableArrayData Constructor #5091 [arrow]
RowGroupReader.get_row_iter() fails with Path ColumnPath not found #5064 [parquet]
cast format 'yyyymmdd' to Date32 give a error #5044 [arrow]

Performance improvements:

ArrowArrayStreamReader imports FFI_ArrowSchema on each iteration #5103 [arrow]

Closed issues:

Working example of list_flights with ObjectStore #5116
object\_store Error broken pipe on S3 multipart upload #5106

Merged pull requests:

Update parquet object_store dependency to 0.9.0 #5290 [parquet] (tustvold)
Update proc-macro2 requirement from =1.0.75 to =1.0.76 #5289 [arrow] [arrow-flight] (dependabot[bot])
Enable JS tests again #5287 (domoritz)
Update proc-macro2 requirement from =1.0.74 to =1.0.75 #5279 [arrow] [arrow-flight] (dependabot[bot])
Update proc-macro2 requirement from =1.0.73 to =1.0.74 #5271 [arrow] [arrow-flight] (dependabot[bot])
Update proc-macro2 requirement from =1.0.71 to =1.0.73 #5265 [arrow] [arrow-flight] (dependabot[bot])
Update docs for datatypes #5260 [arrow] (Jefffrey)
Don't suppress errors in ArrowArrayStreamReader #5256 [arrow] (tustvold)
Add IPC FileDecoder #5249 [arrow] (tustvold)
optimize the next function of ArrowArrayStreamReader #5248 [arrow] (doki23)
ci: Fail Miri CI on first failure #5243 (Jefffrey)
Remove 'unwrap' from Result #5241 [parquet] (zeevm)
Update arrow-row docs URL #5239 [arrow] (thomas-k-cameron)
Improve regexp kernels performance by avoiding cloning Regex #5235 [arrow] (viirya)
Update proc-macro2 requirement from =1.0.70 to =1.0.71 #5231 [arrow] [arrow-flight] (dependabot[bot])
Minor: Improve comments and errors for ArrowPredicate #5230 [parquet] (alamb)
Bump actions/upload-pages-artifact from 2 to 3 #5229 (dependabot[bot])
make with_schema's error more readable #5228 [arrow] (shuoli84)
Use try_new when casting between structs to propagate error #5226 [arrow] (viirya)
feat(cast): support cast between struct #5221 [arrow] (my-vegetable-has-exploded)
Add entries to MapBuilder to return both key and value array builders #5218 [arrow] (viirya)
fix(json): fix inferring object after field was null #5216 [arrow] (kskalski)
Support MapBuilder in make_builder #5210 [arrow] (viirya)
impl From<OffsetBuffer<T>> for ScalarBuffer<T> #5203 [arrow] (mbrobbel)
impl From<BufferBuilder<T>> for Buffer #5202 [arrow] (mbrobbel)
impl From<BufferBuilder<T>> for ScalarBuffer<T> #5201 [arrow] (mbrobbel)
feat: Support quote and escape in Csv WriterBuilder #5196 [arrow] (my-vegetable-has-exploded)
chore: simplify cast_string_to_interval #5195 [arrow] (jackwener)
Clarify interval comparison behavior with documentation and tests #5192 [arrow] (alamb)
Add BooleanArray::into_parts method #5191 [arrow] (mbrobbel)
Fix deprecated note for Buffer::from_raw_parts #5190 [arrow] (mbrobbel)
Fix: Ensure Timestamp Parsing Rejects Characters After 'Z #5189 [arrow] (razeghi71)
Simplify parquet statistics generation #5183 [parquet] (tustvold)
Parquet: Ensure page statistics are written only when conifgured from the Arrow Writer #5181 [parquet] (AdamGS)
Blockwise IO in IPC FileReader \#5153 #5179 [arrow] (tustvold)
Replace ScalarBuffer in Parquet with Vec \#1849 \#5177 #5178 [parquet] (tustvold)
Bump actions/setup-python from 4 to 5 #5175 (dependabot[bot])
Add LargeListBuilder to make_builder #5171 [arrow] (viirya)
fix: ensure take_fixed_size_list can handle null indices #5170 (westonpace)
Removing redundant as casts in parquet #5168 [parquet] (psvri)
Bump actions/labeler from 4.3.0 to 5.0.0 #5167 (dependabot[bot])
improve: make RunArray displayable #5166 [arrow] (yukkit)
ci: Add cargo audit CI action #5160 [arrow] (Jefffrey)
Parquet: write column_orders in FileMetaData #5158 [parquet] (Jefffrey)
Adding is_null datatype shortcut method #5157 [arrow] (comphead)
Parquet: don't truncate f16/decimal min/max stats #5154 [parquet] (Jefffrey)
Support nested schema projection \#5148 #5149 [arrow] (tustvold)
Parquet: omit min/max for interval columns when writing stats #5147 [parquet] (Jefffrey)
Deprecate Fields::remove and Schema::remove #5144 [arrow] (tustvold)
Support casting of Float16 with other numeric types #5139 [arrow] (viirya)
Parquet: Make MetadataLoader public #5137 [parquet] (AdamGS)
Add FileReaderBuilder for arrow-ipc to allow reading large no. of column files #5136 [arrow] (Jefffrey)
Parquet: clear metadata and project fields of ParquetRecordBatchStream::schema #5135 [parquet] (Jefffrey)
JSON: write struct array nulls as null #5133 [arrow] (Jefffrey)
Update proc-macro2 requirement from =1.0.69 to =1.0.70 #5131 [arrow] [arrow-flight] (dependabot[bot])
Fix negative decimal string #5128 [arrow] (viirya)
Cleanup list casting and support nested lists \#5113 #5124 [arrow] (tustvold)
Cast from numeric/timestamp to timestamp/numeric #5123 [arrow] (viirya)
Improve cast docs #5114 [arrow] (tustvold)
Update prost-build requirement from =0.12.2 to =0.12.3 #5112 [arrow] [arrow-flight] (dependabot[bot])
Parquet: derive boundary order when writing #5110 [parquet] (Jefffrey)
Implementing ArrayBuilder for Box<dyn ArrayBuilder> #5109 [arrow] (viirya)
Fix 'ColumnPath not found' error reading Parquet files with nested REPEATED fields #5102 [parquet] (mmaitre314)
fix: coerce_primitive for serde decoded data #5101 [arrow] (fansehep)
Extend aggregation benchmarks #5096 [arrow] (jhorstmann)
Expand parquet crate overview doc #5093 [parquet] (mmaitre314)
Ensure arrays passed to MutableArrayData have same type \#5091 #5092 [arrow] (tustvold)
Update prost-build requirement from =0.12.1 to =0.12.2 #5088 [arrow] [arrow-flight] (dependabot[bot])
Add FFI from_raw #5082 [arrow] (tustvold)
```
fix \#5044
```
Enable truncation of binary statistics columns #5076 [parquet] (emcake)

49.0.0 (2023-11-07)

Full Changelog

Breaking changes:

Return row count when inferring schema from JSON #5008 [arrow] (asayers)
Update object_store 0.8.0 #5043 [parquet] (tustvold)

Implemented enhancements:

Cast from integer/timestamp to timestamp/integer #5039 [arrow]
Support casting from integer to binary #5014 [arrow]
Return row count when inferring schema from JSON #5007 [arrow]
```
FlightSQL
```
Support RecordBatch::remove_column() and Schema::remove_field() #4952 [arrow]
arrow_json: support binary deserialization #4945 [arrow]
Support StructArray in Cast Kernel #4908 [arrow]
There exists a ParquetRecordWriter proc macro in parquet_derive, but ParquetRecordReader is missing #4772 [parquet]

Fixed bugs:

Regression when serializing large json numbers #5038 [arrow]
RowSelection::intersection Produces Invalid RowSelection #5036 [parquet]
Incorrect comment on arrow::compute::kernels::sort::sort_to_indices #5029 [arrow]

Documentation updates:

chore: Update docs to refer to non deprecated function `partition` #5027 [arrow] (alamb)

Merged pull requests:

Parquet f32/f64 handle signed zeros in statistics #5048 [parquet] (Jefffrey)
Fix serialization of large integers in JSON \#5038 #5042 [arrow] (tustvold)
Fix RowSelection::intersection \#5036 #5041 [parquet] (tustvold)
Cast from integer/timestamp to timestamp/integer #5040 [arrow] (viirya)
doc: update comment on sort_to_indices to reflect correct ordering #5033 [arrow] (westonpace)
Support casting from integer to binary #5015 [arrow] (viirya)
Update tracing-log requirement from 0.1 to 0.2 #4998 [arrow] [arrow-flight] (dependabot[bot])
feat(flight-sql): Allow custom commands in get-flight-info #4997 [arrow] [arrow-flight] (amartins23)
```
MINOR
```
Support metadata in SchemaBuilder #4987 [arrow] (tustvold)
feat: support schema change by idx and reverse #4985 [arrow] (fansehep)
Bump actions/setup-node from 3 to 4 #4982 (dependabot[bot])
Add arrow_cast::base64 and document usage in arrow_json #4975 [arrow] (tustvold)
Add SchemaBuilder::remove \#4952 #4964 [arrow] (tustvold)
Add Field::remove(), Schema::remove(), and RecordBatch::remove_column() APIs #4959 [arrow] (Folyd)
Add RecordReader trait and proc macro to implement it for a struct #4773 [parquet] (Joseph-Rance)

48.0.0 (2023-10-18)

Full Changelog

Breaking changes:

Evaluate null_regex for string type in csv now such values will be parsed as `Null` rather than `""` #4942 [arrow] (haohuaijin)
fix(csv)!: infer null for empty column. #4910 [arrow] (kskalski)
feat: log headers/trailers in flight CLI + minor fixes #4898 [arrow] [arrow-flight] (crepererum)
fix(arrow-json)!: include null fields in schema inference with a type of Null #4894 [arrow] (kskalski)
Mark OnCloseRowGroup Send #4893 [parquet] (devinjdangelo)
Specialize Thrift Decoding ~40% Faster \#4891 #4892 [parquet] (tustvold)
Make ArrowRowGroupWriter Public and SerializedRowGroupWriter Send #4850 [parquet] (devinjdangelo)

Implemented enhancements:

Allow schema fields to merge with Null datatype #4901 [arrow]
Add option to FlightDataEncoder to always send dictionaries #4895 [arrow] [arrow-flight]
Rework Thrift Encoding / Decoding of Parquet Metadata #4891 [parquet]
Plans for supporting Extension Array to support Fixed shape tensor Array #4890
Implement Take for UnionArray #4882 [arrow]
Check precision overflow for casting floating to decimal #4865 [arrow]
Replace lexical #4774 [arrow]
Add read access to settings in csv::WriterBuilder #4735 [arrow]
Improve the performance of "DictionaryValue" row encoding #4712 [arrow] [arrow-flight]

Fixed bugs:

Should we make blank values and empty string to None in csv? #4939 [arrow]
```
FlightSQL
```
Loading page index breaks skipping of pages with nested types #4921 [parquet]
CSV schema inference assumes Utf8 for empty columns #4903 [arrow]
parquet: Field Ids are not read from a Parquet file without serialized arrow schema #4877 [parquet]
make_primitive_scalar function loses DataType Internal information #4851 [arrow]
StructBuilder doesn't handle nulls correctly for empty structs #4842 [arrow]
NullArray::is_null() returns false incorrectly #4835 [arrow]
cast_string_to_decimal should check precision overflow #4829 [arrow]
Null fields are omitted by infer_json_schema_from_seekable #4814 [arrow]

Closed issues:

Support for reading JSON Array to Arrow #4905 [arrow]

Merged pull requests:

Assume Pages Delimit Records When Offset Index Loaded \#4921 #4943 [parquet] (tustvold)
Update pyo3 requirement from 0.19 to 0.20 #4941 [arrow] (crepererum)
Add FileWriter schema getter #4940 [arrow] (haixuanTao)
feat: support parsing for parquet writer option #4938 [parquet] (fansehep)
Export SubstraitPlan structure in arrow_flight::sql \#4932 #4933 [arrow] [arrow-flight] (amartins23)
Update zstd requirement from 0.12.0 to 0.13.0 #4923 [parquet] [arrow] (dependabot[bot])
feat: add method for async read bloom filter #4917 [parquet] (hengfeiyang)
Minor: Clarify rationale for FlightDataEncoder API, add examples #4916 [arrow] [arrow-flight] (alamb)
Update regex-syntax requirement from 0.7.1 to 0.8.0 #4914 [arrow] (dependabot[bot])
feat: document & streamline flight SQL CLI #4912 [arrow] [arrow-flight] (crepererum)
Support Arbitrary JSON values in JSON Reader \#4905 #4911 [arrow] (tustvold)
Cleanup CSV WriterBuilder, Default to AutoSI Second Precision \#4735 #4909 [arrow] (tustvold)
Update proc-macro2 requirement from =1.0.68 to =1.0.69 #4907 [arrow] [arrow-flight] (dependabot[bot])
chore: add csv example #4904 [arrow] (fansehep)
feat(schema): allow null fields to be merged with other datatypes #4902 [arrow] (kskalski)
Update proc-macro2 requirement from =1.0.67 to =1.0.68 #4900 [arrow] [arrow-flight] (dependabot[bot])
Add option to FlightDataEncoder to always resend batch dictionaries #4896 [arrow] [arrow-flight] (alexwilcoxson-rel)
Fix integration tests #4889 (tustvold)
Support Parsing Avro File Headers #4888 (tustvold)
Support parquet bloom filter length #4885 [parquet] (letian-jiang)
Replace lz4 with lz4_flex Allowing Compilation for WASM #4884 [parquet] [arrow] (tustvold)
Implement Take for UnionArray #4883 [arrow] (avantgardnerio)
Update tonic-build requirement from =0.10.1 to =0.10.2 #4881 [arrow] [arrow-flight] (dependabot[bot])
parquet: Read field IDs from Parquet Schema #4878 [parquet] (Samrose-Ahmed)
feat: improve flight CLI error handling #4873 [arrow] [arrow-flight] (crepererum)
Support Encoding Parquet Columns in Parallel #4871 [parquet] (tustvold)
Check precision overflow for casting floating to decimal #4866 [arrow] (viirya)
Make align_buffers as public API #4863 [arrow] (viirya)
Enable new integration tests \#4828 #4862 (tustvold)
Faster Serde Integration ~80% faster #4861 [arrow] (tustvold)
fix: make_primitive_scalar bug #4852 [arrow] (JasonLi-cn)
Update tonic-build requirement from =0.10.0 to =0.10.1 #4846 [arrow] [arrow-flight] (dependabot[bot])
Allow Constructing Non-Empty StructArray with no Fields \#4842 #4845 [arrow] (tustvold)
Refine documentation to Array::is_null #4838 [arrow] (alamb)
fix: add missing precision overflow checking for cast_string_to_decimal #4830 [arrow] (jonahgao)

47.0.0 (2023-09-19)

Full Changelog

Breaking changes:

Make FixedSizeBinaryArray value_data return a reference #4820 [arrow]
Update prost to v0.12.1 #4825 [arrow] [arrow-flight] (tustvold)
feat: FixedSizeBinaryArray::value_data return reference #4821 [arrow] (wjones127)
Stateless Row Encoding / Don't Preserve Dictionaries in RowConverter \#4811 #4819 [arrow] [arrow-flight] (tustvold)
fix: entries field is non-nullable #4808 [arrow] (wjones127)
Fix flight sql do put handling, add bind parameter support to FlightSQL cli client #4797 [arrow] [arrow-flight] (suremarc)
Remove unused dyn_cmp_dict feature #4766 [arrow] (tustvold)
Add underlying std::io::Error to IoError and add IpcError variant #4726 [arrow] [arrow-flight] (alexandreyc)

Implemented enhancements:

Row Format Adapative Block Size #4812 [arrow]
Stateless Row Conversion #4811 [arrow] [arrow-flight]
Add option to specify custom null values for CSV reader #4794 [arrow]
parquet::record::RowIter cannot be customized with batch_size and defaults to 1024 #4782 [parquet]
DynScalar abstraction something that makes it easy to create scalar `Datum`s #4781 [arrow]
Datum is not exported as part of arrow it is only exported in `arrow_array` #4780 [arrow]
Scalar is not exported as part of arrow it is only exported in `arrow_array` #4779 [arrow]
Support IntoPyArrow for impl RecordBatchReader #4730 [arrow]
Datum Based String Kernels #4595 [arrow] [arrow-flight]

Fixed bugs:

MapArray::new_from_strings creates nullable entries field #4807 [arrow]
pyarrow module can't roundtrip tensor arrays #4805 [arrow]
concat_batches errors with "schema mismatch" error when only metadata differs #4799 [arrow]
panic in cmp kernels with DictionaryArrays: Option::unwrap() on a None value' #4788 [arrow]
stream ffi panics if schema metadata values aren't valid utf8 #4750 [arrow]
Regression: Incorrect Sorting of *ListArray in 46.0.0 #4746 [arrow]
Row is no longer comparable after reuse #4741 [arrow]
DoPut FlightSQL handler inadvertently consumes schema at start of Request<Streaming<FlightData>> #4658
Return error when converting schema #4752 [arrow] (wjones127)
Implement PyArrowType for Box<dyn RecordBatchReader + Send> #4751 [arrow] (wjones127)

Closed issues:

Building arrow-rust for target wasm32-wasi falied to compile packed_simd_2 #4717

Merged pull requests:

Respect FormatOption::nulls for NullArray #4836 [arrow] (tustvold)
Fix merge_dictionary_values in selection kernels #4833 [arrow] (tustvold)
Fix like scalar null #4832 [arrow] (tustvold)
More chrono deprecations #4822 [arrow] (tustvold)
Adaptive Row Block Size \#4812 #4818 [arrow] (tustvold)
Update proc-macro2 requirement from =1.0.66 to =1.0.67 #4816 [arrow] [arrow-flight] (dependabot[bot])
Do not check schema for equality in concat_batches #4815 [arrow] (alamb)
fix: export record batch through stream #4806 [arrow] (wjones127)
Improve CSV Reader Benchmark Coverage of Small Primitives #4803 [arrow] (tustvold)
csv: Add option to specify custom null values #4795 [arrow] (vrongmeal)
Expand docstring and add example to Scalar #4793 [arrow] (alamb)
Re-export array crate root \#4780 \#4779 #4791 [arrow] (tustvold)
Fix DictionaryArray::normalized_keys \#4788 #4789 [arrow] (tustvold)
Allow custom tree builder for parquet::record::RowIter #4783 [parquet] (YuraKotov)
Bump actions/checkout from 3 to 4 #4767 (dependabot[bot])
fix: avoid panic if offset index not exists. #4761 [parquet] (RinChanNOWWW)
Relax constraints on PyArrowType #4757 (tustvold)
Chrono deprecations #4748 [arrow] (tustvold)
Fix List Sorting, Revert Removal of Rank Kernels #4747 [arrow] (tustvold)
Clear row buffer before reuse #4742 [arrow] (yjshen)
Datum based like kernels \#4595 #4732 [arrow] [arrow-flight] (tustvold)
feat: expose DoGet response headers & trailers #4727 [arrow] [arrow-flight] (crepererum)
Cleanup length and bit_length kernels #4718 [arrow] (tustvold)

46.0.0 (2023-08-21)

Full Changelog

Breaking changes:

API improvement: batches_to_flight_data forces clone #4656 [arrow]
Add AnyDictionary Abstraction and Take ArrayRef in DictionaryArray::with_values #4707 [arrow] (tustvold)
Cleanup parquet type builders #4706 [parquet] (tustvold)
Take kernel dyn Array #4705 [arrow] (tustvold)
Improve ergonomics of Scalar #4704 [arrow] (tustvold)
Datum based comparison kernels \#4596 #4701 [parquet] [arrow] [arrow-flight] (tustvold)
Improve Array Logical Nullability #4691 [parquet] [arrow] (tustvold)
Validate ArrayData Buffer Alignment and Automatically Align IPC buffers \#4255 #4681 [arrow] (tustvold)
More intuitive bool-to-string casting #4666 [arrow] (fsdvh)
enhancement: batches_to_flight_data use a schema ref as param. #4665 [arrow] [arrow-flight] (jackwener)
fix: from_thrift avoid panic when stats in invalid. #4642 [parquet] (jackwener)
bug: Add some missing field in row group metadata: ordinal, total co… #4636 [parquet] (liurenjie1024)
Remove deprecated limit kernel #4597 [arrow] (tustvold)

Implemented enhancements:

parquet: support setting the field_id with an ArrowWriter #4702 [parquet]
Support references in i256 arithmetic ops #4694 [arrow]
Precision-Loss Decimal Arithmetic #4664 [arrow]
Faster i256 Division #4663 [arrow]
Support concat_batches for 0 columns #4661 [arrow]
filter_record_batch should support filtering record batch without columns #4647 [arrow]
Improve speed of lexicographical_partition_ranges #4614 [arrow]
object_store: multipart ranges for HTTP #4612
Add Rank Function #4606 [arrow]
Datum Based Comparison Kernels #4596 [parquet] [arrow] [arrow-flight]
Convenience method to create DataType::List correctly #4544 [arrow]
Remove Deprecated Arithmetic Kernels #4481 [arrow]
Equality kernel where null==null gives true #4438 [arrow]

Fixed bugs:

Parquet ArrowWriter Ignores Nulls in Dictionary Values #4690 [parquet] [arrow]
Schema Nullability Validation Fails to Account for Dictionary Nulls #4689 [parquet] [arrow]
Comparison Kernels Ignore Nulls in Dictionary Values #4688 [parquet] [arrow]
Casting List to String Ignores Format Options #4669 [arrow]
Double free in C Stream Interface #4659 [arrow]
CI Failing On Packed SIMD #4651 [arrow]
RowInterner::size() much too low for high cardinality dictionary columns #4645 [arrow]
Decimal PrimitiveArray change datatype after try_unary #4644
Better explanation in docs for Dictionary field encoding using RowConverter #4639 [arrow]
List(FixedSizeBinary) array equality check may return wrong result #4637 [arrow]
arrow::compute::nullif panics if NullArray is provided #4634 [arrow]
Empty lists in FixedSizeListArray::try_new is not handled #4623 [arrow]
Bounds checking in MutableBuffer::set_null_bits can be bypassed #4620 [arrow]
TypedDictionaryArray Misleading Null Behaviour #4616 [parquet] [arrow]
bug: Parquet writer missing row group metadata fields such as compressed_size, file offset. #4610 [parquet]
new_null_array generates an invalid union array #4600 [arrow]
Footer parsing fails for very large parquet file. #4592 [parquet]
bug(parquet): Disabling global statistics but enabling for particular column breaks reading #4587 [parquet]
arrow::compute::concat panics for dense union arrays with non-trivial type IDs #4578 [arrow]

Closed issues:

```
object\_store
```

Merged pull requests:

Add distinct kernels \#960 \#4438 #4716 [arrow] (tustvold)
Update parquet object_store 0.7 #4715 [parquet] (tustvold)
Support Field ID in ArrowWriter \#4702 #4710 [parquet] (tustvold)
Remove rank kernels #4703 [arrow] (tustvold)
Support references in i256 arithmetic ops #4692 [arrow] (viirya)
Cleanup DynComparator \#2654 #4687 [arrow] (tustvold)
Separate metadata fetch from ArrowReaderBuilder construction \#4674 #4676 [parquet] (tustvold)
cleanup some assert() with error propagation #4673 [parquet] (zeevm)
Faster i256 Division 2-100x \#4663 #4672 [arrow] (tustvold)
Fix MSRV CI #4671 (tustvold)
Fix equality of nested nullable FixedSizeBinary \#4637 #4670 [arrow] (tustvold)
Use ArrayFormatter in cast kernel #4668 [arrow] (tustvold)
Minor: Improve API docs for FlightSQL metadata builders #4667 [arrow] [arrow-flight] (alamb)
Support concat_batches for 0 columns #4662 [arrow] (Dandandan)
fix ownership of c stream error #4660 [arrow] (wjones127)
Minor: Fix illustration for dict encoding #4657 [arrow] (JayjeetAtGithub)
minor: move comment to the correct location #4655 [arrow] (jackwener)
Update packed_simd and run miri tests on simd code #4654 [arrow] (jhorstmann)
impl From<Vec<T>> for BufferBuilder and MutableBuffer #4650 [arrow] (mbrobbel)
Filter record batch with 0 columns #4648 [arrow] (Dandandan)
Account for child Bucket size in OrderPreservingInterner #4646 [arrow] (alamb)
Implement Default,Extend and FromIterator for BufferBuilder #4638 [arrow] (mbrobbel)
fix(select): handle NullArray in nullif #4635 [arrow] (kawadakk)
Move BufferBuilder to arrow-buffer #4630 [arrow] (mbrobbel)
allow zero sized empty fixed #4626 [arrow] (smiklos)
fix: compute_dictionary_mapping use wrong offsetSize #4625 [arrow] (jackwener)
impl FromIterator for MutableBuffer #4624 [arrow] (mbrobbel)
expand docs for FixedSizeListArray #4622 [arrow] (smiklos)
fix(buffer): panic on end index overflow in MutableBuffer::set_null_bits #4621 [arrow] (kawadakk)
impl Default for arrow_buffer::buffer::MutableBuffer #4619 [arrow] (mbrobbel)
Minor: improve docs and add example for lexicographical_partition_ranges #4615 [arrow] (alamb)
Cleanup sort #4613 [arrow] (tustvold)
Add rank function \#4606 #4609 [arrow] (tustvold)
Add more docs and examples for ListArray and OffsetsBuffer #4607 [arrow] (alamb)
Simplify dictionary sort #4605 [arrow] (tustvold)
Consolidate sort benchmarks #4604 [arrow] (tustvold)
Don't Reorder Nulls in sort_to_indices \#4545 #4603 [arrow] (tustvold)
fix(data): create child arrays of correct length when building a sparse union null array #4601 [arrow] (kawadakk)
Use u32 metadata_len when parsing footer of parquet. #4599 [parquet] (Berrysoft)
fix(data): map type ID to child index before indexing a union child array #4598 [arrow] (kawadakk)
Remove deprecated arithmetic kernels \#4481 #4594 [arrow] (tustvold)
Test Disabled Page Statistics \#4587 #4589 [parquet] (tustvold)
Cleanup ArrayData::buffers #4583 [arrow] (tustvold)
Use contains_nulls in ArrayData equality of byte arrays #4582 [arrow] (tustvold)
Vectorized lexicographical_partition_ranges ~80% faster #4575 [arrow] (tustvold)
chore: add datatype new_list #4561 [arrow] (fansehep)

45.0.0 (2023-07-30)

Full Changelog

Breaking changes:

Fix timezoned timestamp arithmetic #4546 [arrow] (alexandreyc)

Implemented enhancements:

Use FormatOptions in Const Contexts #4580 [arrow]
Human Readable Duration Display #4554 [arrow]
BooleanBuilder: Add validity_slice method for accessing validity bits #4535 [arrow]
Support FixedSizedListArray for length kernel #4517 [arrow]
RowCoverter::convert that targets an existing Rows #4479 [arrow]

Fixed bugs:

Panic assertion failed: idx < self.len when casting DictionaryArrays with nulls #4576 [arrow]
arrow-arith is_null is buggy with NullArray #4565 [arrow]
Incorrect Interval to Duration Casting #4553 [arrow]
Too large validity buffer pre-allocation in FixedSizeListBuilder::new #4549 [arrow]
Like with wildcards fail to match fields with new lines. #4547 [arrow]
Timestamp Interval Arithmetic Ignores Timezone #4457 [arrow]

Merged pull requests:

refactor: simplify hour_dyn() with time_fraction_dyn() #4588 [arrow] (jackwener)
Move from_iter_values to GenericByteArray #4586 [arrow] (tustvold)
Mark GenericByteArray::new_unchecked unsafe #4584 [arrow] (tustvold)
Configurable Duration Display #4581 [arrow] (tustvold)
Fix take_bytes Null and Overflow Handling \#4576 #4579 [arrow] (tustvold)
Move chrono-tz arithmetic tests to integration #4571 [arrow] (tustvold)
Write Page Offset Index For All-Nan Pages #4567 [parquet] (MachaelLee)
support NullArray un arith/boolean kernel #4566 [arrow] (smiklos)
Remove Sync from arrow-flight example #4564 [arrow] [arrow-flight] (tustvold)
Fix interval to duration casting \#4553 #4562 [arrow] (tustvold)
docs: fix wrong parameter name #4559 [parquet] (SteveLauC)
Fix FixedSizeListBuilder capacity \#4549 #4552 [arrow] (tustvold)
docs: fix wrong inline code snippet in parquet document #4550 [parquet] (SteveLauC)
fix multiline wildcard likes fixes \#4547 #4548 [arrow] (nl5887)
Provide default is_empty impl for arrow::array::ArrayBuilder #4543 [arrow] (mbrobbel)
Add RowConverter::append \#4479 #4541 [arrow] (tustvold)
Clarify GenericColumnReader::read_records #4540 [parquet] (tustvold)
Initial loongarch port #4538 [arrow] (xiangzhai)
Update proc-macro2 requirement from =1.0.64 to =1.0.66 #4537 [arrow] [arrow-flight] (dependabot[bot])
add a validity slice access for boolean array builders #4536 [arrow] (ChristianBeilschmidt)
use new num version instead of explicit num-complex dependency #4532 [arrow] (mwlon)
feat: Support FixedSizedListArray for length kernel #4520 [arrow] (Weijun-H)

44.0.0 (2023-07-14)

Full Changelog

Breaking changes:

Use Parser for cast kernel \#4512 #4513 [arrow] (tustvold)
Add Datum based arithmetic kernels \#3999 #4465 [arrow] (tustvold)

Implemented enhancements:

eq_dyn_binary_scalar should support FixedSizeBinary types #4491 [arrow]
Port Tests from Deprecated Arithmetic Kernels #4480 [arrow]
Implement RecordBatchReader for Boxed trait object #4474 [arrow]
Support Date - Date kernel #4383 [arrow]
Default FlightSqlService Implementations #4372 [arrow] [arrow-flight]

Fixed bugs:

Parquet: AsyncArrowWriter to a file corrupts the footer for large columns #4526 [parquet]
```
object\_store
```
Cannot cast string '2021-01-02' to value of Date64 type #4512 [arrow]
Incorrect Interval Subtraction #4489 [arrow]
Interval Negation Incorrect #4488 [arrow]
Parquet: AsyncArrowWriter inner buffer is not correctly limited and causes OOM #4477 [parquet]

Merged pull requests:

Fix AsyncArrowWriter flush for large buffer sizes \#4526 #4527 [parquet] (tustvold)
Cleanup cast_primitive_to_list #4511 [arrow] (tustvold)
Bump actions/upload-pages-artifact from 1 to 2 #4508 (dependabot[bot])
Support Date - Date \#4383 #4504 [arrow] (tustvold)
Bump actions/labeler from 4.2.0 to 4.3.0 #4501 (dependabot[bot])
Update proc-macro2 requirement from =1.0.63 to =1.0.64 #4500 [arrow] [arrow-flight] (dependabot[bot])
Add negate kernels \#4488 #4494 [arrow] (tustvold)
Add Datum Arithmetic tests, Fix Interval Substraction \#4480 #4493 [arrow] (tustvold)
support FixedSizeBinary types in eq_dyn_binary_scalar/neq_dyn_binary_scalar #4492 [arrow] (maxburke)
Add default implementations to the FlightSqlService trait #4485 [arrow] [arrow-flight] (rossjones)
add num-complex requirement #4482 [arrow] (mwlon)
fix incorrect buffer size limiting in parquet async writer #4478 [parquet] (richox)
feat: support RecordBatchReader on boxed trait objects #4475 [arrow] (wjones127)
Improve in-place primitive sorts by 13-67% #4473 [arrow] (psvri)
Add Scalar/Datum abstraction \#1047 #4393 [arrow] (tustvold)

43.0.0 (2023-06-30)

Full Changelog

Breaking changes:

Simplify ffi import/export #4447 [arrow] (Virgiel)
Return Result from Parquet Row APIs #4428 [parquet] (zeevm)
Remove Binary Dictionary Arithmetic Support #4407 [arrow] (tustvold)

Implemented enhancements:

Request: a way to copy a Row to Rows #4466 [arrow]
Reuse schema when importing from FFI #4444 [arrow]
```
FlightSQL
```
Support NullBuilder #4429 [arrow]

Fixed bugs:

Regression in in parquet 42.0.0 : Bad parquet column indexes for All Null Columns, resulting in Parquet error: StructArrayReader out of sync on read #4459 [parquet]
Regression in 42.0.0: Parsing fractional intervals without leading 0 is not supported #4424 [arrow]

Documentation updates:

doc: deploy crate docs to GitHub pages #4436 [parquet] [arrow] (xxchan)

Merged pull requests:

Append Row to Rows \#4466 #4470 [arrow] (tustvold)
feat(flight-sql): Allow implementations of FlightSqlService to handle custom actions and commands #4463 [arrow] [arrow-flight] (amartins23)
Docs: Add clearer API doc links #4461 [parquet] [arrow] [arrow-flight] (alamb)
Fix empty offset index for all null columns \#4459 #4460 [parquet] (tustvold)
Bump peaceiris/actions-gh-pages from 3.9.2 to 3.9.3 #4455 (dependabot[bot])
Convince the compiler to auto-vectorize the range check in parquet DictionaryBuffer #4453 [parquet] (jhorstmann)
fix docs deployment #4452 [parquet] [arrow] (xxchan)
Update indexmap requirement from 1.9 to 2.0 #4451 [arrow] (dependabot[bot])
Update proc-macro2 requirement from =1.0.60 to =1.0.63 #4450 [arrow] [arrow-flight] (dependabot[bot])
Bump actions/deploy-pages from 1 to 2 #4449 (dependabot[bot])
Revise error message in From<Buffer> for ScalarBuffer #4446 [arrow] (viirya)
minor: remove useless mut #4443 [parquet] [arrow] (jackwener)
unify substring for binary&utf8 #4442 [arrow] (jackwener)
Casting fixedsizelist to list/largelist #4433 [arrow] (jayzhan211)
feat: support NullBuilder #4430 [arrow] (izveigor)
Remove Float64 -> Float32 cast in IPC Reader #4427 [arrow] (ming08108)
Parse intervals like .5 the same as 0.5 #4425 [arrow] (alamb)
feat: add strict mode to json reader #4421 [arrow] (blinkseb)
Add DictionaryArray::occupancy #4415 [arrow] (tustvold)

42.0.0 (2023-06-16)

Full Changelog

Breaking changes:

Remove 64-bit to 32-bit Cast from IPC Reader #4412 [arrow] (ming08108)
Truncate Min/Max values in the Column Index #4389 [parquet] (AdamGS)
feat(flight): harmonize server metadata APIs #4384 [arrow] [arrow-flight] (roeap)
Move record delimiting into ColumnReader \#4365 #4376 [parquet] (tustvold)
Changed array_to_json_array to take &dyn Array #4370 [arrow] (dadepo)
Make PrimitiveArray::with_timezone consuming #4366 [parquet] [arrow] (tustvold)

Implemented enhancements:

Add doc example of constructing a MapArray #4385 [arrow]
Support millisecond and microsecond functions #4374 [arrow]
Changed array_to_json_array to take &dyn Array #4369 [arrow]
compute::ord kernel for getting min and max of two scalar/array values #4347 [arrow]
Release 41.0.0 of arrow/arrow-flight/parquet/parquet-derive #4346
Refactor CAST tests to use new cast array syntax #4336 [arrow]
pass bytes directly to parquet's KeyValue #4317
PyArrow conversions could return TypeError if provided incorrect Python type #4312 [arrow]
Have array_to_json_array support Map #4297 [arrow]
FlightSQL: Add helpers to create CommandGetXdbcTypeInfo responses `XdbcInfoValue` and builders #4257 [arrow] [arrow-flight]
Have array_to_json_array support FixedSizeList #4248 [arrow]
Truncate ColumnIndex ByteArray Statistics #4126 [parquet]
Arrow compute kernel regards selection vector #4095 [arrow]

Fixed bugs:

Wrongly calculated data compressed length in IPC writer #4410 [arrow]
Take Kernel Handles Nullable Indices Incorrectly #4404 [arrow]
StructBuilder::new Doesn't Validate Builder DataTypes #4397 [arrow]
Parquet error: Not all children array length are the same! when using RowSelection to read a parquet file #4396
RecordReader::skip_records Is Incorrect for Repeated Columns #4368 [parquet]
List-of-String Array panics in the presence of row filters #4365 [parquet]
Fail to read block compressed gzip files with parquet-fromcsv #4173 [parquet]

Closed issues:

Have a parquet file not able to be deduped via arrow-rs, complains about Decimal precision? #4356
Question: Could we move dict_id, dict_is_ordered into DataType? #4325

Merged pull requests:

Fix reading gzip file with multiple gzip headers in parquet-fromcsv. #4419 [parquet] (ghuls)
Cleanup nullif kernel #4416 [arrow] (tustvold)
Fix bug in IPC logic that determines if the buffer should be compressed or not #4411 [arrow] (lwpyr)
Faster unpacking of Int32Type dictionary #4406 [arrow] (tustvold)
Improve take kernel performance on primitive arrays, fix bad null index handling \#4404 #4405 [arrow] (tustvold)
More take benchmarks #4403 [arrow] (tustvold)
Add BooleanBuffer::new_unset and BooleanBuffer::new_set and BooleanArray::new_null constructors #4402 [arrow] (tustvold)
Add PrimitiveBuilder type constructors #4401 [arrow] (tustvold)
StructBuilder Validate Child Data \#4397 #4400 [arrow] (tustvold)
Faster UTF-8 truncation #4399 [parquet] (tustvold)
Minor: Derive Hash impls for CastOptions and FormatOptions #4395 [arrow] (alamb)
Fix typo in README #4394 [arrow] [arrow-flight] (okue)
Improve parquet WriterProperites and ReaderProperties docs #4392 [parquet] (alamb)
Cleanup downcast macros #4391 [arrow] (tustvold)
Update proc-macro2 requirement from =1.0.59 to =1.0.60 #4388 [arrow] [arrow-flight] (dependabot[bot])
Consolidate ByteArray::from_iterator #4386 [arrow] (tustvold)
Add MapArray constructors and doc example #4382 [arrow] (tustvold)
Documentation Improvements #4381 [arrow] (tustvold)
Add NullBuffer and BooleanBuffer From conversions #4380 [arrow] (tustvold)
Add more examples of constructing Boolean, Primitive, String, and Decimal Arrays, and From impl for i256 #4379 [arrow] (alamb)
Add ListArrayReader benchmarks #4378 [parquet] (tustvold)
Update comfy-table requirement from 6.0 to 7.0 #4377 [arrow] (dependabot[bot])
feat: Addmicrosecond and millisecond kernels #4375 [arrow] (izveigor)
Update hashbrown requirement from 0.13 to 0.14 #4373 [parquet] [arrow] (dependabot[bot])
minor: use as_boolean to resolve TODO #4367 [arrow] (jackwener)
Have array_to_json_array support MapArray #4364 [arrow] (dadepo)
deprecate: as_decimal_array #4363 [arrow] (izveigor)
Add support for FixedSizeList in array_to_json_array #4361 [arrow] (dadepo)
refact: use as_primitive in cast.rs test #4360 [arrow] (Weijun-H)
feat(flight): add xdbc type info helpers #4359 [arrow] [arrow-flight] (roeap)
Minor: float16 to json #4358 [arrow] (izveigor)
Raise TypeError on PyArrow import #4316 [arrow] (wjones127)
Arrow Cast: Fixed Point Arithmetic for Interval Parsing #4291 [arrow] (mr-brobot)

41.0.0 (2023-06-02)

Full Changelog

Breaking changes:

Rename list contains kernels to in_list \#4289 #4342 [parquet] [arrow] (tustvold)
Move BooleanBufferBuilder and NullBufferBuilder to arrow_buffer #4338 [arrow] (tustvold)
Add separate row_count and level_count to PageMetadata \#4321 #4326 [parquet] (tustvold)
Treat legacy TIMSETAMP_X converted types as UTC #4309 [parquet] (sergiimk)
Simplify parquet PageIterator #4306 [parquet] (tustvold)
Add Builder style APIs and docs for FlightData, FlightInfo, FlightEndpoint, Locaation and Ticket #4294 [arrow] [arrow-flight] (alamb)
Make GenericColumnWriter Send #4287 [parquet] (tustvold)
feat: update flight-sql to latest specs #4250 [arrow] [arrow-flight] (roeap)
feat(api!): make ArrowArrayStreamReader Send #4232 [arrow] (wjones127)

Implemented enhancements:

Make SerializedRowGroupReader::new() Public #4330 [parquet]
Speed up i256 division and remainder operations #4302 [arrow]
export function parquet_to_array_schema_and_fields #4298 [parquet]
FLightSQL: add helpers to create CommandGetCatalogs, CommandGetSchemas, and CommandGetTables requests #4295 [arrow] [arrow-flight]
Make ColumnWriter Send #4286 [parquet]
Add Builder for FlightInfo to make it easier to create new requests #4281 [arrow] [arrow-flight]
Support Writing/Reading Decimal256 to/from Parquet #4264 [parquet]
FlightSQL: Add helpers to create CommandGetSqlInfo responses `SqlInfoValue` and builders #4256 [arrow] [arrow-flight]
Update flight-sql implementation to latest specs #4249 [arrow] [arrow-flight]
Make ArrowArrayStreamReader Send #4222 [arrow]
Support writing FixedSizeList to Parquet #4214 [parquet]
Cast between Intervals #4181 [arrow]
Splice Parquet Data #4155 [parquet]
CSV Schema More Flexible Timestamp Inference #4131 [arrow]

Fixed bugs:

Doc for arrow_flight::sql is missing enums that are Xdbc related #4339 [arrow] [arrow-flight]
concat_batches panics with total_len <= bit_len assertion for records with lists #4324 [arrow]
Incorrect PageMetadata Row Count returned for V1 DataPage #4321 [parquet]
```
parquet
```
ambiguous glob re-exports of contains_utf8 #4289 [parquet] [arrow]
flight_sql_client --header "key: value" yields a value with a leading whitespace #4270 [arrow] [arrow-flight]
Casting Timestamp to date is off by one day for dates before 1970-01-01 #4211 [arrow]

Merged pull requests:

Don't infer 16-byte decimal as decimal256 #4349 [parquet] (tustvold)
Fix MutableArrayData::extend_nulls \#1230 #4343 [arrow] (tustvold)
Update FlightSQL metadata locations, names and docs #4341 [arrow] [arrow-flight] (alamb)
chore: expose Xdbc related FlightSQL enums #4340 [arrow] [arrow-flight] (appletreeisyellow)
Update pyo3 requirement from 0.18 to 0.19 #4335 [arrow] (dependabot[bot])
Skip unnecessary null checks in MutableArrayData #4333 [arrow] (tustvold)
feat: add read parquet by custom rowgroup examples #4332 [parquet] (sundy-li)
Make SerializedRowGroupReader::new() public #4331 [parquet] (burmecia)
Don't split record across pages \#3680 #4327 [parquet] (tustvold)
fix date conversion if timestamp below unixtimestamp #4323 [arrow] (comphead)
Short-circuit on exhausted page in skip_records #4320 [parquet] (tustvold)
Handle trailing padding when skipping repetition levels \#3911 #4319 [parquet] (tustvold)
Use page_size consistently, deprecate pagesize in parquet WriterProperties #4313 [parquet] (alamb)
Add roundtrip tests for Decimal256 and fix issues \#4264 #4311 [parquet] (tustvold)
Expose page-level arrow reader API \#4298 #4307 [parquet] (tustvold)
Speed up i256 division and remainder operations #4303 [arrow] (viirya)
feat(flight): support int32_to_int32_list_map in sql infos #4300 [arrow] [arrow-flight] (roeap)
feat(flight): add helpers to handle CommandGetCatalogs, CommandGetSchemas, and CommandGetTables requests #4296 [arrow] [arrow-flight] (roeap)
Improve docs and tests for `SqlInfoList #4293 [arrow] [arrow-flight] (alamb)
minor: fix arrow_row docs.rs links #4292 [arrow] (roeap)
Update proc-macro2 requirement from =1.0.58 to =1.0.59 #4290 [arrow] [arrow-flight] (dependabot[bot])
Improve ArrowWriter memory usage: Buffer Pages in ArrowWriter instead of RecordBatch \#3871 #4280 [parquet] (tustvold)
Minor: Add more docstrings in arrow-flight #4279 [arrow] [arrow-flight] (alamb)
Add Debug impls for ArrowWriter and SerializedFileWriter #4278 [parquet] (alamb)
Expose RecordBatchWriter to arrow crate #4277 [arrow] (alexandreyc)
Update criterion requirement from 0.4 to 0.5 #4275 [parquet] [arrow] (dependabot[bot])
Add parquet-concat #4274 [parquet] (tustvold)
Convert FixedSizeListArray to GenericListArray #4273 [arrow] (tustvold)
feat: support 'Decimal256' for parquet #4272 [parquet] (Weijun-H)
Strip leading whitespace from flight_sql_client custom header values #4271 [arrow] [arrow-flight] (mkmik)
Add Append Column API \#4155 #4269 [parquet] (tustvold)
Derive Default for WriterProperties #4268 [parquet] (tustvold)
Parquet Reader/writer for fixed-size list arrays #4267 [parquet] (dexterduck)
feat(flight): add sql-info helpers #4266 [arrow] [arrow-flight] (roeap)
Convert parquet metadata back to builders #4265 [parquet] (tustvold)
Add constructors for FixedSize array types \#3879 #4263 [arrow] (tustvold)
Extract IPC ArrayReader struct #4259 [arrow] (tustvold)
Update object_store requirement from 0.5 to 0.6 #4258 [parquet] (dependabot[bot])
Support Absolute Timestamps in CSV Schema Inference \#4131 #4217 [arrow] (tustvold)
feat: cast between Intervals #4182 [arrow] (izveigor)

40.0.0 (2023-05-19)

Full Changelog

Breaking changes:

Prefetch page index \#4090 #4216 [parquet] (tustvold)
Add RecordBatchWriter trait and implement it for CSV, JSON, IPC and P… #4206 [parquet] [arrow] (alexandreyc)
Remove powf_scalar kernel #4187 [arrow] (tustvold)
Allow format specification in cast #4169 [arrow] (parthchandra)

Implemented enhancements:

ObjectStore with_url Should Handle Path #4199
Support Interval +/- Interval #4178 [arrow]
```
parquet
```
Allow cast to take in a format specification #4168 [arrow]
Support extended pow arithmetic #4166 [arrow]
Preload page index for async ParquetObjectReader #4090 [parquet]

Fixed bugs:

Subtracting Timestamp from Timestamp should produce a Duration not `Timestamp` #3964 [arrow]

Merged pull requests:

Arrow Arithmetic: Subtract timestamps #4244 [arrow] (mr-brobot)
Update proc-macro2 requirement from =1.0.57 to =1.0.58 #4236 [arrow] [arrow-flight] (dependabot[bot])
Fix Nightly Clippy Lints #4233 [arrow] (tustvold)
Minor: use all primitive types in test_layouts #4229 [arrow] (izveigor)
Add close method to RecordBatchWriter trait #4228 [parquet] [arrow] (alexandreyc)
Update proc-macro2 requirement from =1.0.56 to =1.0.57 #4219 [arrow] [arrow-flight] (dependabot[bot])
Feat docs #4215 [parquet] [arrow] (Folyd)
feat: Support bitwise and boolean aggregate functions #4210 [arrow] (izveigor)
Document how to sort a RecordBatch #4204 [arrow] (tustvold)
Fix incorrect cast Timestamp with Timezone #4201 [arrow] (aprimadi)
Add implementation of RecordBatchReader for CSV reader #4195 [arrow] (alexandreyc)
Add Sliced ListArray test \#3748 #4186 [arrow] (tustvold)
refactor: simplify can_cast_types code. #4185 [arrow] (jackwener)
Minor: support new types in struct_builder.rs #4177 [arrow] (izveigor)
feat: add compression info to print_column_chunk_metadata() #4176 [parquet] (SteveLauC)

39.0.0 (2023-05-05)

Full Changelog

Breaking changes:

Allow creating unbuffered streamreader #4165 [arrow] (ming08108)
Cleanup ChunkReader \#4118 #4156 [parquet] (tustvold)
Remove Type from NativeIndex #4146 [parquet] (tustvold)
Don't Duplicate Offset Index on RowGroupMetadata #4142 [parquet] (tustvold)
Return BooleanBuffer from BooleanBufferBuilder #4140 [parquet] [arrow] (tustvold)
Cleanup CSV schema inference \#4129 \#4130 #4133 [parquet] [arrow] (tustvold)
Remove deprecated parquet ArrowReader #4125 [parquet] (tustvold)
refactor: construct StructArray w/ FieldRef #4116 [parquet] [arrow] (crepererum)
Ignore Field Metadata in equals_datatype for Dictionary, RunEndEncoded, Map and Union #4111 [arrow] (izveigor)
Add StructArray Constructors \#3879 #4064 [arrow] (tustvold)

Implemented enhancements:

Release 39.0.0 of arrow/arrow-flight/parquet/parquet-derive next release after 38.0.0 #4170 [arrow] [arrow-flight]
Fixed point decimal multiplication for DictionaryArray #4135 [arrow]
Remove Seek Requirement from CSV ReaderBuilder #4130 [parquet] [arrow]
Inconsistent CSV Inference and Parsing DateTime Handling #4129 [parquet] [arrow]
Support accessing ipc Reader/Writer inner by reference #4121
Add Type Declarations for All Primitive Tensors and Buffer Builders #4112 [arrow]
Support Interval + Timestamp and Interval + Date in addition to Timestamp + Interval and Interval + Date #4094 [arrow]
Enable setting FlightDescriptor on FlightDataEncoderBuilder #3855 [arrow] [arrow-flight]

Fixed bugs:

Parquet Page Index Reader Assumes Consecutive Offsets #4149 [parquet]
Equality of nested data types #4110 [arrow]

Documentation updates:

Improve Documentation of Parquet ChunkReader #4118

Closed issues:

add specific error log for empty JSON array #4105 [arrow]

Merged pull requests:

Prep for 39.0.0 #4171 [arrow] [arrow-flight] (iajoiner)
Support Compression in parquet-fromcsv #4160 [parquet] (suxiaogang223)
feat: support bitwise shift left/right with scalars #4159 [arrow] (izveigor)
Cleanup reading page index \#4149 \#4090 #4151 [parquet] (tustvold)
feat: support bitwise shift left/right #4148 [arrow] (Weijun-H)
Don't hardcode port in FlightSQL tests #4145 [arrow] [arrow-flight] (tustvold)
Better flight SQL example codes #4144 [arrow] [arrow-flight] (sundy-li)
chore: clean the code by using as_primitive #4143 [arrow] (Weijun-H)
docs: fix the wrong ln command in CONTRIBUTING.md #4139 (SteveLauC)
Infer Float64 for JSON Numerics Beyond Bounds of i64 #4138 [arrow] (SteveLauC)
Support fixed point multiplication for DictionaryArray of Decimals #4136 [arrow] (viirya)
Make arrow_json::ReaderBuilder method names consistent #4128 [arrow] (tustvold)
feat: add get_{ref, mut} to arrow_ipc Reader and Writer #4122 (sticnarf)
feat: support Interval + Timestamp and Interval + Date #4117 [arrow] (Weijun-H)
Support NullArray in JSON Reader #4114 [arrow] (jiangzhx)
Add Type Declarations for All Primitive Tensors and Buffer Builders #4113 [arrow] (izveigor)
Update regex-syntax requirement from 0.6.27 to 0.7.1 #4107 [arrow] (dependabot[bot])
feat: set FlightDescriptor on FlightDataEncoderBuilder #4101 [arrow] [arrow-flight] (Weijun-H)
optimize cast for same decimal type and same scale #4088 [arrow] (liukun4515)

38.0.0 (2023-04-21)

Full Changelog

Breaking changes:

Remove DataType from PrimitiveArray constructors #4098 [arrow] (tustvold)
Use Into<Arc<str>> for PrimitiveArray::with_timezone #4097 [arrow] (tustvold)
Store StructArray entries in MapArray #4085 [parquet] [arrow] (tustvold)
Add DictionaryArray Constructors \#3879 #4068 [arrow] [arrow-flight] (tustvold)
Relax JSON schema inference generics #4063 [arrow] (tustvold)
Remove ArrayData from Array \#3880 #4061 [arrow] (tustvold)
Add CommandGetXdbcTypeInfo to Flight SQL Server #4055 [arrow] [arrow-flight] (c-thiel)
Remove old JSON Reader and Decoder \#3610 #4052 [parquet] [arrow] (tustvold)
Use BufRead for JSON Schema Inference #4041 [arrow] (WenyXu)

Implemented enhancements:

Support dyn_compare_scalar for Decimal256 #4083 [arrow]
Better JSON Reader Error Messages #4076 [arrow]
Additional data type groups #4056 [arrow]
Async JSON reader #4043 [arrow]
Field::contains Should Recurse into DataType #4029 [arrow]
Prevent UnionArray with Repeated Type IDs #3982 [parquet] [arrow]
Support Timestamp +/- Interval types #3963 [arrow]
First-Class Array Abstractions #3880 [parquet] [arrow] [arrow-flight]

Fixed bugs:

Update readme to remove reference to Jira #4091
OffsetBuffer::new Rejects 0 Offsets #4066 [arrow]
Parquet AsyncArrowWriter not shutting down inner async writer. #4058 [parquet]
Flight SQL Server missing command type.googleapis.com/arrow.flight.protocol.sql.CommandGetXdbcTypeInfo #4054 [arrow] [arrow-flight]
RawJsonReader Errors with Empty Schema #4053 [parquet] [arrow]
RawJsonReader Integer Truncation #4049 [arrow]
Sparse UnionArray Equality Incorrect Offset Handling #4044 [arrow]

Documentation updates:

Write blog about improvements in JSON and CSV processing #4062 [arrow]

Closed issues:

Parquet reader of Int96 columns and coercion to timestamps #4075
Serializing timestamp from int json raw decoder #4069 [arrow]
Support casting to/from Interval and Duration #3998 [arrow]

Merged pull requests:

Fix Docs Typos #4100 [parquet] (rnarkk)
Update tonic-build requirement from =0.9.1 to =0.9.2 #4099 [arrow] [arrow-flight] (dependabot[bot])
Increase minimum chrono version to 0.4.24 #4093 [arrow] (alamb)
Simplify reference to GitHub issues #4092 (bkmgit)
```
Minor
```
Include byte offsets in parquet-layout #4086 [parquet] (tustvold)
feat: Support dyn_compare_scalar for Decimal256 #4084 [arrow] (izveigor)
Add ByteArray constructors \#3879 #4081 [arrow] (tustvold)
Update prost-build requirement from =0.11.8 to =0.11.9 #4080 [arrow] [arrow-flight] (dependabot[bot])
Improve JSON decoder errors \#4076 #4079 [arrow] (tustvold)
Fix Timestamp Numeric Truncation in JSON Reader #4074 [arrow] (tustvold)
Serialize numeric to tape \#4069 #4073 [arrow] (tustvold)
feat: Prevent UnionArray with Repeated Type IDs #4070 [arrow] (Weijun-H)
Add PrimitiveArray::try_new \#3879 #4067 [arrow] (tustvold)
Add ListArray Constructors \#3879 #4065 [arrow] (tustvold)
Shutdown parquet async writer #4059 [parquet] (kindly)
feat: additional data type groups #4057 [arrow] (izveigor)
Fix precision loss in Raw JSON decoder \#4049 #4051 [arrow] (tustvold)
Use lexical_core in CSV and JSON parser ~25% faster #4050 [arrow] (tustvold)
Add offsets accessors to variable length arrays \#3879 #4048 [arrow] (tustvold)
Document Async decoder usage \#4043 \#78 #4046 [arrow] (tustvold)
Fix sparse union array equality \#4044 #4045 [arrow] (tustvold)
feat: DataType::contains support nested type #4042 [arrow] (Weijun-H)
feat: Support Timestamp +/- Interval types #4038 [arrow] (Weijun-H)
Fix object_store CI #4037 (tustvold)
feat: cast from/to interval and duration #4020 [arrow] (Weijun-H)

37.0.0 (2023-04-07)

Full Changelog

Breaking changes:

Fix timestamp handling in cast kernel \#1936 \#4033 #4034 [arrow] (tustvold)
Update tonic 0.9.1 #4011 [arrow] [arrow-flight] (tustvold)
Use FieldRef in DataType \#3955 #3983 [parquet] [arrow] (tustvold)
Store Timezone as Arc<str> #3976 [parquet] [arrow] (tustvold)
Panic instead of discarding nulls converting StructArray to RecordBatch - \#3951 #3953 [parquet] [arrow] (tustvold)
Fix(flight_sql): PreparedStatement has no token for auth. #3948 [arrow] [arrow-flight] (youngsofun)
Add Strongly Typed Array Slice \#3929 #3930 [parquet] [arrow] (tustvold)
Add Zero-Copy Conversion between Vec and MutableBuffer #3920 [arrow] (tustvold)

Implemented enhancements:

Support Decimals cast to Utf8/LargeUtf #3991 [arrow]
Support Date32/Date64 minus Interval #3962 [arrow]
Reduce Cloning of Field #3955 [parquet] [arrow] [arrow-flight]
Support Deserializing Serde DataTypes to Arrow #3949 [arrow]
Add multiply_fixed_point #3946 [arrow]
Strongly Typed Array Slicing #3929 [parquet] [arrow]
Make it easier to match FlightSQL messages #3874 [arrow] [arrow-flight]
Support Casting Between Binary / LargeBinary and FixedSizeBinary #3826 [arrow]

Fixed bugs:

Incorrect Overflow Casting String to Timestamp #4033
f16::ZERO and f16::ONE are mixed up #4016 [arrow]
Handle overflow precision when casting from integer to decimal #3995 [arrow]
PrimitiveDictionaryBuilder.finish should use actual value type #3971 [arrow]
RecordBatch From StructArray Silently Discards Nulls #3952 [parquet] [arrow]
I256 Checked Subtraction Overflows for i256::MINUS_ONE #3942 [arrow]
I256 Checked Multiply Overflows for i256::MIN #3941 [arrow]

Closed issues:

Remove non-existent js feature from README #4000 [arrow]
Support take on MapArray #3875 [arrow]

Merged pull requests:

Prep for 37.0.0 #4031 [arrow] [arrow-flight] (iajoiner)
Add RecordBatch::with_schema #4028 [arrow] (tustvold)
Only require compatible batch schema in ArrowWriter #4027 [parquet] (tustvold)
Add Fields::contains #4026 [arrow] (tustvold)
Minor: add methods "is_positive" and "signum" to i256 #4024 [arrow] (izveigor)
Deprecate Array::data \#3880 #4019 [arrow] (tustvold)
feat: add tests for ArrowNativeTypeOp #4018 [arrow] (izveigor)
fix: f16::ZERO and f16::ONE are mixed up #4017 [arrow] (izveigor)
Minor: Float16Tensor #4013 [arrow] (izveigor)
Add FlightSQL module docs and links to arrow-flight crates #4012 [arrow] [arrow-flight] (alamb)
Update proc-macro2 requirement from =1.0.54 to =1.0.56 #4008 [arrow] [arrow-flight] (dependabot[bot])
Cleanup Primitive take #4006 [arrow] (tustvold)
Deprecate combine_option_bitmap #4005 [arrow] (tustvold)
Minor: add tests for BooleanBuffer #4004 [arrow] (izveigor)
feat: support to read/write customized metadata in ipc files #4003 [arrow] (framlog)
Cleanup more uses of Array::data \#3880 #4002 [parquet] [arrow] (tustvold)
Remove js feature from README #4001 [arrow] (akazukin5151)
feat: add the implementation BitXor to BooleanBuffer #3997 [arrow] (izveigor)
Handle precision overflow when casting from integer to decimal #3996 [arrow] (viirya)
Support CAST from Decimal datatype to String #3994 [arrow] (comphead)
Add Field Constructors for Complex Fields #3992 [parquet] [arrow] [arrow-flight] (tustvold)
fix: remove unused type parameters. #3986 [arrow] (youngsofun)
Add UnionFields \#3955 #3981 [parquet] [arrow] (tustvold)
Cleanup Fields Serde #3980 [arrow] (tustvold)
Support Rust structures --> RecordBatch by adding Serde support to RawDecoder \#3949 #3979 [arrow] (tustvold)
Convert string_to_timestamp_nanos to doctest #3978 [arrow] (tustvold)
Fix documentation of string_to_timestamp_nanos #3977 [arrow] (byteink)
add Date32/Date64 support to subtract_dyn #3974 [arrow] (SinanGncgl)
PrimitiveDictionaryBuilder.finish should use actual value type #3972 [arrow] (viirya)
Update proc-macro2 requirement from =1.0.53 to =1.0.54 #3968 [arrow] [arrow-flight] (dependabot[bot])
Async writer tweaks #3967 [parquet] (tustvold)
Fix reading ipc files with unordered projections #3966 [arrow] (framlog)
Add Fields abstraction \#3955 #3965 [parquet] [arrow] [arrow-flight] (tustvold)
feat: cast between Binary/LargeBinary and FixedSizeBinary #3961 [arrow] (Weijun-H)
feat: support async writer \#1269 #3957 [parquet] (ShiKaiWi)
Add ListBuilder::append_value \#3949 #3954 [arrow] (tustvold)
Improve array builder documentation \#3949 #3951 [arrow] (tustvold)
Faster i256 parsing #3950 [arrow] (tustvold)
Add multiply_fixed_point #3945 [arrow] (viirya)
feat: enable metadata import/export through C data interface #3944 [arrow] (wjones127)
Fix checked i256 arithmetic \#3942 \#3941 #3943 [arrow] (tustvold)
Avoid memory copies in take_list #3940 [arrow] (tustvold)
Faster decimal parsing 30-60% #3939 [arrow] (spebern)
Fix: FlightSqlClient panic when execute_update. #3938 [arrow] [arrow-flight] (youngsofun)
Cleanup row count handling in JSON writer #3934 [arrow] (tustvold)
Add typed buffers to UnionArray \#3880 #3933 [arrow] (tustvold)
feat: add take for MapArray #3925 [arrow] (wjones127)
Deprecate Array::data_ref \#3880 #3923 [arrow] (tustvold)
Zero-copy conversion from Vec to PrimitiveArray #3917 [arrow] (tustvold)
feat: Add Commands enum to decode prost messages to strong type #3887 [arrow] [arrow-flight] (stuartcarnie)

36.0.0 (2023-03-24)

Full Changelog

Breaking changes:

Use dyn Array in sort kernels #3931 [arrow] (tustvold)
Enforce struct nullability in JSON raw reader \#3900 \#3904 #3906 [arrow] (tustvold)
Return ScalarBuffer from PrimitiveArray::values \#3879 #3896 [arrow] (tustvold)
Use BooleanBuffer in BooleanArray \#3879 #3895 [arrow] (tustvold)
Seal ArrowPrimitiveType #3882 [arrow] (tustvold)
Support compression levels #3847 [parquet] (spebern)

Implemented enhancements:

Improve speed of parsing string to Times #3919 [arrow]
feat: add comparison/sort support for Float16 #3914
Pinned version in arrow-flight's build-dependencies are causing conflicts #3876
Add compression options levels #3844 [parquet] [arrow]
Use Unsigned Integer for Fixed Size DataType #3815
Common trait for RecordBatch and StructArray #3764 [arrow]
Allow precision loss on multiplying decimal arrays #3689 [arrow]

Fixed bugs:

Raw JSON Reader Allows Non-Nullable Struct Children to Contain Nulls #3904
Nullable field with nested not nullable map in json #3900
parquet_derive doesn't support Vec<u8> #3864 [parquet]
```
REGRESSION
```
```
REGRESSION
```
```
REGRESSION
```
CSV Reader Doesn't set Timezone #3841
PyArrowConvert Leaks Memory #3683 [arrow]

Merged pull requests:

Derive RunArray Clone #3932 [arrow] (tustvold)
Move protoc generation to binary crate, unpin prost/tonic build \#3876 #3927 [arrow] [arrow-flight] (tustvold)
Fix JSON Temporal Encoding of Multiple Batches #3924 [arrow] (doki23)
Cleanup uses of Array::data_ref \#3880 #3918 [parquet] [arrow] (tustvold)
Support microsecond and nanosecond in interval parsing #3916 [arrow] (alamb)
feat: add comparison/sort support for Float16 #3915 [arrow] (izveigor)
Add AsArray trait for more ergonomic downcasting #3912 [parquet] [arrow] (tustvold)
Add OffsetBuffer::new #3910 [arrow] (tustvold)
Add PrimitiveArray::new \#3879 #3909 [arrow] (tustvold)
Support timezones in CSV reader \#3841 #3908 [arrow] (tustvold)
Improve ScalarBuffer debug output #3907 [arrow] (tustvold)
Update proc-macro2 requirement from =1.0.52 to =1.0.53 #3905 [arrow] [arrow-flight] (dependabot[bot])
Re-export parquet compression level structs #3903 [parquet] (tustvold)
Fix parsing timestamps of exactly 32 characters #3902 [arrow] (tustvold)
Add iterators to BooleanBuffer and NullBuffer #3901 [arrow] (tustvold)
Array equality for &dyn Array \#3880 #3899 [arrow] (tustvold)
Add BooleanArray::new \#3879 #3898 [arrow] (tustvold)
Revert structured ArrayData \#3877 #3894 (tustvold)
Fix pyarrow memory leak \#3683 #3893 [arrow] (tustvold)
Minor: add examples for ListBuilder and GenericListBuilder #3891 [arrow] (alamb)
Update syn requirement from 1.0 to 2.0 #3890 (dependabot[bot])
Use of mul_checked to avoid silent overflow in interval arithmetic #3886 [arrow] (Weijun-H)
Flesh out NullBuffer abstraction \#3880 #3885 [parquet] [arrow] (tustvold)
Implement Bit Operations for i256 #3884 [arrow] (tustvold)
Flatten arrow_buffer #3883 [arrow] (tustvold)
Add Array::to_data and Array::nulls \#3880 #3881 [arrow] (tustvold)
Added support for byte vectors and slices to parquet_derive \#3864 #3878 [parquet] (waymost)
chore: remove LevelDecoder #3872 [parquet] (Weijun-H)
Parse timestamps with leap seconds \#3861 #3862 [arrow] (tustvold)
Faster time parsing ~93% faster #3860 [arrow] (tustvold)
Parse timestamps with arbitrary seconds fraction #3858 [arrow] (tustvold)
Add BitIterator #3856 [arrow] (tustvold)
Improve decimal parsing performance #3854 [arrow] (spebern)
Update proc-macro2 requirement from =1.0.51 to =1.0.52 #3853 [arrow] [arrow-flight] (dependabot[bot])
Update bitflags requirement from 1.2.1 to 2.0.0 #3852 [arrow] (dependabot[bot])
Add offset pushdown to parquet #3848 [parquet] (tustvold)
Add timezone support to JSON reader #3845 [arrow] (tustvold)
Allow precision loss on multiplying decimal arrays #3690 [arrow] (viirya)

35.0.0 (2023-03-10)

Full Changelog

Breaking changes:

Add RunEndBuffer \#1799 #3817 [arrow] (tustvold)
Restrict DictionaryArray to ArrowDictionaryKeyType #3813 [arrow] (tustvold)
refactor: assorted FlightSqlServiceClient improvements #3788 [arrow] [arrow-flight] (crepererum)
minor: make Parquet CLI input args consistent #3786 [parquet] (XinyuZeng)
Return Buffers from ArrayData::buffers instead of slice \#1799 #3783 [arrow] (tustvold)
Use NullBuffer in ArrayData \#3775 #3778 [parquet] [arrow] (tustvold)

Implemented enhancements:

Support timestamp/time and date types in json decoder #3834 [arrow]
Support decoding decimals in new raw json decoder #3819 [arrow]
Timezone Aware Timestamp Parsing #3794 [arrow]
Preallocate buffers for FixedSizeBinary array creation #3792 [arrow]
Make Parquet CLI args consistent #3785 [parquet]
Creates PrimitiveDictionaryBuilder from provided keys and values builders #3776 [arrow]
Use NullBuffer in ArrayData #3775 [parquet] [arrow]
Support unary_dict_mut in arth #3710 [arrow]
Support cast <> String to interval #3643 [arrow]
Support Zero-Copy Conversion from Vec to/from MutableBuffer #3516 [arrow]

Fixed bugs:

Timestamp Unit Casts are Unchecked #3833 [arrow]
regexp_match skips first match when returning match #3803 [arrow]
Cast to timestamp with time zone returns timestamp #3800 [arrow]
Schema-level metadata is not encoded in Flight responses #3779 [arrow] [arrow-flight]

Closed issues:

FlightSQL CLI client: simple test #3814 [arrow] [arrow-flight]

Merged pull requests:

refactor: timestamp overflow check #3840 [arrow] (Weijun-H)
Prep for 35.0.0 #3836 [parquet] [arrow] [arrow-flight] (iajoiner)
Support timestamp/time and date json decoding #3835 [arrow] (spebern)
Make dictionary preservation optional in row encoding #3831 [arrow] (tustvold)
Move prettyprint to arrow-cast #3828 [arrow] [arrow-flight] (tustvold)
Support decoding decimals in raw decoder #3820 [arrow] (spebern)
Add ArrayDataLayout, port validation \#1799 #3818 [arrow] (tustvold)
test: add test for FlightSQL CLI client #3816 [arrow] [arrow-flight] (crepererum)
Add regexp_match docs #3812 [arrow] (tustvold)
fix: Ensure Flight schema includes parent metadata #3811 [arrow] [arrow-flight] (stuartcarnie)
fix: regexp_match skips first match #3807 [arrow] (Weijun-H)
fix: change uft8 to timestamp with timezone #3806 [arrow] (Weijun-H)
Support reading decimal arrays from json #3805 [arrow] (spebern)
Add unary_dict_mut #3804 [arrow] (viirya)
Faster timestamp parsing ~70-90% faster #3801 [arrow] (tustvold)
Add concat_elements_bytes #3798 [arrow] (tustvold)
Timezone aware timestamp parsing \#3794 #3795 [arrow] (tustvold)
Preallocate buffers for FixedSizeBinary array creation #3793 [arrow] (maxburke)
feat: simple flight sql CLI client #3789 [arrow] [arrow-flight] (crepererum)
Creates PrimitiveDictionaryBuilder from provided keys and values builders #3777 [arrow] (viirya)
ArrayData Enumeration for Remaining Layouts #3769 [arrow] (tustvold)
Update prost-build requirement from =0.11.7 to =0.11.8 #3767 [arrow] [arrow-flight] (dependabot[bot])
Implement concat_elements_dyn kernel #3763 [arrow] (Weijun-H)
Support for casting Utf8 and LargeUtf8 --> Interval #3762 [arrow] (doki23)
into_inner() for CSV Writer #3759 [arrow] (Weijun-H)
Zero-copy Vec conversion \#3516 \#1176 #3756 [arrow] (tustvold)
ArrayData Enumeration for Primitive, Binary and UTF8 #3749 [arrow] (tustvold)
Add into_primitive_dict_builder to DictionaryArray #3715 [arrow] (viirya)

34.0.0 (2023-02-24)

Full Changelog

Breaking changes:

Infer 2020-03-19 00:00:00 as timestamp not Date64 in CSV \#3744 #3746 [arrow] (tustvold)
Implement fallible streams for FlightClient::do_put #3464 [arrow] [arrow-flight] (alamb)

Implemented enhancements:

Support casting string to timestamp with microsecond resolution #3751
Add datatime/interval/duration into comparison kernels #3729 [arrow]
! not operator overload for SortOptions #3726 [arrow]
parquet: convert Bytes to ByteArray directly #3719 [parquet]
Implement simple RecordBatchReader #3704
Is possible to implement GenericListArray::from_iter ? #3702
take_run improvements #3701 [arrow]
Support as_mut_any in Array trait #3655
Array --> Display formatter that supports more options and is configurable #3638 [parquet] [arrow]
arrow-csv: support decimal256 #3474 [arrow]

Fixed bugs:

CSV reader infers Date64 type for fields like "2020-03-19 00:00:00" that it can't parse to Date64 #3744 [arrow]

Merged pull requests:

Update to 34.0.0 and update changelog #3757 [parquet] [arrow] [arrow-flight] (iajoiner)
Update MIRI for split crates \#2594 #3754 (tustvold)
Update prost-build requirement from =0.11.6 to =0.11.7 #3753 [arrow] [arrow-flight] (dependabot[bot])
Enable casting of string to timestamp with microsecond resolution #3752 [arrow] (gruuya)
Use Typed Buffers in Arrays \#1811 \#1176 #3743 [arrow] (tustvold)
Cleanup arithmetic kernel type constraints #3739 [arrow] (tustvold)
Make dictionary kernels optional for comparison benchmark #3738 [arrow] (tustvold)
Support String Coercion in Raw JSON Reader #3736 [arrow] (rguerreiromsft)
replace for loop by try_for_each #3734 [arrow] (suxiaogang223)
feat: implement generic record batch reader #3733 (wjones127)
```
minor
```
Add datetime/interval/duration into dyn scalar comparison #3730 [arrow] (viirya)
Using Borrow<Value> on infer_json_schema_from_iterator #3728 [arrow] (rguerreiromsft)
Not operator overload for SortOptions #3727 [arrow] (berkaysynnada)
fix: encoding batch with no columns #3724 [arrow] [arrow-flight] (wangrunji0408)
feat: impl Ord/PartialOrd for SortOptions #3723 [arrow] (crepererum)
Add From<Bytes> for ByteArray #3720 [parquet] (tustvold)
Deprecate old JSON reader \#3610 #3718 [parquet] [arrow] (tustvold)
Add pretty format with options #3717 [arrow] (tustvold)
Remove unreachable decimal take #3716 [arrow] (tustvold)
Feat: arrow csv decimal256 #3711 [arrow] (suxiaogang223)
perf: take_run improvements #3705 [arrow] (askoa)
Add raw MapArrayReader #3703 [arrow] (tustvold)
feat: Sort kernel for RunArray #3695 [arrow] (askoa)
perf: Remove sorting to yield sorted_rank #3693 [arrow] (askoa)
fix: Handle sliced array in run array iterator #3681 [arrow] (askoa)

33.0.0 (2023-02-10)

Full Changelog

Breaking changes:

Use ArrayFormatter in Cast Kernel #3668 [arrow] (tustvold)
Use dyn Array in cast kernels #3667 [arrow] (tustvold)
Return references from FixedSizeListArray and MapArray #3652 [parquet] [arrow] (tustvold)
Lazy array display \#3638 #3647 [parquet] [arrow] (tustvold)
Use array_value_to_string in arrow-csv #3514 [arrow] (JayjeetAtGithub)

Implemented enhancements:

Support UTF8 cast to Timestamp with timezone #3664
Add modulus_dyn and modulus_scalar_dyn #3648 [arrow]
A trait for append_value and append_null on ArrayBuilders #3644
Improve error message "batches[0] schema is different with argument schema" #3628 [arrow]
Specified version of helper function to cast binary to string #3623 [arrow]
Casting generic binary to generic string #3606 [arrow]
Use array_value_to_string in arrow-csv #3483 [arrow]

Fixed bugs:

ArrowArray::try_from_raw Misleading Signature #3684 [arrow]
PyArrowConvert Leaks Memory #3683 [arrow]
Arrow-csv reader cannot produce RecordBatch even if the bytes are necessary #3674
FFI Fails to Account For Offsets #3671 [arrow]
Regression in CSV reader error handling #3656 [arrow]
UnionArray Child and Value Fail to Account for non-contiguous Type IDs #3653 [arrow]
Panic when accessing RecordBatch from pyarrow #3646 [arrow]
Multiplication for decimals is incorrect #3645
Inconsistent output between pretty print and CSV writer for Arrow #3513 [arrow]

Closed issues:

Release 33.0.0 of arrow/arrow-flight/parquet/parquet-derive next release after 32.0.0 #3682
Release 32.0.0 of arrow/arrow-flight/parquet/parquet-derive next release after `31.0.0` #3584 [parquet] [arrow] [arrow-flight]

Merged pull requests:

Move FFI to sub-crates #3687 [arrow] (tustvold)
Update to 33.0.0 and update changelog #3686 [parquet] [arrow] [arrow-flight] (iajoiner)
Cleanup FFI interface \#3684 \#3683 #3685 [arrow] (tustvold)
fix: take_run benchmark parameter #3679 [arrow] (askoa)
Minor: Add some examples to Date*Array and Time*Array #3678 [arrow] (alamb)
Add CSV Decoder::capacity \#3674 #3677 [arrow] (tustvold)
Add ArrayData::new_null and DataType::primitive_width #3676 [arrow] (tustvold)
Fix FFI which fails to account for offsets #3675 [arrow] (viirya)
Support UTF8 cast to Timestamp with timezone #3673 [arrow] (comphead)
Fix Date64Array docs #3670 [arrow] (tustvold)
Update proc-macro2 requirement from =1.0.50 to =1.0.51 #3669 [arrow] [arrow-flight] (dependabot[bot])
Add timezone accessor for Timestamp*Array #3666 [arrow] (tustvold)
Faster timezone cast #3665 [arrow] (tustvold)
feat + fix: IPC support for run encoded array. #3662 [arrow] (askoa)
Implement std::fmt::Write for StringBuilder \#3638 #3659 [arrow] (tustvold)
Include line and field number in CSV UTF-8 error \#3656 #3657 [arrow] (tustvold)
Handle non-contiguous type_ids in UnionArray \#3653 #3654 [arrow] (tustvold)
Add modulus_dyn and modulus_scalar_dyn #3649 [arrow] (viirya)
Improve error message with detailed schema #3637 [arrow] (Veeupup)
Add limit to ArrowReaderBuilder to push limit down to parquet reader #3633 [parquet] (thinkharderdev)
chore: delete wrong comment and refactor set_metadata in Field #3630 [arrow] (chunshao90)
Fix typo in comment #3627 [parquet] (kjschiroo)
Minor: Update doc strings about Page Index / Column Index #3625 [parquet] (alamb)
Specified version of helper function to cast binary to string #3624 [arrow] (viirya)
feat: take kernel for RunArray #3622 [arrow] (askoa)
Remove BitSliceIterator specialization from try_for_each_valid_idx #3621 [arrow] (tustvold)
Reduce PrimitiveArray::try_unary codegen #3619 [arrow] (tustvold)
Reduce Dictionary Builder Codegen #3616 [arrow] (tustvold)
Minor: Add test for dictionary encoding of batches #3608 [arrow-flight] (alamb)
Casting generic binary to generic string #3607 [arrow] (viirya)
Add ArrayAccessor, Iterator, Extend and benchmarks for RunArray #3603 [arrow] (askoa)

32.0.0 (2023-01-27)

Full Changelog

Breaking changes:

Allow StringArray construction with Vec<Option<String>> #3602 [arrow] (sinistersnare)
Use native types in PageIndex \#3575 #3578 [parquet] (tustvold)
Add external variant to ParquetError \#3285 #3574 [parquet] (tustvold)
Return reference from ListArray::values #3561 [arrow] (tustvold)
feat: Add RunEndEncodedArray #3553 [parquet] [arrow] (askoa)

Implemented enhancements:

There should be a From<Vec<Option<String>>> impl for GenericStringArray<OffsetSize> #3599 [arrow]
FlightDataEncoder Optionally send Schema even when no record batches #3591 [arrow-flight]
Use Native Types in PageIndex #3575 [parquet]
Packing array into dictionary of generic byte array #3571 [arrow]
Implement Error::Source for ArrowError and FlightError #3566 [arrow] [arrow-flight]
```
FlightSQL
```
Arrow CSV writer should not fail when cannot cast the value #3547 [arrow]
Write Deprecated Min Max Statistics When ColumnOrder Signed #3526 [parquet]
Improve Performance of JSON Reader #3441
Support footer kv metadata for IPC file #3432
Add External variant to ParquetError #3285 [parquet]

Fixed bugs:

Nullif of NULL Predicate is not NULL #3589
BooleanBufferBuilder Fails to Clear Set Bits On Truncate #3587 [arrow]
nullif incorrectly calculates null_count, sometimes panics with subtraction overflow error #3579 [arrow]
Meet warning when use pyarrow #3543 [arrow]
Incorrect row group total_byte_size written to parquet file #3530 [parquet]
Overflow when casting timestamps prior to the epoch #3512 [arrow]

Closed issues:

Panic on Key Overflow in Dictionary Builders #3562 [parquet] [arrow]
Bumping version gives compilation error arrow-array #3525

Merged pull requests:

Add Push-Based CSV Decoder #3604 [arrow] (tustvold)
Update to flatbuffers 23.1.21 #3597 [arrow] (tustvold)
Faster BooleanBufferBuilder::append_n for true values #3596 [arrow] (tustvold)
Support sending schemas for empty streams #3594 [arrow-flight] (alamb)
Faster ListArray to StringArray conversion #3593 [arrow] (tustvold)
Add conversion from StringArray to BinaryArray #3592 [arrow] (tustvold)
Fix nullif null count \#3579 #3590 [arrow] (tustvold)
Clear bits in BooleanBufferBuilder \#3587 #3588 [arrow] (tustvold)
Iterate all dictionary key types in cast test #3585 [arrow] (viirya)
Propagate EOF Error from AsyncRead #3576 [parquet] (Sach1nAgarwal)
Show row_counts also for (FixedLen)ByteArray #3573 [parquet] (bmmeijers)
Packing array into dictionary of generic byte array #3572 [arrow] (viirya)
Remove unwrap on datetime cast for CSV writer #3570 [arrow] (comphead)
Implement std::error::Error::source for ArrowError and FlightError #3567 [arrow] [arrow-flight] (alamb)
Improve GenericBytesBuilder offset overflow panic message \#139 #3564 [arrow] (tustvold)
Implement Extend for ArrayBuilder \#1841 #3563 [arrow] (tustvold)
Update pyarrow method call with kwargs #3560 [arrow] (Frankonly)
Update pyo3 requirement from 0.17 to 0.18 #3557 [arrow] (viirya)
Expose Inner FlightServiceClient on FlightSqlServiceClient \#3551 #3556 [arrow-flight] (tustvold)
Fix final page row count in parquet-index binary #3554 [parquet] (tustvold)
Parquet Avoid Reading 8 Byte Footer Twice from AsyncRead #3550 [parquet] (Sach1nAgarwal)
Improve concat kernel capacity estimation #3546 [arrow] (tustvold)
Update proc-macro2 requirement from =1.0.49 to =1.0.50 #3545 [arrow-flight] (dependabot[bot])
Update pyarrow method call to avoid warning #3544 [arrow] (Frankonly)
Enable casting between Utf8/LargeUtf8 and Binary/LargeBinary #3542 [arrow] (viirya)
Use GHA concurrency groups \#3495 #3538 (tustvold)
set sum of uncompressed column size as row group size for parquet files #3531 [parquet] (sidred)
Minor: Add documentation about memory use for ArrayData #3529 [arrow] (alamb)
Upgrade to clap 4.1 + fix test #3528 [parquet] (tustvold)
Write backwards compatible row group statistics \#3526 #3527 [parquet] (tustvold)
No panic on timestamp buffer overflow #3519 [arrow] (comphead)
Support casting from binary to dictionary of binary #3482 [arrow] (viirya)
Add Raw JSON Reader ~2.5x faster #3479 [arrow] (tustvold)

31.0.0 (2023-01-13)

Full Changelog

Breaking changes:

support RFC3339 style timestamps in arrow-json #3449 [arrow] (JayjeetAtGithub)
Improve arrow flight batch splitting and naming #3444 [arrow-flight] (alamb)
Parquet record API: timestamp as signed integer #3437 [parquet] (ByteBaker)
Support decimal int32/64 for writer #3431 [parquet] (liukun4515)

Implemented enhancements:

Support casting Date32 to timestamp #3504 [arrow]
Support casting strings like '2001-01-01' to timestamp #3492 [arrow]
CLI to "rewrite" parquet files #3476 [parquet]
Add more dictionary value type support to build_compare #3465
Allow concat_batches to take non owned RecordBatch #3456 [arrow]
Release Arrow 30.0.1 maintenance release for `30.0.0` #3455
Add string comparisons starts\_with, ends\_with, and contains to kernel #3442 [arrow]
make_builder Loses Timezone and Decimal Scale Information #3435 [arrow]
Use RFC3339 style timestamps in arrow-json #3416 [arrow]
ArrayDataget_slice_memory_size or similar #3407 [arrow] [arrow-flight]

Fixed bugs:

Unable to read CSV with null boolean value #3521 [arrow]
Make consistent behavior on zeros equality on floating point types #3509
Sliced batch w/ bool column doesn't roundtrip through IPC #3496 [arrow] [arrow-flight]
take kernel on List array introduces nulls instead of empty lists #3471 [arrow]
Infinite Loop If Skipping More CSV Lines than Present #3469 [arrow]

Merged pull requests:

Fix reading null booleans from CSV #3523 [arrow] (tustvold)
minor fix: use the unified decimal type builder #3522 [parquet] (liukun4515)
Update version to 31.0.0 and add changelog #3518 [parquet] [arrow] [arrow-flight] (iajoiner)
Additional nullif re-export #3515 [arrow] (tustvold)
Make consistent behavior on zeros equality on floating point types #3510 (viirya)
Enable cast Date32 to Timestamp #3508 [arrow] (comphead)
Update prost-build requirement from =0.11.5 to =0.11.6 #3507 [arrow-flight] (dependabot[bot])
minor fix for the comments #3505 [arrow] (liukun4515)
Fix DataTypeLayout for LargeList #3503 [arrow] (viirya)
Add string comparisons starts\_with, ends\_with, and contains to kernel #3502 [arrow] (snmvaughan)
Add a function to get memory size of array slice #3501 [arrow] (askoa)
Fix IPCWriter for Sliced BooleanArray #3498 [arrow] (crepererum)
Fix: Added support to cast string without time #3494 [arrow] (gaelwjl)
Fix negative interval prettyprint #3491 [arrow] (Jefffrey)
Fixes a broken link in the arrow lib.rs rustdoc #3487 [arrow] (AdamGS)
Refactoring build_compare for decimal and using downcast_primitive #3484 (viirya)
Add tests for record batch size splitting logic in FlightClient #3481 [arrow-flight] (alamb)
change concat_batches parameter to non owned reference #3480 [arrow] (askoa)
feat: add parquet-rewrite CLI #3477 [parquet] (crepererum)
Preserve empty list array elements in take kernel #3473 [arrow] (jonmmease)
Add a test for stream writer for writing sliced array #3472 [arrow] (viirya)
Fix CSV infinite loop and improve error messages #3470 [arrow] (tustvold)
Add more dictionary value type support to build_compare #3466 (viirya)
Add tests for FlightClient::{list_flights, list_actions, do_action, get_schema} #3463 [arrow-flight] (alamb)
Minor: add ticket links to failing ipc integration tests #3461 (alamb)
feat: column_name based index access for RecordBatch and StructArray #3458 [arrow] (askoa)
Support Decimal256 in FFI #3453 [arrow] (viirya)
Remove multiversion dependency #3452 [arrow] (tustvold)
Re-export nullif kernel #3451 [arrow] (tustvold)
Meaningful error message for map builder with null keys #3450 [arrow] (Jefffrey)
Parquet writer v2: clear buffer after page flush #3447 [parquet] (askoa)
Verify ArrayData::data_type compatible in PrimitiveArray::from #3440 [arrow] (tustvold)
Preserve DataType metadata in make_builder #3438 [arrow] (tustvold)
Consolidate arrow ipc tests and increase coverage #3427 [arrow] (alamb)
Generic bytes dictionary builder #3426 [arrow] (viirya)
Minor: Improve docs for arrow-ipc, remove clippy ignore #3421 [arrow] (alamb)
refactor: convert *like_dyn, *like_utf8_scalar_dyn and *like_dict functions to macros #3411 [arrow] (askoa)
Add parquet-index binary #3405 [parquet] (tustvold)
Complete mid-level FlightClient #3402 [arrow-flight] (alamb)
Implement RecordBatch <--> FlightData encode/decode + tests #3391 [arrow] [arrow-flight] (alamb)
Provide into_builder for bytearray #3326 [arrow] (viirya)

30.0.1 (2023-01-04)

Full Changelog

Implemented enhancements:

Generic bytes dictionary builder #3425 [arrow]
Derive Clone for the builders in object-store. #3419
Mid-level ArrowFlight Client #3371 [arrow-flight]
Improve performance of the CSV parser #3338 [arrow]

Fixed bugs:

nullif kernel no longer exported #3454 [arrow]
PrimitiveArray from ArrayData Unsound For IntervalArray #3439 [arrow]
LZ4-compressed PQ files unreadable by Pandas and ClickHouse #3433 [parquet]
Parquet Record API: Cannot convert date before Unix epoch to json #3430 [parquet]
parquet-fromcsv with writer version v2 does not stop #3408 [parquet]

30.0.0 (2022-12-29)

Full Changelog

Breaking changes:

Infer Parquet JSON Logical and Converted Type as UTF-8 #3376 [parquet] (tustvold)
Use custom Any instead of prost_types #3360 [arrow-flight] (tustvold)
Use bytes in arrow-flight #3359 [arrow-flight] (tustvold)

Implemented enhancements:

Add derived implementations of Clone and Debug for ParquetObjectReader #3381 [parquet]
Speed up TrackedWrite #3366 [parquet]
Is it possible for ArrowWriter to write key_value_metadata after write all records #3356 [parquet]
Add UnionArray test to arrow-pyarrow integration test #3346
Document / Deprecate arrow_flight::utils::flight_data_from_arrow_batch #3312 [arrow] [arrow-flight]
```
FlightSQL
```
Support UnionArray in ffi #3304 [arrow]
Add support for Azure Data Lake Storage Gen2 aka: ADLS Gen2 in Object Store library #3283
Support casting from String to Decimal #3280 [arrow]
Allow ArrowCSV writer to control the display of NULL values #3268 [arrow]

Fixed bugs:

FlightSQL example is broken #3386 [arrow-flight]
CSV Reader Bounds Incorrectly Handles Header #3364 [arrow]
Incorrect output string from try_to_type #3350
Decimal arithmetic computation fails to run because decimal type equality #3344 [arrow]
Pretty print not implemented for Map #3322 [arrow]
ILIKE Kernels Inconsistent Case Folding #3311 [arrow]

Documentation updates:

minor: Improve arrow-flight docs #3372 [arrow] [arrow-flight] (alamb)

Merged pull requests:

Version 30.0.0 release notes and changelog #3406 [parquet] [arrow] [arrow-flight] (alamb)
Ends ParquetRecordBatchStream when polling on StreamState::Error #3404 [parquet] (viirya)
fix clippy issues #3398 (Jimexist)
Upgrade multiversion to 0.7.1 #3396 (viirya)
Make FlightSQL Support HTTPs #3388 [arrow-flight] (viirya)
Fix broken FlightSQL example #3387 [arrow-flight] (viirya)
Update prost-build #3385 [arrow-flight] (tustvold)
Split out arrow-arith \#2594 #3384 [arrow] (tustvold)
Add derive for Clone and Debug for ParquetObjectReader #3382 [parquet] (kszlim)
Initial Mid-level FlightClient #3378 [arrow-flight] (alamb)
Document all features on docs.rs #3377 [arrow] [arrow-flight] (tustvold)
Split out arrow-row \#2594 #3375 [arrow] (tustvold)
Remove unnecessary flush calls on TrackedWrite #3374 [parquet] (viirya)
Update proc-macro2 requirement from =1.0.47 to =1.0.49 #3369 [arrow-flight] (dependabot[bot])
Add CSV build_buffered \#3338 #3368 [arrow] (tustvold)
feat: add append_key_value_metadata #3367 [parquet] (jiacai2050)
Add csv-core based reader \#3338 #3365 [arrow] (tustvold)
Put BufWriter into TrackedWrite #3361 [parquet] (viirya)
Add CSV reader benchmark \#3338 #3357 [arrow] (tustvold)
Use ArrayData::ptr_eq in DictionaryTracker #3354 [arrow] (tustvold)
Deprecate flight_data_from_arrow_batch #3353 [arrow] [arrow-flight] (Dandandan)
Fix incorrect output string from try_to_type #3351 (viirya)
Fix unary_dyn for decimal scalar arithmetic computation #3345 [arrow] (viirya)
Add UnionArray test to arrow-pyarrow integration test #3343 (viirya)
feat: configure null value in arrow csv writer #3342 [arrow] (askoa)
Optimize bulk writing of all blocks of bloom filter #3340 [parquet] (viirya)
Add MapArray to pretty print #3339 [arrow] (askoa)
Update prost-build 0.11.4 #3334 [arrow-flight] (tustvold)
Faster Parquet Bloom Writer #3333 (tustvold)
Add bloom filter benchmark for parquet writer #3323 [parquet] (viirya)
Add ASCII fast path for ILIKE scalar 90% faster #3306 [arrow] (tustvold)
Support UnionArray in ffi #3305 [arrow] (viirya)
Support casting from String to Decimal #3281 [arrow] (viirya)
add more integration test for parquet bloom filter round trip tests #3210 [parquet] (Jimexist)

29.0.0 (2022-12-09)

Full Changelog

Breaking changes:

Minor: Allow Field::new and Field::new_with_dict to take existing String as well as &str #3288 [arrow] (alamb)
update &Option<T> to Option<&T> #3249 [parquet] [arrow] (Jimexist)
Hide *_dict_scalar kernels behind *_dyn kernels #3202 [arrow] (viirya)

Implemented enhancements:

Support writing BloomFilter in arrow_writer #3275 [parquet]
Support casting from unsigned numeric to Decimal256 #3272 [arrow]
Support casting from Decimal256 to float types #3266 [arrow]
Make arithmetic kernels supports DictionaryArray of DecimalType #3254 [arrow]
Casting from Decimal256 to unsigned numeric #3239 [arrow]
precision is not considered when cast value to decimal #3223 [arrow]
Use RegexSet in arrow_csv::infer_field_schema #3211 [arrow]
Implement FlightSQL Client #3206 [arrow-flight]
Add binary_mut and try_binary_mut #3143 [arrow]
Add try_unary_mut #3133 [arrow]

Fixed bugs:

Skip null buffer when importing FFI ArrowArray struct if no null buffer in the spec #3290 [arrow]
using ahash compile-time-rng kills reproducible builds #3271 [parquet]
Decimal128 to Decimal256 Overflows #3265 [arrow]
nullif panics on empty array #3261 [arrow]
Some more inconsistency between can_cast_types and cast_with_options #3250 [arrow]
Enable casting between Dictionary of DecimalArray and DecimalArray #3237 [arrow]
new_null_array Panics creating StructArray with non-nullable fields #3226 [arrow]
bool should cast from/to Float16Type as can_cast_types returns true #3221 [arrow]
Utf8 and LargeUtf8 cannot cast from/to Float16 but can_cast_types returns true #3220 [arrow]
Re-enable some tests in arrow-cast crate #3219 [arrow]
Off-by-one buffer size error triggers Panic when constructing RecordBatch from IPC bytes should return an Error #3215 [arrow]
arrow to and from pyarrow conversion results in changes in schema #3136 [arrow]

Documentation updates:

better document when we need LargeUtf8 instead of Utf8 #3228 [arrow]

Merged pull requests:

Use BufWriter when writing bloom filters and limit tests \#3318 #3319 [parquet] (tustvold)
Use take for dictionary like comparisons #3313 [arrow] (tustvold)
Update versions to 29.0.0 and update CHANGELOG #3315 [parquet] [arrow] [arrow-flight] (alamb)
refactor: Merge similar functions ilike_scalar and nilike_scalar #3303 [arrow] (askoa)
Split out arrow-ord \#2594 #3299 [arrow] (tustvold)
Split out arrow-string \#2594 #3295 [arrow] (tustvold)
Skip null buffer when importing FFI ArrowArray struct if no null buffer in the spec #3293 [arrow] (viirya)
Don't use dangling NonNull as sentinel #3289 [arrow] (tustvold)
Set bloom filter on byte array #3284 [parquet] (viirya)
Fix ipc schema custom_metadata serialization #3282 [arrow] (Jefffrey)
Disable const-random ahash feature on non-WASM \#3271 #3277 [parquet] (tustvold)
fix(ffi): handle null data buffers from empty arrays #3276 [arrow] (wjones127)
Support casting from unsigned numeric to Decimal256 #3273 [arrow] (viirya)
Add parquet-layout binary #3269 [parquet] (tustvold)
Support casting from Decimal256 to float types #3267 [arrow] (viirya)
Simplify decimal cast logic #3264 [arrow] (tustvold)
Fix panic on nullif empty array \#3261 #3263 [arrow] (tustvold)
Add BooleanArray::from_unary and BooleanArray::from_binary #3258 [arrow] (tustvold)
Minor: Remove parquet build script #3257 [parquet] (tustvold)
Make arithmetic kernels supports DictionaryArray of DecimalType #3255 [arrow] (viirya)
Support List and LargeList in Row format \#3159 #3251 [arrow] (tustvold)
Don't recurse to children in ArrayData::try_new #3248 [arrow] (tustvold)
Validate dictionaries read over IPC #3247 [arrow] (tustvold)
Fix MapBuilder example #3246 [arrow] (tustvold)
Loosen nullability restrictions added in #3205 \#3226 #3244 [arrow] (tustvold)
Better document implications of offsets \#3228 #3243 [arrow] (tustvold)
Add new API to validate the precision for decimal array #3242 [arrow] (liukun4515)
Move nullif to arrow-select \#2594 #3241 [arrow] (tustvold)
Casting from Decimal256 to unsigned numeric #3240 [arrow] (viirya)
Enable casting between Dictionary of DecimalArray and DecimalArray #3238 [arrow] (viirya)
Remove unwraps from 'create_primitive_array' #3232 [arrow] (aarashy)
Fix CI build by upgrading tonic-build to 0.8.4 #3231 [arrow-flight] (viirya)
Remove negative scale check #3230 [arrow] (viirya)
Update prost-build requirement from =0.11.2 to =0.11.3 #3225 [arrow-flight] (dependabot[bot])
Get the round result for decimal to a decimal with smaller scale #3224 [arrow] (liukun4515)
Move tests which require chrono-tz feature from arrow-cast to arrow #3222 [arrow] (viirya)
add test cases for extracting week with/without timezone #3218 [arrow] (waitingkuo)
Use RegexSet for matching DataType #3217 [arrow] (askoa)
Update tonic-build to 0.8.3 #3214 [arrow-flight] (tustvold)
Support StructArray in Row Format \#3159 #3212 [arrow] (tustvold)
Infer timestamps from CSV files #3209 [arrow] (Jefffrey)
fix bug: cast decimal256 to other decimal with no-safe #3208 [arrow] (liukun4515)
FlightSQL Client & integration test #3207 [arrow-flight] (avantgardnerio)
Ensure StructArrays check nullability of fields #3205 [arrow] (Jefffrey)
Remove special case ArrayData equality for decimals #3204 [arrow] (tustvold)
Add a cast test case for decimal negative scale #3203 [arrow] (viirya)
Move zip and shift kernels to arrow-select #3201 [arrow] (tustvold)
Deprecate limit kernel #3200 [arrow] (tustvold)
Use SlicesIterator for ArrayData Equality #3198 [arrow] (viirya)
Add _dyn kernels of like, ilike, nlike, nilike kernels for dictionary support #3197 [arrow] (viirya)
Adding scalar nlike_dyn, ilike_dyn, nilike_dyn kernels #3195 [arrow] (psvri)
Use self capture in DataType #3190 [arrow] (tustvold)
To pyarrow with schema #3188 [arrow] (doki23)
Support Duration in array_value_to_string #3183 [arrow] (psvri)
Support FixedSizeBinary in Row format #3182 [arrow] (tustvold)
Add binary_mut and try_binary_mut #3144 [arrow] (viirya)
Add try_unary_mut #3134 [arrow] (viirya)

28.0.0 (2022-11-25)

Full Changelog

Breaking changes:

StructArray::columns return slice #3186 [parquet] [arrow] (tustvold)
Return slice from GenericByteArray::value_data #3171 [arrow] (tustvold)
Support decimal negative scale #3152 [arrow] (viirya)
refactor: convert Field::metadata to HashMap #3148 [parquet] [arrow] (crepererum)
Don't Skip Serializing Empty Metadata \#3082 #3126 [arrow] (askoa)
Add Decimal128, Decimal256, Float16 to DataType::is_numeric #3121 [arrow] (tustvold)
Upgrade to thrift 0.17 and fix issues #3104 [parquet] [arrow] (Jimexist)
Fix prettyprint for Interval second fractions #3093 [arrow] (Jefffrey)
Remove Option from Field::metadata #3091 [parquet] [arrow] (askoa)

Implemented enhancements:

Add iterator to RowSelection #3172 [parquet]
create an integration test set for parquet crate against pyspark for working with bloom filters #3167 [parquet]
Row Format Size Tracking #3160 [arrow]
Add ArrayBuilder::finish_cloned() #3154 [arrow]
Optimize memory usage of json reader #3150
Add Field::size and DataType::size #3147 [parquet] [arrow]
Add like_utf8_scalar_dyn kernel #3145 [arrow]
support comparison for decimal128 array with scalar in kernel #3140 [arrow]
audit and create a document for bloom filter configurations #3138 [parquet]
Should be the rounding vs truncation when cast decimal to smaller scale #3137 [arrow]
Upgrade chrono to 0.4.23 #3120
Implements more temporal kernels using time_fraction_dyn #3108 [arrow]
Upgrade to thrift 0.17 #3105 [parquet] [arrow]
Be able to parse time formatted strings #3100 [arrow]
Improve "Fail to merge schema" error messages #3095 [arrow]
Expose SortingColumn when reading and writing parquet metadata #3090 [parquet]
Change Field::metadata to HashMap #3086 [parquet] [arrow]
Support bloom filter reading and writing for parquet #3023 [parquet]
API to take back ownership of an ArrayRef #2901 [arrow]
Specialized Interleave Kernel #2864 [arrow]

Fixed bugs:

arithmetic overflow leads to segfault in concat_batches #3123 [arrow]
Clippy failing on master : error: use of deprecated associated function chrono::NaiveDate::from_ymd: use from_ymd_opt() instead #3097 [parquet] [arrow]
Pretty print for interval types has wrong formatting #3092 [arrow]
Field is not serializable with binary formats #3082 [arrow]
Decimal Casts are Unchecked #2986 [arrow]

Closed issues:

Release Arrow 27.0.0 next release after `26.0.0` #3045 [parquet] [arrow] [arrow-flight]
Perf about ParquetRecordBatchStream vs ParquetRecordBatchReader #2916

Merged pull requests:

Improve regex related kernels by upto 85% #3192 [arrow] (psvri)
Derive clone for arrays #3184 [arrow] (tustvold)
Row decode cleanups #3180 [arrow] (tustvold)
Update zstd requirement from 0.11.1 to 0.12.0 #3178 [parquet] [arrow] (dependabot[bot])
Move decimal constants from arrow-data to arrow-schema crate #3177 [arrow] (mbrobbel)
bloom filter part V: add an integration with pytest against pyspark #3176 [parquet] (Jimexist)
Bloom filter config tweaks \#3023 #3175 [parquet] (tustvold)
Add RowParser #3174 [arrow] (tustvold)
Add RowSelection::iter(), Into<Vec<RowSelector>> and example #3173 [parquet] (alamb)
Add read parquet examples #3170 [parquet] (xudong963)
Faster BinaryArray to StringArray conversion ~67% #3168 [arrow] (tustvold)
Remove unnecessary downcasts in builders #3166 [arrow] (tustvold)
bloom filter part IV: adjust writer properties, bloom filter properties, and incorporate into column encoder #3165 [parquet] (Jimexist)
Fix parquet decimal precision #3164 [parquet] (psvri)
Add Row size methods \#3160 #3163 [arrow] (tustvold)
Prevent precision=0 for decimal type #3162 [arrow] (psvri)
Remove unnecessary Buffer::from_slice_ref reference #3161 [arrow] (tustvold)
Add finish_cloned to ArrayBuilder #3158 [arrow] (askoa)
Check overflow in MutableArrayData extend offsets \#3123 #3157 [arrow] (tustvold)
Extend Decimal256 as Primitive #3156 [arrow] (tustvold)
Doc improvements #3155 [arrow] (psvri)
Add collect.rs example #3153 [arrow] (viirya)
Implement Neg for i256 #3151 [arrow] (tustvold)
feat: {Field,DataType}::size #3149 [arrow] (crepererum)
Add like_utf8_scalar_dyn kernel #3146 [arrow] (viirya)
comparison op: decimal128 array with scalar #3141 [arrow] (liukun4515)
Cast: should get the round result for decimal to a decimal with smaller scale #3139 [arrow] (liukun4515)
Fix Panic on Reading Corrupt Parquet Schema \#2855 #3130 [parquet] (psvri)
Clippy parquet fixes #3124 [parquet] [arrow] (psvri)
Add GenericByteBuilder \#2969 #3122 [arrow] (tustvold)
parquet bloom filter part III: add sbbf writer, remove bloom default feature, add reader properties #3119 [parquet] (Jimexist)
Add downcast_array \#2901 #3117 [arrow] (tustvold)
Add COW conversion for Buffer and PrimitiveArray and unary_mut #3115 [arrow] (viirya)
Include field name in merge error message #3113 [arrow] (andygrove)
Add PrimitiveArray::unary_opt #3110 [arrow] (tustvold)
Implements more temporal kernels using time_fraction_dyn #3107 [arrow] (viirya)
cast: support unsigned numeric type to decimal128 #3106 [arrow] (liukun4515)
Expose SortingColumn in parquet files #3103 [parquet] (askoa)
parquet bloom filter part II: read sbbf bitset from row group reader, update API, and add cli demo #3102 [parquet] (Jimexist)
Parse Time32/Time64 from formatted string #3101 [arrow] (Jefffrey)
Cleanup temporal _internal functions #3099 [arrow] (viirya)
Improve schema mismatch error message #3098 [arrow] (askoa)
Fix clippy by avoiding deprecated functions in chrono #3096 [parquet] [arrow] (viirya)
Minor: Add diagrams and documentation to row format #3094 [arrow] (alamb)
Minor: Use ArrowNativeTypeOp instead of total_cmp directly #3087 [arrow] (viirya)
Check overflow while casting between decimal types #3076 [arrow] (viirya)
add bloom filter implementation based on split block sbbf spec #3057 [parquet] (Jimexist)
Add FixedSizeBinaryArray::try_from_sparse_iter_with_size #3054 [arrow] (maxburke)

27.0.0 (2022-11-11)

Full Changelog

Breaking changes:

Recurse into Dictionary value type in DataType::is_nested #3083 [arrow] (tustvold)
early type checks in RowConverter #3080 [arrow] (crepererum)
Add Decimal128 and Decimal256 to downcast_primitive #3056 [arrow] (viirya)
Replace remaining _generic temporal kernels with _dyn kernels #3046 [arrow] (viirya)
Replace year_generic with year_dyn #3041 [arrow] (viirya)
Validate decimal256 with i256 directly #3025 [arrow] (viirya)
Hadoop LZ4 Support for LZ4 Codec #3013 [parquet] (marioloko)
Replace hour_generic with hour_dyn #3006 [arrow] (viirya)
Accept any &dyn Array in nullif kernel #2940 [arrow] (tustvold)

Implemented enhancements:

Row Format: Option to detach/own a row #3078 [arrow]
Row Format: API to check if datatypes are supported #3077 [arrow]
Deprecate Buffer::count_set_bits #3067 [arrow]
Add Decimal128 and Decimal256 to downcast_primitive #3055 [arrow]
Improved UX of creating TimestampNanosecondArray with timezones #3042 [arrow]
Cast decimal256 to signed integer #3039 [arrow]
Support casting Date64 to Timestamp #3037 [arrow]
Check overflow when casting floating point value to decimal256 #3032 [arrow]
Compare i256 in validate_decimal256_precision #3024 [arrow]
Check overflow when casting floating point value to decimal128 #3020 [arrow]
Add macro downcast_temporal_array #3008 [arrow]
Replace hour_generic with hour_dyn #3005 [arrow]
Replace temporal _generic kernels with dyn #3004 [arrow]
Add RowSelection::intersection #3003 [parquet]
I would like to round rather than truncate when casting f64 to decimal #2997 [arrow]
arrow::compute::kernels::temporal should support nanoseconds #2995 [arrow]
Release Arrow 26.0.0 next release after `25.0.0` #2953 [parquet] [arrow] [arrow-flight]
Add timezone offset for debug format of Timestamp with Timezone #2917 [arrow]
Support merge RowSelectors when creating RowSelection #2858 [parquet]

Fixed bugs:

Inconsistent Nan Handling Between Scalar and Non-Scalar Comparison Kernels #3074 [arrow]
Debug format for timestamp ignores timezone #3069 [arrow]
Row format decode loses timezone #3063 [arrow]
binary operator produces incorrect result on arrays with resized null buffer #3061 [arrow]
RLEDecoder Panics on Null Padded Pages #3035 [parquet]
Nullif with incorrect valid_count #3031 [arrow]
RLEDecoder::get_batch_with_dict may panic on bit-packed runs longer than 1024 #3029 [parquet]
Converted type is None according to Parquet Tools then utilizing logical types #3017
CompressionCodec LZ4 incompatible with C++ implementation #2988 [parquet]

Documentation updates:

Mark parquet predicate pushdown as complete #2987 [parquet] (tustvold)

Merged pull requests:

Improved UX of creating TimestampNanosecondArray with timezones #3088 [arrow] (src255)
Remove unused range module #3085 [parquet] (tustvold)
Make intersect_row_selections a member function #3084 [parquet] (tustvold)
Update hashbrown requirement from 0.12 to 0.13 #3081 [parquet] [arrow] (dependabot[bot])
feat: add OwnedRow #3079 [arrow] (crepererum)
Use ArrowNativeTypeOp on non-scalar comparison kernels #3075 [arrow] (viirya)
Add missing inline to ArrowNativeTypeOp #3073 [arrow] (tustvold)
fix debug information for Timestamp with Timezone #3072 [arrow] (waitingkuo)
Deprecate Buffer::count_set_bits \#3067 #3071 [arrow] (tustvold)
Add compare to ArrowNativeTypeOp #3070 [arrow] (tustvold)
Minor: Improve docstrings on WriterPropertiesBuilder #3068 [parquet] (alamb)
Faster f64 inequality #3065 [arrow] (tustvold)
Fix row format decode loses timezone \#3063 #3064 [arrow] (tustvold)
Fix null_count computation in binary #3062 [arrow] (viirya)
Faster f64 equality #3060 [arrow] (tustvold)
Update arrow-flight subcrates \#3044 #3052 [arrow-flight] (tustvold)
Minor: Remove cloning ArrayData in with_precision_and_scale #3050 [arrow] (viirya)
Split out arrow-json \#3044 #3049 [arrow] (tustvold)
Move intersect_row_selections from datafusion to arrow-rs. #3047 [parquet] (Ted-Jiang)
Split out arrow-csv \#2594 #3044 [arrow] (tustvold)
Move reader_parser to arrow-cast \#3022 #3043 [arrow] (tustvold)
Cast decimal256 to signed integer #3040 [arrow] (viirya)
Enable casting from Date64 to Timestamp #3038 [arrow] (gruuya)
Fix decoding long and/or padded RLE data \#3029 \#3035 #3036 [parquet] (tustvold)
Fix nullif when existing array has no nulls #3034 [arrow] (tustvold)
Check overflow when casting floating point value to decimal256 #3033 [arrow] (viirya)
Update parquet to depend on arrow subcrates #3028 [parquet] (tustvold)
Make various i256 methods const #3026 [arrow] (tustvold)
Split out arrow-ipc #3022 [arrow] (tustvold)
Check overflow while casting floating point value to decimal128 #3021 [arrow] (viirya)
Update arrow-flight #3019 [arrow-flight] (tustvold)
Move ArrowNativeTypeOp to arrow-array \#2594 #3018 [arrow] (tustvold)
Support cast timestamp to time #3016 [arrow] (naosense)
Add filter example #3014 [arrow] (tustvold)
Check overflow when casting integer to decimal #3009 [arrow] (viirya)
Add macro downcast_temporal_array #3007 [arrow] (viirya)
Parquet Writer: Make column descriptor public on the writer #3002 [parquet] (pier-oliviert)
Update chrono-tz requirement from 0.7 to 0.8 #3001 [arrow] (dependabot[bot])
Round instead of Truncate while casting float to decimal #3000 [arrow] (waitingkuo)
Support Predicate Pushdown for Parquet Lists \#2108 #2999 [parquet] (tustvold)
Split out arrow-cast \#2594 #2998 [arrow] (tustvold)
arrow::compute::kernels::temporal should support nanoseconds #2996 [arrow] (comphead)
Add RowSelection::from_selectors_and_combine to merge RowSelectors #2994 [parquet] (Ted-Jiang)
Simplify Single-Column Dictionary Sort #2993 [arrow] (tustvold)
Minor: Add entry to changelog for 26.0.0 RC2 fix #2992 (alamb)
Fix ignored limit on lexsort_to_indices #2991 [arrow] (alamb)
Add clone and equal functions for CastOptions #2985 [arrow] (askoa)
minor: remove redundant prefix #2983 [arrow] [arrow-flight] (jackwener)
Compare dictionary decimal arrays #2982 [arrow] (viirya)
Compare dictionary and non-dictionary decimal arrays #2980 [arrow] (viirya)
Add decimal comparison kernel support #2978 [arrow] (viirya)
Move concat kernel to arrow-select \#2594 #2976 [arrow] (tustvold)
Specialize interleave for byte arrays \#2864 #2975 [arrow] (tustvold)
Use unary function for numeric to decimal cast #2973 [arrow] (viirya)
Specialize filter kernel for binary arrays \#2969 #2971 [arrow] (tustvold)
Combine take_utf8 and take_binary \#2969 #2970 [arrow] (tustvold)
Faster Scalar Dictionary Comparison ~10% #2968 [arrow] (tustvold)
Move byte_size from datafusion::physical_expr #2965 [arrow] (avantgardnerio)
Pass decompressed size to parquet Codec::decompress \#2956 #2959 [parquet] (marioloko)
Add Decimal Arithmetic #2881 [arrow] (tustvold)

26.0.0 (2022-10-28)

Full Changelog

Breaking changes:

Cast Timestamps to RFC3339 strings #2934
Remove Unused NativeDecimalType #2945 [arrow] (tustvold)
Format Timestamps as RFC3339 #2939 [arrow] (waitingkuo)
Update flatbuffers to resolve RUSTSEC-2021-0122 #2895 [arrow] (tustvold)
replace from_timestamp by from_timestamp_opt #2894 [arrow] (waitingkuo)

Implemented enhancements:

Optimized way to count the numbers of true and false values in a BooleanArray #2963 [arrow]
Add pow to i256 #2954 [arrow]
Write Generic Code over [Large]BinaryArray and [Large]StringArray #2946 [arrow]
Add Page Row Count Limit #2941 [parquet]
prettyprint to show timezone offset for timestamp with timezone #2937 [arrow]
Cast numeric to decimal256 #2922 [arrow]
Add freeze_with_dictionary API to MutableArrayData #2914 [arrow]
Support decimal256 array in sort kernels #2911 [arrow]
support [+/-]hhmm and [+/-]hh as fixedoffset timezone format #2910 [arrow]
Cleanup decimal sort function #2907 [arrow]
replace from_timestamp by from_timestamp_opt #2892 [arrow]
Move Primitive arity kernels to arrow-array #2787 [arrow]
add overflow-checking for negative arithmetic kernel #2662 [arrow]

Fixed bugs:

Subtle compatibility issue with serve_arrow #2952
error[E0599]: no method named total_cmp found for struct f16 in the current scope #2926 [arrow]
Fail at rowSelection and_then method #2925 [parquet]
Ordering not implemented for FixedSizeBinary types #2904 [arrow]
Parquet API: Could not convert timestamp before unix epoch to string/json #2897 [parquet]
Overly Pessimistic RLE Size Estimation #2889 [parquet]
Memory alignment error in RawPtrBox::new #2882 [arrow]
Compilation error under chrono-tz feature #2878 [arrow]
AHash Statically Allocates 64 bytes #2875 [parquet]
parquet::arrow::arrow_writer::ArrowWriter ignores page size properties #2853 [parquet]

Documentation updates:

Document crate topology \#2594 #2913 [arrow] (tustvold)

Closed issues:

SerializedFileWriter comments about multiple call on consumed self #2935 [parquet]
Pointer freed error when deallocating ArrayData with shared memory buffer #2874
Release Arrow 25.0.0 next release after `24.0.0` #2820 [parquet] [arrow] [arrow-flight]
Replace DecimalArray with PrimitiveArray #2637 [parquet] [arrow]

Merged pull requests:

Fix ignored limit on lexsort_to_indices (#2991) #2991 [arrow] (alamb)
Fix GenericListArray::try_new_from_array_data error message \#526 #2961 [arrow] (tustvold)
Fix take string on sliced indices #2960 [arrow] (tustvold)
Add BooleanArray::true_count and BooleanArray::false_count #2957 [arrow] (tustvold)
Add pow to i256 #2955 [arrow] (viirya)
fix datatype for timestamptz debug fmt #2948 [arrow] (waitingkuo)
Add GenericByteArray \#2946 #2947 [arrow] (tustvold)
Specialize interleave string ~2-3x faster #2944 [arrow] (tustvold)
Added support for LZ4_RAW compression. \#1604 #2943 [parquet] (marioloko)
Add optional page row count limit for parquet WriterProperties \#2941 #2942 [parquet] (tustvold)
Cleanup orphaned doc comments \#2935 #2938 [parquet] (tustvold)
support more fixedoffset tz format #2936 [arrow] (waitingkuo)
Benchmark with prepared row converter #2930 [arrow] (tustvold)
Add lexsort benchmark \#2871 #2929 [arrow] (tustvold)
Improve panic messages for RowSelection::and_then \#2925 #2928 [parquet] (tustvold)
Update required half from 2.0 --> 2.1 #2927 [arrow] (alamb)
Cast numeric to decimal256 #2923 [arrow] (viirya)
Cleanup generated proto code #2921 [arrow-flight] (tustvold)
Deprecate TimestampArray from_vec and from_opt_vec #2919 [parquet] [arrow] (tustvold)
Support decimal256 array in sort kernels #2912 [arrow] (viirya)
Add timezone abstraction #2909 [arrow] (tustvold)
Cleanup decimal sort function #2908 [arrow] (viirya)
Simplify TimestampArray from_vec with timezone #2906 [arrow] (tustvold)
Implement ord for FixedSizeBinary types #2905 [arrow] (maxburke)
Update chrono-tz requirement from 0.6 to 0.7 #2903 [arrow] (dependabot[bot])
Parquet record api support timestamp before epoch #2899 [parquet] (AnthonyPoncet)
Specialize interleave integer #2898 [arrow] (tustvold)
Support overflow-checking variant of negate kernel #2893 [arrow] (viirya)
Respect Page Size Limits in ArrowWriter \#2853 #2890 [parquet] (tustvold)
Improve row format docs #2888 [arrow] (tustvold)
Add FixedSizeList::from_iter_primitive #2887 [arrow] (tustvold)
Simplify ListArray::from_iter_primitive #2886 [arrow] (tustvold)
Split out value selection kernels into arrow-select \#2594 #2885 [arrow] (tustvold)
Increase default IPC alignment to 64 \#2883 #2884 [arrow] (tustvold)
Copying inappropriately aligned buffer in ipc reader #2883 [arrow] (viirya)
Validate decimal IPC read \#2387 #2880 [arrow] (tustvold)
Fix compilation error under chrono-tz feature #2879 [arrow] (viirya)
Don't validate decimal precision in ArrayData \#2637 #2873 [arrow] (tustvold)
Add downcast_integer and downcast_primitive #2872 [arrow] (tustvold)
Filter DecimalArray as PrimitiveArray ~5x Faster \#2637 #2870 [arrow] (tustvold)
Treat DecimalArray as PrimitiveArray in row format #2866 [arrow] (tustvold)

25.0.0 (2022-10-14)

Full Changelog

Breaking changes:

Make DecimalArray as PrimitiveArray #2857 [parquet] [arrow] (viirya)
fix timestamp parsing while no explicit timezone given #2814 [arrow] (waitingkuo)
Support Arbitrary Number of Arrays in downcast_primitive_array #2809 (tustvold)

Implemented enhancements:

Restore Integration test JSON schema serialization #2876 [arrow]
Fix various invalid_html_tags clippy error #2861 [parquet] [arrow] [arrow-flight]
Replace complicated temporal macro with generic functions #2851 [arrow]
Add NaN handling in dyn scalar comparison kernels #2829 [arrow]
Add overflow-checking variant of sum kernel #2821 [arrow]
Update to Clap 4 #2817 [parquet]
Safe API to Operate on Dictionary Values #2797 [arrow]
Add modulus op into ArrowNativeTypeOp #2753 [arrow]
Allow creating of TimeUnit instances without direct dependency on parquet-format #2708 [parquet]
Arrow Row Format #2677 [arrow]

Fixed bugs:

Don't try to infer nulls in CSV schema inference #2859 [arrow]
parquet::arrow::arrow_writer::ArrowWriter ignores page size properties #2853 [parquet]
Introducing ArrowNativeTypeOp made it impossible to call kernels from generics #2839 [arrow]
Unsound ArrayData to Array Conversions #2834 [parquet] [arrow]
Regression: the trait bound for<'de> arrow::datatypes::Schema: serde::de::Deserialize<'de> is not satisfied #2825 [arrow]
convert string to timestamp shouldn't apply local timezone offset if there's no explicit timezone info in the string #2813 [arrow]

Closed issues:

Add pub api for checking column index is sorted #2848 [parquet]

Merged pull requests:

Take decimal as primitive \#2637 #2869 [arrow] (tustvold)
Split out arrow-integration-test crate #2868 [arrow] (tustvold)
Decimal cleanup \#2637 #2865 [parquet] [arrow] (tustvold)
Fix various invalid_html_tags clippy errors #2862 [parquet] [arrow] [arrow-flight] (viirya)
Don't try to infer nullability in CSV reader #2860 [arrow] (Dandandan)
Fix page size on dictionary fallback #2854 [parquet] (thinkharderdev)
Replace complicated temporal macro with generic functions #2850 [arrow] (viirya)
```
feat
```
parquet: Add snap option to README #2847 [parquet] (exyi)
Cleanup cast kernel #2846 [arrow] (tustvold)
Simplify ArrowNativeType #2841 [arrow] (tustvold)
Expose ArrowNativeTypeOp trait to make it useful for type bound #2840 [arrow] (viirya)
Add interleave kernel \#1523 #2838 [arrow] (tustvold)
Handle empty offsets buffer \#1824 #2836 [arrow] (tustvold)
Validate ArrayData type when converting to Array \#2834 #2835 [parquet] [arrow] (tustvold)
Derive ArrowPrimitiveType for Decimal128Type and Decimal256Type \#2637 #2833 [arrow] (tustvold)
Add NaN handling in dyn scalar comparison kernels #2830 [arrow] (viirya)
Simplify OrderPreservingInterner allocation strategy ~97% faster \#2677 #2827 [arrow] (tustvold)
Convert rows to arrays \#2677 #2826 [arrow] (tustvold)
Add overflow-checking variant of sum kernel #2822 [arrow] (viirya)
Update Clap dependency to version 4 #2819 [parquet] (jgoday)
Fix i256 checked multiplication #2818 [arrow] (tustvold)
Add string_dictionary benches for row format \#2677 #2816 [arrow] (tustvold)
Add OrderPreservingInterner::lookup \#2677 #2815 [arrow] (tustvold)
Simplify FixedLengthEncoding #2812 [arrow] (tustvold)
Implement ArrowNumericType for Float16Type #2810 [arrow] (tustvold)
Add DictionaryArray::with_values to make it easier to operate on dictionary values #2798 [arrow] (tustvold)
Add i256 \#2637 #2781 [arrow] (tustvold)
Add modulus ops into ArrowNativeTypeOp #2756 [arrow] (HaoYang670)
feat: cast List / LargeList to Utf8 / LargeUtf8 #2588 [arrow] (gandronchik)

24.0.0 (2022-09-30)

Full Changelog

Breaking changes:

Cleanup ArrowNativeType \#1918 #2793 [parquet] [arrow] (tustvold)
Remove ArrowNativeType::FromStr #2775 [arrow] (tustvold)
Split out arrow-array crate \#2594 #2769 [arrow] (tustvold)
Add dyn_arith_dict feature flag #2760 [arrow] (tustvold)
Split out arrow-data into a separate crate #2746 [arrow] (tustvold)
Split out arrow-schema \#2594 #2711 [arrow] (tustvold)

Implemented enhancements:

Include field name in Parquet PrimitiveTypeBuilder error messages #2804 [parquet]
Add PrimitiveArray::reinterpret_cast #2785
BinaryBuilder and StringBuilder initialization parameters in struct_builder may be wrong #2783 [arrow]
Add divide scalar dyn kernel which produces null for division by zero #2767 [arrow]
Add divide dyn kernel which produces null for division by zero #2763 [arrow]
Improve performance of checked kernels on non-null data #2747 [arrow]
Add overflow-checking variants of arithmetic dyn kernels #2739 [arrow]
The binary function should not panic on unequaled array length. #2721 [arrow]

Fixed bugs:

min compute kernel is incorrect with sliced buffers in arrow 23 #2779 [arrow]
try_unary_dict should check value type of dictionary array #2754 [arrow]

Closed issues:

Add back JSON import/export for schema #2762
null casting and coercion for Decimal128 #2761
Json decoder behavior changed from versions 21 to 21 and returns non-sensical num_rows for RecordBatch #2722 [arrow]
Release Arrow 23.0.0 next release after `22.0.0` #2665 [parquet] [arrow] [arrow-flight]

Merged pull requests:

add field name to parquet PrimitiveTypeBuilder error messages #2805 [parquet] (andygrove)
Add struct equality test case \#514 #2791 [arrow] (tustvold)
Move unary kernels to arrow-array \#2787 #2789 [arrow] (tustvold)
Disable test harness for string_dictionary_builder benchmark #2788 [arrow] (tustvold)
Add PrimitiveArray::reinterpret_cast \#2785 #2786 (tustvold)
Fix BinaryBuilder and StringBuilder Capacity Allocation in StructBuilder #2784 (chunshao90)
Fix min/max computation for sliced arrays \#2779 #2780 [arrow] (tustvold)
Fix Backwards Compatible Parquet List Encodings \#1915 #2774 [parquet] (tustvold)
MINOR: Fix clippy for rust 1.64.0 #2772 [parquet] [arrow] (viirya)
MINOR: Fix clippy for rust 1.64.0 #2771 (viirya)
Add divide scalar dyn kernel which produces null for division by zero #2768 [arrow] (viirya)
Add divide dyn kernel which produces null for division by zero #2764 [arrow] (viirya)
Add value type check in try_unary_dict #2755 [arrow] (viirya)
Fix verify_release_candidate.sh for new arrow subcrates #2752 (alamb)
Fix: Issue 2721 : binary function should not panic but return error w… #2750 [arrow] (aksharau)
Speed up checked kernels for non-null data ~1.4-5x faster #2749 [arrow] (Dandandan)
Add overflow-checking variants of arithmetic dyn kernels #2740 [arrow] (viirya)
Trim parquet row selection #2705 [parquet] (tustvold)

23.0.0 (2022-09-16)

Full Changelog

Breaking changes:

Move JSON Test Format To integration-testing #2724 [arrow] (tustvold)
Split out arrow-buffer crate \#2594 #2693 [arrow] (tustvold)
Simplify DictionaryBuilder constructors \#2684 \#2054 #2685 [parquet] [arrow] (tustvold)
Deprecate RecordBatch::concat replace with concat_batches \#2594 #2683 [arrow] (tustvold)
Add overflow-checking variant for primitive arithmetic kernels and explicitly define overflow behavior #2643 [arrow] (viirya)
Update thrift v0.16 and vendor parquet-format \#2502 #2626 [parquet] (tustvold)
Update flight definitions including backwards-incompatible change to GetSchema #2586 [arrow] [arrow-flight] (liukun4515)

Implemented enhancements:

Cleanup like and nlike utf8 kernels #2744 [arrow]
Speedup eq and neq kernels for utf8 arrays #2742 [arrow]
API for more ergonomic construction of RecordBatchOptions #2728 [arrow]
Automate updates to CHANGELOG-old.md #2726
Don't check the DivideByZero error for float modulus #2720 [arrow]
try_binary should not panic on unequaled array length. #2715 [arrow]
Add benchmark for bitwise operation #2714 [arrow]
Add overflow-checking variants of arithmetic scalar dyn kernels #2712 [arrow]
Add divide_opt kernel which produce null values on division by zero error #2709 [arrow]
Add DataType function to detect nested types #2704 [arrow]
Add support of sorting dictionary of other primitive types #2700 [arrow]
Sort indices of dictionary string values #2697 [arrow]
Support empty projection in RecordBatch::project #2690 [arrow]
Support sorting dictionary encoded primitive integer arrays #2679 [arrow]
Use BitIndexIterator in min_max_helper #2674 [arrow]
Support building comparator for dictionaries of primitive integer values #2672 [arrow]
Change max/min string macro to generic helper function min_max_helper #2657 [arrow]
Add overflow-checking variant of arithmetic scalar kernels #2651 [arrow]
Compare dictionary with binary array #2644 [arrow]
Add overflow-checking variant for primitive arithmetic kernels #2642 [arrow]
Use downcast_primitive_array in arithmetic kernels #2639 [arrow]
Support DictionaryArray in temporal kernels #2622 [arrow]
Inline Generated Thift Code Into Parquet Crate #2502 [parquet]

Fixed bugs:

Escape contains patterns for utf8 like kernels #2745 [arrow]
Float Array should not panic on DivideByZero in the Divide kernel #2719 [arrow]
DictionaryBuilders can Create Invalid DictionaryArrays #2684 [parquet] [arrow]
arrow crate does not build with features = ["ffi"] and default_features = false. #2670 [arrow]
Invalid results with RowSelector having row_count of 0 #2669 [parquet]
clippy error: unresolved import crate::array::layout #2659 [arrow]
Cast the numeric without the CastOptions #2648 [arrow]
Explicitly define overflow behavior for primitive arithmetic kernels #2641 [arrow]
update the flight.proto and fix schema to SchemaResult #2571 [arrow] [arrow-flight]
Panic when first data page is skipped using ColumnChunkData::Sparse #2543 [parquet]
SchemaResult in IPC deviates from other implementations #2445 [arrow] [arrow-flight]

Closed issues:

Implement collect for int values #2696 [arrow]

Merged pull requests:

Speedup string equal/not equal to empty string, cleanup like/ilike kernels, fix escape bug #2743 [arrow] (Dandandan)
Partially flatten arrow-buffer #2737 [arrow] (tustvold)
Automate updates to CHANGELOG-old.md #2732 (iajoiner)
Update read parquet example in parquet/arrow home #2730 [parquet] (datapythonista)
Better construction of RecordBatchOptions #2729 [arrow] (askoa)
benchmark: bitwise operation #2718 [arrow] (liukun4515)
Update try_binary and checked_ops, and remove math_checked_op #2717 [arrow] (HaoYang670)
Support bitwise op in kernel: or,xor,not #2716 [arrow] (liukun4515)
Add overflow-checking variants of arithmetic scalar dyn kernels #2713 [arrow] (viirya)
Add divide_opt kernel which produce null values on division by zero error #2710 [arrow] (viirya)
Add DataType::is_nested() #2707 [arrow] (kfastov)
Update criterion requirement from 0.3 to 0.4 #2706 [parquet] [arrow] (dependabot[bot])
Support bitwise and operation in the kernel #2703 [arrow] (liukun4515)
Add support of sorting dictionary of other primitive arrays #2701 [arrow] (viirya)
Clarify docs of binary and string builders #2699 [arrow] (datapythonista)
Sort indices of dictionary string values #2698 [arrow] (viirya)
Add support for empty projection in RecordBatch::project #2691 [arrow] (Dandandan)
Temporarily disable Golang integration tests re-enable JS #2689 (tustvold)
Verify valid UTF-8 when converting byte array \#2205 #2686 [arrow] (tustvold)
Support sorting dictionary encoded primitive integer arrays #2680 [arrow] (viirya)
Skip RowSelectors with zero rows #2678 [parquet] (askoa)
Faster Null Path Selection in ArrayData Equality #2676 [arrow] (dhruv9vats)
Use BitIndexIterator in min_max_helper #2675 [arrow] (viirya)
Support building comparator for dictionaries of primitive integer values #2673 [arrow] (viirya)
json feature always requires base64 feature #2668 [parquet] (eagletmt)
Add try_unary, binary, try_binary kernels ~90% faster #2666 [arrow] (tustvold)
Use downcast_dictionary_array in unary_dyn #2663 [arrow] (tustvold)
optimize the numeric_cast_with_error #2661 [arrow] (liukun4515)
ffi feature also requires layout #2660 [arrow] (viirya)
Change max/min string macro to generic helper function min_max_helper #2658 [arrow] (viirya)
Fix flaky test test_fuzz_async_reader_selection #2656 [parquet] (thinkharderdev)
MINOR: Ignore flaky test test_fuzz_async_reader_selection #2655 [parquet] (viirya)
MutableBuffer::typed_data - shared ref access to the typed slice #2652 [arrow] (medwards)
Overflow-checking variant of arithmetic scalar kernels #2650 [arrow] (viirya)
support CastOption for casting numeric #2649 [arrow] (liukun4515)
Help LLVM vectorize comparison kernel ~50-80% faster #2646 [arrow] (tustvold)
Support comparison between dictionary array and binary array #2645 [arrow] (viirya)
Use downcast_primitive_array in arithmetic kernels #2640 [arrow] (viirya)
Fully qualifying parquet items #2638 (dingxiangfei2009)
Support DictionaryArray in temporal kernels #2623 [arrow] (viirya)
Comparable Row Format #2593 [arrow] (tustvold)
Fix bug in page skipping #2552 [parquet] (thinkharderdev)

22.0.0 (2022-09-02)

Full Changelog

Breaking changes:

Use total_cmp for floating value ordering and remove nan_ordering feature flag #2614 [arrow] (viirya)
Gate dyn comparison of dictionary arrays behind dyn_cmp_dict #2597 [arrow] (tustvold)
Move JsonSerializable to json module \#2300 #2595 [arrow] (tustvold)
Decimal precision scale datatype change #2532 [parquet] [arrow] (psvri)
Refactor PrimitiveBuilder Constructors #2518 [parquet] [arrow] (psvri)
Refactoring DecimalBuilder constructors #2517 [arrow] (psvri)
Refactor FixedSizeBinaryBuilder Constructors #2516 [parquet] [arrow] (psvri)
Refactor BooleanBuilder Constructors #2515 [arrow] (psvri)
Refactor UnionBuilder Constructors #2488 [arrow] (psvri)

Implemented enhancements:

Add Macros to assist with static dispatch #2635 [arrow]
Support comparison between DictionaryArray and BooleanArray #2617 [arrow]
Use total_cmp for floating value ordering and remove nan_ordering feature flag #2613 [arrow]
Support empty projection in CSV, JSON readers #2603 [arrow]
Support SQL-compliant NaN ordering between for DictionaryArray and non-DictionaryArray #2599 [arrow]
Add dyn_cmp_dict feature flag to gate dyn comparison of dictionary arrays #2596 [arrow]
Add max_dyn and min_dyn for max/min for dictionary array #2584 [arrow]
Allow FlightSQL implementers to extend do_get() #2581 [arrow-flight]
Support SQL-compliant behavior on eq_dyn, neq_dyn, lt_dyn, lt_eq_dyn, gt_dyn, gt_eq_dyn #2569 [arrow]
Add sql-compliant feature for enabling sql-compliant kernel behavior #2568
Calculate sum for dictionary array #2565 [arrow]
Add test for float nan comparison #2556 [arrow]
Compare dictionary with string array #2548 [arrow]
Compare dictionary with primitive array in lt_dyn, lt_eq_dyn, gt_dyn, gt_eq_dyn #2538 [arrow]
Compare dictionary with primitive array in eq_dyn and neq_dyn #2535 [arrow]
UnionBuilder Create Children With Capacity #2523 [arrow]
Speed up like_utf8_scalar for %pat% #2519 [arrow]
Replace macro with TypedDictionaryArray in comparison kernels #2513 [arrow]
Use same codebase for boolean kernels #2507 [arrow]
Use u8 for Decimal Precision and Scale #2496 [arrow]
Integrate skip row without pageIndex in SerializedPageReader in Fuzz Test #2475 [parquet]
Avoid unnecessary copies in Arrow IPC reader #2437 [arrow]
Add GenericColumnReader::skip_records Missing OffsetIndex Fallback #2433 [parquet]
Support Reading PageIndex with ParquetRecordBatchStream #2430 [parquet]
Specialize FixedLenByteArrayReader for Parquet #2318 [parquet]
Make JSON support Optional via Feature Flag #2300 [arrow]

Fixed bugs:

Casting timestamp array to string should not ignore timezone #2607 [arrow]
Ilike_ut8_scalar kernels have incorrect logic #2544 [arrow]
Always validate the array data when creating array in IPC reader #2541 [arrow]
Int96Converter Truncates Timestamps #2480 [parquet]
Error Reading Page Index When Not Available #2434 [parquet]
ParquetFileArrowReader::get_record_reader[_by_column] batch_size overallocates #2321 [parquet]

Documentation updates:

Document All Arrow Features in docs.rs #2633 [arrow]

Closed issues:

Add support for CAST from Interval(DayTime) to Timestamp(Nanosecond, None) #2606 [arrow]
Why do we check for null in TypedDictionaryArray value function #2564 [arrow]
Add the length field for Buffer #2524 [arrow]
Avoid large over allocate buffer in async reader #2512 [parquet]
Rewriting Decimal Builders using const_generic. #2390 [arrow]
Rewrite Decimal Array using const_generic #2384 [arrow]

Merged pull requests:

Add downcast macros \#2635 #2636 [arrow] (tustvold)
Document all arrow features in docs.rs \#2633 #2634 [arrow] (tustvold)
Document dyn_cmp_dict #2624 [arrow] (tustvold)
Support comparison between DictionaryArray and BooleanArray #2618 [arrow] (viirya)
Cast timestamp array to string array with timezone #2608 [arrow] (viirya)
Support empty projection in CSV and JSON readers #2604 [arrow] (Dandandan)
Make JSON support optional via a feature flag \#2300 #2601 [parquet] [arrow] (tustvold)
Support SQL-compliant NaN ordering for DictionaryArray and non-DictionaryArray #2600 [arrow] (viirya)
Split out integration test plumbing \#2594 \#2300 #2598 [arrow] (tustvold)
Refactor Binary Builder and String Builder Constructors #2592 [parquet] [arrow] (psvri)
Dictionary like scalar kernels #2591 [arrow] (psvri)
Validate dictionary key in TypedDictionaryArray \#2578 #2589 [arrow] (tustvold)
Add max_dyn and min_dyn for max/min for dictionary array #2585 [arrow] (viirya)
Code cleanup of array value functions #2583 [arrow] (psvri)
Allow overriding of do_get & export useful macro #2582 [arrow-flight] (avantgardnerio)
MINOR: Upgrade to pyo3 0.17 #2576 [arrow] (andygrove)
Support SQL-compliant NaN behavior on eq_dyn, neq_dyn, lt_dyn, lt_eq_dyn, gt_dyn, gt_eq_dyn #2570 [arrow] (viirya)
Add sum_dyn to calculate sum for dictionary array #2566 [arrow] (viirya)
struct UnionBuilder will create child buffers with capacity #2560 [arrow] (kastolars)
Don't panic on RleValueEncoder::flush_buffer if empty \#2558 #2559 [parquet] (tustvold)
Add the length field for Buffer and use more Buffer in IPC reader to avoid memory copy. #2557 [arrow] [arrow-flight] (HaoYang670)
Add test for float nan comparison #2555 [arrow] (viirya)
Compare dictionary array with string array #2549 [arrow] (viirya)
Always validate the array data except the `Decimal` when creating array in IPC reader #2547 [arrow] (HaoYang670)
MINOR: Fix test_row_type_validation test #2546 [arrow] (viirya)
Fix ilike_utf8_scalar kernels #2545 [arrow] (psvri)
fix typo #2540 (00Masato)
Compare dictionary array and primitive array in lt_dyn, lt_eq_dyn, gt_dyn, gt_eq_dyn kernels #2539 [arrow] (viirya)
```
MINOR
```
Compare dictionary with primitive array in eq_dyn and neq_dyn #2533 [arrow] (viirya)
Add iterator for FixedSizeBinaryArray #2531 [arrow] (tustvold)
add bench: decimal with byte array and fixed length byte array #2529 [parquet] (liukun4515)
Add FixedLengthByteArrayReader Remove ComplexObjectArrayReader #2528 [parquet] (tustvold)
Split out byte array decoders \#2318 #2527 [parquet] (tustvold)
Use offset index in ParquetRecordBatchStream #2526 [parquet] (thinkharderdev)
Clean the create_array in IPC reader. #2525 [arrow] (HaoYang670)
Remove DecimalByteArrayConvert \#2480 #2522 [parquet] (tustvold)
Improve performance of %pat% \>3x speedup #2521 [arrow] (Dandandan)
remove len field from MapBuilder #2520 [arrow] (psvri)
Replace macro with TypedDictionaryArray in comparison kernels #2514 [arrow] (viirya)
Avoid large over allocate buffer in sync reader #2511 [parquet] (Ted-Jiang)
Avoid useless memory copies in IPC reader. #2510 [arrow] (HaoYang670)
Refactor boolean kernels to use same codebase #2508 [arrow] (viirya)
Remove Int96Converter \#2480 #2481 [parquet] (tustvold)

21.0.0 (2022-08-18)

Full Changelog

Breaking changes:

Return structured ColumnCloseResult \#2465 #2466 [parquet] (tustvold)
Push ChunkReader into SerializedPageReader \#2463 #2464 [parquet] (tustvold)
Revise FromIterator for Decimal128Array to use Into instead of Borrow #2442 [parquet] [arrow] (viirya)
Use Fixed-Length Array in BasicDecimal new and raw_value #2405 [arrow] (HaoYang670)
Remove deprecated ParquetWriter #2380 [parquet] (tustvold)
Remove deprecated SliceableCursor and InMemoryWriteableCursor #2378 [parquet] (tustvold)

Implemented enhancements:

add into_inner method to ArrowWriter #2491 [parquet]
Remove byteorder dependency #2472 [parquet]
Return Structured ColumnCloseResult from GenericColumnWriter::close #2465 [parquet]
Push ChunkReader into SerializedPageReader #2463 [parquet]
Support SerializedPageReader::skip_page without OffsetIndex #2459 [parquet]
Support Time64/Time32 comparison #2457 [arrow]
Revise FromIterator for Decimal128Array to use Into instead of Borrow #2441 [parquet]
Support RowFilter withinParquetRecordBatchReader #2431 [parquet]
Remove the field StructBuilder::len #2429 [arrow]
Standardize creation and configuration of parquet --> Arrow readers `ParquetRecordBatchReaderBuilder` #2427 [parquet]
Use OffsetIndex to Prune IO in ParquetRecordBatchStream #2426 [parquet]
Support peek_next_page and skip_next_page in InMemoryPageReader #2406 [parquet]
Support casting from Utf8/LargeUtf8 to Binary/LargeBinary #2402 [arrow]
Support casting between Decimal128 and Decimal256 arrays #2375 [arrow]
Combine multiple selections into the same batch size in skip_records #2358 [parquet]
Add API to change timezone for timestamp array #2346 [arrow]
Change the output of read_buffer Arrow IPC API to return Result<_> #2342 [arrow]
Allow skip_records in GenericColumnReader to skip across row groups #2331 [parquet]
Optimize the validation of Decimal256 #2320 [arrow]
Implement Skip for DeltaBitPackDecoder #2281 [parquet]
Changes to ParquetRecordBatchStream to support row filtering in DataFusion #2270 [parquet]
Add ArrayReader::skip_records API #2197 [parquet]

Fixed bugs:

Panic in SerializedPageReader without offset index #2503 [parquet]
MapArray columns don't handle null values correctly #2484 [arrow]
There is no compiler error when using an invalid Decimal type. #2440 [arrow]
Flight SQL Server sends incorrect response for DoPutUpdateResult #2403 [arrow-flight]
AsyncFileReaderNo Longer Object-Safe #2372 [parquet]
StructBuilder Does not Verify Child Lengths #2252 [arrow]

Closed issues:

Combine DecimalArray validation #2447 [arrow]

Merged pull requests:

Fix bug in page skipping #2504 [parquet] (thinkharderdev)
Fix MapArrayReader \#2484 \#1699 \#1561 #2500 [parquet] (tustvold)
Add API to Retrieve Finished Writer from Parquet Writer #2498 [parquet] (jiacai2050)
Derive Copy,Clone for BasicDecimal #2495 [arrow] (tustvold)
remove byteorder dependency from parquet #2486 [parquet] (psvri)
parquet-read: add support to read parquet data from stdin #2482 [parquet] (nvartolomei)
Remove Position trait \#1163 #2479 [parquet] (tustvold)
Add ChunkReader::get_bytes #2478 [parquet] (tustvold)
RFC: Simplify decimal \#2440 #2477 [arrow] (tustvold)
Use Parquet OffsetIndex to prune IO with RowSelection #2473 [parquet] (thinkharderdev)
Remove unnecessary Option from Int96 #2471 [parquet] (tustvold)
remove len field from StructBuilder #2468 [arrow] (psvri)
Make Parquet reader filter APIs public \#1792 #2467 [parquet] (tustvold)
enable ipc compression feature for integration test #2462 (liukun4515)
Simplify implementation of Schema #2461 [arrow] (HaoYang670)
Support skip_page missing OffsetIndex Fallback in SerializedPageReader #2460 [parquet] (Ted-Jiang)
support time32/time64 comparison #2458 [arrow] (waitingkuo)
Utf8array casting #2456 [arrow] (psvri)
Remove outdated license text #2455 (alamb)
Support RowFilter within ParquetRecordBatchReader \#2431 #2452 [parquet] (tustvold)
benchmark: decimal builder and vec to decimal array #2450 [arrow] (liukun4515)
Collocate Decimal Array Validation Logic #2446 [arrow] (liukun4515)
Minor: Move From trait for Decimal256 impl to decimal.rs #2443 [arrow] (liukun4515)
decimal benchmark: arrow reader decimal from parquet int32 and int64 #2438 [parquet] (liukun4515)
MINOR: Simplify split_second function #2436 [arrow] (viirya)
Add ParquetRecordBatchReaderBuilder \#2427 #2435 [parquet] (tustvold)
refactor: refine validation for decimal128 array #2428 [arrow] (liukun4515)
Benchmark of casting decimal arrays #2424 [arrow] (viirya)
Test non-annotated repeated fields \#2394 #2422 [parquet] (tustvold)
Fix #2416 Automatic version updates for github actions with dependabot #2417 (iemejia)
Add validation logic for StructBuilder::finish #2413 [arrow] (psvri)
test: add test for reading decimal value from primitive array reader #2411 [parquet] (liukun4515)
Upgrade ahash to 0.8 #2410 [parquet] [arrow] (Dandandan)
Support peek_next_page and skip_next_page in InMemoryPageReader #2407 [parquet] (Ted-Jiang)
Fix DoPutUpdateResult #2404 [arrow-flight] (avantgardnerio)
Implement Skip for DeltaBitPackDecoder #2393 [parquet] (Ted-Jiang)
fix: Don't instantiate the scalar composition code quadratically for dictionaries #2391 [arrow] (Marwes)
MINOR: Remove unused trait and some cleanup #2389 [arrow] (viirya)
Decouple parquet fuzz tests from converter \#1661 #2386 [parquet] (tustvold)
Rewrite Decimal and DecimalArray using const_generic #2383 [parquet] [arrow] (HaoYang670)
Simplify BitReader ~5-10% faster #2381 [parquet] (tustvold)
Fix parquet clippy lints \#1254 #2377 [parquet] (tustvold)
Cast between Decimal128 and Decimal256 arrays #2376 [arrow] (viirya)
support compression for IPC with revamped feature flags #2369 [arrow] (alamb)
Implement AsyncFileReader for Box<dyn AsyncFileReader> #2368 [parquet] (tustvold)
Remove get_byte_ranges where bound #2366 [parquet] (tustvold)
refactor: Make read_num_bytes a function instead of a macro #2364 [parquet] (Marwes)
refactor: Group metrics into page and column metrics structs #2363 [parquet] (Marwes)
Speed up Decimal256 validation based on bytes comparison and add benchmark test #2360 [parquet] [arrow] (liukun4515)
Combine multiple selections into the same batch size in skip_records #2359 [parquet] (Ted-Jiang)
Add API to change timezone for timestamp array #2347 [arrow] (viirya)
Clean the code in field.rs and add more tests #2345 [arrow] (HaoYang670)
Add Parquet RowFilter API #2335 [parquet] (tustvold)
Make skip_records in complex_object_array can skip cross row groups #2332 [parquet] (Ted-Jiang)
Integrate Record Skipping into Column Reader Fuzz Test #2315 [parquet] (Ted-Jiang)

20.0.0 (2022-08-05)

Full Changelog

Breaking changes:

Add more const evaluation for GenericBinaryArray and GenericListArray: add PREFIX and data type constructor #2327 [parquet] [arrow] (HaoYang670)
Make FFI support optional, change APIs to be safe \#2302 #2303 [arrow] (tustvold)
Remove test_utils from default features \#2298 #2299 [arrow] (tustvold)
Rename DataType::Decimal to DataType::Decimal128 #2229 [parquet] [arrow] (viirya)
Add Decimal128Iter and Decimal256Iter and do maximum precision/scale check #2140 [arrow] (viirya)

Implemented enhancements:

Add the constant data type constructors for ListArray #2311 [arrow]
Update FlightSqlService trait to pass session info along #2308 [arrow-flight]
Optimize take_bits for non-null indices #2306 [arrow]
Make FFI support optional via Feature Flag ffi #2302 [arrow]
Mark ffi::ArrowArray::try_new is safe #2301 [arrow]
Remove test_utils from default arrow-rs features #2298 [arrow]
Remove JsonEqual trait #2296 [arrow]
Move with_precision_and_scale to Decimal array traits #2291 [arrow]
Improve readability and maybe performance of string --> numeric/time/date/timetamp cast kernels #2285 [arrow]
Add vectorized unpacking for 8, 16, and 64 bit integers #2276 [parquet]
Use initial capacity for interner hashmap #2273 [arrow]
Impl FromIterator for Decimal256Array #2248 [arrow]
Separate ArrayReader::next_batchwith ArrayReader::read_records and ArrayReader::consume_batch #2236 [parquet]
Rename DataType::Decimal to DataType::Decimal128 #2228 [arrow]
Automatically Grow Parquet BitWriter Buffer #2226 [parquet]
Add append_option support to Decimal128Builder and Decimal256Builder #2224 [arrow]
Split the FixedSizeBinaryArray and FixedSizeListArray from array_binary.rs and array_list.rs #2217 [arrow]
Don't Box Values in PrimitiveDictionaryBuilder #2215 [arrow]
Use BitChunks in equal_bits #2186 [arrow]
Implement Hash for Schema #2182 [arrow]
read decimal data type from parquet file with binary physical type #2159 [parquet]
The GenericStringBuilder should use GenericBinaryBuilder #2156 [arrow]
Update Rust version to 1.62 #2143 [parquet] [arrow] [arrow-flight]
Check precision and scale against maximum value when constructing Decimal128 and Decimal256 #2139 [arrow]
Use ArrayAccessor in Decimal128Iter and Decimal256Iter #2138 [arrow]
Use ArrayAccessor and FromIterator in Cast Kernels #2137 [arrow]
Add TypedDictionaryArray for more ergonomic interaction with DictionaryArray #2136 [arrow]
Use ArrayAccessor in Comparison Kernels #2135 [arrow]
Support peek_next_page() and skip_next_page in InMemoryColumnChunkReader #2129 [parquet]
Lazily materialize the null buffer builder for all array builders. #2125 [arrow]
Do value validation for Decimal256 #2112 [arrow]
Support skip_def_levels for ColumnLevelDecoder #2107 [parquet]
Add integration test for scan rows with selection #2106 [parquet]
Support for casting from Utf8/String to Time32 / Time64 #2053 [arrow]
Update prost and tonic related crates #2268 [arrow-flight] (carols10cents)

Fixed bugs:

temporal conversion functions cannot work on negative input properly #2325 [arrow]
IPC writer should truncate string array with all empty string #2312 [arrow]
Error order for comparing Decimal128 or Decimal256 #2256 [arrow]
Fix maximum and minimum for decimal values for precision greater than 38 #2246 [arrow]
IntervalMonthDayNanoType::make_value() does not match C implementation #2234 [arrow]
FlightSqlService trait does not allow impls to do handshake #2210 [arrow-flight]
EnabledStatistics::None not working #2185 [parquet]
Boolean ArrayData Equality Incorrect Slice Handling #2184 [arrow]
Publicly export MapFieldNames #2118 [arrow]

Documentation updates:

Update instructions on How to join the slack #arrow-rust channel -- or maybe try to switch to discord?? #2192
```
Minor
```

Performance improvements:

Improve speed of writing string dictionaries to parquet by skipping a copy(#1764) #2322 [parquet] [arrow] (tustvold)

Closed issues:

Fix wrong logic in calculate_row_count when skipping values #2328 [parquet]
Support filter for parquet data type #2126 [parquet]
Make skip value in ByteArrayDecoderDictionary avoid decoding #2088 [parquet]

Merged pull requests:

fix: Fix skip error in calculate_row_count. #2329 [parquet] (Ted-Jiang)
temporal conversion functions should work on negative input properly #2326 [arrow] (viirya)
Increase DeltaBitPackEncoder miniblock size to 64 for 64-bit integers \#2282 #2319 [parquet] (tustvold)
Remove JsonEqual #2317 [parquet] [arrow] (viirya)
fix: IPC writer should truncate string array with all empty string #2314 [arrow] (JasonLi-cn)
Pass pull Request<FlightDescriptor> to FlightSqlService impls #2309 [parquet] [arrow-flight] (avantgardnerio)
Speedup take_boolean / take_bits for non-null indices ~4 - 5x speedup #2307 [arrow] (Dandandan)
Add typed dictionary \#2136 #2297 [arrow] (tustvold)
```
Minor
```
Move with_precision_and_scale to BasicDecimalArray trait #2292 [parquet] [arrow] (viirya)
Replace the fn get_data_type by const DATA_TYPE in BinaryArray and StringArray #2289 [arrow] (HaoYang670)
Clean up string casts and improve performance #2284 [arrow] (alamb)
```
Minor
```
Add unpack8, unpack16, unpack64 \#2276 ~10-50% faster #2278 [parquet] (tustvold)
Fix bugs in the from_list function. #2277 [arrow] (HaoYang670)
fix: use signed comparator to compare decimal128 and decimal256 #2275 [arrow] (liukun4515)
Use initial capacity for interner hashmap #2272 [parquet] (Dandandan)
Remove fallibility from paruqet RleEncoder \#2226 #2259 [parquet] (tustvold)
Fix escaped like wildcards in like_utf8 / nlike_utf8 kernels #2258 [arrow] (daniel-martinez-maqueda-sap)
Add tests for reading nested decimal arrays from parquet #2254 [parquet] (tustvold)
feat: Implement string cast operations for Time32 and Time64 #2251 [arrow] (stuartcarnie)
move FixedSizeList to array_fixed_size_list.rs #2250 [arrow] (HaoYang670)
Impl FromIterator for Decimal256Array #2247 [arrow] (viirya)
Fix max and min value for decimal precision greater than 38 #2245 [arrow] (viirya)
Make Schema::fields and Schema::metadata pub public #2239 [arrow] (alamb)
```
Minor
```
Separate ArrayReader::next_batch with read_records and consume_batch #2237 [parquet] (Ted-Jiang)
Update IntervalMonthDayNanoType::make_value() to conform to specifications #2235 [arrow] (avantgardnerio)
Disable value validation for Decimal256 case #2232 [arrow] (viirya)
Automatically grow parquet BitWriter \#2226 ~10% faster #2231 [parquet] (tustvold)
Only trigger arrow CI on changes to arrow #2227 (alamb)
Add append_option support to decimal builders #2225 [arrow] (bphillips-exos)
Optimized writing of byte array to parquet \#1764 2x faster #2221 [parquet] (tustvold)
Increase test coverage of ArrowWriter #2220 [parquet] (tustvold)
Update instructions on how to join the Slack channel #2219 (HaoYang670)
Move FixedSizeBinaryArray to array_fixed_size_binary.rs #2218 [arrow] (HaoYang670)
Avoid boxing in PrimitiveDictionaryBuilder #2216 [arrow] (tustvold)
remove redundant CI benchmark check, cleanups #2212 [parquet] (alamb)
Update FlightSqlService trait to proxy handshake #2211 [arrow-flight] (avantgardnerio)
parquet: export json api with serde_json feature name #2209 [parquet] (flisky)
Cleanup record skipping logic and tests \#2158 #2199 [parquet] (tustvold)
Use BitChunks in equal_bits #2194 [arrow] (tustvold)
Fix disabling parquet statistics \#2185 #2191 [parquet] (tustvold)
Change CI names to match crate names #2189 (alamb)
Fix offset handling in boolean_equal \#2184 #2187 [arrow] (tustvold)
Implement Hash for Schema #2183 [arrow] (crepererum)
Let the StringBuilder use BinaryBuilder #2181 [arrow] (HaoYang670)
Use ArrayAccessor and FromIterator in Cast Kernels #2169 [arrow] (viirya)
Split most arrow specific CI checks into their own workflows reduce common CI time to 21 minutes #2168 (alamb)
Remove another attempt to cache target directory in action.yaml #2167 (alamb)
Run actions on push to master, pull requests #2166 (alamb)
Break parquet_derive and arrow_flight tests into their own workflows #2165 (alamb)
```
minor
```
parquet reader: Support reading decimals from parquet BYTE_ARRAY type #2160 [parquet] (liukun4515)
Add integration test for scan rows with selection #2158 [parquet] (Ted-Jiang)
Use ArrayAccessor in Comparison Kernels #2157 [arrow] (viirya)
Implement peek\_next\_page and skip\_next\_page for `InMemoryColumnCh… #2155 [parquet] (thinkharderdev)
Avoid decoding unneeded values in ByteArrayDecoderDictionary #2154 [parquet] (thinkharderdev)
Only run integration tests when arrow changes #2152 (alamb)
Break out docs CI job to its own github action #2151 (alamb)
Do not pretend to cache rust build artifacts, speed up CI by ~20% #2150 (alamb)
Update rust version to 1.62 #2144 [parquet] [arrow] [arrow-flight] (Ted-Jiang)
Make MapFieldNames public \#2118 #2134 [arrow] (tustvold)
Add ArrayAccessor trait, remove duplication in array iterators \#1948 #2133 [arrow] (tustvold)
Lazily materialize the null buffer builder for all array builders. #2127 [arrow] (HaoYang670)
Faster parquet DictEncoder ~20% #2123 [parquet] (tustvold)
Add validation for Decimal256 #2113 [arrow] (viirya)
Support skip_def_levels for ColumnLevelDecoder #2111 [parquet] (Ted-Jiang)
Donate object_store code from object_store_rs to arrow-rs #2081 (alamb)
Improve validate_utf8 performance #2048 [arrow] (tfeda)

19.0.0 (2022-07-22)

Full Changelog

Breaking changes:

Rename DecimalArray``/DecimalBuilder to Decimal128Array/Decimal128Builder #2101 [arrow]
Change builder append methods to be infallible where possible #2103 [parquet] [arrow] (jhorstmann)
Return reference from UnionArray::child \#2035 #2099 [arrow] (tustvold)
Remove preserve_order feature from serde_json dependency \#2095 #2098 [parquet] [arrow] (tustvold)
Rename weekday and weekday0 kernels to to num_days_from_monday and num_days_since_sunday #2066 [arrow] (alamb)
Remove null_count from write_batch_with_statistics #2047 [parquet] (tustvold)

Implemented enhancements:

Use total_cmp from std #2130 [arrow]
Permit parallel fetching of column chunks in ParquetRecordBatchStream #2110 [parquet]
The GenericBinaryBuilder should use buffer builders directly. #2104 [arrow]
Pass generate_decimal256_case arrow integration test #2093 [arrow]
Rename weekday and weekday0 kernels to to num_days_from_monday and days_since_sunday #2065 [arrow]
Improve performance of filter_dict #2062 [arrow]
Improve performance of set_bits #2060 [arrow]
Lazily materialize the null buffer builder of BooleanBuilder #2058 [arrow]
BooleanArray::from_iter should omit validity buffer if all values are valid #2055 [arrow]
FFI_ArrowSchema should set DICTIONARY_ORDERED flag if a field's dictionary is ordered #2049 [arrow]
Support peek_next_page() and skip_next_page in SerializedPageReader #2043 [parquet]
Support FFI / C Data Interface for MapType #2037 [arrow]
The DecimalArrayBuilder should use FixedSizedBinaryBuilder #2026 [arrow]
Enable serialized_reader read specific Page by passing row ranges. #1976 [parquet]

Fixed bugs:

type_id and value_offset are incorrect for sliced UnionArray #2086 [arrow]
Boolean take kernel does not handle null indices correctly #2057 [arrow]
Don't double-count nulls in write_batch_with_statistics #2046 [parquet]
Parquet Writer Ignores Statistics specification in WriterProperties #2014 [parquet]

Documentation updates:

Improve docstrings + examples for as_primitive_array cast functions #2114 [arrow] (alamb)

Closed issues:

Why does serde_json specify the preserve_order feature in arrow package #2095 [arrow]
Support skip_values in DictionaryDecoder #2079 [parquet]
Support skip_values in ColumnValueDecoderImpl #2078 [parquet]
Support skip_values in ByteArrayColumnValueDecoder #2072 [parquet]
Several Builder::append methods returning results even though they are infallible #2071
Improve formatting of logical plans containing subqueries #2059
Return reference from UnionArray::child #2035
support write page index #1777 [parquet]

Merged pull requests:

Use total_cmp from std #2131 [arrow] (Dandandan)
fix clippy #2124 (alamb)
Fix logical merge conflict: match arms have incompatible types #2121 (alamb)
Update GenericBinaryBuilder to use buffer builders directly. #2117 [arrow] (HaoYang670)
Simplify null mask preservation in parquet reader #2116 [parquet] (tustvold)
Add get_byte_ranges method to AsyncFileReader trait #2115 [parquet] (thinkharderdev)
add test for skip_values in DictionaryDecoder and fix it #2105 [parquet] (Ted-Jiang)
Define Decimal128Builder and Decimal128Array #2102 [parquet] [arrow] (viirya)
Support skip_values in DictionaryDecoder #2100 [parquet] (thinkharderdev)
Pass generate_decimal256_case integration test, add DataType::Decimal256 #2094 [parquet] [arrow] (viirya)
DecimalBuilder should use FixedSizeBinaryBuilder #2092 [arrow] (HaoYang670)
Array writer indirection #2091 [parquet] (tustvold)
Remove doc hidden from GenericColumnReader #2090 [parquet] (tustvold)
Support skip_values in ColumnValueDecoderImpl #2089 [parquet] (thinkharderdev)
type_id and value_offset are incorrect for sliced UnionArray #2087 [arrow] (viirya)
Add IPC truncation test case for StructArray #2083 [arrow] (viirya)
Improve performance of set_bits by using copy_from_slice instead of setting individual bytes #2077 [arrow] (jhorstmann)
Support skip_values in ByteArrayColumnValueDecoder #2076 [parquet] (Ted-Jiang)
Lazily materialize the null buffer builder of boolean builder #2073 [arrow] (HaoYang670)
Fix windows CI \#2069 #2070 (tustvold)
Test utf8_validation checks char boundaries #2068 [arrow] (tustvold)
feat(compute): Support doy day of year for temporal #2067 [arrow] (ovr)
Support nullable indices in boolean take kernel and some optimizations #2064 [arrow] (jhorstmann)
Improve performance of filter_dict #2063 [arrow] (viirya)
Ignore null buffer when creating ArrayData if null count is zero #2056 [arrow] (jhorstmann)
feat(compute): Support week0 PostgreSQL behaviour for temporal #2052 [arrow] (ovr)
Set DICTIONARY_ORDERED flag for FFI_ArrowSchema #2050 [arrow] (viirya)
Generify parquet write path \#1764 #2045 [parquet] (tustvold)
Support peek_next_page() and skip_next_page in serialized_reader. #2044 [parquet] (Ted-Jiang)
Support MapType in FFI #2042 [arrow] (viirya)
Add support of converting FixedSizeBinaryArray to DecimalArray #2041 [arrow] (HaoYang670)
Truncate IPC record batch #2040 [arrow] (viirya)
Refine the List builder #2034 [arrow] (HaoYang670)
Add more tests of RecordReader Batch Size Edge Cases \#2025 #2032 [parquet] (tustvold)
Add support for adding intervals to dates #2031 [arrow] (avantgardnerio)

18.0.0 (2022-07-08)

Full Changelog

Breaking changes:

Fix several bugs in parquet writer statistics generation, add EnabledStatistics to control level of statistics generated #2022 [parquet] (tustvold)
Add page index reader test for all types and support empty index. #2012 [parquet] (Ted-Jiang)
Add Decimal256Builder and Decimal256Array; Decimal arrays now implement BasicDecimalArray trait #2000 [parquet] [arrow] (viirya)
Simplify ColumnReader::read_batch #1995 [parquet] [arrow] (tustvold)
Remove PrimitiveBuilder::finish_dict \#1978 #1980 [arrow] (tustvold)
Disallow cast from other datatypes to NullType #1942 [arrow] (liukun4515)
Add column index writer for parquet #1935 [parquet] (liukun4515)

Implemented enhancements:

Add DataType::Dictionary support to subtract_scalar, multiply_scalar, divide_scalar #2019 [arrow]
Support DictionaryArray in add_scalar kernel #2017 [arrow]
Enable column page index read test for all types #2010 [parquet]
Simplify FixedSizeBinaryBuilder #2007 [arrow]
Support Decimal256Builder and Decimal256Array #1999 [arrow]
Support DictionaryArray in unary kernel #1989 [arrow]
Add kernel to quickly compute comparisons on Arrays #1987 [arrow]
Support DictionaryArray in divide kernel #1982 [arrow]
Implement Into<ArrayData> for T: Array #1979 [arrow]
Support DictionaryArray in multiply kernel #1972 [arrow]
Support DictionaryArray in subtract kernel #1970 [arrow]
Declare DecimalArray::length as a constant #1967 [arrow]
Support DictionaryArray in add kernel #1950 [arrow]
Add builder style methods to Field #1934 [arrow]
Make StringDictionaryBuilder faster #1851 [arrow]
concat_elements_utf8 should accept arbitrary number of input arrays #1748 [arrow]

Fixed bugs:

Array reader for list columns fails to decode if batches fall on row group boundaries #2025 [parquet]
ColumnWriterImpl::write_batch_with_statistics incorrect distinct count in statistics #2016 [parquet]
ColumnWriterImpl::write_batch_with_statistics can write incorrect page statistics #2015 [parquet]
RowFormatter is not part of the public api #2008 [parquet]
Infinite Loop possible in ColumnReader::read_batch For Corrupted Files #1997 [parquet]
PrimitiveBuilder::finish_dict does not validate dictionary offsets #1978 [arrow]
Incorrect n_buffers in FFI_ArrowArray #1959 [arrow]
DecimalArray::from_fixed_size_list_array fails when offset > 0 #1958 [arrow]
Incorrect but ignored metadata written after ColumnChunk #1946 [parquet]
Send + Sync impl for Allocation may not be sound unless Allocation is Send + Sync as well #1944 [arrow]
Disallow cast from other datatypes to NullType #1923 [arrow]

Documentation updates:

The doc of FixedSizeListArray::value_length is incorrect. #1908 [arrow]

Closed issues:

Column chunk statistics of min_bytes and max_bytes return wrong size #2021 [parquet]
```
Discussion
```
Move DecimalArray to a new file #1985 [arrow]
Support DictionaryArray in multiply kernel #1974
close function instead of mutable reference #1969 [parquet]
Incorrect null_count of DictionaryArray #1962 [arrow]
Support multi diskRanges for ChunkReader #1955 [parquet]
Persisting Arrow timestamps with Parquet produces missing TIMESTAMP in schema #1920 [parquet]
Separate get_next_page_header from get_next_page in PageReader #1834 [parquet]

Merged pull requests:

Consistent case in Index enumeration #2029 [parquet] (tustvold)
Fix record delimiting on row group boundaries \#2025 #2027 [parquet] (tustvold)
Add builder style APIs For Field: with_name, with_data_type and with_nullable #2024 [arrow] (alamb)
Add dictionary support to subtract_scalar, multiply_scalar, divide_scalar #2020 [arrow] (viirya)
Support DictionaryArray in add_scalar kernel #2018 [arrow] (viirya)
Refine the FixedSizeBinaryBuilder #2013 [arrow] (HaoYang670)
Add RowFormatter to record public API #2009 [parquet] (FabioBatSilva)
Fix parquet test_common feature flags #2003 [parquet] (tustvold)
Stub out Skip Records API \#1792 #1998 [parquet] [arrow-flight] (tustvold)
Implement Into<ArrayData> for T: Array #1992 [parquet] [arrow] (heyrutvik)
Add unary_cmp #1991 [arrow] (viirya)
Support DictionaryArray in unary kernel #1990 [arrow] (viirya)
Refine FixedSizeListBuilder #1988 [arrow] (HaoYang670)
Move DecimalArray to array_decimal.rs #1986 [arrow] (HaoYang670)
MINOR: Fix clippy error after updating rust toolchain #1984 [parquet] [arrow] [arrow-flight] (viirya)
Support dictionary array for divide kernel #1983 [arrow] (viirya)
Support dictionary array for subtract and multiply kernel #1971 [arrow] (viirya)
Declare the value_length of decimal array as a const #1968 [arrow] (HaoYang670)
Fix the behavior of from_fixed_size_list when offset > 0 #1964 [arrow] (HaoYang670)
Calculate n_buffers in FFI_ArrowArray by data layout #1960 [arrow] (viirya)
Fix the doc of FixedSizeListArray::value_length #1957 [arrow] (HaoYang670)
Use InMemoryColumnChunkReader ~20% faster #1956 [parquet] (tustvold)
Unpin clap \#1867 #1954 [parquet] (tustvold)
Set is_adjusted_to_utc if any timezone set \#1932 #1953 [parquet] [arrow] (tustvold)
Add add_dyn for DictionaryArray support #1951 [arrow] (viirya)
write ColumnMetadata after the column chunk data, not the ColumnChunk #1947 [parquet] (liukun4515)
Require Send+Sync bounds for Allocation trait #1945 [arrow] (jhorstmann)
Faster StringDictionaryBuilder ~60% faster \#1851 #1861 [arrow] (tustvold)
Arbitrary size concat elements utf8 #1787 [arrow] (Ismail-Maj)

17.0.0 (2022-06-24)

Full Changelog

Breaking changes:

Add validation to RecordBatch for non-nullable fields containing null values #1890 [arrow] (andygrove)
Rename ArrayData::validate_dict_offsets to ArrayData::validate_values #1889 [arrow] (frolovdev)
Add Decimal128 API and use it in DecimalArray and DecimalBuilder #1871 [parquet] [arrow] (viirya)
Mark typed buffer APIs safe \#996 \#1027 #1866 [parquet] [arrow] (tustvold)

Implemented enhancements:

add a small doc example showing ArrowWriter being used with a cursor #1927 [parquet]
Support cast to/from NULL and DataType::Decimal #1921 [arrow]
Add Decimal256 API #1913 [arrow]
Add DictionaryArray::key function #1911 [arrow]
Support specifying capacities for ListArrays in MutableArrayData #1884 [arrow]
Explicitly declare the features used for each dependency #1876 [parquet] [arrow] [arrow-flight]
Add Decimal128 API and use it in DecimalArray and DecimalBuilder #1870 [arrow]
PrimitiveArray::from_iter should omit validity buffer if all values are valid #1856 [arrow]
Add from(v: Vec<Option<&[u8]>>) and from(v: Vec<&[u8]>) for FixedSizedBInaryArray #1852 [arrow]
Add Vec-inspired APIs to BufferBuilder #1850 [arrow]
PyArrow integration test for C Stream Interface #1847 [arrow]
Add nilike support in comparison #1845 [arrow]
Split up arrow::array::builder module #1843 [arrow]
Add quarter support in temporal kernels #1835 [arrow]
Rename ArrayData::validate_dictionary_offset to ArrayData::validate_values #1812 [arrow]
Clean up the testing code for substring kernel #1801 [arrow]
Speed up substring_by_char kernel #1800 [arrow]

Fixed bugs:

unable to write parquet file with UTC timestamp #1932 [parquet]
Incorrect max and min decimals #1916 [arrow]
dynamic_types example does not print the projection #1902 [arrow]
log2(0) panicked at 'attempt to subtract with overflow', parquet/src/util/bit_util.rs:148:5 #1901 [parquet]
Final slicing in combine_option_bitmap needs to use bit slices #1899 [arrow]
Dictionary IPC writer writes incorrect schema #1892 [arrow]
Creating a RecordBatch with null values in non-nullable fields does not cause an error #1888 [arrow]
Upgrade regex dependency #1874 [arrow]
Miri reports leaks in ffi tests #1872 [arrow]
AVX512 + simd binary and/or kernels slower than autovectorized version #1829 [arrow]

Documentation updates:

Blog post about arrow 10.0.0 - 16.0.0 #1808
Add README for the compute module. #1940 [arrow] (HaoYang670)
minor: clarify docstring on DictionaryArray::lookup_key #1910 [arrow] (alamb)
minor: add a diagram to docstring for DictionaryArray #1909 [arrow] (alamb)
Closes #1902: Print the original and projected RecordBatch in dynamic_types example #1903 [arrow] (martin-g)

Closed issues:

how read/write REPEATED #1886 [parquet]
Handling Unsupported Arrow Types in Parquet #1666 [parquet]

Merged pull requests:

Set adjusted to UTC if UTC timezone \#1932 #1937 [parquet] (tustvold)
Split up parquet::arrow::array_reader \#1483 #1933 [parquet] (tustvold)
Add ArrowWriter doctest \#1927 #1930 [parquet] (tustvold)
Update indexmap dependency #1929 [arrow] (tustvold)
Complete and fixup split of arrow::array::builder module \#1843 #1928 [arrow] (tustvold)
MINOR: Replace checked_add/sub().unwrap() with +/- #1924 [arrow] (HaoYang670)
Support casting NULL to/from Decimal #1922 [arrow] (liukun4515)
Update half requirement from 1.8 to 2.0 #1919 [arrow] (dependabot[bot])
Fix max and min decimal for max precision #1917 [arrow] (viirya)
Add Decimal256 API #1914 [arrow] (viirya)
Add DictionaryArray::key function #1912 [arrow] (alamb)
Fix misaligned reference and logic error in crc32 #1906 [parquet] (saethlin)
Refine the bit_util of Parquet. #1905 [parquet] (HaoYang670)
Use bit_slice in combine_option_bitmap #1900 [arrow] (jhorstmann)
Issue #1876: Explicitly declare the used features for each dependency in integration_testing #1898 (martin-g)
Issue #1876: Explicitly declare the used features for each dependency in parquet_derive_test #1897 [parquet] (martin-g)
Issue #1876: Explicitly declare the used features for each dependency in parquet_derive #1896 (martin-g)
Issue #1876: Explicitly declare the used features for each dependency in parquet #1895 [parquet] (martin-g)
Minor: Add examples to docstring for weekday #1894 [arrow] (alamb)
Correct nullable in read_dictionary #1893 [arrow] (viirya)
Feature add weekday temporal kernel #1891 [arrow] (nl5887)
Support specifying list capacities for MutableArrayData #1885 [arrow] (jhorstmann)
Issue #1876: Explicitly declare the used features for each dependency in parquet #1881 [parquet] (martin-g)
Issue #1876: Explicitly declare the used features for each dependency in arrow-flight #1880 [arrow-flight] (martin-g)
Split up arrow::array::builder module \#1843 #1879 [arrow] (DaltonModlin)
Fix memory leak in ffi test #1878 [arrow] (viirya)
Issue #1876 - Explicitly declare the used features for each dependency #1877 [arrow] (martin-g)
Fixes #1874 - Upgrade regex dependency to 1.5.6 #1875 [arrow] (martin-g)
Do not print exit code from miri, instead it should be the return value of the script #1873 (jhorstmann)
Update vendored gRPC #1869 [arrow-flight] (tustvold)
Expose BitSliceIterator and BitIndexIterator \#1864 #1865 [arrow] (tustvold)
Exclude some long-running tests when running under miri #1863 [arrow] (jhorstmann)
Add vec-inspired APIs to BufferBuilder \#1850 #1860 [arrow] (tustvold)
Omit validity buffer in PrimitiveArray::from_iter when all values are valid #1859 [arrow] (jhorstmann)
Add two from methods for FixedSizeBinaryArray #1854 [arrow] (HaoYang670)
Clean up the test code of substring kernel. #1853 [arrow] (HaoYang670)
Add PyArrow integration test for C Stream Interface #1848 [arrow] (viirya)
Add nilike support in comparison #1846 [arrow] (MazterQyou)
MINOR: Remove version check from test_command_help #1844 [parquet] (viirya)
Implement UnionArray FieldData using Type Erasure #1842 [arrow] (tustvold)
Add quarter support in temporal #1836 [arrow] (MazterQyou)
speed up substring_by_char by about 2.5x #1832 [arrow] (HaoYang670)
Remove simd and avx512 bitwise kernels in favor of autovectorization #1830 [arrow] (jhorstmann)
Refactor parquet::arrow module #1827 [parquet] (tustvold)
docs: remove experimental marker on C Stream Interface #1821 [arrow] (wjones127)
Separate Page IO from Page Decode #1810 [parquet] (tustvold)

16.0.0 (2022-06-10)

Full Changelog

Breaking changes:

Seal ArrowNativeType and OffsetSizeTrait for safety \#1028 #1819 [arrow] (tustvold)
Improve API for csv::infer_file_schema by removing redundant ref #1776 [arrow] (tustvold)

Implemented enhancements:

List equality method should work on empty offset ListArray #1817 [arrow]
Command line tool for convert CSV to Parquet #1797 [parquet]
IPC writer should write validity buffer for UnionArray in V4 IPC message #1793 [arrow]
Add function for row alignment with page mask #1790 [parquet]
Rust IPC Read should be able to read V4 UnionType Array #1788 [arrow]
combine_option_bitmap should accept arbitrary number of input arrays. #1780 [arrow]
Add substring_by_char kernels for slicing on character boundaries #1768 [arrow]
Support reading PageIndex from column metadata #1761 [parquet]
Support casting from DataType::Utf8 to DataType::Boolean #1740 [arrow]
Make current position available in FileWriter. #1691 [parquet]
Support writing parquet to stdout #1687 [parquet]

Fixed bugs:

Incorrect Offset Validation for Sliced List Array Children #1814 [arrow]
Parquet Snappy Codec overwrites Existing Data in Decompression Buffer #1806 [parquet]
flight_data_to_arrow_batch does not support RecordBatches with no columns #1783 [arrow-flight]
parquet does not compile with features=["zstd"] #1630 [parquet]

Documentation updates:

Update arrow module docs #1840 [arrow] (tustvold)
Update safety disclaimer #1837 [arrow] (tustvold)
Update ballista readme link #1765 (tustvold)
Move changelog archive to CHANGELOG-old.md #1759 (alamb)

Closed issues:

DataType::Decimal Non-Compliant? #1779 [arrow]
Further simplify the offset validation #1770 [arrow]
Best way to convert arrow to Rust native type #1760 [arrow]
Why Parquet is a part of Arrow? #1715 [parquet] [arrow]

Merged pull requests:

Make equals_datatype method public, enabling other modules #1838 [arrow] (nl5887)
```
Minor
```
Update MIRI pin #1828 (tustvold)
Change to use resolver v2, test more feature flag combinations in CI, fix errors \#1630 #1822 [parquet] [arrow] (tustvold)
Add ScalarBuffer abstraction \#1811 #1820 [arrow] (tustvold)
Fix list equal for empty offset list array #1818 [arrow] (viirya)
Fix Decimal and List ArrayData Validation \#1813 \#1814 #1816 [arrow] (tustvold)
Don't overwrite existing data on snappy decompress \#1806 #1807 [parquet] (tustvold)
Rename arrow/benches/string_kernels.rs to arrow/benches/substring_kernels.rs #1805 [arrow] (HaoYang670)
Add public API for decoding parquet footer #1804 [parquet] (tustvold)
Add AsyncFileReader trait #1803 [parquet] (tustvold)
add parquet-fromcsv \#1 #1798 [parquet] (kazuk)
Use IPC row count info in IPC reader #1796 [arrow] (viirya)
Fix typos in the Memory and Buffers section of the docs home #1795 [arrow] (datapythonista)
Write validity buffer for UnionArray in V4 IPC message #1794 [arrow] (viirya)
feat:Add function for row alignment with page mask #1791 [parquet] (Ted-Jiang)
Read and skip validity buffer of UnionType Array for V4 ipc message #1789 [arrow] [arrow-flight] (viirya)
Add Substring_by_char #1784 [arrow] (HaoYang670)
Add ParquetFileArrowReader::try_new #1782 [parquet] (tustvold)
Arbitrary size combine option bitmap #1781 [arrow] (Ismail-Maj)
Implement ChunkReader for Bytes, deprecate SliceableCursor #1775 [parquet] (tustvold)
Access metadata of flushed row groups on write \#1691 #1774 [parquet] (tustvold)
Simplify ParquetFileArrowReader Metadata API #1773 [parquet] (tustvold)
MINOR: Unpin nightly version as packed_simd releases new version #1771 (viirya)
Update comfy-table requirement from 5.0 to 6.0 #1769 [arrow] (dependabot[bot])
Optionally disable validate_decimal_precision check in DecimalBuilder.append_value for interop test #1767 [arrow] (viirya)
Minor: Clean up the code of MutableArrayData #1763 [arrow] (HaoYang670)
Support reading PageIndex from parquet metadata, prepare for skipping pages at reading #1762 [parquet] (Ted-Jiang)
Support casting Utf8 to Boolean #1738 [arrow] (MazterQyou)

15.0.0 (2022-05-27)

Full Changelog

Breaking changes:

Change ArrayDataBuilder::null_bit_buffer to accept Option<Buffer> rather than Buffer #1739 [arrow] (HaoYang670)
Remove null_count from ArrayData::try_new() #1721 [arrow] (HaoYang670)
Change parquet writers to use standard std:io::Write rather custom ParquetWriter trait \#1717 \#1163 #1719 [parquet] (tustvold)
Add explicit column mask for selection in parquet: ProjectionMask \#1701 #1716 [parquet] (tustvold)
Add type_ids in Union datatype #1703 [parquet] [arrow] (viirya)
Fix Parquet Reader's Arrow Schema Inference #1682 [parquet] [arrow] (tustvold)

Implemented enhancements:

Rename the string kernel to concatenate_elements #1747 [arrow]
ArrayDataBuilder::null_bit_buffer should accept Option<Buffer> as input type #1737 [arrow]
Fix schema comparison for non_canonical_map when running flight test #1730 [arrow]
Add support in aggregate kernel for BinaryArray #1724 [arrow]
Fix incorrect null_count in generate_unions_case integration test #1712 [arrow]
Keep type ids in Union datatype to follow Arrow spec and integrate with other implementations #1690 [arrow]
Support Reading Alternative List Representations to Arrow From Parquet #1680 [parquet]
Speed up the offsets checking #1675 [arrow]
Separate Parquet -> Arrow Schema Conversion From ArrayBuilder #1655 [parquet]
Add leaf_columns argument to ArrowReader::get_record_reader_by_columns #1653 [parquet]
Implement string_concat kernel #1540 [arrow]
Improve Unit Test Coverage of ArrayReaderBuilder #1484 [parquet]

Fixed bugs:

Parquet write failure from record batches when data is nested two levels deep #1744 [parquet]
IPC reader may break on projection #1735 [arrow]
Latest nightly fails to build with feature simd #1734 [arrow]
Trying to write parquet file in parallel results in corrupt file #1717 [parquet]
Roundtrip failure when using DELTA_BINARY_PACKED #1708 [parquet]
ArrayData::try_new cannot always return expected error. #1707 [arrow]
"out of order projection is not supported" after Fix Parquet Arrow Schema Inference #1701 [parquet]
Rust is not interoperability with C++ for IPC schemas with dictionaries #1694 [arrow]
Incorrect Repeated Field Schema Inference #1681 [parquet]
Parquet Treats Embedded Arrow Schema as Authoritative #1663 [parquet]
parquet_to_arrow_schema_by_columns Incorrectly Handles Nested Types #1654 [parquet]
Inconsistent Arrow Schema When Projecting Nested Parquet File #1652 [parquet]
StructArrayReader Cannot Handle Nested Lists #1651 [parquet]
Bug `substring` kernel: The null buffer is not aligned when offset != 0 #1639 [arrow]

Documentation updates:

Parquet command line tool does not install "globally" #1710 [parquet]
Improve integration test document to follow Arrow C++ repo CI #1742 [arrow] (viirya)

Merged pull requests:

Test for list array equality with different offsets #1756 [arrow] (alamb)
Rename string_concat to concat_elements_utf8 #1754 [arrow] (alamb)
Rename the string kernel to concat_elements. #1752 [arrow] (HaoYang670)
Support writing nested lists to parquet #1746 [parquet] (tustvold)
Pin nightly version to bypass packed_simd build error #1743 (viirya)
Fix projection in IPC reader #1736 [arrow] (iyupeng)
cargo install installs not globally #1732 [parquet] (kazuk)
Fix schema comparison for non_canonical_map when running flight test #1731 (viirya)
Add min_binary and max_binary aggregate kernels #1725 [arrow] (HaoYang670)
Fix parquet benchmarks #1723 [parquet] (tustvold)
Fix BitReader::get_batch zero extension \#1708 #1722 [parquet] (tustvold)
Implementation string concat #1720 [arrow] (Ismail-Maj)
Check the length of null_bit_buffer in ArrayData::try_new() #1714 [arrow] (HaoYang670)
Fix incorrect null_count in generate_unions_case integration test #1713 [arrow] (viirya)
Fix: Null buffer accounts for offset in substring kernel. #1704 [arrow] (HaoYang670)
Minor: Refine OffsetSizeTrait to extend num::Integer #1702 [arrow] (HaoYang670)
Fix StructArrayReader handling nested lists \#1651 #1700 [parquet] (tustvold)
Speed up the offsets checking #1684 [arrow] (HaoYang670)

14.0.0 (2022-05-13)

Full Changelog

Breaking changes:

Use bytes in parquet rather than custom Buffer implementation \#1474 #1683 [parquet] (tustvold)
Rename OffsetSize::fn is_large to const OffsetSize::IS_LARGE #1664 [parquet] [arrow] (HaoYang670)
Remove StringOffsetTrait and BinaryOffsetTrait #1645 [arrow] (HaoYang670)
Fix generate_nested_dictionary_case integration test failure #1636 [arrow] [arrow-flight] (viirya)

Implemented enhancements:

Add support for DataType::Duration in ffi interface #1688 [arrow]
Fix generate_unions_case integration test #1676 [arrow]
Add DictionaryArray support for bit_length kernel #1673 [arrow]
Add DictionaryArray support for length kernel #1672 [arrow]
flight_client_scenarios integration test should receive schema from flight data #1669 [arrow]
Unpin Flatbuffer version dependency #1667 [arrow]
Add dictionary array support for substring function #1656 [arrow]
Exclude dict_id and dict_is_ordered from equality comparison of Field #1646 [arrow]
Remove StringOffsetTrait and BinaryOffsetTrait #1644 [arrow]
Add tests and examples for UnionArray::from(data: ArrayData) #1643 [arrow]
Add methods pub fn offsets_buffer, pub fn types_ids_bufferand pub fn data_buffer for ArrayDataBuilder #1640 [arrow]
Fix generate_nested_dictionary_case integration test failure for Rust cases #1635 [arrow]
Expose ArrowWriter row group flush in public API #1626 [parquet]
Add substring support for FixedSizeBinaryArray #1618 [arrow]
Add PrettyPrint for UnionArrays #1594 [arrow]
Add SIMD support for the length kernel #1489 [arrow]
Support dictionary arrays in length and bit_length #1674 [arrow] (viirya)
Add dictionary array support for substring function #1665 [arrow] (sunchao)
Add DecimalType support in new_null_array #1659 [arrow] (yjshen)

Fixed bugs:

Docs.rs build is broken #1695
Interoperability with C++ for IPC schemas with dictionaries #1694
UnionArray::is_null incorrect #1625 [arrow]
Published Parquet documentation missing arrow::async_reader #1617 [parquet]
Files written with Julia's Arrow.jl in IPC format cannot be read by arrow-rs #1335 [arrow]

Documentation updates:

Correct arrow-flight readme version #1641 [arrow-flight] (alamb)

Closed issues:

Make OffsetSizeTrait::IS_LARGE as a const value #1658
Question: Why are there 3 types of OffsetSizeTraits? #1638
Written Parquet file way bigger than input files #1627
Ensure there is a single zero in the offsets buffer for an empty ListArray. #1620
Filtering UnionArray Changes DataType #1595

Merged pull requests:

Fix docs.rs build #1696 [parquet] (alamb)
support duration in ffi #1689 [arrow] (ryan-jacobs1)
fix bench command line options #1685 [parquet] [arrow] (kazuk)
Enable branch protection #1679 (tustvold)
Fix logical merge conflict in #1588 #1678 [parquet] (tustvold)
Fix generate_unions_case for Rust case #1677 [arrow] (viirya)
Receive schema from flight data #1670 (viirya)
unpin flatbuffers dependency version #1668 [arrow] (Cheappie)
Remove parquet dictionary converters \#1661 #1662 [parquet] (tustvold)
Minor: simplify the function GenericListArray::get_type #1650 [arrow] (HaoYang670)
Pretty Print UnionArrays #1648 [arrow] (tfeda)
Exclude dict_id and dict_is_ordered from equality comparison of Field #1647 [arrow] (viirya)
expose row-group flush in public api #1634 [parquet] (Cheappie)
Add substring support for FixedSizeBinaryArray #1633 [arrow] (HaoYang670)
Fix UnionArray is_null #1632 [arrow] (viirya)
Do not assume dictionaries exists in footer #1631 [arrow] (pcjentsch)
Add support for nested list arrays from parquet to arrow arrays \#993 #1588 [parquet] (tustvold)
Add async into doc features #1349 [parquet] (HaoYang670)

13.0.0 (2022-04-29)

Full Changelog

Breaking changes:

Update parquet::basic::LogicalType to be more idomatic #1612 [parquet] (tfeda)
Fix Null Mask Handling in ArrayData, UnionArray, and MapArray #1589 [arrow] (tustvold)
Replace &Option<T> with Option<&T> in several arrow and parquet APIs #1571 [parquet] [arrow] (tfeda)

Implemented enhancements:

Read/write nested dictionary under fixed size list in ipc stream reader/write #1609 [arrow]
Add support for BinaryArray in substring kernel #1593 [arrow]
Read/write nested dictionary under large list in ipc stream reader/write #1584 [arrow]
Read/write nested dictionary under map in ipc stream reader/write #1582 [arrow]
Implement Clone for JSON DecoderOptions #1580 [arrow]
Add utf-8 validation checking to substring kernel #1575 [arrow]
Support casting to/from DataType::Null in cast kernel #1572 [arrow] (WinkerDu)

Fixed bugs:

Parquet schema should allow scale == precision for decimal type #1606 [parquet]
ListArray::from(ArrayData) dereferences invalid pointer when offsets are empty #1601 [arrow]
ArrayData Equality Incorrect Null Mask Offset Handling #1599
Filtering UnionArray Incorrect Handles Runs #1598
```
Safety
```
```
Safety
```
Union Layout Should Not Support Separate Validity Mask #1590
Incorrect nullable flag when reading maps test\_read\_maps fails when `force_validate` is active #1587 [parquet]
Output of ipc::reader::tests::projection_should_work fails validation #1548 [arrow]
Incorrect min/max statistics for decimals with byte-array notation #1532

Documentation updates:

Minor: Clarify docs on UnionBuilder::append_null #1628 [arrow] (alamb)

Closed issues:

Dense UnionArray Offsets Are i32 not i8 #1597 [arrow]
Replace &Option<T> with Option<&T> in some APIs #1556 [parquet] [arrow]
Improve ergonomics of parquet::basic::LogicalType #1554 [parquet]
Mark the current substring function as unsafe and rename it. #1541 [arrow]
Requirements for Async Parquet API #1473 [parquet]

Merged pull requests:

Nit: use the standard function div_ceil #1629 [arrow] (HaoYang670)
Update flatbuffers requirement from =2.1.1 to =2.1.2 #1622 [arrow] (dependabot[bot])
Fix decimals min max statistics #1621 [parquet] (atefsawaed)
Add example readme #1615 [arrow] (alamb)
Improve docs and examples links on main readme #1614 [arrow] (alamb)
Read/Write nested dictionaries under FixedSizeList in IPC #1610 [arrow] (viirya)
Add substring support for binary #1608 [arrow] (HaoYang670)
Parquet: schema validation should allow scale == precision for decimal type #1607 [parquet] (sunchao)
Don't access and validate offset buffer in ListArray::from(ArrayData) #1602 [arrow] (jhorstmann)
Fix map nullable flag in ParquetTypeConverter #1592 [parquet] (viirya)
Read/write nested dictionary under large list in ipc stream reader/writer #1585 [arrow] (viirya)
Read/write nested dictionary under map in ipc stream reader/writer #1583 [arrow] (viirya)
Derive Clone and PartialEq for json DecoderOptions #1581 [arrow] (alamb)
Add utf-8 validation checking for substring #1577 [arrow] (HaoYang670)
Use Option<T> rather than Option<&T> for copy types in substring kernel #1576 [arrow] (tustvold)
Use littleendian arrow files for projection_should_work #1573 [arrow] (viirya)

12.0.0 (2022-04-15)

Full Changelog

Breaking changes:

Add ArrowReaderOptions to ParquetFileArrowReader, add option to skip decoding arrow metadata from parquet \#1459 #1558 [parquet] (tustvold)
Support RecordBatch with zero columns but non zero row count, add field to RecordBatchOptions \#1536 #1552 [arrow] (tustvold)
Consolidate JSON Reader options and DecoderOptions #1539 [arrow] (alamb)
Update prost, prost-derive and prost-types to 0.10, tonic, and tonic-build to 0.7 #1510 [arrow-flight] (alamb)
Add Json DecoderOptions and support custom format_string for each field #1451 [arrow] (sum12)

Implemented enhancements:

Read/write nested dictionary in ipc stream reader/writer #1565 [arrow]
Support FixedSizeBinary in the Arrow C data interface #1553 [arrow]
Support Empty Column Projection in ParquetRecordBatchReader #1537 [parquet]
Support RecordBatch with zero columns but non zero row count #1536 [arrow]
Add support for Date32/Date64<--> String/LargeString in cast kernel #1535 [arrow]
Support creating arrays from externally owned memory like Vec or String #1516 [arrow]
Speed up the substring kernel #1511 [arrow]
Handle Parquet Files With Inconsistent Timestamp Units #1459 [parquet]

Fixed bugs:

Error Inferring Schema for LogicalType::UNKNOWN #1557 [parquet]
Read dictionary from nested struct in ipc stream reader panics #1549 [arrow]
filter produces invalid sparse UnionArrays #1547 [arrow]
Documentation for GenericListBuilder is not exposed. #1518 [arrow]
cannot read parquet file #1515 [parquet]
The substring kernel panics when chars > U+0x007F #1478 [arrow]
Hang due to infinite loop when reading some parquet files with RLE encoding and bit packing #1458 [parquet]

Documentation updates:

Improve JSON reader documentation #1559 [arrow] (alamb)
Improve doc string for substring kernel #1529 [arrow] (HaoYang670)
Expose documentation of GenericListBuilder #1525 [arrow] (comath)
Add a diagram to take kernel documentation #1524 [arrow] (alamb)

Closed issues:

Interesting benchmark results of min_max_helper #1400

Merged pull requests:

Fix incorrect into_buffers for UnionArray #1567 [arrow] (viirya)
Read/write nested dictionary in ipc stream reader/writer #1566 [arrow] (viirya)
Support FixedSizeBinary and FixedSizeList for the C data interface #1564 [arrow] (sunchao)
Split out ListArrayReader into separate module \#1483 #1563 [parquet] (tustvold)
Split out MapArray into separate module \#1483 #1562 [parquet] (tustvold)
Support empty projection in ParquetRecordBatchReader #1560 [parquet] (tustvold)
fix infinite loop in not fully packed bit-packed runs #1555 [parquet] (tustvold)
Add test for creating FixedSizeBinaryArray::try_from_sparse_iter failed when given all Nones #1551 [arrow] (alamb)
Fix reading dictionaries from nested structs in ipc StreamReader #1550 [arrow] (dispanser)
Add support for Date32/64 <--> String/LargeString in cast kernel #1534 [arrow] (yjshen)
fix clippy errors in 1.60 #1527 [parquet] [arrow] (alamb)
Mark remove-old-releases.sh executable #1522 (alamb)
Delete duplicate code in the sort kernel #1519 [arrow] (HaoYang670)
Fix reading nested lists from parquet files #1517 [parquet] (viirya)
Speed up the substring kernel by about 2x #1512 [arrow] (HaoYang670)
Add new_from_strings to create MapArrays #1507 [arrow] (viirya)
Decouple buffer deallocation from ffi and allow creating buffers from rust vec #1494 [arrow] (jhorstmann)

11.1.0 (2022-03-31)

Full Changelog

Implemented enhancements:

Implement size_hint and ExactSizedIterator for DecimalArray #1505 [arrow]
Support calculate length by chars for StringArray #1493 [arrow]
Add length kernel support for ListArray #1470 [arrow]
The length kernel should work with BinaryArrays #1464 [arrow]
FFI for Arrow C Stream Interface #1348 [arrow]
Improve performance of DictionaryArray::try_new() #1313 [arrow]

Fixed bugs:

MIRI error in math_checked_divide_op/try_from_trusted_len_iter #1496 [arrow]
Parquet Writer Incorrect Definition Levels for Nested NullArray #1480 [parquet]
FFI: ArrowArray::try_from_raw shouldn't clone #1425 [arrow]
Parquet reader fails to read null list. #1399 [parquet]

Documentation updates:

A small mistake in the doc of BinaryArray and LargeBinaryArray #1455 [arrow]
A small mistake in the doc of GenericBinaryArray::take_iter_unchecked #1454 [arrow]
Add links in the doc of BinaryOffsetSizeTrait #1453 [arrow]
The doc of FixedSizeBinaryArray is confusing. #1452 [arrow]
Clarify docs that SlicesIterator ignores null values #1504 [arrow] (alamb)
Update the doc of BinaryArray and LargeBinaryArray #1471 [arrow] (HaoYang670)

Closed issues:

packed_simd v.s. portable_simd, which should be used? #1492
Cleanup: Use Arrow take kernel Within parquet ListArrayReader #1482 [parquet]

Merged pull requests:

Implement size_hint and ExactSizedIterator for DecimalArray #1506 [arrow] (alamb)
Add StringArray::num_chars for calculating number of characters #1503 [arrow] (HaoYang670)
Workaround nightly miri error in try_from_trusted_len_iter #1497 [arrow] (jhorstmann)
update doc of array_binary and array_string #1491 [arrow] (HaoYang670)
Use Arrow take kernel within ListArrayReader #1490 [parquet] (viirya)
Add length kernel support for List Array #1488 [arrow] (HaoYang670)
Support sort for Decimal data type #1487 [arrow] (yjshen)
Fix reading/writing nested null arrays \#1480 \#1036 \#1399 #1481 [parquet] (tustvold)
Implement ArrayEqual for UnionArray #1469 [arrow] (viirya)
Support the length kernel on Binary Array #1465 [arrow] (HaoYang670)
Remove Clone and copy source structs internally #1449 [arrow] (viirya)
Fix Parquet reader for null lists #1448 [parquet] (viirya)
Improve performance of DictionaryArray::try_new() #1435 [arrow] (jackwener)
Add FFI for Arrow C Stream Interface #1384 [arrow] (viirya)

11.0.0 (2022-03-17)

Full Changelog

Breaking changes:

Replace filter_row_groups with ReadOptions in parquet SerializedFileReader #1389 [parquet] (yjshen)
Implement projection for arrow IPC Reader file / streams #1339 [arrow] [arrow-flight] (Dandandan)

Implemented enhancements:

Fix generate_interval_case integration test failure #1445
Make the doc examples of ListArray and LargeListArray more readable #1433
Redundant if and abs in shift() #1427
Improve substring kernel performance #1422 [arrow]
Add missing value_unchecked() of FixedSizeBinaryArray #1419
Remove duplicate bound check in function shift #1408
Support dictionary array in C data interface #1397
filter kernel should work with UnionArrays #1394 [arrow]
filter kernel should work with FixedSizeListArrayss #1393 [arrow]
Add doc examples for creating FixedSizeListArray #1392 [arrow]
Update rust-version to 1.59 #1377
Arrow IPC projection support #1338
Implement basic FlightSQL Server #1386 [arrow-flight] (wangfenjin)

Fixed bugs:

DictionaryArray::try_new ignores validity bitmap of the keys #1429 [arrow]
The doc of GenericListArray is confusing #1424
DeltaBitPackDecoder Incorrectly Handles Non-Zero MiniBlock Bit Width Padding #1417 [parquet]
DeltaBitPackEncoder Pads Miniblock BitWidths With Arbitrary Values #1416 [parquet]
Possible unaligned write with MutableBuffer::push #1410 [arrow]
Integration Test is failing on master branch #1398 [arrow]

Documentation updates:

Rewrite doc of GenericListArray #1450 [arrow] (HaoYang670)
Fix integration doc about build.ninja location #1438 (viirya)

Merged pull requests:

Rewrite doc example of ListArray and LargeListArray #1447 [arrow] (HaoYang670)
Fix generate_interval_case in integration test #1446 [arrow] (viirya)
Fix generate_decimal128_case in integration test #1440 (viirya)
filter kernel should work with FixedSizeListArrays #1434 [arrow] (viirya)
Support nullable keys in DictionaryArray::try_new #1430 [arrow] (jhorstmann)
remove redundant if/clamp_min/abs #1428 [arrow] (jackwener)
Add doc example for creating FixedSizeListArray #1426 [arrow] (HaoYang670)
Directly write to MutableBuffer in substring #1423 [arrow] (viirya)
Fix possibly unaligned writes in MutableBuffer #1421 [arrow] (jhorstmann)
Add value_unchecked() and unit test #1420 [arrow] (jackwener)
Fix DeltaBitPack MiniBlock Bit Width Padding #1418 [parquet] (tustvold)
Update zstd requirement from 0.10 to 0.11 #1415 [parquet] (dependabot[bot])
Set default-features = false for zstd in the parquet crate to support wasm32-unknown-unknown #1414 [parquet] (kylebarron)
Add support for UnionArray infilter kernel #1412 [arrow] (viirya)
Remove duplicate bound check in the function shift #1409 [arrow] (HaoYang670)
Add dictionary support for C data interface #1407 [arrow] (sunchao)
Fix a small spelling mistake in docs. #1406 [arrow] (HaoYang670)
Add unit test to check FixedSizeBinaryArray input all none #1405 [arrow] (jackwener)
Move csv Parser trait and its implementations to utils module #1385 [arrow] (sum12)

10.0.0 (2022-03-04)

Full Changelog

Breaking changes:

Remove existing has_ methods for optional fields in ColumnChunkMetaData #1346 [parquet] (shanisolomon)
Remove redundant has_ methods in ColumnChunkMetaData #1345 [parquet] (shanisolomon)

Implemented enhancements:

Add extract month and day in temporal.rs #1387
Add clone to IpcWriteOptions #1381 [arrow]
Support MapArray in filter kernel #1378 [arrow]
Add week temporal kernel #1375 [arrow]
Improve performance of compare_dict_op #1371 [arrow]
Add support for LargeUtf8 in json writer #1357 [parquet]
Make arrow::array::builder::MapBuilder public #1354 [arrow]
Refactor StructArray::from #1351 [arrow]
Refactor RecordBatch::validate_new_batch #1350 [arrow]
Remove redundant has_ methods for optional column metadata fields #1344 [parquet]
Add write method to JsonWriter #1340 [arrow]
Refactor the code of Bitmap::new #1337 [arrow]
Use DictionaryArray's iterator in compare_dict_op #1329 [arrow]
Add as_decimal_array(arr: &dyn Array) -> &DecimalArray #1312 [arrow]
More ergonomic / idiomatic primitive array creation from iterators #1298 [arrow]
Implement DictionaryArray support in eq_dyn, neq_dyn, lt_dyn, lt_eq_dyn, gt_dyn, gt_eq_dyn #1201 [arrow]

Fixed bugs:

cargo clippy fails on the master branch #1362 [arrow]
ArrowArray::try_from_raw should not assume the pointers are from Arc #1333 [arrow]
Fix CSV Writer::new to accept delimiter and make WriterBuilder::build use it #1328 [arrow]
Make bounds configurable via builder when reading CSV #1327 [arrow]
Add with_datetime_format() to CSV WriterBuilder #1272 [arrow]

Performance improvements:

Improve performance of min and max aggregation kernels without nulls #1373 [arrow]

Closed issues:

Consider removing redundant has_XXX metadata functions in ColumnChunkMetadata #1332

Merged pull requests:

Support extract day and month in temporal.rs #1388 [arrow] (Ted-Jiang)
Add write method to Json Writer #1383 [arrow] (matthewmturner)
Derive Clone for IpcWriteOptions #1382 [arrow] (matthewmturner)
feat: support maps in MutableArrayData #1379 [arrow] (helgikrs)
Support extract week in temporal.rs #1376 [arrow] (Ted-Jiang)
Speed up the function min_max_string #1374 [arrow] (HaoYang670)
Improve performance if dictionary kernels, add benchmark and add take_iter_unchecked #1372 [arrow] (viirya)
Update pyo3 requirement from 0.15 to 0.16 #1369 [arrow] (dependabot[bot])
Update contributing guide #1368 (HaoYang670)
Allow primitive array creation from iterators of PrimitiveTypes as well as `Option` #1367 [arrow] (viirya)
Update flatbuffers requirement from =2.1.0 to =2.1.1 #1364 [arrow] (dependabot[bot])
Fix clippy lints #1363 [parquet] [arrow] (HaoYang670)
Refactor RecordBatch::validate_new_batch #1361 [arrow] (HaoYang670)
Refactor StructArray::from #1360 [arrow] (HaoYang670)
Update flatbuffers requirement from =2.0.0 to =2.1.0 #1359 [arrow] (dependabot[bot])
fix: add LargeUtf8 support in json writer #1358 [arrow] (tiphaineruy)
Add as_decimal_array function #1356 [arrow] (liukun4515)
Publicly export arrow::array::MapBuilder #1355 [arrow] (tjwilson90)
Add with_datetime_format to csv WriterBuilder #1347 [arrow] (gsserge)
Refactor Bitmap::new #1343 [arrow] (HaoYang670)
Remove delimiter from csv Writer #1342 [arrow] (gsserge)
Make bounds configurable in csv ReaderBuilder #1341 [arrow] (gsserge)
ArrowArray::try_from_raw should not assume the pointers are from Arc #1334 [arrow] (viirya)
Use DictionaryArray's iterator in compare_dict_op #1330 [arrow] (viirya)
Implement DictionaryArray support in neq_dyn, lt_dyn, lt_eq_dyn, gt_dyn, gt_eq_dyn #1326 [arrow] (viirya)
Arrow Rust + Conbench Integration #1289 (dianaclarke)

9.1.0 (2022-02-19)

Full Changelog

Implemented enhancements:

Exposing page encoding stats #1321
Improve filter performance by special casing high and low selectivity predicates #1288 [arrow]
Speed up DeltaBitPackDecoder #1281 [parquet]
Fix all clippy lints in arrow crate #1255 [arrow]
Expose page encoding ColumnChunkMetadata #1322 [parquet] (shanisolomon)
Expose column index and offset index in ColumnChunkMetadata #1318 [parquet] (shanisolomon)
Expose bloom filter offset in ColumnChunkMetadata #1309 [parquet] (shanisolomon)
Add DictionaryArray::try_new() to create dictionaries from pre existing arrays #1300 [arrow] (alamb)
Add DictionaryArray::keys_iter, and take_iter for other array types #1296 [arrow] (viirya)
Make rle decoder public under experimental feature #1271 [parquet] (zeevm)
Add DictionaryArray support in eq_dyn kernel #1263 [arrow] (viirya)

Fixed bugs:

len is not a parameter of MutableArrayData::extend #1316
module data_type is private in Rust Parquet 8.0.0 #1302 [parquet]
Test failure: bit_chunk_iterator #1294
csv_writer benchmark fails with "no such file or directory" #1292

Documentation updates:

Fix warnings in cargo doc #1268 [parquet] [arrow] (alamb)

Performance improvements:

Vectorize DeltaBitPackDecoder, up to 5x faster decoding #1284 [parquet] (tustvold)
Skip zero-ing primitive nulls #1280 [parquet] (tustvold)
Add specialized filter kernels in compute module up to 10x faster #1248 [parquet] [arrow] (tustvold)

Closed issues:

Expose column and offset index metadata offset #1317
Expose bloom filter metadata offset #1308
Improve ergonomics to construct DictionaryArrays from Key and Value arrays #1299
Make it easier to iterate over DictionaryArray #1295 [arrow]
(WON'T FIX) Don't Interwine Bit and Byte Aligned Operations in BitReader #1282
how to create arrow::array from streamReader #1278
Remove scientific notation when converting floats to strings. #983

Merged pull requests:

Update the document of function MutableArrayData::extend #1336 [arrow] (HaoYang670)
Fix clippy lint dead_code #1324 [arrow] (gsserge)
fix test bug and ensure that bloom filter metadata is serialized in to_thrift #1320 [parquet] (shanisolomon)
Enable more clippy lints in arrow #1315 [arrow] (gsserge)
Fix clippy lint clippy::type_complexity #1310 [arrow] (gsserge)
Fix clippy lint clippy::float_equality_without_abs #1305 [arrow] (gsserge)
Fix clippy clippy::vec_init_then_push lint #1303 [arrow] (gsserge)
Fix failing csv_writer bench #1293 [arrow] (andygrove)
Changes for 9.0.2 #1291 [parquet] [arrow] [arrow-flight] (alamb)
Fix bitmask creation also for simd comparisons with scalar #1290 [arrow] (jhorstmann)
Fix simd comparison kernels #1286 [arrow] (jhorstmann)
Restrict Decoder to compatible types \#1276 #1277 [parquet] (tustvold)
Fix some clippy lints in parquet crate, rename LevelEncoder variants to conform to Rust standards #1273 [parquet] (HaoYang670)
Use new DecimalArray creation API in arrow crate #1249 [arrow] (alamb)
Improve DecimalArray API ergonomics: add iter(), FromIterator, with_precision_and_scale #1223 [arrow] (alamb)

9.0.2 (2022-02-09)

Full Changelog

Breaking changes:

Add Send + Sync to DataType, RowGroupReader, FileReader, ChunkReader. #1264
Rename the function Bitmap::len to Bitmap::bit_len to clarify its meaning #1242 [parquet] [arrow] (HaoYang670)
Remove unused / broken memory-check feature #1222 [arrow] (jhorstmann)
Potentially buffer multiple RecordBatches before writing a parquet row group in ArrowWriter #1214 [parquet] [arrow] (tustvold)

Implemented enhancements:

Add async arrow parquet reader #1154 [parquet] [arrow] (tustvold)
Rename Bitmap::len to Bitmap::bit_len #1233
Extend CSV schema inference to allow scientific notation for floating point types #1215 [arrow]
Write Multiple RecordBatch to Parquet Row Group #1211
Add doc examples for eq_dyn etc. #1202 [arrow]
Add comparison kernels for BinaryArray #1108
impl ArrowNativeType for i128 #1098
Remove Copy trait bound from dyn scalar kernels #1243 [arrow] (matthewmturner)
Add into_inner for IPC FileWriter #1236 [arrow] (yjshen)
```
Minor
```

Fixed bugs:

Parquet v8.0.0 panics when reading all null column to NullArray #1245 [parquet]
Get Unknown configuration option rust-version when running the rust format command #1240
Bitmap Length Validation is Incorrect #1231 [arrow]
Writing sliced ListArray or MapArray ignore offsets #1226 [parquet]
Remove broken memory-tracking crate feature #1171
Revert making parquet::data_type and parquet::arrow::schema experimental #1244 [parquet] (tustvold)

Documentation updates:

Update parquet crate documentation and examples #1253 [parquet] [arrow] (alamb)
Refresh parquet readme / contributing guide #1252 [parquet] (alamb)
Add docs examples for dynamically compare functions #1250 [arrow] (HaoYang670)
Add Rust Docs examples for UnionArray #1241 [arrow] (HaoYang670)
Improve documentation for Bitmap #1237 [arrow] (alamb)

Performance improvements:

Improve performance for arithmetic kernels with simd feature enabled except for division/modulo #1221 [arrow] (jhorstmann)
Do not concatenate identical dictionaries #1219 [arrow] (tustvold)
Preserve dictionary encoding when decoding parquet into Arrow arrays, 60x perf improvement \#171 #1180 [parquet] (tustvold)

Closed issues:

UnalignedBitChunkIterator to that iterates through already aligned u64 blocks #1227
Remove unused ArrowArrayReader in parquet #1197 [parquet]

Merged pull requests:

Upgrade clap to 3.0.0 #1261 [parquet] (Jimexist)
Update chrono-tz requirement from 0.4 to 0.6 #1259 [arrow] (dependabot[bot])
Update zstd requirement from 0.9 to 0.10 #1257 [parquet] (dependabot[bot])
Fix NullArrayReader \#1245 #1246 [parquet] (tustvold)
dyn compare for binary array #1238 [arrow] (HaoYang670)
Remove arrow array reader \#1197 #1234 [parquet] (tustvold)
Fix null bitmap length validation \#1231 #1232 [arrow] (tustvold)
Faster bitmask iteration #1228 [parquet] [arrow] (tustvold)
Add non utf8 values into the test cases of BinaryArray comparison #1220 [arrow] (HaoYang670)
Update DECIMAL_RE to allow scientific notation in auto inferred schemas #1216 [arrow] (pjmore)
Fix simd comparison kernels #1286 [arrow] (jhorstmann)
Fix bitmask creation also for simd comparisons with scalar #1290 [arrow] (jhorstmann)

8.0.0 (2022-01-20)

Full Changelog

Breaking changes:

Return error from JSON writer rather than panic #1205 [arrow] (Ted-Jiang)
Remove ArrowSignedNumericType to Simplify and reduce code duplication in arithmetic kernels #1161 [arrow] (jhorstmann)
Restrict RecordReader and friends to scalar types \#1132 #1155 [parquet] (tustvold)
Move more parquet functionality behind experimental feature flag \#1032 #1134 [parquet] (tustvold)

Implemented enhancements:

Parquet reader should be able to read structs within list #1186 [parquet]
Disable serde_json arbitrary_precision feature flag #1174 [arrow]
Simplify and reduce code duplication in arithmetic.rs #1160 [arrow]
Return Err from JSON writer rather than panic! for unsupported types #1157 [arrow]
Support scalar mathematics kernels for Array and scalar value #1153 [arrow]
Support DecimalArray in sort kernel #1137
Parquet Fuzz Tests #1053
BooleanBufferBuilder Append Packed #1038 [arrow]
parquet Performance Optimization: StructArrayReader Redundant Level & Bitmap Computation #1034 [parquet]
Reduce Public Parquet API #1032 [parquet]
Add from_iter_values for binary array #1188 [arrow] (Jimexist)
Add support for MapArray in json writer #1149 [arrow] (helgikrs)

Fixed bugs:

Empty string arrays with no nulls are not equal #1208 [arrow]
Pretty print a RecordBatch containing Float16 triggers a panic #1193 [arrow]
Writing structs nested in lists produces an incorrect output #1184 [parquet]
Undefined behavior for GenericStringArray::from_iter_values if reported iterator upper bound is incorrect #1144 [arrow]
Interval comparisons with simd feature asserts #1136 [arrow]
RecordReader Permits Illegal Types #1132 [parquet]

Security fixes:

Fix undefined behavor in GenericStringArray::from_iter_values #1145 [arrow] (alamb)
parquet: Optimized ByteArrayReader, Add UTF-8 Validation \#1040 #1082 [parquet] [arrow] (tustvold)

Documentation updates:

Update parquet crate readme #1192 [parquet] (alamb)
Document safety justification of some uses of from_trusted_len_iter #1148 [arrow] (alamb)

Performance improvements:

Improve parquet reading performance for columns with nulls by preserving bitmask when possible \#1037 #1054 [parquet] [arrow] (tustvold)
Improve parquet performance: Skip levels computation for required struct arrays in parquet #1035 [parquet] (tustvold)

Closed issues:

Generify ColumnReaderImpl and RecordReader #1040 [parquet]
Parquet Preserve BitMask #1037

Merged pull requests:

fix a bug in variable sized equality #1209 [arrow] (helgikrs)
Pin WASM / packed SIMD tests to nightly-2022-01-17 #1204 (alamb)
feat: add support for casting Duration/Interval to Int64Array #1196 [arrow] (e-dard)
Add comparison support for fully qualified BinaryArray #1195 [arrow] (HaoYang670)
Fix in display of Float16Array #1194 [arrow] (helgikrs)
update nightly version for miri #1189 (Jimexist)
feat(parquet): support for reading structs nested within lists #1187 [parquet] (helgikrs)
fix: Fix a bug in how definition levels are calculated for nested structs in a list #1185 [parquet] (helgikrs)
Truncate bitmask on BooleanBufferBuilder::resize: #1183 [parquet] [arrow] (tustvold)
Add ticket reference for false positive in clippy #1181 [arrow] (alamb)
Fix record formatting in 1.58 #1178 [parquet] (tustvold)
Serialize i128 as JSON string #1175 [arrow] (tustvold)
Support DecimalType in sort and take kernels #1172 [arrow] (liukun4515)
Fix new clippy lints introduced in Rust 1.58 #1170 [parquet] [arrow] (alamb)
Fix compilation error with simd feature #1169 [arrow] (jhorstmann)
Fix bug while writing parquet with empty lists of structs #1166 [parquet] (helgikrs)
Use tempfile for parquet tests #1165 [parquet] (tustvold)
Remove left over dev/README.md file from arrow/arrow-rs split #1162 (alamb)
Add multiply_scalar kernel #1159 [arrow] (viirya)
Fuzz test different parquet encodings #1156 [parquet] (tustvold)
Add subtract_scalar kernel #1152 [arrow] (viirya)
Add add_scalar kernel #1151 [arrow] (viirya)
Move simd right out of for_each loop #1150 [arrow] (viirya)
Internal Remove GenericStringArray::from_vec and GenericStringArray::from_opt_vec #1147 [arrow] (alamb)
Implement SIMD comparison operations for types with less than 4 lanes i128 #1146 [arrow] (jhorstmann)
Extends parquet fuzz tests to also tests nulls, dictionaries and row groups with multiple pages \#1053 #1110 [parquet] (tustvold)
Generify ColumnReaderImpl and RecordReader \#1040 #1041 [parquet] (tustvold)
BooleanBufferBuilder::append_packed \#1038 #1039 [arrow] (tustvold)

7.0.0 (2022-1-07)

Full Changelog

Arrow

Breaking changes:

pretty_format_batches now returns Result<impl Display> rather than String: #975
MutableBuffer::typed_data_mut is marked unsafe: #1029
UnionArray updated match latest Arrow spec, added UnionMode, UnionArray::new() marked unsafe: #885

New Features:

Support for Float16Array types #888
IPC support for UnionArray #654
Dynamic comparison kernels for scalars (e.g. eq_dyn_scalar), including DictionaryArray: #1113

Enhancements:

Added Schema::with_metadata and Field::with_metadata #1092
Support for custom datetime format for inference and parsing csv files #1112
Implement Array for ArrayRef for easier use #1129
Pretty printing display support for FixedSizeBinaryArray #1097
Dependency Upgrades: pyo3, parquet-format, prost, tonic
Avoid allocating vector of indices in lexicographical_partition_ranges#998

Parquet

Fixed bugs:

(parquet) Fix reading of dictionary encoded pages with null values: #1130

Changelog

6.5.0 (2021-12-23)

Full Changelog

092fc64bbb019244887ebd0d9c9a2d3e3a9aebc0 support cast decimal to decimal (#1084) (#1093)
01459762ed18b504e00e7b2818fce91f19188b1e Fix like regex escaping (#1085) (#1090)
7c748bfccbc2eac0c1138378736b70dcb7e26a5b support cast decimal to signed numeric (#1073) (#1089)
bd3600b6483c253ae57a38928a636d39a6b7cb02 parquet: Use constant for RLE decoder buffer size (#1070) (#1088)
2b5c53ecd92468fd95328637a15de7f35b6fcf28 Box RleDecoder index buffer (#1061) (#1062) (#1081)
78721bc1a467177679ad6196b994759cf4d73377 BooleanBufferBuilder correct buffer length (#1051) (#1052) (#1080)
3a5e3541d3a4db61a828011ed95c8539adf1d57c support cast signed numeric to decimal (#1044) (#1079)
000bdb3053098255d43288aa3e8665e8b1892a6c fix(compute): LIKE escape parenthesis (#1042) (#1078)
e0abdb9e62772a2f853974e68e744246e7f47569 Add Schema::project and RecordBatch::project functions (#1033) (#1077)
31911a4d6328d889d98796b896412b3997f73e13 Remove outdated safety example from doc (#1050) (#1058)
71ac8620993a65a7f1f57278c3495556625356b3 Use existing array type in take kernel (#1046) (#1057)
1c5902376b7f7d56cb5249db4f98a6a370ead919 Extract method to drive PageIterator -> RecordReader (#1031) (#1056)
7ca39361f8733b86bc0cef5ed5d74093e2c6b14d Clarify governance of arrow crate (#1030) (#1055)

6.4.0 (2021-12-10)

Full Changelog

049f48559f578243935b6e512d06c4c2df360bf1 Force new cargo and target caching to fix CI (#1023) (#1024)
ef37da3b60f71a52d5ad67e9ca810dca38b29f00 Fix a broken link and some missing styling in the main arrow crate docs (#1013) (#1019)
f2c746a9b968714cfe05d35fcee8658371acd899 Remove out of date comment (#1008) (#1018)
557fc11e3b2a09a680c0cfbf38d27b13101b63fe Remove unneeded rc feature of serde (#990) (#1016)
b28385e096b1cf8f5fb2773d49b160f93d94fbac Docstrings for Timestamp*Array. (#988) (#1015)
a92672e40217670d2566a85d70b0b59fffac594c Add full data validation for ArrayData::try_new() (#1007)
6c8b2936d7b07e1e2f5d1d48eea425a385382dfb Add boolean comparison to scalar kernels for less then, greater than (#977) (#1005)
14d140aeca608a23a8a6b2c251c8f53ffd377e61 Fix some typos in code and comments (#985) (#1006)
b4507f562fb0eddfb79840871cd2733dc0e337cd Fix warnings introduced by Rust/Clippy 1.57.0 (#1004)

6.3.0 (2021-11-26)

Full Changelog

Changes:

7e51df015ce851a5de444ca08b57b38e7ee959a3 add more error test case and change the code style (#952) (#976)
6c570cfe98d6a7a4ec74b139b733c5c72ed10015 Support read decimal data from csv reader if user provide the schema with decimal data type (#941) (#974)
4fa0d4d7f7d9ca0a3da2a6dfe3eae6dc2d51a79a Adding Pretty Print Support For Fixed Size List (#958) (#968)
9d453a3128013c03e8ed854ded76b15cc6f28be4 Fix bug in temporal utilities due to DST being ignored. (#955) (#967)
1b9fd9e3fb2653236513bb7dda5aa2fa14d1d831 Inferring 2. as Float64 for issue #929 (#950) (#966)
e6c5e1c877bd94b3d6e545567f901d9962257cf8 Fix CI for latest nightly (#970) (#973)
c96e8de457442806e18944f0b26dd06ba4cb1aee Fix primitive sort when input contains more nulls than the given sort limit (#954) (#965)
094037d418381584178db1d886cad3b5024b414a Update comfy-table to 5.0 (#957) (#964)
9f635021eee6786c5377c891218c5f88ebce07c3 Fix csv writing of timestamps to show timezone. (#849) (#963)
f7deba4c3a050a52608462ee8a827bb8f6364140 Adding ability to parse float from number with leading decimal (#831) (#962)
59f96e842d05b63882f7ba285c66a9739761cf84 add ilike comparator (#874) (#961)
54023c8a5543c9f9fa4955afa01189029f3e96f5 Remove unpassable cargo publish check from verify-release-candidate.sh (#882) (#949)

6.2.0 (2021-11-12)

Full Changelog

Features / Fixes:

4037933e43cad9e4de027039ce14caa65f78300a Fix validation for offsets of StructArrays (#942) (#946)
1af9ca5d363d870550026a7b1abcb749befbb371 implement take kernel for null arrays (#939) (#944)
320de1c20aefbf204f6888e2ad3663863afeba9f add checker for appending i128 to decimal builder (#928) (#943)
dff14113884ad4246a8cafb9be579ebdb4e1481f Validate arguments to ArrayData::new and null bit buffer and buffers (#810) (#936)
c3eae1ec56303b97c9e15263063a6a13122ef194 fix some warning about unused variables in panic tests (#894) (#933)
e80bb018450f13a30811ffd244c42917d8bf8a62 fix some clippy warnings (#896) (#930)
bde89463b627be3f60b5569d038ca36c434da71d feat(ipc): add support for deserializing messages with nested dictionary fields (#923) (#931)
792544b5fb7b84224ef9745ecb9f330663c14fb4 refactor regexp_is_match_utf8_scalar to try to mitigate miri failures (#895) (#932)
3f0e252811cbb6e3f7c774959787dcfec985d03e Automatically retry failed MIRI runs to work around intermittent failures (#934)
c9a9515c46d560ced00e23ff57cb10a1c97573cb Update mod.rs (#909) (#919)
64ed79ece67141b92dc45b8a1d43cb9d909aa6a9 Mark boolean kernels public (#913) (#920)
8b95fe0bbf03588c5cc00f67365c5b0dac4d7a34 doc example mistype (#904) (#918)
34c5eab4862cab16fdfd5f5ed6c68dce6298dfa4 allow null array to be cast to all other types (#884) (#917)
3c69752e55ed0c58f5a8faed918a22b45cd93766 Fix instances of UB that cause tests to not pass under miri (#878) (#916)
85402148c3af03d0855e81f855715ea98a7491c5 feat(ipc): Support writing dictionaries nested in structs and unions (#870) (#915)
03d95e626cb0e654775fefa77786674ea41be4a2 Fix references to changelog (#905)

6.1.0 (2021-10-29)

Full Changelog

Features / Fixes:

b42649b0088fe7762c713a41a23c1abdf8d0496d implement eq_dyn and neq_dyn (#858) (#867)
01743f3f10a377c1ca857cd554acbf84155766d8 fix: fix a bug in offset calculation for unions (#863) (#871)
8bfff793a23f0e71008c7a9eea7a54d6b913ecff add lt_bool, lt_eq_bool, gt_bool, gt_eq_bool (#860) (#868)
8845e91d4ab584c822e9ee903db7069551b124af fix(ipc): Support serializing structs containing dictionaries (#848) (#865)
620282a0d9fdd2a8ed7e8313d17ba3dec64c80e5 Implement boolean equality kernels (#844) (#857)
94cddcacf785be982e69689291ce034ef00220b4 Cherry pick fix parquet_derive with default features (and fix cargo publish) (#856)
733fd583ddb3dbe6b4d58a809c444ee16ac0eae8 Use kernel utility for parsing timestamps in csv reader. (#832) (#853)
2cc64937a153f632796915d2d9869d5c2a501d28 [Minor] Fix clippy errors with new rust version (1.56) and float formatting with nightly (#845) (#850)

Other:

bfac9e5a027e3bd78b7a1ec90c75a3e385bd66bb Test out new tarpaulin version (#852) (#866)
809350ced392cfc78d8a1a46228d4ffc25dea9ff Update README.md (#834) (#854)
70582f40dd21f5c710c4946266d0563a92b92337 [MINOR] Delete temp file from docs (#836) (#855)
a721e00014015a7e598946b6efb9b1da8080ec85 Force fresh cargo cache key in CI (#839) (#851)

6.0.0 (2021-10-13)

Full Changelog

Breaking changes:

Replace ArrayData::new() with ArrayData::try_new() and unsafe ArrayData::new_unchecked #822 [parquet] [arrow] (alamb)
Update Bitmap::len to return bits rather than bytes #749 [arrow] (matthewmturner)
use sort_unstable_by in primitive sorting #552 [arrow] (Jimexist)
New MapArray support #491 [parquet] [arrow] (nevi-me)

Implemented enhancements:

Improve parquet binary writer speed by reducing allocations #819
Expose buffer operations #808
Add doc examples of writing parquet files using ArrowWriter #788

Fixed bugs:

JSON reader can create null struct children on empty lists #825
Incorrect null count for cast kernel for list arrays #815
minute and second temporal kernels do not respect timezone #500
Fix data corruption in json decoder f64-to-i64 cast #652 [arrow] (xianwill)

Documentation updates:

Doctest for PrimitiveArray using from_iter_values. #694 [arrow] (novemberkilo)
Doctests for BinaryArray and LargeBinaryArray. #625 [arrow] (novemberkilo)
Add links in docstrings #605 [arrow] (alamb)

5.5.0 (2021-09-24)

Full Changelog

Implemented enhancements:

parquet should depend on a small set of arrow features #800
Support equality on RecordBatch #735

Fixed bugs:

Converting from string to timestamp uses microseconds instead of milliseconds #780
Document has no link to RowColumnIter #762
length on slices with null doesn't work #744

5.4.0 (2021-09-10)

Full Changelog

Implemented enhancements:

Upgrade lexical-core to 0.8 #747
append_nulls and append_trusted_len_iter for PrimitiveBuilder #725
Optimize MutableArrayData::extend for null buffers #397

Fixed bugs:

Arithmetic with scalars doesn't work on slices #742
Comparisons with scalar don't work on slices #740
unary kernel doesn't respect offset #738
new_null_array creates invalid struct arrays #734
--no-default-features is broken for parquet #733 [parquet]
Bitmap::len returns the number of bytes, not bits. #730
Decimal logical type is formatted incorrectly by print_schema #713
parquet_derive does not support chrono time values #711
Numeric overflow when formatting Decimal type #710
The integration tests are not running #690

Closed issues:

Question: Is there no way to create a DictionaryArray with a pre-arranged mapping? #729

5.3.0 (2021-08-26)

Full Changelog

Implemented enhancements:

Add optimized filter kernel for regular expression matching #697
Can't cast from timestamp array to string array #587

Fixed bugs:

'Encoding DELTA_BYTE_ARRAY is not supported' with parquet arrow readers #708
Support reading json string into binary data type. #701

Closed issues:

Resolve Issues with prettytable-rs dependency #69 [arrow]

5.2.0 (2021-08-12)

Full Changelog

Implemented enhancements:

Make rand an optional dependency #671
Remove undefined behavior in value method of boolean and primitive arrays #645
Avoid materialization of indices in filter_record_batch for single arrays #636
Add a note about arrow crate security / safety #627
Allow the creation of String arrays from an iterator of &Option<&str> #598
Support arrow map datatype #395

Fixed bugs:

Parquet fixed length byte array columns write byte array statistics #660 [parquet]
Parquet boolean columns write Int32 statistics #659 [parquet]
Writing Parquet with a boolean column fails #657
JSON decoder data corruption for large i64/u64 #653
Incorrect min/max statistics for strings in parquet files #641 [parquet]

Closed issues:

Release candidate verifying script seems work on macOS #640
Update CONTRIBUTING #342

5.1.0 (2021-07-29)

Full Changelog

Implemented enhancements:

Make FFI_ArrowArray empty() public #602
exponential sort can be used to speed up lexico partition kernel #586
Implement sort() for binary array #568
primitive sorting can be improved and more consistent with and without limit if sorted unstably #553

Fixed bugs:

Confusing memory usage with CSV reader #623
FFI implementation deviates from specification for array release #595
Parquet file content is different if ~/.cargo is in a git checkout #589
Ensure output of MIRI is checked for success #581
MIRI failure in array::ffi::tests::test_struct and other ffi tests #580
ListArray equality check may return wrong result #570
cargo audit failed #561
ArrayData::slice() does not work for nested types such as StructArray #554

Documentation updates:

More examples of how to construct Arrays #301

Closed issues:

Implement StringBuilder::append_option #263 [arrow]

5.0.0 (2021-07-14)

Full Changelog

Breaking changes:

Remove lifetime from DynComparator #543 [arrow]
Simplify interactions with arrow flight APIs #376 [arrow-flight]
refactor: remove lifetime from DynComparator #542 [arrow] (e-dard)
use iterator for partition kernel instead of generating vec #438 [arrow] (Jimexist)
Remove DictionaryArray::keys_array method #419 [arrow] (jhorstmann)
simplify interactions with arrow flight APIs #377 [arrow-flight] (garyanaplan)
return reference from DictionaryArray::values() \#313 #314 [arrow] (tustvold)

Implemented enhancements:

Allow creation of StringArrays from Vec<String> #519 [arrow]
Implement RecordBatch::concat #461 [arrow]
Implement RecordBatch::slice() to slice RecordBatches #460 [arrow]
Add a RecordBatch::split to split large batches into a set of smaller batches #343
generate parquet schema from rust struct #539 [parquet] (nevi-me)
Implement RecordBatch::concat #537 [arrow] (silathdiir)
Implement function slice for RecordBatch #490 [arrow] (b41sh)
add lexicographically partition points and ranges #424 [arrow] (Jimexist)
allow to read non-standard CSV #326 [arrow] (kazuk)
parquet: Speed up BitReader/DeltaBitPackDecoder #325 [parquet] (kornholi)
ARROW-12343: [Rust] Support auto-vectorization for min/max #9 [arrow] (Dandandan)
ARROW-12411: [Rust] Create RecordBatches from Iterators #7 [arrow] (alamb)

Fixed bugs:

Error building on master - error: cyclic package dependency: package ahash v0.7.4 depends on itself. Cycle #544
IPC reader panics with out of bounds error #541
Take kernel doesn't handle nulls and structs correctly #530 [arrow]
master fails to compile with default-features=false #529
README developer instructions out of date #523
Update rustc and packed_simd in CI before 5.0 release #517
Incorrect memory usage calculation for dictionary arrays #503 [arrow]
sliced null buffers lead to incorrect result in take kernel and probably on other places #502
Cast of utf8 types and list container types don't respect offset #334 [arrow]
fix take kernel null handling on structs #531 [arrow] (bjchambers)
Correct array memory usage calculation for dictionary arrays #505 [arrow] (jhorstmann)
parquet: improve BOOLEAN writing logic and report error on encoding fail #443 [parquet] (garyanaplan)
Fix bug with null buffer offset in boolean not kernel #418 [arrow] (jhorstmann)
respect offset in utf8 and list casts #335 [arrow] (ritchie46)
Fix comparison of dictionaries with different values arrays \#332 #333 [arrow] (tustvold)
ensure null-counts are written for all-null columns #307 [parquet] (crepererum)
fix invalid null handling in filter #296 [arrow] (ritchie46)
fix NaN handling in parquet statistics #256 (crepererum)

Documentation updates:

Improve arrow's crate's readme on crates.io #463
Clean up README.md in advance of the 5.0 release #536 [arrow] [arrow-flight] [parquet] (alamb)
fix readme instructions to reflect new structure #524 (marcvanheerden)
Improve docs for NullArray, new_null_array and new_empty_array #240 [arrow] (alamb)

Merged pull requests:

Fix default arrow build #533 [arrow] (alamb)
Add tests for building applications using arrow with different feature flags #532 [arrow] (alamb)
Remove unused futures dependency from arrow-flight #528 [arrow-flight] (alamb)
CI: update rust nightly and packed_simd #525 [arrow] (ritchie46)
Support StringArray creation from String Vec #522 [arrow] (silathdiir)
Fix parquet benchmark schema #513 [parquet] (nevi-me)
Fix parquet definition levels #511 [parquet] (nevi-me)
Fix for primitive and boolean take kernel for nullable indices with an offset #509 [arrow] (jhorstmann)
Bump flatbuffers #499 [arrow] (PsiACE)
implement second/minute helpers for temporal #493 [arrow] (ovr)
special case concatenating single element array shortcut #492 [arrow] (Jimexist)
update docs to reflect recent changes joins and window functions #489 (Jimexist)
Update rand, proc-macro and zstd dependencies #488 [arrow] [arrow-flight] [parquet] (alamb)
Doctest for GenericListArray. #474 [arrow] (novemberkilo)
remove stale comment on ArrayData equality and update unit tests #472 (Jimexist)
remove unused patch file #471 (Jimexist)
fix clippy warnings for rust 1.53 #470 (Jimexist)
Fix PR labeler #468 (Dandandan)
Tweak dev backporting docs #466 (alamb)
Unvendor Archery #459 (kszucs)
Add sort boolean benchmark #457 (alamb)
Add C data interface for decimal128 and timestamp #453 [arrow] (alippai)
Implement the Iterator trait for the json Reader. #451 [arrow] (LaurentMazare)
Update release docs + release email template #450 (alamb)
remove clippy unnecessary wraps suppression in cast kernel #449 (Jimexist)
Use partition for bool sort #448 (Jimexist)
remove unnecessary wraps in sort #445 (Jimexist)
Python FFI bridge for Schema, Field and DataType #439 [arrow] (kszucs)
Update release Readme.md #436 (alamb)
Derive Eq and PartialEq for SortOptions #425 (tustvold)
refactor lexico sort for future code reuse #423 (Jimexist)
Reenable MIRI check on PRs #421 (alamb)
Sort by float lists #420 (medwards)
Fix out of bounds read in bit chunk iterator #416 (jhorstmann)
Doctests for DecimalArray. #414 (novemberkilo)
Add Decimal to CsvWriter and improve debug display #406 (alippai)
MINOR: update install instruction #400 (alippai)
use prettier to auto format md files #398 (Jimexist)
window::shift to work for all array types #388 (Jimexist)
add more tests for window::shift and handle boundary cases #386 (Jimexist)
Implement faster arrow array reader #384 (yordan-pavlov)
Add set_bit to BooleanBufferBuilder to allow mutating bit in index #383 (boazberman)
make sure that only concat preallocates buffers #382 (ritchie46)
Respect max rowgroup size in Arrow writer #381 [parquet] (nevi-me)
Fix typo in release script, update release location #380 (alamb)
Doctests for FixedSizeBinaryArray #378 (novemberkilo)
Simplify shift kernel using new_null_array #370 (Dandandan)
allow SliceableCursor to be constructed from an Arc directly #369 (crepererum)
Add doctest for ArrayBuilder #367 (alippai)
Fix version in readme #365 (domoritz)
Remove superfluous space #363 (domoritz)
Add crate badges #362 (domoritz)
Disable MIRI check until it runs cleanly on CI #360 (alamb)
Only register Flight.proto with cargo if it exists #351 (tustvold)
Reduce memory usage of concat (large)utf8 #348 (ritchie46)
Fix filter UB and add fast path #341 (ritchie46)
Automatic cherry-pick script #339 (alamb)
Doctests for BooleanArray. #338 (novemberkilo)
feature gate ipc reader/writer #336 (ritchie46)
Add ported Rust release verification script #331 (wesm)
Doctests for StringArray and LargeStringArray. #330 (novemberkilo)
inline PrimitiveArray::value #329 (ritchie46)
Enable wasm32 as a target architecture for the SIMD feature #324 (roee88)
Fix undefined behavior in FFI and enable MIRI checks on CI #323 (roee88)
Mutablebuffer::shrink_to_fit #318 [arrow] (ritchie46)
Add simd modulus op #317 (gangliao)
feature gate csv functionality #312 [arrow] (ritchie46)
```
Minor
```
Remove old release scripts #293 (alamb)
Add Send to the ArrayBuilder trait #291 (Max-Meldrum)
Added changelog generator script and configuration. #289 (jorgecarleitao)
manually bump development version #288 (nevi-me)
Fix FFI and add support for Struct type #287 (roee88)
Fix subtraction underflow when sorting string arrays with many nulls #285 (medwards)
Speed up bound checking in take #281 (Dandandan)
Update PR template by commenting out instructions #278 (nevi-me)
Added Decimal support to pretty-print display utility \#230 #273 (mgill25)
Fix null struct and list roundtrip #270 (nevi-me)
1.52 clippy fixes #267 (nevi-me)
Fix typo in csv/reader.rs #265 (domoritz)
Fix empty Schema::metadata deserialization error #260 (hulunbier)
update datafusion and ballista doc links #259 (Jimexist)
support full u32 and u64 roundtrip through parquet #258 [parquet] (crepererum)
```
MINOR
```
```
Minor
```
fix parquet max_definition for non-null structs #246 (nevi-me)
Disabled rebase needed until demonstrate working. #243 (jorgecarleitao)
pin flatbuffers to 0.8.4 #239 (ritchie46)
sort_primitive result is capped to the min of limit or values.len #236 (medwards)
Read list field correctly #234 [parquet] (nevi-me)
Fix code examples for RecordBatch::try_from_iter #231 (alamb)
Support string dictionaries in csv reader \#228 #229 (tustvold)
support LargeUtf8 in sort kernel #26 (ritchie46)
Removed unused files #22 (jorgecarleitao)
ARROW-12504: Buffer::from_slice_ref set correct capacity #18 [arrow] (tustvold)
Add GitHub templates #17 (andygrove)
ARROW-12493: Add support for writing dictionary arrays to CSV and JSON #16 [arrow] (tustvold)
ARROW-12426: [Rust] Fix concatenation of arrow dictionaries #15 [arrow] (tustvold)
Update repository and homepage urls #14 [arrow] [arrow-flight] [parquet] (Dandandan)
Added rebase-needed bot #13 (jorgecarleitao)
Added Integration tests against arrow #10 (jorgecarleitao)

4.4.0 (2021-06-24)

Full Changelog

Breaking changes:

migrate partition kernel to use Iterator trait #437 [arrow]
Remove DictionaryArray::keys_array #391 [arrow]

Implemented enhancements:

sort kernel boolean sort can be O(n) #447 [arrow]
C data interface for decimal128, timestamp, date32 and date64 #413
Add Decimal to CsvWriter #405
Use iterators to increase performance of creating Arrow arrays #200 [parquet]

Fixed bugs:

Release Audit Tool RAT is not being triggered #481
Security Vulnerabilities: flatbuffers: read_scalar and read_scalar_at allow transmuting values without unsafe blocks #476
Clippy broken after upgrade to rust 1.53 #467
Pull Request Labeler is not working #462
Arrow 4.3 release: error[E0658]: use of unstable library feature 'partition_point': new API #456
parquet reading hangs when row_group contains more than 2048 rows of data #349
Fail to build arrow #247
JSON reader does not implement iterator #193 [arrow]

Security fixes:

Ensure a successful MIRI Run on CI #227

Closed issues:

sort kernel has a lot of unnecessary wrapping #446
```
Parquet
```

4.3.0 (2021-06-10)

Full Changelog

Implemented enhancements:

Add partitioning kernel for sorted arrays #428 [arrow]
Implement sort by float lists #427 [arrow]
Derive Eq and PartialEq for SortOptions #426 [arrow]
use prettier and github action to normalize markdown document syntax #399
window::shift can work for more than just primitive array type #392
Doctest for ArrayBuilder #366

Fixed bugs:

Boolean not kernel does not take offset of null buffer into account #417
my contribution not marged in 4.2 release #394
window::shift shall properly handle boundary cases #387
Parquet WriterProperties.max_row_group_size not wired up #257
Out of bound reads in chunk iterator #198 [arrow]

4.2.0 (2021-05-29)

Full Changelog

Breaking changes:

DictionaryArray::values() clones the underlying ArrayRef #313 [arrow]

Implemented enhancements:

Simplify shift kernel using null array #371
Provide Arc-based constructor for parquet::util::cursor::SliceableCursor #368
Add badges to crates #361
Consider inlining PrimitiveArray::value #328
Implement automated release verification script #327
Add wasm32 to the list of target architectures of the simd feature #316
add with_escape for csv::ReaderBuilder #315 [arrow]
IPC feature gate #310
csv feature gate #309 [arrow]
Add shrink_to / shrink_to_fit to MutableBuffer #297

Fixed bugs:

Incorrect crate setup instructions #364
Arrow-flight only register rerun-if-changed if file exists #350
Dictionary Comparison Uses Wrong Values Array #332
Undefined behavior in FFI implementation #322
All-null column get wrong parquet null-counts #306 [parquet]
Filter has inconsistent null handling #295

4.1.0 (2021-05-17)

Full Changelog

Implemented enhancements:

Add Send to ArrayBuilder #290 [arrow]
Improve performance of bound checking option #280 [arrow]
extend compute kernel arity to include nullary functions #276
Implement FFI / CDataInterface for Struct Arrays #251 [arrow]
Add support for pretty-printing Decimal numbers #230 [arrow]
CSV Reader String Dictionary Support #228 [arrow]
Add Builder interface for adding Arrays to record batches #210 [arrow]
Support auto-vectorization for min/max #209 [arrow]
Support LargeUtf8 in sort kernel #25 [arrow]

Fixed bugs:

no method named select_nth_unstable_by found for mutable reference &mut [T] #283
Rust 1.52 Clippy error #266
NaNs can break parquet statistics #255 [parquet]
u64::MAX does not roundtrip through parquet #254 [parquet]
Integration tests failing to compile flatbuffer #249 [arrow]
Fix compatibility quirks between arrow and parquet structs #245 [parquet]
Unable to write non-null Arrow structs to Parquet #244 [parquet]
schema: missing field metadata when deserialize #241 [arrow]
Arrow does not compile due to flatbuffers upgrade #238 [arrow]
Sort with limit panics for the limit includes some but not all nulls, for large arrays #235 [arrow]
arrow-rs contains a copy of the "format" directory #233 [arrow]
Fix SEGFAULT/ SIGILL in child-data ffi #206 [arrow]
Read list field correctly in <struct<list>> #167 [parquet]
FFI listarray lead to undefined behavior. #20

Security fixes:

Fix MIRI build on CI #226 [arrow]
Get MIRI running again #224 [arrow]

Documentation updates:

Comment out the instructions in the PR template #277
Update links to datafusion and ballista in README.md #19
Update "repository" in Cargo.toml #12

Closed issues:

Arrow Aligned Vec #268
```
Rust
```
Umbrella issue for clippy integration #217 [arrow]
Support sort #215 [arrow]
Support stable Rust #214 [arrow]
Remove Rust and point integration tests to arrow-rs repo #211 [arrow]
ArrayData buffers are inconsistent across implementations #207
3.0.1 patch release #204
Document patch release process #202
Simplify Offset #186 [arrow]
Typed Bytes #185 [arrow]
```
CI
```
Improve take primitive performance #174
```
CI
```
Update assignees in JIRA where missing #160
```
Rust
```
```
DataFusion
```
```
Rust
```
```
DataFusion
```
```
DataFusion
```
```
Rust
```
```
DataFusion
```
```
DataFusion
```
```
DataFusion
```
```
Archery
```
```
rust
```
```
Rust
```
```
DataFusion
```
```
DataFusion
```
Merge utils from Parquet and Arrow #32 [arrow] [parquet]
Add benchmarks for Parquet #30 [parquet]
Mark methods that do not perform bounds checking as unsafe #28 [arrow]
Test issue #24 [arrow]
This is a test issue #11

630 KiB Исходник Постоянная ссылка Ответственный История

Historical Changelog

51.0.0 (2024-03-15)

50.0.0 (2024-01-08)

49.0.0 (2023-11-07)

48.0.0 (2023-10-18)

47.0.0 (2023-09-19)

46.0.0 (2023-08-21)

45.0.0 (2023-07-30)

44.0.0 (2023-07-14)

43.0.0 (2023-06-30)

42.0.0 (2023-06-16)

41.0.0 (2023-06-02)

40.0.0 (2023-05-19)

39.0.0 (2023-05-05)

38.0.0 (2023-04-21)

37.0.0 (2023-04-07)

36.0.0 (2023-03-24)

35.0.0 (2023-03-10)

34.0.0 (2023-02-24)

33.0.0 (2023-02-10)

32.0.0 (2023-01-27)

31.0.0 (2023-01-13)

30.0.1 (2023-01-04)

30.0.0 (2022-12-29)

29.0.0 (2022-12-09)

28.0.0 (2022-11-25)

27.0.0 (2022-11-11)

26.0.0 (2022-10-28)

25.0.0 (2022-10-14)

24.0.0 (2022-09-30)

23.0.0 (2022-09-16)

22.0.0 (2022-09-02)

21.0.0 (2022-08-18)

20.0.0 (2022-08-05)

19.0.0 (2022-07-22)

18.0.0 (2022-07-08)

17.0.0 (2022-06-24)

16.0.0 (2022-06-10)

15.0.0 (2022-05-27)

14.0.0 (2022-05-13)

13.0.0 (2022-04-29)

12.0.0 (2022-04-15)

11.1.0 (2022-03-31)

11.0.0 (2022-03-17)

10.0.0 (2022-03-04)

9.1.0 (2022-02-19)

9.0.2 (2022-02-09)

8.0.0 (2022-01-20)

7.0.0 (2022-1-07)

Arrow

Parquet

Changelog

6.5.0 (2021-12-23)

6.4.0 (2021-12-10)

6.3.0 (2021-11-26)

6.2.0 (2021-11-12)

6.1.0 (2021-10-29)

6.0.0 (2021-10-13)

5.5.0 (2021-09-24)

5.4.0 (2021-09-10)

5.3.0 (2021-08-26)

5.2.0 (2021-08-12)

5.1.0 (2021-07-29)

5.0.0 (2021-07-14)

4.4.0 (2021-06-24)

4.3.0 (2021-06-10)

4.2.0 (2021-05-29)

4.1.0 (2021-05-17)

630 KiB

Исходник Постоянная ссылка Ответственный История