azure-kusto-parquet-conv/arrow-rs
Michael Spector e7c8571478 Removed fake AWS credentials from arrow-rs test code 2024-06-30 10:39:12 +03:00
..
.github Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
arrow Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
arrow-arith Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
arrow-array Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
arrow-avro Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
arrow-buffer Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
arrow-cast Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
arrow-csv Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
arrow-data Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
arrow-flight Added missing files 2024-06-25 17:23:25 +03:00
arrow-integration-test Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
arrow-integration-testing Added missing files 2024-06-25 17:23:25 +03:00
arrow-ipc Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
arrow-json Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
arrow-ord Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
arrow-pyarrow-integration-testing Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
arrow-row Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
arrow-schema Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
arrow-select Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
arrow-string Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
conbench Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
dev Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
format Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
object_store Removed fake AWS credentials from arrow-rs test code 2024-06-30 10:39:12 +03:00
parquet Added missing files 2024-06-25 17:23:25 +03:00
parquet_derive Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
parquet_derive_test Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
.asf.yaml Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
.gitattributes Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
.github_changelog_generator Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
.gitignore Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
.gitmodules Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
.pre-commit-config.yaml Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
CHANGELOG-old.md Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
CHANGELOG.md Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
CODE_OF_CONDUCT.md Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
CONTRIBUTING.md Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
Cargo.toml Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
LICENSE.txt Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
NOTICE.txt Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
README.md Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
header Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
pre-commit.sh Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00
rustfmt.toml Support microsecond precision in timestamps 2024-06-25 14:22:29 +03:00

README.md

Native Rust implementation of Apache Arrow and Apache Parquet

Coverage Status

Welcome to the Rust implementation of Apache Arrow, the popular in-memory columnar format.

This repo contains the following main components:

Crate Description Latest API Docs README
arrow Core functionality (memory layout, arrays, low level computations) docs.rs (README)
arrow-flight Support for Arrow-Flight IPC protocol docs.rs (README)
object-store Support for object store interactions (aws, azure, gcp, local, in-memory) docs.rs (README)
parquet Support for Parquet columnar file format docs.rs (README)
parquet_derive A crate for deriving RecordWriter/RecordReader for arbitrary, simple structs docs.rs (README)

The current development version the API documentation in this repo can be found here.

Release Versioning and Schedule

arrow and parquet crates

The Arrow Rust project releases approximately monthly and follows Semantic Versioning.

Due to available maintainer and testing bandwidth, arrow crates (arrow, arrow-flight, etc.) are released on the same schedule with the same versions as the parquet and [parquet-derive] crates.

Starting June 2024, we plan to release new major versions with potentially breaking API changes at most once a quarter, and release incremental minor versions in the intervening months. See this ticket for more details.

For example:

Approximate Date Version Notes
Jun 2024 52.0.0 Major, potentially breaking API changes
Jul 2024 52.1.0 Minor, NO breaking API changes
Aug 2024 52.2.0 Minor, NO breaking API changes
Sep 2024 53.0.0 Major, potentially breaking API changes

object_store crate

The object_store crate is released independently of the arrow and parquet crates and follows Semantic Versioning. We aim to release new versions approximately every 2 months.

There are several related crates in different repositories

Crate Description Documentation
datafusion In-memory query engine with SQL support (README)
ballista Distributed query execution (README)
object_store_opendal Use opendal as object_store backend (README)

Collectively, these crates support a wider array of functionality for analytic computations in Rust.

For example, you can write SQL queries or a DataFrame (using the datafusion crate) to read a parquet file (using the parquet crate), evaluate it in-memory using Arrow's columnar format (using the arrow crate), and send to another process (using the arrow-flight crate).

Generally speaking, the arrow crate offers functionality for using Arrow arrays, and datafusion offers most operations typically found in SQL, including joins and window functions.

You can find more details about each crate in their respective READMEs.

Arrow Rust Community

The dev@arrow.apache.org mailing list serves as the core communication channel for the Arrow community. Instructions for signing up and links to the archives can be found on the Arrow Community page. All major announcements and communications happen there.

The Rust Arrow community also uses the official ASF Slack for informal discussions and coordination. This is a great place to meet other contributors and get guidance on where to contribute. Join us in the #arrow-rust channel and feel free to ask for an invite via:

  1. the dev@arrow.apache.org mailing list
  2. the GitHub Discussions
  3. the Discord channel

The Rust implementation uses GitHub issues as the system of record for new features and bug fixes and this plays a critical role in the release process.

For design discussions we generally collaborate on Google documents and file a GitHub issue linking to the document.

There is more information in the contributing guide.