Граф коммитов

43 Коммитов

Автор SHA1 Сообщение Дата
Michael Spector b189180aa1
Merge pull request #14 from urishapira/cslschame-update
updated cslschama types
2022-02-20 08:45:16 +02:00
Michael Spector 5462c9b3c1
Merge branch 'dev' into cslschame-update 2022-02-20 08:44:24 +02:00
sigorbor dfa72004fd
Update Package.nuspec to 0.1.12
To avoid conflict between Nuget feeds, promote the version
2022-02-15 12:15:57 +02:00
Michael Spector c3541e81aa
Merge pull request #15 from urishapira/gt-parquet-row-groups-data
Get parquet row groups data
2022-02-15 11:06:48 +02:00
urishapira 7bdca360a3 updated package.nuspec 2022-02-15 09:54:24 +02:00
urishapira 06fa6ab965 format 2022-02-15 09:51:38 +02:00
urishapira 2586c53db0 updated schema 2022-02-15 09:41:51 +02:00
urishapira 28b50bf4d9 Added row groups data 2022-02-14 16:35:44 +02:00
urishapira 8baec66f0e updated cslschama types 2022-02-09 16:37:03 +02:00
Michael Spector 9f00a1f95b Update Cargo.lock 2022-01-18 09:15:37 +02:00
sigorbor 23a6f6ad2b
Merge pull request #13 from sigorbor/nuget0.1.10
Promote NuGet package version + cargo fmt
2021-07-05 16:12:52 +03:00
Igor Borodin 8e0b51486f Promote NuGet package version
cargo fmt the code
2021-07-05 15:54:48 +03:00
Michael Spector ed1e7ecffd
Merge pull request #12 from sigorbor/missing_columns_support
Add empty CSV values for missing PQ columns
2021-06-30 11:52:43 +03:00
Igor Borodin 9a6308d861 DOn't require --columns argument 2021-06-29 13:34:24 +03:00
Igor Borodin a14622fe7c Add empty CSV values for missing PQ columns 2021-06-28 16:36:48 +03:00
Michael Spector 16a0031c37 Increment version (badly uploaded package) 2021-03-11 15:31:33 +02:00
Michael Spector a787dc5c4f
Merge pull request #10 from Azure/convert_types
Enable implicit Parquet to Kusto types conversion
2021-03-11 14:24:52 +02:00
Michael Spector 328e336d9f Added configuration option that enables implicit Parquet to Kusto types conversion 2021-03-11 14:19:52 +02:00
Michael Spector ce0ff04e6b Support numeric keys in MAP type 2021-02-10 07:15:06 +02:00
Michael Spector d5172cbbf1
Merge pull request #9 from Azure/cslschema
Added an option for getting CSL schema of Parquet file
2020-12-01 13:17:41 +00:00
Michael Spector 881d4c789b Added an option for getting CSL schema of Parquet file 2020-12-01 15:13:39 +02:00
Michael Spector 38159e93ba
Merge pull request #8 from Azure/csv
Support CSV output format
2020-11-15 07:07:15 +00:00
Michael Spector 64b1046380 Format date logical type as string 2020-11-14 13:06:28 +02:00
Michael Spector 75527e2d25 Use ryu for rendering floats (the same way as serde_json) 2020-11-13 08:42:30 +02:00
Michael Spector f88a5485b3 Eliminate double line terminators 2020-11-12 14:15:40 +02:00
Michael Spector c965c9d313 Fixed escaping of strings with quotes by proper usage 2020-11-12 11:05:06 +02:00
Michael Spector 45e6d85e5e Add support for CSV output format.
When `--csv` argument is used, the utility produces CSV format.
Nested top level elements (arrays, objects) are formatted as JSON.

Using the option alone saves some time already:

```
PS C:\Users\mispecto\Projects\azure-kusto-parquet-conv> Measure-Command
{ .\target\debug\pq2json.exe ..\..\Downloads\20200809201109710_53ea103e_e9fabfe6_001.parquet > 1 }

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 11
Milliseconds      : 361
Ticks             : 113610039
TotalDays         : 0.000131493100694444
TotalHours        : 0.00315583441666667
TotalMinutes      : 0.189350065
TotalSeconds      : 11.3610039
TotalMilliseconds : 11361.0039

PS C:\Users\mispecto\Projects\azure-kusto-parquet-conv> Measure-Command
{ .\target\debug\pq2json.exe --csv ..\..\Downloads\20200809201109710_53ea103e_e9fabfe6_001.parquet > 1 }

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 5
Milliseconds      : 624
Ticks             : 56247514
TotalDays         : 6.51012893518518E-05
TotalHours        : 0.00156243094444444
TotalMinutes      : 0.0937458566666667
TotalSeconds      : 5.6247514
TotalMilliseconds : 5624.7514
```
2020-11-11 16:15:14 +02:00
Michael Spector d6d3fb9fc7
Merge pull request #7 from Azure/upgrade_arrow
Fixed crash when reading a null byte array field.
2020-05-05 13:31:56 +03:00
Michael Spector b4b4a5956b Fixed crash when reading a null byte array field.
The fix is in the new rzheka/arrow version:
https://github.com/rzheka/arrow/pull/4

rzheka/arrow was also updated from the upstream
repository (official Apache Arrow repo).
2020-05-05 12:57:09 +03:00
Michael Spector 000b5d71a3 Use correct name for schema projection 2019-12-03 09:44:22 +02:00
Michael Spector f8493eb414 Increment package version (added --columns option) 2019-12-03 08:47:22 +02:00
Michael Spector f95887bda6
Merge pull request #6 from spektom/columns_projection
Allow selecting specific (top-level) columns from Parquet file
2019-12-02 16:14:35 +02:00
Michael Spector 926d14230d
Merge branch 'dev' into columns_projection 2019-12-02 16:14:22 +02:00
Michael Spector f4b8c7df8b
Merge pull request #5 from spektom/int96_update
Updated Arrow library for better INT96 support
2019-12-02 15:44:34 +02:00
Michael Spector c657251de4 Updated Arrow library for better INT96 support 2019-11-26 13:09:01 +02:00
Michael Spector 576beb446f Allow selecting specific (top-level) columns from Parquet file 2019-11-18 12:15:15 +02:00
Evgeney Ryzhyk 8102e87867
Add timestamp rendering options (#3) 2019-06-25 19:16:13 +03:00
Evgeney Ryzhyk b117a4dc3f
Add nuspec 2019-06-24 23:43:15 +03:00
Evgeney Ryzhyk ef323be562
pq2json tool: initial implementation (#1) 2019-06-24 22:48:18 +03:00
Evgeney Ryzhyk 0225380ae5 README fix 2019-06-24 15:57:04 +03:00
Microsoft Open Source debe79272b Initial commit 2019-06-24 05:35:26 -07:00
Microsoft Open Source 1acf700590 Initial commit 2019-06-24 05:35:25 -07:00
Microsoft GitHub User 3aa02da754
Initial commit 2019-06-24 05:35:22 -07:00