Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VTX-666: Sync from upstream #51

Merged
merged 1,994 commits into from
Sep 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1994 commits
Select commit Hold shift + click to select a range
7ef6be4
Preallocate for `FixedSizeList` in `concat` (#5862)
judahrand Jun 21, 2024
13c9e90
Add eq benchmark for StringArray/StringViewArray (#5924)
XiangpengHao Jun 21, 2024
9413cd3
Add the ability for Maps to cast to another case where the field name…
HawaiianSpork Jun 22, 2024
86eb191
fix(ipc): set correct row count when reading struct arrays with zero …
kawadakk Jun 23, 2024
02fb714
Update zstd-sys requirement from >=2.0.0, <2.0.10 to >=2.0.0, <2.0.12…
dependabot[bot] Jun 23, 2024
0ea074a
Add `MultipartUpload` blanket implementation for `Box<W>` (#5919)
fsdvh Jun 23, 2024
a35214f
Fix typo in benchmarks (#5935)
alamb Jun 23, 2024
063ac13
row format benches for bool & nullable int (#5943)
korowa Jun 23, 2024
0c3a24d
Implement arrow-row encoding/decoding for view types (#5922)
XiangpengHao Jun 24, 2024
c084342
Better document support for nested comparison (#5942)
tustvold Jun 24, 2024
3139a08
Update quick-xml requirement from 0.32.0 to 0.33.0 in /object_store (…
dependabot[bot] Jun 24, 2024
66bada5
Implement like/ilike etc for StringViewArray (#5931)
XiangpengHao Jun 24, 2024
460fd55
test: Add unit test for extending slice of list array (#5948)
viirya Jun 25, 2024
2323c74
Update quick-xml requirement from 0.33.0 to 0.34.0 in /object_store (…
dependabot[bot] Jun 25, 2024
901fbe8
Minor: fixup contribution guide (#5952)
alamb Jun 25, 2024
0e56fd5
chore(5797): change default data_page_row_limit to 20k (#5957)
wiedld Jun 25, 2024
4b326f6
Improve error message for unsupported nested comparison (#5961)
alamb Jun 26, 2024
45190ab
feat: add max_bytes and min_bytes on PageIndex (#5950)
tshauck Jun 26, 2024
6b03162
Faster primitive arrays encoding into row format (#5858)
korowa Jun 26, 2024
e5604aa
Document process for PRs with breaking changes (#5953)
alamb Jun 26, 2024
1ef22e5
`like` benchmark for StringView (#5936)
alamb Jun 26, 2024
ee55721
Expose `IntervalMonthDayNano` and `IntervalDayTime` and update docs (…
alamb Jun 26, 2024
6bc9514
implement sort for view types (#5963)
XiangpengHao Jun 26, 2024
0a4d8a1
Fix FFI array offset handling (#5964)
tustvold Jun 26, 2024
c5b5eda
Add benchmark for reading binary/binary view from parquet (#5968)
XiangpengHao Jun 28, 2024
a7b4a3b
Add view buffer for parquet reader (#5970)
XiangpengHao Jun 28, 2024
871c999
Handle flight dictionary ID assignment automatically (#5971)
thinkharderdev Jun 29, 2024
a4d2167
Make ObjectStoreScheme public (#5912)
orf Jun 30, 2024
6230435
Add operation in ArrowNativeTypeOp::neg_check error message (#5944) (…
zhao-gang Jul 1, 2024
62c1615
feat: support reading OPTIONAL column in parquet_derive (#5717)
double-free Jul 1, 2024
8e9bdce
Update quick-xml requirement from 0.34.0 to 0.35.0 in /object_store (…
dependabot[bot] Jul 1, 2024
8284e5f
Reduce repo size by removing accumulative commits in CI job (#5982)
Owen-CH-Leung Jul 1, 2024
bb1250c
Minor: fix clippy complaint in parquet_derive (#5984)
alamb Jul 1, 2024
cad5735
Add user defined metadata (#5915)
criccomini Jul 2, 2024
6351674
Provide Arrow Schema Hint to Parquet Reader - Alternative 2 (#5939)
efredine Jul 2, 2024
3b93a4b
WriteMultipart Abort on MultipartUpload::complete Error (#5974)
fsdvh Jul 2, 2024
859c4ad
Implement directly build byte view array on top of parquet buffer (#5…
XiangpengHao Jul 2, 2024
ebc1cb1
fix: error in case of invalid interval expression (#5987)
DDtKey Jul 2, 2024
e61fb62
Add ParquetMetadata::memory_size size estimation (#5965)
alamb Jul 2, 2024
5c6f857
feat(5851): ArrowWriter memory usage (#5967)
wiedld Jul 2, 2024
035b589
Prepare arrow `52.1.0` (#5992)
alamb Jul 2, 2024
e7a0008
Implement dictionary support for reading ByteView from parquet (#5973)
XiangpengHao Jul 3, 2024
1f0b000
implement `DataType::try_form(&str)` (#5994)
samuelcolvin Jul 3, 2024
bed3746
Add additional documentation and examples to DataType (#5997)
alamb Jul 5, 2024
fd5e67d
Automatically cleanup empty dirs in LocalFileSystem (#5978)
fsdvh Jul 6, 2024
a85768d
Add FlightSqlServiceClient::new_from_inner (#6003)
lewiszlw Jul 6, 2024
b9562b9
fix doc ci in latest rust nightly version (#6012)
Rachelint Jul 6, 2024
2b986df
Deduplicate strings/binarys when building view types (#6005)
XiangpengHao Jul 8, 2024
af4d6b6
Fast utf8 validation when loading string view from parquet (#6009)
XiangpengHao Jul 8, 2024
b9e4497
Rename `Schema::all_fields` to `flattened_fields` (#6001)
lewiszlw Jul 8, 2024
8355823
Complete `StringViewArray` and `BinaryViewArray` parquet decoder: im…
XiangpengHao Jul 8, 2024
76fbdbc
Update zstd-sys requirement from >=2.0.0, <2.0.12 to >=2.0.0, <2.0.13…
dependabot[bot] Jul 9, 2024
c47f230
Update clap test (#6028)
tustvold Jul 9, 2024
3ce8e84
Unsafe improvements: core `parquet` crate. (#6024)
veluca93 Jul 9, 2024
cb3babc
Improve performance reading `ByteViewArray` from parquet by removing …
XiangpengHao Jul 10, 2024
826577a
Update quick-xml requirement from 0.35.0 to 0.36.0 in /object_store (…
dependabot[bot] Jul 10, 2024
2424da2
Fix `hashbrown` version in `arrow-array`, remove from `arrow-row` (#6…
mbrobbel Jul 11, 2024
50b1e30
Additional tests for parquet reader utf8 validation (#6023)
alamb Jul 11, 2024
e70c16d
Clean up unused code for view types in offset buffer (#6040)
XiangpengHao Jul 11, 2024
920a944
Move avoid using copy-based buffer creation (#6039)
XiangpengHao Jul 12, 2024
199ce91
Fix 5592: Colon (:) in in object_store::path::{Path} is not handled o…
hesampakdaman Jul 13, 2024
9acc9fa
Minor API adjustments for StringViewBuilder (#6047)
XiangpengHao Jul 15, 2024
0002b4d
Fix typo in GenericByteViewArray documentation (#6054)
progval Jul 15, 2024
074bcb5
Directly decode String/BinaryView types from arrow-row format (#6044)
XiangpengHao Jul 15, 2024
31b8ba0
Add begin/end_transaction methods in FlightSqlServiceClient (#6026)
lewiszlw Jul 15, 2024
6d4e2f2
Implement min max support for string/binary view types (#6053)
XiangpengHao Jul 15, 2024
66390ff
Add parquet `StatisticsConverter` for arrow reader (#6046)
efredine Jul 16, 2024
b2458bd
StringView support in arrow-csv (#6062)
2010YOUY01 Jul 16, 2024
b72098f
Minor: clarify the relationship between `file::metadata` and `format`…
alamb Jul 16, 2024
6ab853d
Do not write `ColumnIndex` for null columns when not writing page sta…
etseidl Jul 16, 2024
62f9e72
Reorganize arrow-flight test code (#6065)
lewiszlw Jul 16, 2024
4978e32
Sanitize error message for sensitive requests (#6074)
tustvold Jul 17, 2024
94652e5
use GCE metadata server env var overrides (#6015)
barronw Jul 17, 2024
41665ea
Correct timeout in comment from 5s to 30s (#6073)
trungda Jul 17, 2024
b44497e
Prepare for object_store `0.10.2` release (#6079)
alamb Jul 17, 2024
9be0eb5
Minor: Improve parquet PageIndex documentation (#6042)
alamb Jul 17, 2024
8a5be13
Enable casting from Utf8View (#6077)
a10y Jul 19, 2024
16915b5
Add PartialEq to ParquetMetaData and FileMetadata (#6082)
adriangb Jul 19, 2024
ee56940
fix panic in `ParquetMetadata::memory_size`: check has_min_max_set be…
Fischer0522 Jul 20, 2024
5de1d5e
Optimize `max_boolean` by operating on u64 chunks (#6098)
simonvandel Jul 22, 2024
658e58f
add benchmark to track performance (#6101)
XiangpengHao Jul 22, 2024
8aa91e5
Make bool_or an alias for max_boolean (#6100)
simonvandel Jul 23, 2024
93e4eb2
Faster `GenericByteView` construction (#6102)
XiangpengHao Jul 23, 2024
af40ea3
Implement specialized min/max for `GenericBinaryView` (`StringView` a…
XiangpengHao Jul 23, 2024
49e714d
Prepare `52.2.0` release (#6110)
alamb Jul 24, 2024
fa2fbfd
added a flush method to IPC writers (#6108)
V0ldek Jul 25, 2024
3ebb033
Fix Clippy for the Rust 1.80 release (#6116)
alamb Jul 25, 2024
1ff4e21
Fix clippy in object_store crate (#6120)
alamb Jul 25, 2024
613e93e
Merge `53.0.0-dev` dev branch to main (#6126)
alamb Jul 26, 2024
b06ffce
Add support for level histograms added in PARQUET-2261 to `ParquetMet…
etseidl Jul 26, 2024
f42d242
Add ArrowError::ArithmeticError (#6130)
andygrove Jul 26, 2024
e815d06
Implement data_part for intervals (#6071)
nrc Jul 27, 2024
705d341
Remove `SchemaBuilder` dependency from `StructArray` constructors (#6…
Rafferty97 Jul 27, 2024
5f5a82c
Remove automatic buffering in `ipc::reader::FileReader` for for consi…
V0ldek Jul 28, 2024
80ed712
Use `LevelHistogram` in `PageIndex` (#6135)
etseidl Jul 29, 2024
11f2bb8
Fix comparison kernel benchmarks (#6147)
samuelcolvin Jul 29, 2024
bd1e76b
Implement exponential block size growing strategy for `StringViewBuil…
XiangpengHao Jul 29, 2024
0e99e3a
improve LIKE regex (#6145)
samuelcolvin Jul 29, 2024
bf9ce47
Improve `LIKE` performance for "contains" style queries (#6128)
samuelcolvin Jul 29, 2024
bf0ea91
improvements to `(i)starts_with` and `(i)ends_with` performance (#6118)
samuelcolvin Jul 30, 2024
6e893b5
Add `BooleanArray::new_from_packed` and `BooleanArray::new_from_u8` (…
chloro-pn Jul 30, 2024
2905ce6
Update object store MSRV to `1.64` (#6123)
alamb Jul 31, 2024
c14ade2
Upgrade protobuf definitions to flightsql 17.0 (#6133) (#6169)
alamb Aug 1, 2024
bf1a9ec
Add additional documentation and examples to ArrayAccessor (#6141)
alamb Aug 1, 2024
ede5a64
Minor: Update release schedule in README (#6125)
alamb Aug 1, 2024
0c3732f
Optimize `take` kernel for `BinaryViewArray` and `StringViewArray` (#…
a10y Aug 2, 2024
01407f4
Minor: improve comments in temporal.rs tests (#6140)
alamb Aug 2, 2024
df59cdd
Support `StringView` and `BinaryView` in CDataInterface (#6171)
a10y Aug 2, 2024
f708f3e
Make object_store errors non-exhaustive (#6165)
tustvold Aug 2, 2024
e6b7944
Update snafu (#5930) (#6070)
alamb Aug 2, 2024
ee6fb87
Update sysinfo requirement from 0.30.12 to 0.31.2 (#6182)
dependabot[bot] Aug 2, 2024
f2de2cd
No longer write Parquet column metadata after column chunks *and* in …
etseidl Aug 2, 2024
36d567b
add filter benchmark for fsb (#6186)
chloro-pn Aug 3, 2024
e6bd74b
Add support for `StringView` and `BinaryView` statistics in `Statisti…
Kev1n8 Aug 3, 2024
191c9d4
Benchmarks for `bool_and` (#6189)
simonvandel Aug 5, 2024
6133d18
Fix typo in documentation of Float64Array (#6188)
mesejo Aug 6, 2024
7f2d9ac
feat(parquet): Implement AsyncFileWriter for `object_store::buffered:…
Xuanwo Aug 6, 2024
2a4f269
Support Parquet `BYTE_STREAM_SPLIT` for INT32, INT64, and FIXED_LEN_B…
etseidl Aug 6, 2024
63a6209
Reduce bounds check in `RowIter`, add `unsafe Rows::row_unchecked` (#…
XiangpengHao Aug 6, 2024
a235b9b
Update zstd-sys requirement from >=2.0.0, <2.0.13 to >=2.0.0, <2.0.14…
dependabot[bot] Aug 6, 2024
d5ed6b9
Add `ThriftMetadataWriter` for writing Parquet metadata (#6197)
adriangb Aug 6, 2024
db239e5
Add (more) Parquet Metadata Documentation (#6184)
alamb Aug 6, 2024
d7c57d0
fix parquet type is_optional comment (#6192)
jp0317 Aug 7, 2024
49840ec
Remove duplicated statistics tests in parquet (#6190)
Kev1n8 Aug 7, 2024
b90c799
fix: interleave docs suggests itself, not take (#6210)
gstvg Aug 8, 2024
12ff1ea
fix: Correctly handle take on dense union of a single selected type (…
gstvg Aug 8, 2024
7f1bae2
Make it clear that StatisticsConverter can not panic (#6187)
alamb Aug 8, 2024
bd75582
Optimize `min_boolean` and `bool_and` (#6144)
simonvandel Aug 8, 2024
ace1401
Add benchmarks for `BYTE_STREAM_SPLIT` encoded Parquet `FIXED_LEN_BYT…
etseidl Aug 8, 2024
e28cf44
fix(arrow): restrict the range of temporal values produced via `data_…
kyle-mccarthy Aug 8, 2024
4bd737d
Support casting between BinaryView <--> Utf8 and LargeUtf8 (#6180)
xinlifoobar Aug 8, 2024
130ba61
feat(object_store): add `PermissionDenied` variant to top-level erro…
kyle-mccarthy Aug 8, 2024
79ffdc4
update BYTE_STREAM_SPLIT documentation (#6212)
etseidl Aug 8, 2024
3e02689
Add time dictionary coercions (#6208)
adriangb Aug 8, 2024
8a66174
use spaces not tabs everywhere (#6217)
samuelcolvin Aug 9, 2024
5c5a94a
Implement specialized filter kernel for `FixedSizeByteArray` (#6178)
chloro-pn Aug 9, 2024
bb363dc
fix: lexsort_to_indices should not fallback to non-lexical sort if th…
viirya Aug 12, 2024
fe03d39
Prepare for object_store `0.11.0` release (#6227)
alamb Aug 12, 2024
a693f0f
Improve interval parsing (#6211)
samuelcolvin Aug 12, 2024
61c0b7d
Add LICENSE and NOTICE files to object_store (#6234)
alamb Aug 13, 2024
3cd8b76
Update changelog for object_store 0.11.0 release (#6238)
alamb Aug 13, 2024
63d49c8
Minor: Remove non standard footer from LICENSE.txt (#6237)
alamb Aug 13, 2024
5868966
Minor: Improve Type documentation (#6224)
alamb Aug 13, 2024
1238bb1
Add "take" workflow for self-assigning tickets, add "how to find issu…
alamb Aug 13, 2024
468a564
Move `ParquetMetadataWriter` to its own module, update documentation …
alamb Aug 13, 2024
c1b3d98
Modest improvement to FixedLenByteArray BYTE_STREAM_SPLIT arrow decod…
etseidl Aug 13, 2024
9c4a7e3
Improve performance of `FixedLengthBinary` decoding (#6220)
etseidl Aug 13, 2024
43b29b9
minor enhance doc for ParquetField (#6239)
mapleFU Aug 13, 2024
3e5c76f
Remove unnecessary null buffer construction when converting arrays to…
etseidl Aug 14, 2024
4295d37
Add examples to `StringViewBuilder` and `BinaryViewBuilder` (#6240)
alamb Aug 14, 2024
2461a16
Implement PartialEq for GenericBinaryArray (#6241)
alamb Aug 14, 2024
69b17ad
parquet Statistics - deprecate `has_*` APIs and add `_opt` functions …
Michael-J-Ward Aug 15, 2024
0f7116b
Minor: Update DateType::Date64 docs (#6223)
alamb Aug 15, 2024
8d1f0f5
feat(object_store): add support for server-side encryption with custo…
jiachengdb Aug 15, 2024
0130af3
Expose bulk ingest in flight sql client and server (#6201)
djanderson Aug 15, 2024
c835f88
docs: Add parquet_opendal in related projects (#6236)
Xuanwo Aug 15, 2024
042d725
Avoid infinite loop in bad parquet by checking the number of rep leve…
jp0317 Aug 15, 2024
4a3422f
Make the bearer token visible in FlightSqlServiceClient (#6254)
ccciudatu Aug 17, 2024
c6bd492
Add tests for bad parquet files (#6262)
alamb Aug 17, 2024
27789d7
Update parquet object_store dependency to 0.11.0 (#6264)
alamb Aug 17, 2024
d7ad4fe
Implement date_part for durations (#6246)
nrc Aug 19, 2024
25d39c1
feat: further TLS options on ClientOptions: #5034 (#6148)
ByteBaker Aug 19, 2024
663a637
Improve documentation for MutableArrayData (#6272)
alamb Aug 20, 2024
c2d2311
Do not print compression level in schema printer (#6271)
ttencate Aug 20, 2024
e5d9816
Add `Statistics::distinct_count_opt` and deprecate `Statistics::disti…
alamb Aug 20, 2024
7655cca
Fix accessing name from ffi schema (#6273)
kylebarron Aug 20, 2024
344ba1d
ci: use octokit to add assignee (#6267)
dsgibbons Aug 20, 2024
23b6ff9
Only add encryption headers for for SSE-C in get. (#6260)
jiachengdb Aug 20, 2024
0bbad36
Minor: move `FallibleRequestStream` and `FallibleTonicResponseStream`…
alamb Aug 20, 2024
6c59b76
Minor: `pub use ByteView` in arrow and improve documentation (#6275)
alamb Aug 20, 2024
1dae743
ci: simplify octokit add assignee (#6280)
dsgibbons Aug 21, 2024
30db5dc
Update tower requirement from 0.4.13 to 0.5.0 (#6250)
dependabot[bot] Aug 21, 2024
56f6942
Fix panic in comparison_kernel benchmarks (#6284)
alamb Aug 21, 2024
2795b94
fix reference in doctest to size_of which is not imported by default …
rtyler Aug 21, 2024
6dd4a5f
Use `unary()` for array conversion in Parquet array readers, speed up…
etseidl Aug 22, 2024
8c956a9
Support writing UTC adjusted time arrays to parquet (#6278)
aykut-bozkurt Aug 23, 2024
f73dbc3
Minor: improve `RowFilter` and `ArrowPredicate` docs (#6301)
alamb Aug 25, 2024
855666d
Specialize Prefix/Suffix Match for `Like/ILike` between Array and Sca…
xinlifoobar Aug 25, 2024
ee2f75a
Err on `try_from_le_slice` (#6295)
samuelcolvin Aug 26, 2024
b711f23
feat(parquet): add union method to RowSelection (#6308)
sdd Aug 27, 2024
dc8427f
Minor: Improve comments on GenericByteViewArray::bytes_iter(), prefix…
alamb Aug 28, 2024
a937869
Update tonic-build requirement from =0.12.0 to =0.12.2 (#6314)
dependabot[bot] Aug 28, 2024
6785170
docs[object_store]: clarify the backoff strategy that is actually imp…
westonpace Aug 29, 2024
1336973
Pass empty vectors as min/max for all null pages when building Column…
etseidl Aug 31, 2024
69e5e5f
Minor: improve filter documentation (#6317)
alamb Aug 31, 2024
acdd27a
Fix writing of invalid Parquet ColumnIndex when row group contains n…
adriangb Aug 31, 2024
6e50503
Derive PartialEq and Eq for parquet::arrow::ProjectionMask (#6330)
thinkharderdev Aug 31, 2024
0c15191
Support zero column `RecordBatch`es in pyarrow integration (use Recor…
Michael-J-Ward Aug 31, 2024
3a1f67f
parquet_derive: Match fields by name, support reading selected fields…
double-free Aug 31, 2024
831a080
Specialize filter for structs and sparse unions (#6304)
gstvg Aug 31, 2024
774b721
Prepare arrow/parquet `53.0.0` release (#6338)
alamb Aug 31, 2024
ffd216d
Workaround new bug in parquet (#6344)
alamb Aug 31, 2024
97ae9d7
fix: azure sas token visible in logs (#6323)
alexwilcoxson-rel Sep 2, 2024
d4be752
fix: clippy warnings from nightly rust 1.82 (#6348)
waynexia Sep 3, 2024
efe867a
[object_store] Propagate env vars as object store client options (#6334)
ccciudatu Sep 4, 2024
6bf2bda
Remove vestigal conbench integration (#6339)
alamb Sep 5, 2024
e92f287
feat: add catalog/schema subcommands to flight_sql_client. (#6332)
nathanielc Sep 5, 2024
1e63281
Benchmark for bit_mask (set_bits) (#6353)
kazuyukitanimura Sep 5, 2024
9ff7d8b
impl `From<Vec<T>>` for `Buffer` (#6355)
mbrobbel Sep 6, 2024
2d25d65
`object_store::GetOptions` derive `Clone` (#6361)
samuelcolvin Sep 6, 2024
6da4793
object_store/delimited: Fix `TrailingEscape` condition (#6265)
Turbo87 Sep 6, 2024
0491294
Add breaking change from `#6043` to `CHANGELOG` (#6354)
mbrobbel Sep 6, 2024
25e1969
Manually run fmt on all files under parquet (#6328)
etseidl Sep 9, 2024
b368437
Update chrono-tz requirement from 0.9 to 0.10 (#6371)
dependabot[bot] Sep 9, 2024
7a5155c
Add support for Utf8View in arrow_string::length (#6345)
Omega359 Sep 9, 2024
704f90b
Add support for BinaryView in arrow_string::length (#6359)
Omega359 Sep 10, 2024
6fb59d0
Improve `GenericStringBuilder` documentation (#6372)
alamb Sep 11, 2024
f050ff7
Update prost-build requirement from =0.13.1 to =0.13.2 (#6350)
dependabot[bot] Sep 11, 2024
e838e62
add "ARROW_VERSION" const (#6379)
samuelcolvin Sep 11, 2024
60ec869
Support StringViewArray interop with python: fix lingering C Data Int…
a10y Sep 12, 2024
f80bc5f
parquet writer: Raise an error when the row_group_index overflows i16…
progval Sep 13, 2024
889a904
impl `From<ScalarBuffer<T>>` for `Buffer` (#6389)
mbrobbel Sep 13, 2024
ba85fa3
Clear string-tracking hash table when ByteView deduplication is enabl…
shanesveller Sep 13, 2024
b4de692
Improve performance of set_bits by avoiding to set individual bits (#…
kazuyukitanimura Sep 15, 2024
341ec35
stop panic in `MetadataLoader` on invalid data (#6367)
samuelcolvin Sep 16, 2024
3490639
Update lexical-core requirement from 0.8 to 1.0 (to resolve RUSTSEC-2…
dariocurr Sep 17, 2024
aad55d5
Remove "NOT YET FULLY SUPPORTED" comment from DataType::Utf8View/Bina…
alamb Sep 17, 2024
5414f1d
Move lifetime of `take_iter` from iterator to its items (#6403)
dariocurr Sep 17, 2024
d7e8702
Derive `Clone` for `object_store::aws::AmazonS3` (#6414)
ethe Sep 18, 2024
f5a6382
fix: binary_mut should work if only one input array has null buffer (…
viirya Sep 18, 2024
e7598a4
Fix encoding/decoding REE Dicts when using streaming IPC (#6399)
brancz Sep 18, 2024
d274b69
fix: Stop losing precision and scale when casting decimal to dictiona…
andygrove Sep 19, 2024
1390283
Rephrase doc comment (#6421)
waynexia Sep 19, 2024
8ab18fd
fix: don't panic in IPC reader if struct child arrays have different …
alexwilcoxson-rel Sep 20, 2024
669d405
Add `set_bits` fuzz test (#6394)
alamb Sep 20, 2024
4683c20
Reduce integration test matrix (#6407)
kou Sep 20, 2024
0a708e5
chore: add docs, part of #37 (#6424)
ByteBaker Sep 20, 2024
e8b9dad
Add RowSelection::skipped_row_count (#6429)
progval Sep 20, 2024
bc6009f
silence warnings (#6432)
etseidl Sep 21, 2024
d727503
object_score: Support Azure Fabric OAuth Provider (#6382)
RobinLin666 Sep 21, 2024
c90713b
perf: Faster decimal precision overflow checks (#6419)
andygrove Sep 21, 2024
d05cf6d
Implement native support StringViewArray for `regexp_is_match` and `r…
tlm365 Sep 21, 2024
b809021
bump arrow-flight msrv to 1.71.1 (#6437)
gstvg Sep 23, 2024
7191f4d
feat: expose HTTP/2 max frame size in `object_store` (#6442)
crepererum Sep 23, 2024
de6a759
chore: add docs, part of #37 (#6433)
ByteBaker Sep 23, 2024
477b9f0
Fix doc "bit width" to "byte width" (#6434)
kylebarron Sep 23, 2024
4ab97f9
Minor: Add some missing documentation to fix CI errors (#6445)
etseidl Sep 24, 2024
a65e14a
Update prost-build requirement from =0.13.2 to =0.13.3 (#6440)
dependabot[bot] Sep 24, 2024
e67f17e
Add `ParquetMetaDataReader` (#6431)
etseidl Sep 24, 2024
43dd5e4
throw arrow error instead of panic (#6456)
goldmedal Sep 25, 2024
4e2b939
Disable rust<>nanoarrow integration test in CI (#6449)
alamb Sep 25, 2024
922a1ff
Add `union_extract` kernel (#6387)
gstvg Sep 25, 2024
62825b2
Add `IpcSchemaEncoder`, deprecate ipc schema functions, Fix IPC not r…
brancz Sep 25, 2024
6137e91
Add additional documentation and builder APIs to `SortOptions` (#6441)
alamb Sep 25, 2024
d48010e
Workaround for missing Parquet page indexes in `ParquetMetadaReader` …
etseidl Sep 25, 2024
2881dbe
Support cast between Durations + between Durations all numeric types …
tisonkun Sep 26, 2024
50e9e49
Update Cargo.toml (#6459)
ashtuchkin Sep 26, 2024
458fb77
Merge remote-tracking branch 'upstream/master' into sync-from-upstream
fsdvh Sep 26, 2024
55751ee
remove dup
fsdvh Sep 26, 2024
0b0b3ad
empty
fsdvh Sep 26, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
8 changes: 7 additions & 1 deletion .asf.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,4 +38,10 @@ github:
# require branches to be up-to-date before merging
strict: true
# don't require any jobs to pass
contexts: []
contexts: []

# publishes the content of the `asf-site` branch to
# https://arrow.apache.org/rust/
publish:
whoami: asf-site
subdir: rust
9 changes: 3 additions & 6 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@
r/R/RcppExports.R linguist-generated=true
r/R/arrowExports.R linguist-generated=true
r/src/RcppExports.cpp linguist-generated=true
r/src/arrowExports.cpp linguist-generated=true
r/man/*.Rd linguist-generated=true

parquet/src/format.rs linguist-generated
arrow-flight/src/arrow.flight.protocol.rs linguist-generated
arrow-flight/src/sql/arrow.flight.protocol.sql.rs linguist-generated
21 changes: 18 additions & 3 deletions .github/actions/setup-builder/action.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,12 @@ description: 'Prepare Rust Build Environment'
inputs:
rust-version:
description: 'version of rust to install (e.g. stable)'
required: true
required: false
default: 'stable'
target:
description: 'target architecture(s)'
required: false
default: 'x86_64-unknown-linux-gnu'
runs:
using: "composite"
steps:
Expand Down Expand Up @@ -51,6 +55,17 @@ runs:
shell: bash
run: |
echo "Installing ${{ inputs.rust-version }}"
rustup toolchain install ${{ inputs.rust-version }}
rustup toolchain install ${{ inputs.rust-version }} --target ${{ inputs.target }}
rustup default ${{ inputs.rust-version }}
echo "CARGO_TARGET_DIR=/github/home/target" >> $GITHUB_ENV
- name: Disable debuginfo generation
# Disable full debug symbol generation to speed up CI build and keep memory down
# "1" means line tables only, which is useful for panic tracebacks.
shell: bash
run: echo "RUSTFLAGS=-C debuginfo=1" >> $GITHUB_ENV
- name: Enable backtraces
shell: bash
run: echo "RUST_BACKTRACE=1" >> $GITHUB_ENV
- name: Fixup git permissions
# https://github.com/actions/checkout/issues/766
shell: bash
run: git config --global --add safe.directory "$GITHUB_WORKSPACE"
11 changes: 9 additions & 2 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,17 @@ updates:
interval: daily
open-pull-requests-limit: 10
target-branch: master
labels: [auto-dependencies]
labels: [ auto-dependencies, arrow ]
- package-ecosystem: cargo
directory: "/object_store"
schedule:
interval: daily
open-pull-requests-limit: 10
target-branch: master
labels: [ auto-dependencies, object_store ]
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "daily"
open-pull-requests-limit: 10
labels: [auto-dependencies]
labels: [ auto-dependencies ]
200 changes: 111 additions & 89 deletions .github/workflows/arrow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,34 @@
# tests for arrow crate
name: arrow

concurrency:
group: ${{ github.repository }}-${{ github.head_ref || github.sha }}-${{ github.workflow }}
cancel-in-progress: true

on:
# always trigger
push:
branches:
- master
pull_request:
paths:
- arrow/**
- .github/**
- arrow-arith/**
- arrow-array/**
- arrow-buffer/**
- arrow-cast/**
- arrow-csv/**
- arrow-data/**
- arrow-integration-test/**
- arrow-ipc/**
- arrow-json/**
- arrow-avro/**
- arrow-ord/**
- arrow-row/**
- arrow-schema/**
- arrow-select/**
- arrow-string/**
- arrow/**

jobs:

Expand All @@ -36,24 +55,46 @@ jobs:
runs-on: ubuntu-latest
container:
image: amd64/rust
env:
# Disable full debug symbol generation to speed up CI build and keep memory down
# "1" means line tables only, which is useful for panic tracebacks.
RUSTFLAGS: "-C debuginfo=1"
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
submodules: true
- name: Setup Rust toolchain
uses: ./.github/actions/setup-builder
with:
rust-version: stable
- name: Test
run: |
cargo test -p arrow
- name: Test --features=force_validate,prettyprint,ipc_compression,ffi,dyn_cmp_dict
run: |
cargo test -p arrow --features=force_validate,prettyprint,ipc_compression,ffi,dyn_cmp_dict
- name: Test arrow-buffer with all features
run: cargo test -p arrow-buffer --all-features
- name: Test arrow-data with all features
run: cargo test -p arrow-data --all-features
- name: Test arrow-schema with all features
run: cargo test -p arrow-schema --all-features
- name: Test arrow-array with all features
run: cargo test -p arrow-array --all-features
- name: Test arrow-select with all features
run: cargo test -p arrow-select --all-features
- name: Test arrow-cast with all features
run: cargo test -p arrow-cast --all-features
- name: Test arrow-ipc with all features
run: cargo test -p arrow-ipc --all-features
- name: Test arrow-csv with all features
run: cargo test -p arrow-csv --all-features
- name: Test arrow-json with all features
run: cargo test -p arrow-json --all-features
- name: Test arrow-avro with all features
run: cargo test -p arrow-avro --all-features
- name: Test arrow-string with all features
run: cargo test -p arrow-string --all-features
- name: Test arrow-ord with all features
run: cargo test -p arrow-ord --all-features
- name: Test arrow-arith with all features
run: cargo test -p arrow-arith --all-features
- name: Test arrow-row with all features
run: cargo test -p arrow-row --all-features
- name: Test arrow-integration-test with all features
run: cargo test -p arrow-integration-test --all-features
- name: Test arrow with default features
run: cargo test -p arrow
- name: Test arrow with all features except pyarrow
run: cargo test -p arrow --features=force_validate,prettyprint,ipc_compression,ffi,chrono-tz
- name: Run examples
run: |
# Test arrow examples
Expand All @@ -64,114 +105,95 @@ jobs:
- name: Run non-archery based integration-tests
run: cargo test -p arrow-integration-testing

# test compilaton features
# test compilation features
linux-features:
name: Check Compilation
runs-on: ubuntu-latest
container:
image: amd64/rust
env:
# Disable full debug symbol generation to speed up CI build and keep memory down
# "1" means line tables only, which is useful for panic tracebacks.
RUSTFLAGS: "-C debuginfo=1"
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
submodules: true
- name: Setup Rust toolchain
uses: ./.github/actions/setup-builder
with:
rust-version: stable
- name: Check compilation
run: |
cargo check -p arrow
run: cargo check -p arrow
- name: Check compilation --no-default-features
run: |
cargo check -p arrow --no-default-features
run: cargo check -p arrow --no-default-features
- name: Check compilation --all-targets
run: |
cargo check -p arrow --all-targets
run: cargo check -p arrow --all-targets
- name: Check compilation --no-default-features --all-targets
run: |
cargo check -p arrow --no-default-features --all-targets
run: cargo check -p arrow --no-default-features --all-targets
- name: Check compilation --no-default-features --all-targets --features test_utils
run: |
cargo check -p arrow --no-default-features --all-targets --features test_utils

# test the --features "simd" of the arrow crate. This requires nightly Rust.
linux-test-simd:
name: Test SIMD on AMD64 Rust ${{ matrix.rust }}
runs-on: ubuntu-latest
container:
image: amd64/rust
env:
# Disable full debug symbol generation to speed up CI build and keep memory down
# "1" means line tables only, which is useful for panic tracebacks.
RUSTFLAGS: "-C debuginfo=1"
steps:
- uses: actions/checkout@v3
with:
submodules: true
- name: Setup Rust toolchain
uses: ./.github/actions/setup-builder
with:
rust-version: nightly
- name: Run tests --features "simd"
run: |
cargo test -p arrow --features "simd"
- name: Check compilation --features "simd"
run: |
cargo check -p arrow --features simd
- name: Check compilation --features simd --all-targets
run: |
cargo check -p arrow --features simd --all-targets
run: cargo check -p arrow --no-default-features --all-targets --features test_utils
- name: Check compilation --no-default-features --all-targets --features ffi
run: cargo check -p arrow --no-default-features --all-targets --features ffi
- name: Check compilation --no-default-features --all-targets --features chrono-tz
run: cargo check -p arrow --no-default-features --all-targets --features chrono-tz


# test the arrow crate builds against wasm32 in stable rust
# test the arrow crate builds against wasm32 in nightly rust
wasm32-build:
name: Build wasm32
runs-on: ubuntu-latest
container:
image: amd64/rust
env:
# Disable full debug symbol generation to speed up CI build and keep memory down
# "1" means line tables only, which is useful for panic tracebacks.
RUSTFLAGS: "-C debuginfo=1"
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
submodules: true
- name: Cache Cargo
uses: actions/cache@v3
- name: Setup Rust toolchain
uses: ./.github/actions/setup-builder
with:
path: /github/home/.cargo
key: cargo-wasm32-cache3-
- name: Setup Rust toolchain for WASM
run: |
rustup toolchain install nightly
rustup override set nightly
rustup target add wasm32-unknown-unknown
rustup target add wasm32-wasi
- name: Build
run: |
cd arrow
cargo build --no-default-features --features=json,csv,ipc,simd,ffi --target wasm32-unknown-unknown
cargo build --no-default-features --features=json,csv,ipc,simd,ffi --target wasm32-wasi
target: wasm32-unknown-unknown,wasm32-wasi
- name: Build wasm32-unknown-unknown
run: cargo build -p arrow --no-default-features --features=json,csv,ipc,ffi --target wasm32-unknown-unknown
- name: Build wasm32-wasi
run: cargo build -p arrow --no-default-features --features=json,csv,ipc,ffi --target wasm32-wasi

clippy:
name: Clippy
runs-on: ubuntu-latest
container:
image: amd64/rust
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Setup Rust toolchain
uses: ./.github/actions/setup-builder
with:
rust-version: stable
- name: Setup Clippy
run: |
rustup component add clippy
- name: Run clippy
run: |
cargo clippy -p arrow --features=prettyprint,csv,ipc,test_utils,ffi,ipc_compression,dyn_cmp_dict --all-targets -- -D warnings
run: rustup component add clippy
- name: Clippy arrow-buffer with all features
run: cargo clippy -p arrow-buffer --all-targets --all-features -- -D warnings
- name: Clippy arrow-data with all features
run: cargo clippy -p arrow-data --all-targets --all-features -- -D warnings
- name: Clippy arrow-schema with all features
run: cargo clippy -p arrow-schema --all-targets --all-features -- -D warnings
- name: Clippy arrow-array with all features
run: cargo clippy -p arrow-array --all-targets --all-features -- -D warnings
- name: Clippy arrow-select with all features
run: cargo clippy -p arrow-select --all-targets --all-features -- -D warnings
- name: Clippy arrow-cast with all features
run: cargo clippy -p arrow-cast --all-targets --all-features -- -D warnings
- name: Clippy arrow-ipc with all features
run: cargo clippy -p arrow-ipc --all-targets --all-features -- -D warnings
- name: Clippy arrow-csv with all features
run: cargo clippy -p arrow-csv --all-targets --all-features -- -D warnings
- name: Clippy arrow-json with all features
run: cargo clippy -p arrow-json --all-targets --all-features -- -D warnings
- name: Clippy arrow-avro with all features
run: cargo clippy -p arrow-avro --all-targets --all-features -- -D warnings
- name: Clippy arrow-string with all features
run: cargo clippy -p arrow-string --all-targets --all-features -- -D warnings
- name: Clippy arrow-ord with all features
run: cargo clippy -p arrow-ord --all-targets --all-features -- -D warnings
- name: Clippy arrow-arith with all features
run: cargo clippy -p arrow-arith --all-targets --all-features -- -D warnings
- name: Clippy arrow-row with all features
run: cargo clippy -p arrow-row --all-targets --all-features -- -D warnings
- name: Clippy arrow with all features
run: cargo clippy -p arrow --all-features --all-targets -- -D warnings
- name: Clippy arrow-integration-test with all features
run: cargo clippy -p arrow-integration-test --all-targets --all-features -- -D warnings
- name: Clippy arrow-integration-testing with all features
run: cargo clippy -p arrow-integration-testing --all-targets --all-features -- -D warnings
Loading
Loading