Skip to content

Latest commit

 

History

History
387 lines (335 loc) · 30.1 KB

status.rst

File metadata and controls

387 lines (335 loc) · 30.1 KB

Implementation Status

The following tables summarize the features available in the various official Arrow libraries. All libraries currently follow version 1.0.0 of the Arrow format, or later minor versions that are compatible with version 1.0.0. See :doc:`./format/Versioning` for details about versioning. Unless otherwise stated, the Python, R, Ruby and C/GLib libraries follow the C++ Arrow library.

Data Types

Data type (primitive) C++ Java Go JS C# Rust Julia Swift nanoarrow
Null  
Boolean
Int8/16/32/64
UInt8/16/32/64
Float16 ✓ (1) ✓ (2)  
Float32/64
Decimal32            
Decimal64            
Decimal128  
Decimal256  
Date32/64
Time32/64
Timestamp  
Duration  
Interval    
Fixed Size Binary  
Binary
Large Binary (4)  
Utf8
Large Utf8 (4)  
Binary View          
Large Binary View              
Utf8 View          
Large Utf8 View              
Data type (nested) C++ Java Go JS C# Rust Julia Swift nanoarrow
Fixed Size List  
List  
Large List   (4)  
List View            
Large List View              
Struct  
Map  
Dense Union  
Sparse Union  
Data type (special) C++ Java Go JS C# Rust Julia Swift nanoarrow
Dictionary ✓ (3) ✓ (3)  
Extension      
Run-End Encoded              
Canonical Extension types C++ Java Go JavaScript C# Rust Julia Swift
Fixed shape tensor              
Variable shape tensor                
JSON            
Opaque          
UUID            
8-bit Boolean            

Notes:

  • (1) Casting to/from Float16 in Java is not supported.
  • (2) Float16 support in C# is only available when targeting .NET 6+.
  • (3) Nested dictionaries not supported
  • (4) C# large array types are provided to help with interoperability with other libraries, but these do not support buffers larger than 2 GiB and an exception will be raised if trying to import an array that is too large.
.. seealso::
   The :ref:`format_columnar` and the
   :ref:`format_canonical_extensions` specification.


IPC Format

IPC Feature C++ Java Go JS C# Rust Julia Swift nanoarrow
Arrow stream format ✓ (4)
Arrow file format  
Record batches
Dictionaries    
Replacement dictionaries          
Delta dictionaries ✓ (1)   ✓ (1)      
Tensors                
Sparse tensors                
Buffer compression ✓ (3)      
Endianness conversion ✓ (2)   ✓ (2)           ✓ (2)
Custom schema metadata  

Notes:

  • (1) Delta dictionaries not supported on nested dictionaries
  • (2) Data with non-native endianness can be byte-swapped automatically when reading.
  • (3) LZ4 Codec currently is quite inefficient. ARROW-11901 tracks improving performance.
  • (4) The nanoarrow IPC implementation is only implemented for reading IPC streams.
.. seealso::
   The :ref:`format-ipc` specification.

Flight RPC

Flight RPC Transport C++ Java Go JS C# Rust Julia Swift
gRPC transport (grpc:, grpc+tcp:)      
gRPC domain socket transport (grpc+unix:)      
gRPC + TLS transport (grpc+tls:)      
UCX transport (ucx:) (1)              

Supported features in the gRPC transport:

Flight RPC Feature C++ Java Go JS C# Rust Julia Swift
All RPC methods      
Authentication handlers   ✓ (2)    
Call timeouts        
Call cancellation        
Concurrent client calls (3)      
Custom middleware        
RPC error codes      

Supported features in the UCX transport:

Flight RPC Feature C++ Java Go JS C# Rust Julia Swift
All RPC methods ✓ (4)              
Authentication handlers                
Call timeouts                
Call cancellation                
Concurrent client calls ✓ (5)              
Custom middleware                
RPC error codes              

Notes:

  • (1) Flight UCX transport has been deprecated on the 19.0.0 release.
  • (2) Support using AspNetCore authentication handlers.
  • (3) Whether a single client can support multiple concurrent calls.
  • (4) Only support for DoExchange, DoGet, DoPut, and GetFlightInfo.
  • (5) Each concurrent call is a separate connection to the server (unlike gRPC where concurrent calls are multiplexed over a single connection). This will generally provide better throughput but consumes more resources both on the server and the client.
.. seealso::
   The :ref:`flight-rpc` specification.

Flight SQL

Note

Flight SQL is still experimental.

The feature support refers to the client/server libraries only; databases which implement the Flight SQL protocol in turn will support/not support individual features.

Feature C++ Java Go JS C# Rust Julia Swift
BeginSavepoint            
BeginTransaction            
CancelQuery            
ClosePreparedStatement      
CreatePreparedStatement      
CreatePreparedSubstraitPlan            
EndSavepoint            
EndTransaction            
GetCatalogs      
GetCrossReference      
GetDbSchemas      
GetExportedKeys      
GetImportedKeys      
GetPrimaryKeys      
GetSqlInfo      
GetTables      
GetTableTypes      
GetXdbcTypeInfo      
PreparedStatementQuery      
PreparedStatementUpdate      
StatementSubstraitPlan            
StatementQuery      
StatementUpdate      
.. seealso::
   The :doc:`./format/FlightSql` specification.

C Data Interface

Feature C++ Python R Rust Go Java C/GLib Ruby Julia C# Swift nanoarrow
Schema export    
Array export    
Schema import    
Array import    
.. seealso::
   The :ref:`C Data Interface <c-data-interface>` specification.


C Stream Interface

Feature C++ Python R Rust Go Java C/GLib Ruby Julia C# Swift nanoarrow
Stream export      
Stream import      
.. seealso::
   The :ref:`C Stream Interface <c-stream-interface>` specification.


Third-Party Data Formats

Format C++ Java Go JS C# Rust Julia Swift
Avro   R R          
CSV R/W R (2) R/W     R/W R/W  
ORC R/W R (1)            
Parquet R/W R (2) R/W     R/W    

Notes:

  • R = Read supported
  • W = Write supported
  • (1) Through JNI bindings. (Provided by org.apache.arrow.orc:arrow-orc)
  • (2) Through JNI bindings to Arrow C++ Datasets. (Provided by org.apache.arrow:arrow-dataset)