From 9a9d44169a2b03918a3aee4c1fb43ee1f95e82d8 Mon Sep 17 00:00:00 2001 From: Antoni Stachowski Date: Fri, 2 Aug 2024 15:45:16 +0200 Subject: [PATCH] Arrow batches documentation (#1190) --- doc.go | 36 ++++++++++++++++++++++++++++++++++-- 1 file changed, 34 insertions(+), 2 deletions(-) diff --git a/doc.go b/doc.go index 095004c38..2ad5fc941 100644 --- a/doc.go +++ b/doc.go @@ -633,8 +633,15 @@ of the returned value: # Arrow batches -You can retrieve data in a columnar format similar to the format a server returns. -You must use `WithArrowBatches` context, similar to the following: +You can retrieve data in a columnar format similar to the format a server returns, without transposing them to rows. +When working with the arrow columnar format in go driver, ArrowBatch structs are used. These are structs +mostly corresponding to data chunks received from the backend. They allow for access to specific arrow.Record structs. + +An ArrowBatch can exist in a state where the underlying data has not yet been loaded. The data is downloaded and +translated only on demand. Translation options are retrieved from a context.Context interface, which is either +passed from query context or set by the user using WithContext(ctx) method. + +In order to access them you must use `WithArrowBatches` context, similar to the following: var rows driver.Rows err = conn.Raw(func(x interface{}) error { @@ -648,6 +655,31 @@ You must use `WithArrowBatches` context, similar to the following: ... // use Arrow records +This returns []*ArrowBatch. + +ArrowBatch functions: + +GetRowCount(): +Returns the number of rows in the ArrowBatch. Note that this returns 0 if the data has not yet been loaded, +irrespective of it’s actual size. + +WithContext(ctx context.Context): +Sets the context of the ArrowBatch to the one provided. Note that the context will not retroactively apply to data +that has already been downloaded. For example: + + records1, _ := batch.Fetch() + records2, _ := batch.WithContext(ctx).Fetch() + +will produce the same result in records1 and records2, irrespective of the newly provided ctx. Context worth noting are: +-WithArrowBatchesTimestampOption +-WithHigherPrecision +-WithArrowBatchesUtf8Validation +described in more detail later. + +Fetch(): +Returns the underlying records as *[]arrow.Record. When this function is called, the ArrowBatch checks whether +the underlying data has already been loaded, and downloads it if not. + Limitations: 1. For some queries Snowflake may decide to return data in JSON format (examples: `SHOW PARAMETERS` or `ls @stage`). You cannot use JSON with Arrow batches context.