- #1100 Refactor position tracking for multi-line JSON files
- #1102 Bumped gp-common-go-libs to v1.0.16
- #1105 Bumped Spring Framework version to 5.3.33
- #1108 Bumped Tomcat version to 9.0.87
- #1111 Adjusted PXF error handling and determining client disconnects
- #1080 Add support for UUID to JDBC profile
- #1081 Add automation test cases for JDBC profile DATE_WIDE_RANGE feature
- #1093 Bump Postgresql JDBC driver to 42.7.2
- #1096 Add formatting for Date values
- #1035 Upgrade snappy-java to 1.1.10.4
- #1040 Bump golang.org/x/net from 0.7.0 to 0.17.0 in /cli
- #1044 Upgrade Go toolchain to 1.21.3
- #1058 Add RHEL9 Support
- #1047 Fix pxf register command
- #1013 Bumped Azure Storage dependency to 5.5.0
- #1018 Add pxf.service.kerberos.ticket-renew-window option to pxf-site.xml
- #1019 Add pushdown of NUMERIC and handling of CHAR and VARCHAR predicates for JDBC profile
- #956 Add pxfdelimited_import formatter to support multibyte delimiters for TEXT and CSV profiles
- #960 Add support year with more than 4 digits in 'date' or 'timestamp'
- #973 Enable write flow for FDW for non-text/csv formats
- #976 Restrict PXF to listen to local requests only
- #979 Add logging to the LineBreakAccessor for the write
- #983 Bump Springboot to 2.7.12
- #984 Enable writing data in JSON format using *:json profiles
- #989 Bump snappy to 1.1.10.1
- #967 FDW: Fix for skipping the dropped and correctly counting Projection Index
- #978 Added erroring out logic for decimal overflow for ORC
- #949 Support for fixedwidth formatter with new
*:fixedwidth
PXF profiles - #954 Update table options names to not include dash character
- #955 Bump jackson-databind from 2.13.4.1 to 2.13.4.2 in /automation
- #940 Introduced options to handle decimal overflow when writing Parquet files
- #857 Changes for supporting PXF for GP7.
- #900 Added error handling for FDW read flow
- #943 Upgrade gp-common-go-libs to 1.0.11
- #872 Enable automation tests to run against PXF FDW extension
- #876 Parquet write array
- #879 Add new SPLIT_BY_FILE option for json profiles
- #880 Allow setting JDBC session-level parallel instructions for Oracle
- #885 Prevent out-of-bounds buffer access
- #886 Update jackson-databind to 2.13.4.1
- #895 Parquet read array
- #903 Bump postgresql from 42.4.1 to 42.4.3
- #875 Fix projection of boolean qualifiers whose attrs are not in SELECT list
- #879 Refactor JsonRecordReader and fix incorrect parsing of JSON objects for multi-line JSON
- #897 Add close connection if statementRead is null
- #870 Relax the requirement that C string length matches Java string length
- #858 Add JsonProtocolHandler to use HdfsFileFragmenter for multi-line JSON
- #818 Add support for writing ORC primitive types
- #836 Add write support for one-dimensional ORC arrays
- #842 Add support for using a PreparedStatement when reading
- #833 Bump aws-java-sdk-s3 from 1.11.472 to 1.12.261
- #838 Upgrade org.xerial.snappy:snappy-java to 1.1.8.4
- #845 Bump postgresql version from 42.3.3 to 42.4.1
- #781 Fix: In case of UnsupportedOperationException, Add an error message.
- #789 Upgrade to Springboot 2.5.12
- #799 Bump jackson-databind from 2.11.0 to 2.12.6.1
- #814 Add data buffer boundary checks to PXF extension
- #815 Upgrade ORC version to 1.6.13 to get a fix for ORC-1065
- #819 Upgrade Hadoop to 2.10.2
- #823 Add unsupported exception in case of hive write
- #788 Replace prefix macro with environment variable in scriptlets
- #794 Fix NPE in Hive ORC vectorized query execution
- #703 Added Support for Avro Logical Types for Readable External Tables
- #707 Enabled Kerberos Constrained Delegation impersonation for secure clusters
- #752 Add support for GPDB6 on RHEL 8
- #754 Add scripts for modifying PXF extension to support gpupgrade
- #738 Fix: For reading the records correctly from a MultiLine JSON file
- #756 Fixed HiveDataFragmenter not closing connections to Hive Metastore
- #760 Update bundled postgresql to 42.3.3
- #720 Redirect PXF stdout and stderr to files in PXF_LOGDIR
- #735 Bumped Log4j2 version to 2.17.1
- #740 Bump go version to 1.17.6
- #741 Improved performance of iterating over a list of fragments
- #710 Allow skipping the header for *:text:multi profiles
- #719 Add explicit UnsupportedException for Hive transactional tables
- #721 Set default MySQL fetchSize to Integer.MIN_VALUE
- #726 pxf-hive: Catch TTransportException when working with metastore client
- #727 bump log4j2 version to 2.16.0
- #662 Remove the error-causing check
- #670 Upgrade Spring Boot to 2.5.2, Gradle to 7
- #675 pxf-hdfs: support for reading lists from ORC files
- #688 Introduced operation retries when GSS connection failures are encountered
- #687 Enhanced logging to include fragment info and minor alignment changes
- #689 external-table: add simple cURL debug callback function
- #680 fix CURLOPT_RESOLVE optimization
- #691 Added dataSource to Fragmenter Cache key, made cache expiration configurable
- #696 Enum and bytea types are now handled properly for external tables using FORMAT "TEXT" or FORMAT "CSV"
- #697 Fixed NullPointerException in GSS failure handling retry logic
- #633 Upgrade go dependencies
- #636 Add support for reading and writing arrays in AVRO
- #638 Report exception class if there's no exception message
- #640 Support reading JSON arrays and objects into Greenplum text columns
- #644 Allow configuring connection timeout for data uploads to PXF
- #624 Deprecated HiveVectorizedORC profile should follow vectorized execution path
- #626 Hive connector should clone SerDe properties per fragment
- #627 Fix NullPointerException for ORC textMapper function when the column is repeating
- #630 Fix the inconsistency between row count in external table and ORC file
- #404 Migrate PXF to Spring Boot
- #491 Remove invalid GemFireXD Profile
- #486 Serialize fragment metadata using kryo instead of json for better optimization
- #457 Convert PXF-CLI to use go modules instead of dep
- #498 Support pushing predicates of type varchar
- #506 Restore FDW build
- #470 Add support for reading ORC without Hive
- #500 Add the InOperatorTransformer TreeVisitor (transform IN operator into chain of ORs)
- #514 Improve logging of read stats(ms instead of ns)
- #512 Encode header values for custom headers, add disable_ppd option for PXF FDW extension, and add pushing predicates of type varchar down for the PXF FDW extension
- #495 Hive profile names now split "protocol" and "format"
- #521 Bump Hadoop version to 2.10.1
- #538 Add createParent option for SequenceFile during PXF write
- #535 Add shortnames and "uncompressed" option for text compression codecs
- #548 Update PXF CLI to support PXF on master
- #546 Add Prometheus metrics endpoint
- #555 Remove fragmenter call from PXF FDW extension
- #557 Log OOM issues to PXF_LOGDIR
- #542 Pass data encoding and database encoding from PXF client to server
- #568 Support different charsets in PXF FDW extension
- #573 Add trace and table headers to the request
- #572 Add custom tags for MVC
- #575 Add charset to Console and RollingFile appender
- #576 Log empty profile message at INFO level
- #569 Bump PXF external-table extension to 2.0
- #574 Add application property for configuring logging level
- #577 Enhance MDC with PXF context
- #579 Report fragments.sent PXF metric
- #571 Add PXF version header to request
- #583 Report records.sent metric
- #586 Report records.received metric
- #595 Add bytes monitoring to PXF
- #604 Log error messages with context
- #519 Update the error message when capacity exceeded in PXF
- #554 Fix different encoding when using LineRecordReader
- #553 Hardcode replicas in fragment
- #549 pxfbridge: Return early when context->current_fragment is NULL
- #526 pxf-hive: properly escape strings in complex data types
- #480 Enable predicate pushdown for Hive profile when accessing Parquet backed tables
- #477 CLI: Add --skip-register flag for pxf [cluster] init
- #474 Optimize hive metadata
- #467 clarify the pxf.fs.basePath description
- #472 Enable column projection for Parquet files read via Hive profile
- #461 Increase default maximumPoolSize property in jdbc-site.xml
- #469 Specify Hive schema column names and types in HiveAccessor when creating RecordReaders
- #456 Add support for File Storage (Attached to every segment host)
- #453 Remove Configuration from SessionId
- #453 Release UGI if there is an error during the filter execution
- #451 Support dropping columns in PXF writable external tables
- #451 Support dropping columns in PXF readable external tables
- #445 pxf.service.user.name is commented out by default in pxf-site.xml
- #460 Avro: fixing NullPointerException for writing NULL values for SMALLINT and BYTEA columns
- #418 Parquet performance improvements for write
- #433 Parquet Write: Fix physical and logical storage for DATE types
- #435 Upgrade from Tomcat 7.0.100 to 7.0.105
- #439 Add lib/native directory in PXF_CONF
- #392 Add support for Avro BZip2 and XZ Compression Codecs
- #395 Bump com.fasterxml.jackson.core:jackson-* version from 2.9.x to 2.11.0
- #410 Allow skipping the header for *:text profiles
- #421 Deprecate THREAD_SAFE custom option
- #382 Add missing dependency for Hive profile when accessing CSV files
- #415 Hive: Report the correct error message from HiveMetaStoreClientCompatibility1xx
- #416 Fix performance issues when writing wide CSV/TEXT rows
- #383 Avro: support writing SMALLINT to Avro
- #341 Create external-table directory and PXF RPM
- #360 Add additional information regarding the pxf.service.user.name property
- #358 Fix Glob Patterns for Hadoop-Compatible FileSystems
- #336 Add User Option to allow invalid input paths
- #333 Hive: Support column projection on Hive Profiles
- #324 Parquet: Right trim char types during insert
- #348 CVE-2020-10672: Upgrade jackson-databind to version 2.9.10.4
- #343 Upgrade hive 2.3.7 and support Java 11 for Hive Profiles
- #332 Hive: Add missing transitive dependency when reading parquet files
- #331 pxf-cli: Update gp-common-go-libs to latest
- #315 Bump greenplum-db/gp-common-go-libs to latest
- #308 Fix ESCAPE 'OFF' is not processed correctly on PXF side error
- #306 Fix JAVA_HOME from $PXF_CONF/conf/pxf-env.sh being overridden
- #302 pxf cluster sync: support deletion of extraneous files
- #304 Harden PXF's Tomcat configuration
- #300 Implement
pxf cluster restart
- #295 (parquet-refactor) Use the first version of Guava that has a stable Cache API
- #287 Update GCS connector jar in automation/prod
- #286 Parquet record filter
- #305 pxf cluster init: enforce JAVA_HOME is set
- #299 Log org.apache.parquet at WARN level
- #292 Fix compilation with JDK 11.
- #290 Log ClientAbortException at debug level
- #276 Refactor filter parser code (Non-user facing)
- #283 Fix parquet write decimal
- #280 Introduced pxf.session.user property and JDBC conn pool qualifier
- #248 Enable Avro write
- #261 Add Support for impersonation per server
- #257 Add Support for Kerberized Hive 3
- #254 Merge pxf impersonation jdbc
- #247 Add support for multiple kerberos Hadoop and Hive servers
- #268 Fallback on using LineRecordReader when reading encrypted files
- #253 Improve performance of HdfsFileFragmenter
- #251 Fix regression in *:text:multi profiles when using wildcards
- #243 Upgrade jackson libraries to 2.9.10
- #235 Certify support for Hive 3.1
- #226 Add OR and NOT support for JDBC filter pushdown
- #236 Upgrade tomcat to version 7.0.96
- #230 Add support for Hive 2 (Up to Hive 2.3.6)
- PXF does not support Hive when running Java 11. As a workaround run PXF on Java 8.
- #228 Make JDBC profile not fail when MAPR JAR files override default Hadoop ones
- #217 CLI: reset on standby master and don't allow cluster init without PXF_CONF set #217
- #224 Fix cloud access when Kerberized Hadoop is present
- Enable multinode testing against GCP dataproc. Run automation tests against Hadoop 2.9.2 and Hive 2.3.5
- #211 Preserve error when re-throwing IOException (#211)
- #187 Implement S3 Select
- #189 Implement cluster reset command
- #191 Support config option to specify the server configuration directory (#191)
- #198 Support serializing a list of OneFields to CSV (#198)
- #201 Add JDK11 to PXF docker base dev image (#201)
- #202 Support NOT and OR operators for S3 Select
- #203 Support reading and writing of timestamp with time zone for Parquet (#203)
- #206 Add S3 Select support for Parquet using S3-SELECT=AUTO (#206)
- #207 Enable support for PXF server to run with Java-11 (#207)
- #212 Use format options for S3 Select (#212)
- #213 Rename S3-SELECT option -> S3_SELECT (#213)
- #193 Fix uncompressed write parquet (#193)
- #170 JDBC: Query data from ranges outside of partition range (#170)
- #192 JsonResolver: throw BadRecordException on bad JSON
- #196 Purge codehaus from codebase, and replace it with fasterXML library (#196)
- #182 Run a named query that ends with semicolon
- #188 Support for FDW
- #186 Enable JDBC Connection Pooling
- #183 Upgrade postgres driver to version 42.2.5
- #180 Upgrade the Postgres JDBC Driver version
- #176 Upgrade jackson 2 version from 2.9.8 -> 2.9.9
- #171 Enable JDBC connection to Hive and JDBC-specific user impersonation per server
- #178 Make pxf threads configurable
- #172 Support user-specific configuration in server
- #169 HdfsFileFragmenter
- #162 Kill JVM/Tomcat on OutOfMemoryError
- #160 JDBC: Optimize resolver for INSERT queries
- #161 Fix Oracle timestamp wrapping
- #157 JDBC profile can execute a query to read data from an external DB
- #158 Add support to read multiline files as a single row
- #156 JDBC statement properties including fetch size and timeout
- #151 PXF cli: fix regression with version command
- #152 Ensure pxf version can be run before pxf init
- #147 Reverse direction of rsync in pxf sync command
- #144 Remove support for Logical operator NOT with Hive Partition Filtering. NOT is an unsupported logical operator
- #138 Hive partition filtering with support for all Logical Operators
- #134 pxf cluster: stop checking that hostname is master
- #150 Add debug statements for the JDBC connection
- #149 pxf cli: Add cluster status command
- #142 Allow configuration of JDBC transaction isolation. Implements #130
- #145 Added integration test for JDBC session parameters
- #135 Cache Fragmenter calls to improve memory consumption during the fragmenter call
- #136 pxf-cli: Support sync and init on standby master
- #141 pxf-api: Fix BaseConfigurationFactory logging
- #118 PXF-JDBC: Enable external database configuration and connection settings modification. Implements #129
- #133 Add Changelog
- pxf-cli: Use rsync on master host
- #115 PXF no longer expects the path to contain transaction and segment IDs during write. PXF will now construct the write path for Hadoop-Compatible FileSystems to include transaction and segment IDs.
- #119 Remove PXF-Ignite plugin. The Ignite plugin is removed in favor of Ignite's JDBC driver.
- Adds more visibility to external contributors by exposing Pull Request pipelines. It allows external contributors to debug issues when submitting Pull Requests.
- #116 Always use doAs for Kerberos with Hive, add request's hive-site.xml to HiveConf explicitly. Fixes issues with Kerberized Hive, where UGI was not being set.
- #80 Throw IOException when fs.mkdirs() returns false
- Improve Documentation
- #114 enable file-based configuration for JDBC plugin
- #113 added PARQUET_VERSION parameter and tests
- #112 Support additional parquet write config options
- #111 Fixed propagation of write exception from the JDBC plugin
- #110 Parquet column projection
- #108 Enabled column projection pushdown for JDBC profile
- #101 Update logging configuration to limit Hadoop INFO logging
- Add descriptive message when JAVA_HOME is not set
- #98 PXF-JDBC: quote column names
- #95 Enable license generation for PXF
- #94 Column projection support changes
- #92 Enhanced unit test for repeated primitive Parquet types
- #91 Create new groups for hive and hbase tests
- #89 Implement optimized version of isTextForm
- #88 Support Parquet repeated primitive types serialized into JSON
- #87 Updated library versions with security issues
- #86 Remove Parquet fragmenter; defer schema read to accessor
- Performance Tests
- #81 Upgrade to hadoop version 2.9.2
- #77 Add MapR Support for HDFS
No changelog for this release.
Changelog needed here.
Changelog needed here.
Changelog needed here.
Changelog needed here.
Changelog needed here.
Changelog needed here.