Hosting a database cluster in the cloud via Aurora is able to provide users with sets of features and configurations to obtain maximum performance and availability, such as database failover. However, at the moment, most existing drivers do not currently support those functionalities or are not able to entirely take advantage of it.
The main idea behind the AWS Advanced Python Driver is to add a software layer on top of an existing Python driver that would enable all the enhancements brought by Aurora, without requiring users to change their workflow with their databases and existing Python drivers.
In an Amazon Aurora database cluster, failover is a mechanism by which Aurora automatically repairs the cluster status when a primary DB instance becomes unavailable. It achieves this goal by electing an Aurora Replica to become the new primary DB instance, so that the DB cluster can provide maximum availability to a primary read-write DB instance. The AWS Advanced Python Driver is designed to understand the situation and coordinate with the cluster in order to provide minimal downtime and allow connections to be very quickly restored in the event of a DB instance failure.
Although Aurora is able to provide maximum availability through the use of failover, existing client drivers do not currently support this functionality. This is partially due to the time required for the DNS of the new primary DB instance to be fully resolved in order to properly direct the connection. The AWS Advanced Python Driver allows customers to continue using their existing community drivers in addition to having the AWS Advanced Python Driver fully exploit failover behavior by maintaining a cache of the Aurora cluster topology and each DB instance's role (Aurora Replica or primary DB instance). This topology is provided via a direct query to the Aurora DB, essentially providing a shortcut to bypass the delays caused by DNS resolution. With this knowledge, the AWS Advanced Python Driver can more closely monitor the Aurora DB cluster status so that a connection to the new primary DB instance can be established as fast as possible.
Since a database failover is usually identified by reaching a network or a connection timeout, the AWS Advanced Python Driver introduces an enhanced and customizable manner to faster identify a database outage.
Enhanced Failure Monitoring (EFM) is a feature available from the Host Monitoring Connection Plugin that periodically checks the connected database host's health and availability. If a database host is determined to be unhealthy, the connection is aborted (and potentially routed to another healthy host in the cluster).
The AWS Advanced Python Driver also works with RDS provided databases that are not Aurora.
Please visit this page for more information.
To start using the driver with Psycopg, you need to pass Psycopg's connect function to the AwsWrapperConnection#connect
method as shown in the following example:
from aws_advanced_python_wrapper import AwsWrapperConnection
from psycopg import Connection
with AwsWrapperConnection.connect(
Connection.connect,
"host=database.cluster-xyz.us-east-1.rds.amazonaws.com dbname=db user=john password=pwd",
plugins="failover",
wrapper_dialect="aurora-pg",
autocommit=True
) as awsconn:
awscursor = awsconn.cursor()
awscursor.execute("SELECT aurora_db_instance_identifier()")
awscursor.fetchone()
for record in awscursor:
print(record)
The AwsWrapperConnection#connect
method accepts the connection configuration through both the connection string and the keyword arguments.
You can either pass the connection configuration entirely through the connection string, entirely though the keyword arguments, or through a mixture of both.
To use the driver with MySQL Connector/Python, see the following example:
from aws_advanced_python_wrapper import AwsWrapperConnection
from mysql.connector import Connect
with AwsWrapperConnection.connect(
Connect,
"host=database.cluster-xyz.us-east-1.rds.amazonaws.com database=db user=john password=pwd",
plugins="failover",
wrapper_dialect="aurora-mysql",
autocommit=True
) as awsconn:
awscursor = awsconn.cursor()
awscursor.execute("SELECT @@aurora_server_id")
awscursor.fetchone()
for record in awscursor:
print(record)
For more details on how to download the AWS Advanced Python Driver, minimum requirements to use it, and how to integrate it within your project and with your Python driver of choice, please visit the Getting Started page.
The following table lists the connection properties used with the AWS Advanced Python Wrapper.
Parameter | Documentation Link |
---|---|
auxiliary_query_timeout_sec |
Driver Parameters |
topology_refresh_ms |
Driver Parameters |
cluster_id |
Driver Parameters |
cluster_instance_host_pattern |
Driver Parameters |
wrapper_dialect |
Dialects, and whether you should include it. |
wrapper_driver_dialect |
Driver Dialect, and whether you should include it. |
plugins |
Connection Plugin Manager |
auto_sort_wrapper_plugin_order |
Connection Plugin Manager |
profile_name |
Connection Plugin Manager |
connect_timeout |
Network Timeouts |
socket_timeout |
Network Timeouts |
tcp_keepalive |
Network Timeouts |
tcp_keepalive_time |
Network Timeouts |
tcp_keepalive_interval |
Network Timeouts |
tcp_keepalive_probes |
Network Timeouts |
enable_failover |
Failover Plugin |
failover_mode |
Failover Plugin |
cluster_instance_host_pattern |
Failover Plugin |
failover_cluster_topology_refresh_rate_sec |
Failover Plugin |
failover_reader_connect_timeout_sec |
Failover Plugin |
failover_timeout_sec |
Failover Plugin |
failover_writer_reconnect_interval_sec |
Failover Plugin |
failure_detection_count |
Host Monitoring Plugin |
failure_detection_enabled |
Host Monitoring Plugin |
failure_detection_interval_ms |
Host Monitoring Plugin |
failure_detection_time_ms |
Host Monitoring Plugin |
monitor_disposal_time_ms |
Host Monitoring Plugin |
iam_default_port |
IAM Authentication Plugin |
iam_host |
IAM Authentication Plugin |
iam_region |
IAM Authentication Plugin |
iam_expiration |
IAM Authentication Plugin |
secrets_manager_secret_id |
Secrets Manager Plugin |
secrets_manager_region |
Secrets Manager Plugin |
secrets_manager_endpoint |
Secrets Manager Plugin |
reader_host_selector_strategy |
Connection Strategy |
db_user |
Federated Authentication Plugin |
idp_username |
Federated Authentication Plugin |
idp_password |
Federated Authentication Plugin |
idp_endpoint |
Federated Authentication Plugin |
iam_role_arn |
Federated Authentication Plugin |
iam_idp_arn |
Federated Authentication Plugin |
iam_region |
Federated Authentication Plugin |
idp_name |
Federated Authentication Plugin |
idp_port |
Federated Authentication Plugin |
rp_identifier |
Federated Authentication Plugin |
iam_host |
Federated Authentication Plugin |
iam_default_port |
Federated Authentication Plugin |
iam_token_expiration |
Federated Authentication Plugin |
http_request_connect_timeout |
Federated Authentication Plugin |
ssl_secure |
Federated Authentication Plugin |
Technical documentation regarding the functionality of the AWS Advanced Python Driver will be maintained in this GitHub repository. Since the AWS Advanced Python Driver requires an underlying Python driver, please refer to the individual driver's documentation for driver-specific information. To find all the documentation and concrete examples on how to use the AWS Advanced Python Driver, please refer to the AWS Advanced Python Driver Documentation page.
This driver currently does not support switchover in Amazon RDS Blue/Green Deployments. In order to execute a Blue/Green deployment with the driver, please ensure your application is coded to retry the database connection. Retry will allow the driver to re-establish a connection to an available database instance. Without a retry, the driver will not be able to identify an available database instance after blue/green switchover has occurred.
When connecting to Aurora MySQL clusters, it is recommended to use the Python implementation of the MySQL Connector/Python driver by setting the use_pure
connection argument to True
.
The AWS Advanced Python Driver internally calls the MySQL Connector/Python's is_connected
method to verify the connection. The MySQL Connector/Python's C extension uses a network blocking implementation of the is_connected
method.
In the event of a network failure where the host can no longer be reached, the is_connected
call may hang indefinitely and will require users to forcibly interrupt the application.
The official MySQL Connector/Python offers a Python implementation and a C implementation of the driver that can be toggled using the use_pure
connection argument.
The IAM Authentication Plugin is incompatible with the Python implementation of the driver due to its 255-character password limit.
The IAM Authentication Plugin generates a temporary AWS IAM token to authenticate users. Passing this token to the Python implementation of the driver will result in error messages similar to the following:
Error occurred while opening a connection: int1store requires 0 <= i <= 255
or
struct.error: ubyte format requires 0 <= number <= 255
To avoid this error, we recommend you set use_pure
to False
when using the IAM Authentication Plugin.
However, as noted in the MySQL Connector/Python C Extension section, doing so may cause the application to indefinitely hang if there is a network failure.
Unfortunately, due to conflicting limitations, you will need to decide if using the IAM plugin is worth this risk for your application.
In the AWS Advanced Python Driver test suite, we have two methods of simulating database/network failure:
- method 1: initiate failover using the AWS RDS SDK.
- method 2: use a test dependency called Toxiproxy.
Toxiproxy creates proxy network containers that sit between the test and the actual database. The test code connects to the proxy containers instead of the database. Toxiproxy can then be used to disable network activity between the driver and the database via the proxy. This network failure simulation is stricter than scenario 1. It does not allow any communication between the driver and the database, and does not give the database a chance to break off any connections.
We have observed different behavior when testing with method 1 vs method 2. With method 1, the server failover is detected without unexpected side effects. With method 2, we have observed some side effects. We have only observed these side effects when using Toxiproxy, and have never observed them in a real-world scenario.
Psycopg and MySQL Connector/Python do not provide client-side query timeouts. When querying a host that has been disabled via Toxiproxy, the query will hang indefinitely until the host is re-enabled. Whether this behavior would ever occur in a real world scenario is uncertain. We reached out to the Psycopg team, who indicated they have not seen this issue (see full discussion here). They also believe that Toxiproxy tests for stricter conditions than would occur in a real-world scenario. However, we will list the side effects we have noticed during testing due to this behavior:
- The EFM plugin and
socket_timeout
connection parameter use helper threads to execute queries used to detect if the network is working properly. These helper threads are executed with a timeout so that failure can be detected if the network is down. If the host was disabled using Toxiproxy, although the timeout will return control to the main thread, the helper thread will be stuck in a loop waiting on results from the server. It will be stuck until the host is re-enabled with Toxiproxy. There is no mechanism to cancel a running thread from another thread in Python, so the thread will consume resources until the host is re-enabled. Using an Event to signal the thread to stop is not an option, as the loop occurs inside the underlying driver code. Note that although the helper thread will be stuck, control will still be returned to the main thread so that the driver is still usable. - As a consequence of side effect 1, if a query is executed against a host that has been disabled with Toxiproxy, the Python program will not exit if the host is not re-enabled. The EFM helper threads mentioned in side effect 1 are run using a ThreadPoolExecutor. Although the ThreadPoolExecutor implementation uses daemon threads, it also joins all threads at Python exit. Because the helper thread is stuck in this scenario, the Python application will hang waiting the thread to join. This behavior has only been observed when using the MySQL Connector/Python driver.
- As a consequence of side effect 1, if a query is executed against a host that has been disabled with Toxiproxy, and the host is still disabled when the Python program exits, a segfault may occur. This occurs because the helper thread is stuck in a loop attempting to read a connection pointer. When the program is exiting, the pointer is destroyed. The helper thread may try to read from the pointer after it is destroyed, leading to a segfault. This behavior has only been observed when using the Psycopg driver.
If you encounter a bug with the AWS Advanced Python Driver, we would like to hear about it. Please search the existing issues to see if others are also experiencing the issue before reporting the problem in a new issue. GitHub issues are intended for bug reports and feature requests.
When opening a new issue, please fill in all required fields in the issue template to help expedite the investigation process.
For all other questions, please use GitHub discussions.
- Set up your environment by following the directions in the Development Guide.
- To contribute, first make a fork of this project.
- Make any changes on your fork. Make sure you are aware of the requirements for the project (e.g. do not require Python 3.7 if we are supporting Python 3.8 and higher).
- Create a pull request from your fork.
- Pull requests need to be approved and merged by maintainers into the main branch.
Note
Before making a pull request, run all tests and verify everything is passing.
The project source code is written using the PEP 8 Style Guide, and the style is strictly enforced in our automation pipelines. Any contribution that does not respect/satisfy the style will automatically fail at build time.
The aws-advanced-python-wrapper
has a regular monthly release cadence. A new release will occur during the last week of each month. However, if there are no changes since the latest release, then a release will not occur.
This aws-advanced-python-wrapper
is being tested against the following Community and Aurora database versions in our test suite:
Database | Versions |
---|---|
MySQL | 8.4.0 |
PostgreSQL | 16.2 |
Aurora MySQL | - LTS version, see here for more details. - Latest release, as shown on this page. |
Aurora PostgreSQL | - LTS version, see here for more details. - Latest release, as shown on this page.) |
The aws-advanced-python-wrapper
is compatible with MySQL 5.7 and MySQL 8.0 as per MySQL Connector/Python.
Warning
Due to recent internal changes with the v9.0.0
MySQL Connector/Python driver in regards to connection handling, the AWS Advanced Python Wrapper is not recommended for usage with v9.0.0
. The AWS Advanced Python Wrapper will be updated in the future for v9.0.0
compatibility with the community driver.
This software is released under the Apache 2.0 license.