Using MySQL Connector/Python for connecting to MySQL DB

Using MySQL Connector/Python

In the previous blog, you installed MySQL Connector/Python and made sure that the module worked. However, printing the version string for the connector is hardly very exciting, so this blog will begin the journey through the features of the two legacy APIs.



The mysql.connector module includes the implementation of the Python Database API, which is defined in PEP249. This includes the option to use the C Extension while using the same API. 

This article goes through the ins and outs of creating and configuring connections to MySQL. Creating a connection is simple and is the first thing you will learn. There is a little more to the connection than just creating it, though. The rest of the blog will discuss how to configure the connection, including tips to avoid hardcoding the username and password into the application. The article finishes with a discussion of other connection-related options, with a particular focus on the character set.

Creating the Connection from Python

It has taken some work to get to this point, but now you are ready to connect to MySQL from Python for the first time. This section will go through the syntax of creating the connection, the most common connection options, examples of creating a connection, reconfiguring a connection, and some best practices for connections.

Syntax

There are several ways to create the connection. Four of them are

  • The mysql.connector.connect() function: This is the most flexible connection method. It provides a uniform way of creating connections using the C Extension or enabling the connection pooling and the failover-related options. This function works as a wrapper that returns an object of the appropriate class depending on the settings.
  • The MySQLConnection() constructor
  • The MySQLConnection.connect() method: It requires first instantiating the MySQLConnection class without arguments and then creating the connection.
  • The same as before using the MySQLConnection.connect() method, but with the difference that the MySQLConnection.config() method is called explicitly to configure the connection.

The MySQLConnection class is the pure Python implementation. Alternatively, the CMySQLConnection class can be used, which provides implementation of the C Extension backend to the Python Database API.

All of the methods end up with the same connection object, and they all take the connections options as keyword arguments. This means that you can choose whatever way to create the connection that works best for the program. However, since the mysql.connector.connect() function is the most powerful, it is the preferred way to connect because it makes it easier to switch between the pure Python and C Extension implementations or to enable connection pooling or failover.

Tip

Creating a connection using the mysql.connector.connect() function gives access to all connection-related features.

Figure 1 shows the basic flow of using the four ways to create a connection. The red (dark grey) boxes are called directly from the application code, and the yellow (light grey) boxes are called by the last method called indirectly. The figure uses the MySQLConnection class; however, the same applies if the CMySQLConnection class is used.

 The flow of creating a connection MySQL

 Figure 1 The flow of creating a connection

The leftmost route is the one using the mysql.connector.connect() function. The Python program calls the function with the connection arguments, and the function then handles the rest. The figure assumes a MySQLConnection connection (using the pure Python implementation) is created, but the function can also return a CMySQLConnection object if the C Extension is used. The basic syntax for the mysql.connector.connect() function is

db = mysql.connector.connect(**kwargs)

The route second from the left has the Python program send the connections arguments to the constructor when instantiating the MySQLConnection class. This triggers the constructor to call the connect() method, which in turn calls the config() method. The syntax when using the MySQLConnection class is

db = mysql.connector.MySQLConnection(**kwargs)

In the third route from the left, the MySQLConnection class is first instantiated and then the connect() method is explicitly invoked. The code syntax becomes

db = mysql.connector.MySQLConnection()
db.connect(**kwargs)

Finally, in the rightmost route, all steps are done explicitly. Notice that the order of calling the connect() and config() methods becomes reversed in this case compared with the three other ways of creating the connection. The syntax is

db = mysql.connector.MySQLConnection()
db.config(**kwargs)
db.connect()

Before creating some real connections, it is necessary to take a look at the most common options used when creating a connection.

Common Connection Options

The most commonly used options for specifying how to connect to MySQL, whom to authenticate as, and which password to use are summarized in Table 1.

Table 1 Common Connection-Related Options

Argument

Default Value

Description

host

127.0.0.1

The hostname of the host, where the MySQL instance you want to connect to is installed. The default is to connect to the loopback (that is the local host).

port

3306

The port on which MySQL is listening. Port 3306 is the standard MySQL port.

unix_socket

 

On Linux and Unix, it is possible to connect to a MySQL instance on the local host by using a Unix socket. Specify the path to the socket file.

user

 

The username of the application user. Do not include the @ and the following hostname; that is for the test user created in this article. Just specify pyuser.

password

 

The password with which to authenticate. For the test user, this would be Py@pp4Demo.

ssl_ca

 

The path to the file containing the SSL certificate authority (CA).

ssl_cert

 

The path to the file containing the public SSL certificate.

ssl_cipher

 

The SSL cipher to use for the connection. You can get a list of valid ciphers by connecting to MySQL using SSL and executing the query SHOW GLOBAL STATUS LIKE ‘Ssl_cipher_list’; The current cipher in use can be determined through the Ssl_cipher session status variable.

ssl_disabled

 

Force a non-SSL connection.

ssl_key

 

The path to the file containing the private SSL key.

ssl_verify_cert

False

Whether MySQL Connector/Python should verify the certificate used by MySQL Server against the CA specified with the ssl_ca option.

The option names may seem familiar if you have, for example, used the MySQL command-line client. That is not a coincidence. Using these options, it is possible to demonstrate how to create connections.

Tip

There are a number of example programs in this article. All example programs that appear in a listing are available for download. See the discussion of example programs in this article for more information about using the example programs.

Connection Examples

It is time to combine the four ways of creating a connection as well as the most common connection option to create source code examples for creating MySQL Connector/Python connections. Listing 1 shows how to connect using the four ways of creating the connection. The examples are in the same order as when they were discussed earlier in the section.

import mysql.connector
connect_args = {
  "host": "127.0.0.1",
  "port": 3306,
  "user": "pyuser",
  "password": "Py@pp4Demo",
};
# ---- connect() function ----
db1 = mysql.connector.connect(
  **connect_args
)
print(
  "MySQL connection ID for db1: {0}"
  .format(db1.connection_id)
)
db1.close()
# ---- Explicit MySQLConnection ----
db2 = mysql.connector.MySQLConnection(
  **connect_args
)
print(
  "MySQL connection ID for db2: {0}"
  .format(db2.connection_id)
)
db2.close()
# ---- Two steps manually ----
db3 = mysql.connector.MySQLConnection()
db3.connect(**connect_args)
print(
  "MySQL connection ID for db3: {0}"
  .format(db3.connection_id)
)
db3.close()
# ---- All three steps manually ----
db4 = mysql.connector.MySQLConnection()
db4.config(**connect_args)
db4.connect()
print(
  "MySQL connection ID for db4: {0}"
  .format(db4.connection_id)
)
db4.close()

Listing 1 Examples of Connecting to MySQL

The four connections use the same connection options. Once the connection is created, the connection ID (from the MySQL Server side) of the connection is printed using the connection_id property of the connection. Finally, the connection is closed using the close() method. It is best practice to always explicitly close the connection when the application is done with it.

Tip

Always close the connection when you are done with it. Closing the connection ensures a clean disconnect from MySQL Server. It may also take some time before the server kills the connection; in the meantime, it occupies one of the available connections.

The output is similar to the following sample except the connection IDs will be different:

MySQL connection ID for db1: 13
MySQL connection ID for db2: 14
MySQL connection ID for db3: 15
MySQL connection ID for db4: 16

The config() method can also be invoked for an existing connection. Let’s discuss how to reconfigure the connection and reconnect next.

Reconfiguration and Reconnect

It is not commonly done, but it is possible to reconfigure an existing connection and reconnect. In this context, reconfiguring means potentially changing all options including the MySQL Server instance that the application is connected to. When such changes are made, it is necessary to explicitly tell MySQL Connector/Python to reconnect.

To reconfigure a connection, use the config() method in the same way as you did before the initial connection was made. Once the new, desired configuration has been created, call the reconnect() method if any of the configuration changes require a new connection. Calling reconnect() closes the old connection and creates a new one with the new configuration. Listing 2 shows an example of reconfiguring a connection.

import mysql.connector
initial_args = {
  "host": "127.0.0.1",
  "port": 3306,
  "user": "pyuser",
  "password": "Py@pp4Demo",
};
# Create initial connection
db = mysql.connector.connect(
  **initial_args
)
print(
  "Initial MySQL connection ID ...: {0}"
  .format(db.connection_id)
)
new_args = {
  "host": "<your_IP_goes_here_in_quotes>",
};
db.config(**new_args)
db.reconnect()
print(
  "New MySQL connection ID .......: {0}"
  .format(db.connection_id)
)
db.close()

 Listing 2 Reconfiguring a Connection

This example requires that there is a second user account on the same MySQL instance. The user is identical to the existing pyuser@localhost connection, but is defined to connect from the public IP address (replace with the IP address of your computer or if the IP address resolves with the hostname):

mysql> CREATE USER pyuser@'<your_IP_goes_here_in_quotes>'
                   IDENTIFIED BY 'Py@pp4Demo';
Query OK, 0 rows affected (0.84 sec)

It is also necessary that any firewall allows the connection.

As you can see from the example, it is not necessary to change all of the configuration options. Those that are not explicitly set in the new set of options keep their old value. The output of the program is (except for the IDs)

Initial MySQL connection ID ...: 21
New MySQL connection ID .......: 22

The last topic of the section is a few best practices regarding connections.

Connection Best Practices

There are a few best practices that are good to follow when it comes to connections. It is always difficult with best practices because all applications have unique requirements. So, the suggestions will focus on the technical side of MySQL Connector/Python.

The main best practices are

  • Always close the connection when you are done with it. This has already been discussed.
  • Use SSL (TLS) to encrypt the connections. This is particularly important if you are connecting to a remote host and even more so if the connection is over an insecure network. An exception is when you use a Unix socket for the connection as that is always considered a secure connection even without using SSL.
  • Do not hardcode the configuration into the source code. This particularly applies to the password.

Note

While MySQL uses ssl_ as the prefix for the options related to encrypting the connection, in reality TLS is used.

In MySQL 8.0 and in some builds of MySQL 5.7, SSL is enabled by default using a self-signed certificate, and MySQL Connector/Python will use an encrypted connection by default.

The examples thus far have had a major flaw: they hardcoded not only where to connect to, but also the username and particularly the password. This makes the code harder to maintain, and it is also a security concern because the password is visible to anyone with access to the source code. Hardcoding the connection options also means that either the development and production system must share connection options or the deployment procedure needs to change the source code to update the configuration parameters.

Caution

Never store the password in the source code.

Neither is a good option, so an alternative solution must be found. The next section will discuss an alternative: using configuration files.

 

Configuration Files

The method of specifying the connection options directly in with the call to create the connection to MySQL can be very useful for doing quick tests, but it is neither practical nor secure (for the password) to do in real applications. This section will start out discussing some alternatives and then will go into details of using MySQL configuration files.

Alternatives to Hardcoding the Configuration

There are several ways to avoid hardcoding the connection configuration into the source code. There are pros and cons for each method, so it is not a matter of one size fits all. Four methods will be discussed:

  • Asking for the information interactively
  • Using environment variables
  • Reading the information from the application’s own configuration file or as a command-line argument
  • Using a MySQL configuration file.

The interactive approach is great if you are writing a program that can be used by different users, so it is not known who the program will connect as. It is also the safest way to pass the password to the program. It is, however, not very convenient for more daemon-like processes to require starting the process manually each time it is necessary to restart it.

Environment variables can be used to specify options for the session. Child processes will inherit the environment of the parent, so the environment variables can be used to pass settings to a child process, such as from a shell to the application. It can be a great way to configure applications without requiring files or parsing the options on the command line. For example, it is a common way to configure applications running inside containers such as Docker.1

There are some downsides to using environment variables. When automatically starting processes, it will be necessary to store the environment variables in a file, which means it ends up being an alternative format for a configuration file. The environment is also in general long-lived; for example, if the application starts new processes on its own, it will by default pass on its environment, including potential secret information such as the password. The environment variables may also be readable by users with high privileges. So, care should be taken when using environment variables.

Using the application’s own configuration file or providing the options as command-line arguments means that all of the configuration is done in one place. In this case, the MySQL options are treated the same way as other options, and all that is required when writing the code is to pass the options and their values to the MySQL connection.

Caution

Be very careful with passwords as command-line options. It may be possible for other users on the host to see the arguments passed to the program, such as by using the ps command on Linux. So, it is recommended not to specify the password as a command-line argument.

There is another way, however. MySQL Connector/Python has native support for reading MySQL configuration files. Some of the reasons for using this approach over the application’s own configuration file are that the application may not need a configuration file except for the MySQL-related options, or there may be different owners of the application configuration and the MySQL configuration. The latter can happen if the developer is responsible for defining the behavior of the application itself but the database administrator is in charge of the MySQL-specific options.

Since this blog is about using the MySQL Connector/Python features rather than general Python programming, the only one of the four options to be discussed in more detail is the one using a MySQL configuration file.

Using MySQL Configuration Files

MySQL uses the INI file format for its configuration files. The following is a simple example for use with MySQL Connector/Python using the same configuration as previously in this article:

[connector_python]
user     = pyuser
host     = 127.0.0.1
port     = 3306
password = Py@pp4Demo

There are two connection options that control the use MySQL configuration files:

  • option_files: This option specifies the path to one or more configuration files to read. The value can either be a string or a list of strings. There is no default value.
  • option_groups: This option specifies which option groups to read from. The option group is specified as a name between square brackets; in the example configuration, the option group is connector_python. The value is a list of strings with the names of the group. The default is to read from the client and connector_python groups.

By convention, MySQL configuration files are called my.ini on Microsoft Windows and my.cnf on other platforms. There are no requirements for the file name or file extension from a functional point of view.

An important feature to be aware of is that the option_groups option does not treat all groups equally. Specifically, the connector_python group is special because all options in this group must be valid or a ValueError exception will be raised. For other groups, unknown options are ignored. The reason to ignore unknown options is that several programs may read the same option groups. For example, the client group is also read by the mysql command-line client and other MySQL client programs.

Listing 3 shows an example of connecting to MySQL with the connection options read from the my.ini file located in the same directory as the program.

 

import mysql.connector
db = mysql.connector.connect(
  option_files="my.ini")
print(__file__ + " - single config file:")
print(
  "MySQL connection ID for db: {0}"
  .format(db.connection_id)
)
db.close()

 Listing 3 Using a MySQL Configuration File

The output is similar to the previous examples printing the ID for the connection, for example:

listing_2_3.py - single config file:
MySQL connection ID for db: 35

In some cases, you may want to split the MySQL configuration into several files. For example, say several applications need to connect to the same MySQL backend, so they share the host and port information, but each application uses different credentials for the connection. Continuing the example, the two files, my_shared.ini and my_app_specific.ini, can be created with the following content:

my_shared.ini:
[client]
   host     = 127.0.0.1
   port     = 3306
   my_app_specific.ini:
[connector_python]
   user     = pyuser
   password = Py@pp4Demo

The only change required to the test program is to change the value of option_ files into a list. To demonstrate how the option_groups option can be set, it is also added to the program. The resulting source code can be seen in Listing 4.

import mysql.connector
db = mysql.connector.connect(
  option_files = [
    "my_shared.ini",
    "my_app_specific.ini"
  ],
  option_groups = [
    "client",
    "connector_python"
  ]
)
print(__file__ + " - two config files:")
print(
  "MySQL connection ID for db: {0}"
  .format(db.connection_id)
)
db.close()

Listing 4 Using Multiple Configuration Files

The output is (except for the ID, which will change from execution to execution) the following:

listing_2_4.py - two config files:
MySQL connection ID for db: 42

 One final consideration is path names. If a relative path is specified, it is the directory where Python is executed that is used as the base directory. Take, for example, the following command to execute a program (the ID will in general be different):

PS C:\MySQL> python Source/test.py
MySQL connection ID for db: 56

his is executed while C:\MySQL is the current working directory. If test.py has option_files="my.ini", then the my.ini file must be located in C:\MySQL.

Another observation is that for Microsoft Windows it is optional whether backslashes (\) or forward slashes (/) are used to separate the path components (directories).

This concludes the discussion of configuration files. The final topic of the article is the remaining options that MySQL Connector/Python supports for the connection.

 

General Configuration

Thus far the only configuration options that have been discussed are the ones required to specify where to connect to, whom to connect as, and whether to use SSL. There are several other options that are more related to the behavior of the application. These options are the topic of this section.

The options that were not included in the list of connection options earlier in the blog are summarized in Table 2 through Table 5 with one table for each of the following option types: connection, character set, query behavior, and warnings.

Connection

There are more connection options than were discussed in the “Creating the Connection” section. They are not as commonly used but can be required for some use cases. The options are summarized in Table 2. Some of the options will be discussed in more detail after the table.

Table 2 Less Common Connection-Related Options

Name

Default Value

Description

auth_plugin

 

Which authentication plugin to use. This is, for example, required when using MySQL Connector/Python 2.1 to connect to MySQL Server 8.0 because the server’s default authentication plugin is not supported by old MySQL Connector/Python versions.

client_flags

 

An alternative way to configure several options through flags.

compress

False

When enabled, the network traffic is compressed.

connection_timeout

 

How long to wait before timing out when creating the connection.

converter_class

 

Specifies a custom converter class for converting the raw row data to Python types.

failover

 

Tuple of dictionaries specifying alternative MySQL Server instances to fail over to if the primary connection fails. This is only supported using the mysql.connector.connect() function.

force_ipv6

False

When True, IPv6 is always used when possible.

pool_name

Auto generated

The name of a connection pool. By default, the name is generated by joining the values of the host , port , user , and database connection options. The name can be at most pooling.CNX_POOL_MAXNAMESIZE (defaults to 64) characters long and is allowed to use alphanumeric characters as well as the following characters: ., _, :, -, *, $, and #. This is only supported using the mysql.connector.connect() function or by instantiating the pooling.MySQLConnectionPool constructor class directly.

pool_reset_session

True

When True, the session variables are reset when the connection is returned to the pool. This is only supported using the mysql.connector.connect() function or by instantiating the pooling.MySQLConnectionPool constructor class directly.

pool_size

5

The number of connections to hold in the pool. The value must be at least 1 and at most pooling.CNX_POOL_MAXSIZE (defaulting to 32). This is only supported using the mysql.connector.connect() function or by instantiating the pooling.MySQLConnectionPool constructor class directly.

use_pure

False

When True, the pure Python implementation of the connector is used. When False, the C Extension is used. If the option is not specified, the default is to use the C Extension if it is installed; otherwise it falls back on the pure Python implementation. This is only supported using the mysql.connector.connect() function. In most cases, it is recommended to use the C Extension.

The compress option can be used to reduce the amount of network traffic by compressing the data transferred between the application and MySQL Server (and vice versa) at the cost of additional computational resources. This can be particularly useful if large SQL statements are sent to the server or large query results are returned to the application and the application is installed on a remote host.

Four options that deserve a little more attention are the failover and pool options. The failover option can be used to define one or more MySQL Server instances that MySQL Connector/Python will fail over to if the connection to the primary instance fails. Each alternative MySQL Server instance is specified as a dictionary in a tuple or list. The pool options set up a connection pool that the application can request connections from.

The client_flags option can be used to set several options. The list of options that is available can be determined using the get_full_info() method of the ClientFlag constants:

from mysql.connector.constants import ClientFlag
print("\n".join(
  sorted(ClientFlag.get_full_info())
))

 The output from Connector/Python 8.0.11 can be seen in Listing 5. The name of the client flag is first listed, followed by a description of what the flag controls. Most of the flags also have dedicated options, but there are a few additional flags such as INTERACTIVE that can only be set through the client_flags option.

CAN_HANDLE_EXPIRED_PASSWORDS : Don't close the connection for a connection with expired password
COMPRESS : Can use compression protocol
CONNECT_ARGS : Client supports connection attributes
CONNECT_WITH_DB : One can specify db on connect
DEPRECATE_EOF : Client no longer needs EOF packet
FOUND_ROWS : Found instead of affected rows
IGNORE_SIGPIPE : IGNORE sigpipes
IGNORE_SPACE : Ignore spaces before ''
INTERACTIVE : This is an interactive client
LOCAL_FILES : Can use LOAD DATA LOCAL
LONG_FLAG : Get all column flags
LONG_PASSWD : New more secure passwords
MULTI_RESULTS : Enable/disable multi-results
MULTI_STATEMENTS : Enable/disable multi-stmt support
NO_SCHEMA : Don't allow database.table.column
ODBC : ODBC client
PLUGIN_AUTH : Client supports plugin authentication
PLUGIN_AUTH_LENENC_CLIENT_DATA : Enable authentication response packet to be larger than 255 bytes
PROTOCOL_41 : New 4.1 protocol
PS_MULTI_RESULTS : Multi-results in PS-protocol
REMEMBER_OPTIONS :
RESERVED : Old flag for 4.1 protocol
SECURE_CONNECTION : New 4.1 authentication
SESION_TRACK : Capable of handling server state change information
SSL : Switch to SSL after handshake
SSL_VERIFY_SERVER_CERT :
TRANSACTIONS : Client knows about transactions

Listing 5 List of Client Flags

In order to configure client_flags, specify a list of the flags that should be enabled or disabled. To enable the flag, just specify the name of the flag; to disable the flag, prepend a minus sign. Listing 6 shows an example to tell the connection it is an interactive connection, but it cannot handle expired passwords.

import mysql.connector
from mysql.connector.constants import ClientFlag
connect_args = {
  "host": "127.0.0.1",
  "port": 3306,
  "user": "pyuser",
  "password": "Py@pp4Demo",
  "client_flags": [
    ClientFlag.INTERACTIVE,
    -ClientFlag.CAN_HANDLE_EXPIRED_PASSWORDS
  ]
};
db = mysql.connector.connect(
  **connect_args
)
print(__file__ + " - Client flags:")
print(
  "MySQL connection ID for db: {0}"
  .format(db.connection_id)
)
db.close()

Listing 6 Using Client Flags in the Connection

This gives the following output (again except for the values of the ID):

listing_2_6.py - Client flags:
MySQL connection ID for db: 60

The use_pure option can be used to specify whether the C Extension or the pure Python implementation of the connector will be used. The C Extension provides better performance than the pure implementation, particularly when working with large result sets and prepared statements. On the other hand, the pure Python implementation is supported on more platforms, has a few more features, and it is easier to modify the source code. The C Extension is the default in versions 8.0.11 and later when it is installed, whereas earlier versions used the pure Python implementation by default.

The C Extension can also be used by importing the _mysql_connector module instead of the usual mysql.connector module.

The other connection options will not be discussed in any more detail. Instead, focus will be moved to the character set options.

Character Set

The character set defines how characters are encoded. In the early days of the Internet, the ASCII character set was often used. ASCII uses seven bits for each character, which is space efficient but it means there are only 128 different characters available. This works reasonably well for plain text in English, but it is missing characters for other languages. Over the years, various other character sets have been used, such as the Latin character sets.

The locale-specific character sets help support all languages, but with the downside that different encodings are required for different languages. One response to that is the Unicode Transformation Format (UTF) encodings; UTF-8 in particular has become popular. UTF-8 uses a variable number of bytes to store characters. The original 128 ASCII characters have the same encoding in UTF-8; other characters use two to four bytes.

Until and including MySQL Server 5.7, the default character set for the server side was Latin1, but this changed in MySQL 8.0 when utf8mb4 became the default. The mb4 suffix indicates that up to four bytes are used for each character (mb = multi-byte). The reason this is required is that utf8 in MySQL previously has meant up to three bytes per character is supported per character. However, a three-byte UTF-8 implementation misses out on several emojis and it has been deprecated, so it is better to use the four-byte variant. The default character set for Connector/Python until version 8.0.12 is utf8, which is a three-byte implementation of UTF-8 (called utf8 or utf8mb3 in MySQL Server). Starting from version 8.0.12 the default is utf8mb4 as in MySQL Server.

There is also the concept of collation to consider. The collation defines how to compare two characters or character sequences with each other, such as whether ä and a should be considered the same character in comparisons and whether ss is considered equal to ß (German sharp s). The collation also defines the sorting order of characters and whether the comparison is case sensitive or not. Each character set has a default collation, but it is also possible to explicitly request a collation.

Tip

Unless you have specific country requirements, the default collation in MySQL Server is often a good choice when choosing utf8 or utf8mb4 as the character set.

The character sets and collations that are available in MySQL do not, in general, change much between versions. However, one of the major changes for MySQL Server 8.0 is the addition of a range of UCA 9.0.0 collations. Information about the available character sets and their default collation can be found using the CHARACTER_SETS table in the Information Schema, as shown in Listing 7.

mysql> SELECT CHARACTER_SET_NAME AS Name,
              DEFAULT_COLLATE_NAME
         FROM information_schema.CHARACTER_SETS
        ORDER BY CHARACTER_SET_NAME;
+----------+----------------------+
| Name     | DEFAULT_COLLATE_NAME |
+----------+----------------------+
| armscii8 | armscii8_general_ci  |
| ascii    | ascii_general_ci     |
| big5     | big5_chinese_ci      |
| binary   | binary               |
...
| ujis     | ujis_japanese_ci     |
| utf16    | utf16_general_ci     |
| utf16le  | utf16le_general_ci   |
| utf32    | utf32_general_ci     |
| utf8     | utf8_general_ci      |
| utf8mb4  | utf8mb4_0900_ai_ci   |
+----------+----------------------+
41 rows in set (0.00 sec)

Listing 7 Character Set Collations in MySQL 8.0.11

Similarly, the collations available for a specific character set can be determined using the COLLATIONS table. Listing 8 shows the output for the utf8mb4 character set.

mysql> SELECT COLLATION_NAME, IS_DEFAULT
         FROM information_schema.COLLATIONS
        WHERE CHARACTER_SET_NAME = 'utf8mb4';
+----------------------------+------------+
| COLLATION_NAME             | IS_DEFAULT |
+----------------------------+------------+
| utf8mb4_general_ci         |            |
| utf8mb4_bin                |            |
| utf8mb4_unicode_ci         |            |
...
| utf8mb4_0900_ai_ci         | Yes        |
| utf8mb4_de_pb_0900_ai_ci   |            |
| utf8mb4_is_0900_ai_ci      |            |
| utf8mb4_lv_0900_ai_ci      |            |
...
| utf8mb4_vi_0900_as_cs      |            |
| utf8mb4_ja_0900_as_cs      |            |
| utf8mb4_ja_0900_as_cs_ks   |            |
| utf8mb4_0900_as_ci         |            |
| utf8mb4_ru_0900_ai_ci      |            |
| utf8mb4_ru_0900_as_cs      |            |
+----------------------------+------------+
73 rows in set (0.00 sec)

 

Listing 8 The Collations Available for the utf8mb4 Character Set

The output shows the 73 collations that are available for utf8mb4 in MySQL Server 8.0.11. The collation names consist of several parts:

  • The character set name
  • Which country the collation is for (for example ja for Japan) or whether it is of a more general nature
  • Modifiers (accents): These are not present for all collations. Examples are ai for accent insensitive, as for accent sensitive, ci for case insensitive, and cs for case sensitive.

Tip

The topic of character sets and collations in MySQL is large. For a deeper discussion, see https://dev.mysql.com/doc/refman/en/charset.html and references therein.

There are three options related to characters sets and collations for MySQL Connector/Python. These are summarized in Table 3.

Table 3 Character Set-Related Options

Name

Default Value

Description

charset

utf8mb4

The character set used for the connection. In MySQL Connector/Python 8.0.11 and earlier, the default is utf8. In most cases, it is recommended to use utf8mb4.

collation

utf8mb4_general_ci

The collation to use for comparisons and ordering of strings. In many cases, the default can be used. The default value for MySQL Connector/Python 8.0.11 and earlier is utf8_general_ci. In MySQL Server 8.0, the default collation for the utf8mb4 character set is utf8mb4_0900_ai_ci which is often a good choice unless specific requirements exist.

use_unicode

True

Whether to return strings in query results as Python Unicode literals. The default is True, and this is usually also the best value to use.

Listing 9 shows an example of configuring the character set-related options.

import mysql.connector
connect_args = {
  "host": "127.0.0.1",
  "port": 3306,
  "user": "pyuser",
  "password": "Py@pp4Demo",
  "charset": "utf8mb4",
  "collation": "utf8mb4_unicode_ci",
  "use_unicode": True
};
db = mysql.connector.connect(
  **connect_args)
print(__file__ + " - Setting character set:")
print(
  "MySQL connection ID for db: {0}"
  .format(db.connection_id)
)
db.close()

It is possible to change the character set and collation used by the connection after the initial connection to MySQL Server has been made. The best way to do this is to change the charset and collation properties of the connection using the set_charset_collation() method as demonstrated in Listing 10. Notice that unlike the rest of the examples, this example first instantiates the MySQLConnection class to be able to print the initial character set and collation before creating the connection.

import mysql.connector
db = mysql.connector.MySQLConnection()
# Print banner and initial settings
print(
  "{0:<9s}   {1:<7s}   {2:<18s}".format(
    "Stage", "charset", "collation"
  )
)
print("-" * 40)
print(
  "{0:<9s}   {1:<7s}   {2:<18s}".format(
    "Initial", db.charset, db.collation
  )
)
# Create the connection
connect_args = {
  "host": "127.0.0.1",
  "port": 3306,
  "user": "pyuser",
  "password": "Py@pp4Demo"
};
db.connect(**connect_args)
# The connection does not change the
# settings
print(
  "{0:<9s}   {1:<7s}   {2:<18s}".format(
    "Connected",
    db.charset, db.collation
  )
)
# Change only the character set
db.set_charset_collation(
  charset = "utf8mb4"
)
print(
  "{0:<9s}   {1:<7s}   {2:<18s}".format(
    "Charset", db.charset, db.collation
  )
)
# Change only the collation
db.set_charset_collation(
  collation = "utf8mb4_unicode_ci"
)
print(
  "{0:<9s}   {1:<7s}   {2:<18s}".format(
    "Collation",
    db.charset, db.collation
  )
)
# Change both the character set and
# collation
db.set_charset_collation(
  charset   = "latin1",
  collation = "latin1_general_ci"
)
print(
  "{0:<9s}   {1:<7s}   {2:<18s}".format(
    "Both", db.charset, db.collation
  )
)
db.close()

Listing 10 Changing the Character Set of a Connection

As you can see from the example, the character set and collation properties can be used even before the connection has been established. However, it is not possible to use the set_charset_collation() method to change the character set or collation until the connection has been established.

Note

Always use the set_charset_collation() method to change the character set and/or collation for a connection. Compared with executing SET NAMES as an SQL statement directly, it ensures that Connector/Python knows which settings are used for converting bytearrays into Python strings, the character set and collation choice are validated against those known by Connector/Python, and the C Extension settings are kept in sync.

Establishing the connection will not change the value of the charset and collation properties. The character set can be changed on its own, in which case the collation is set to the default for the character set. In this case, the character set is set to utf8mb4, so the default character set is utf8mb4_general_ci.

The collation can also be set separately, and finally both the character set and collation are set. The output of executing the program in Listing 10 using version 8.0.11 is

Stage       charset   collation
----------------------------------------
Initial     utf8      utf8_general_ci
Connected   utf8      utf8_general_ci
Charset     utf8mb4   utf8mb4_general_ci
Collation   utf8mb4   utf8mb4_unicode_ci
Both        latin1    latin1_general_ci

 

If you are using MySQL Connector/Python 8.0.12 or later, the character set and collation for Initial and Connected are utf8mb4 and utf8mb4_general_ci.

Query Behavior

There are several options that control how queries behave. These range from defining whether features are allowed over transaction configuration to defining how MySQL Connector/Python will handle the results. The options are listed in Table 4.

Table 4 Query-Related Options

Name

Default Value

Description

allow_local_infile

True

Whether the LOAD DATA LOCAL INFILE statement is allowed.

autocommit

False

When True, an implicit COMMIT is executed after each query.

buffered

False

When True, the result set is fetched immediately and buffered in the application.

consume_results

False

When True, query results are fetched automatically if there are unfetched rows and a new query is executed.

database

 

Which database (schema) to use as the default for queries where the database name is not explicitly given for a table.

raw

False

By default, result values are converted to Python types when cursors are used. When setting this option to True, the results are returned without conversion.

sql_mode

(Server default)

The SQL mode used when executing queries. See https://dev.mysql.com/doc/refman/en/sql-mode.html.

time_zone

 

When set, timestamps are converted to that time zone instead of using the server-side time zone.

 

Warnings

It is very important that warnings and errors are handled in the correct way. Failure to do so can result in corrupted or lost data. There are two options that control how MySQL Connector/Python handles warnings when you use a cursor. The options are shown in Table 5.

Table 5 Warning-Related Options for Cursors

Name

Default Value

Description

get_warnings

False

When set to True, warnings are automatically fetched after each query.

raise_on_warnings

False

When set to True, warnings cause an exception to be raised.

 

Summary

This article went through how to create and configure a connection from a Python program to a MySQL Server database. The following topics were discussed:

  • Four different ways to establish the connection including the initial configuration. The mysql.connector.connect() function is the most flexible of the four methods.
  • The configuration options.
  • Best practices for connections: Close connections, use SSL/TLS to encrypt the traffic, do not hardcode connection options (particularly the password) in the source code.
  • MySQL configuration files.
  • Character sets.

It is all well and good to be able to create the connection to the database, but it is not very useful unless you can execute queries.

 

 

Вас заинтересует / Intresting for you:

Connecting with a MySQL databa...
Connecting with a MySQL databa... 2285 views Valerij Sun, 26 Aug 2018, 11:27:52
MySQL Database optimization ba...
MySQL Database optimization ba... 2500 views Stepan Ushakov Fri, 05 Oct 2018, 17:15:28
MySQL Installation manual (bin...
MySQL Installation manual (bin... 1312 views Александров Попков Wed, 05 Dec 2018, 14:36:19
Determining characteristics of...
Determining characteristics of... 1297 views Valerij Sun, 26 Aug 2018, 12:52:37