Differentiate between BINARY/CHAR

MySQL provides `BINARY` and `VARBINARY` data types. These are treated as strings with the `binary` character set. As far as MySQL is concerned, the only real difference between binary and text fields is the character set (although this is a bit of a simplification).

This is why, when using  `ColumnTypes()`, the `mysql` driver returns `CHAR` or `VARCHAR`; there's no notion of a `BINARY` field type internally.

### Motivation
The use case I have is for an application that ingests data from multiple sources and writes it to a common format. This requires identifying column types in the input database (which is MySQL, in this case) to ensure they can be supported in the common format.

This common format supports UTF-8 strings, but not arbitrary binary strings. Therefore, I'd like to be able to identify whether a column that the driver says is `CHAR` is actually a text field, and not a byte slice.

### Possible Solutions
I can see two possible approaches that would benefit my use case here:
1. Provide access to a column's character set
2. Switch over character set to return `BINARY` or `VARBINARY` when appropriate

#### Option 1: Provide access to a column's character set
The benefit of this option is that it opens the way for users of the driver to do more than just distinguish between `BINARY` and `CHAR`. One could potentially handle different character sets in different ways, which I can imagine could be useful with unusually-configured databases.

The major disadvantage is that I'm not aware of an existing interface in the stdlib 'sql' package for returning this sort of information. Doing so would require providing some new MySQL specific interface for getting this information out of the driver. Not impossible, but could be tricky to get right.

#### Option 2: Switch over character set to return `BINARY` or `VARBINARY` when appropriate
This option should be relatively straightforward to implement. I believe the packet received by the driver already contains information about the character set for string types. We could store this information when reading field information from a packet, and later switch over it when we need to know the column type.

The character set for binary fields is `63`, as evidenced here: https://dev.mysql.com/doc/dev/mysql-server/latest/page_protocol_basic_character_set.html.

This option would require some modest changes to `fields.go` and `packets.go` to provide the extra logic required. I believe this sort of approach was discussed in https://github.com/go-sql-driver/mysql/pull/667, with little opposition, but the decision was to wait for some feedback from users before implementing.

### Conclusion
I suggest option 2 be implemented, as it requires the smallest change and solves the use case I presented above. Option 1 could still be implemented later if it was found to be useful.

Subject to any feedback on what I've proposed above, I'll try implementing option 2 with the aim of producing a suitable PR.

Let me know if you have any questions or comments or if you think I've missed anything. 


_Note, there is also a similar issue with the `TEXT` and `BLOB` data types (and their variants). The difference here is that these fields are read as `BLOB`s by default (whether they are actually `TEXT` or `BLOB`.  The fix should be more or less the same in this case._



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Differentiate between BINARY/CHAR #723

Motivation

Possible Solutions

Option 1: Provide access to a column's character set

Option 2: Switch over character set to return `BINARY` or `VARBINARY` when appropriate

Conclusion

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Differentiate between BINARY/CHAR #723

Description

Motivation

Possible Solutions

Option 1: Provide access to a column's character set

Option 2: Switch over character set to return BINARY or VARBINARY when appropriate

Conclusion

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Option 2: Switch over character set to return `BINARY` or `VARBINARY` when appropriate