Skip to content

Client hung after end of results during heavy load #285

Closed
@encryptio

Description

@encryptio

I'm running many concurrent transactions (half read, half write, often with statements which cause large write locks on rows/index pages) with tx_isolation=serializable, and rerunning transactions where any statement returns a *MySQLError with Number == ER_LOCK_DEADLOCK (1213). There are very often deadlock errors (~0.015 retries per completed transaction), which is expected under the kind of queries and load I'm putting on the system.

This code runs fine for a while, but occasionally (more often in heavy load), a client connection will get locked up waiting for an extra packet at the end of a result list that the server doesn't think needs to be sent. Unfortunately, this happens a hell of a lot under heavy load (once every couple of minutes at 100 transactions per second sustained.)

The client is blocked (forever) on:

1 @ 0x4237d6 0x42e4c6 0x42de2a 0x52c314 0x52c370 0x52d6f0 0x53cac5 0x4bceac 0x4bcfef 0x4c0ad1 0x4c4347 0x4c879f 0x49f3b8 0x49ec6f 0x490a1d 0x48ca2f 0x406ce3 0x406b56 0x48eaad 0x48e865 0x4066d7 0x4239a0
#   0x42e4c6    netpollblock+0xa6                           /usr/lib/golang/src/pkg/runtime/netpoll.goc:280
#   0x42de2a    net.runtime_pollWait+0x6a                       /usr/lib/golang/src/pkg/runtime/netpoll.goc:116
#   0x52c314    net.(*pollDesc).Wait+0x34                       /usr/lib/golang/src/pkg/net/fd_poll_runtime.go:81
#   0x52c370    net.(*pollDesc).WaitRead+0x30                       /usr/lib/golang/src/pkg/net/fd_poll_runtime.go:86
#   0x52d6f0    net.(*netFD).Read+0x2a0                         /usr/lib/golang/src/pkg/net/fd_unix.go:204
#   0x53cac5    net.(*conn).Read+0xc5                           /usr/lib/golang/src/pkg/net/net.go:122
#   0x4bceac    github.com/go-sql-driver/mysql.(*buffer).fill+0x1dc         /home/croll/local/go/src/github.com/go-sql-driver/mysql/buffer.go:55
#   0x4bcfef    github.com/go-sql-driver/mysql.(*buffer).readNext+0x3f          /home/croll/local/go/src/github.com/go-sql-driver/mysql/buffer.go:76
#   0x4c0ad1    github.com/go-sql-driver/mysql.(*mysqlConn).readPacket+0x61     /home/croll/local/go/src/github.com/go-sql-driver/mysql/packets.go:30
#   0x4c4347    github.com/go-sql-driver/mysql.(*mysqlConn).readUntilEOF+0x27       /home/croll/local/go/src/github.com/go-sql-driver/mysql/packets.go:641
#   0x4c879f    github.com/go-sql-driver/mysql.(*mysqlRows).Close+0x6f          /home/croll/local/go/src/github.com/go-sql-driver/mysql/rows.go:53
#   0x49f3b8    database/sql.(*Rows).Close+0x78                     /usr/lib/golang/src/pkg/database/sql/sql.go:1580
#   0x49ec6f    database/sql.(*Rows).Next+0xdf                      /usr/lib/golang/src/pkg/database/sql/sql.go:1511
#   0x490a1d    go.codemobs.com/gom/db/mysql.ScanRowsToObjects+0x4d         /home/croll/local/go/src/go.codemobs.com/gom/db/mysql/map.go:104
#   0x48ca2f    go.codemobs.com/gom/db/mysql.(*Ctx).Get+0x2ff               /home/croll/local/go/src/go.codemobs.com/gom/db/mysql/ctx.go:35
#   0x406ce3    main.scheduleJob+0xb3                           /home/croll/local/go/src/go.codemobs.com/vps/transcode-master/scheduler.go:100
#   0x406b56    main.scheduleJobs+0x3c6                         /home/croll/local/go/src/go.codemobs.com/vps/transcode-master/scheduler.go:85
#   0x48eaad    go.codemobs.com/gom/db/mysql.(*MySQLBackend).runTxWithRetryCount+0x14d  /home/croll/local/go/src/go.codemobs.com/gom/db/mysql/db.go:81
#   0x48e865    go.codemobs.com/gom/db/mysql.(*MySQLBackend).RunTx+0x65         /home/croll/local/go/src/go.codemobs.com/gom/db/mysql/db.go:54
#   0x4066d7    main.schedulerLoop+0x57                         /home/croll/local/go/src/go.codemobs.com/vps/transcode-master/scheduler.go:26

But the server thinks the connection is idle:

mysql> show processlist;
+--------+------+-----------------+--------+---------+------+-------+------------------+
| Id     | User | Host            | db     | Command | Time | State | Info             |
+--------+------+-----------------+--------+---------+------+-------+------------------+
| 162017 | root | localhost:45301 | master | Sleep   |   14 |       | NULL             |
| 162024 | root | localhost       | NULL   | Query   |    0 | NULL  | show processlist |
+--------+------+-----------------+--------+---------+------+-------+------------------+

It's not specific to the scheduler transactions either; it appears to happen randomly, roughly in proportion to what I expect the number of rows locked in each transaction is.

I suspect either the mysqld implementation or the go mysql driver have a bug somewhere in their protocol handling, but I don't know where to go from here.

Using go-sql-driver/mysql as of 9543750 Merge pull request #280 from hyandell/master. I can make this happen reliably with MySQL 5.5.37 and 10.0.14-MariaDB, and with Go 1.2.2 and Go 1.3.3, and have not found a version of anything on which it does not lock up.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions