Description
Describe the bug
Sometimes the query may return wrong results under high load, on Non-Windows clients.
To reproduce
- .NET Core
- MSSQL instance
- Client is not running on Windows (reproduced on MacOS or Linux)
- Query uses async reader methods
- High load that saturates the connection pool and causes timeouts.
Bug is seen more often when MultipleActiveResultSets=True
(MARS) is in the connection string. It might be prevented by disabling MARS (unclear at this point).
The issue is hard to reproduce and rare, therefore this project was created with the code that attempts to simulate the situation.
This program starts 2000 concurrent connections that runs select @Id as Id
statement. Each Id
is different, and the query result should always return
the id that was queried, yet under certain circumstances that's not the case. It looks like
sometimes the reader returns the stale data from random previous connection that experienced the timeout.
There are more details in the linked project.
Network latency can be more easily simulated using this method here: https://github.com/trainline/SqlClient659
Expected behavior
Under no circumstances a select statement should return a different result.
Further technical details
Microsoft.Data.SqlClient version: 4.8.1 or earlier
.NET target: netcoreapp2.0 - netcoreapp3.1
SQL Server version: SQL Server Web Edition 13.00.5426.0.v1, SQL Server Express Edition 14.00.3281.6.v1
Operating system: macOS 10.15.5, Docker container
Workaround
Do not use this SQL client on linux/docker.
Always check the id of the returned result and retry or crash if it is wrong.
To decrease the likelihood of this or maybe prevent it (unclear at this point), do not use MARS feature, if you are using .NET Core with MSSQL database, ensure that the connection string does not contain MultipleActiveResultSets=True
.