Maximilian Riegler

EF Core 5 Upgrade - collation shenanigans

Posted on efcoredotnetcsharpsql

Today I stumbled over a pretty weird problem at work, where I already initially thought it must've been from either a framework or a database upgrade.

I was in the middle of doing some code first database entity modelling, creating a migration for it and updating the database, when I got struck by the following error from my dev database:

errno: 150 "Foreign key constraint is incorrectly formed"

This of course made me check all navigation properties on my entity, if there maybe was some type mismatch, but it all looked good. Then I looked at the error message and generated migration SQL again and there it was:

    `Id` char(36) COLLATE ascii_general_ci NOT NULL,
    `OtherEntityId` char(36) COLLATE ascii_general_ci NOT NULL,
    `CreatedOn` datetime(6) NULL,

ascii_general_ci ...? I didn't remember us switching to ASCII, I mean we were using utf8mb4_general_ci everywhere, right?

The DbContextModelSnapshot.cs didn't reveal a whole lot regarding that, only some utf8mb4 sprinkled here and there, so I checked the database - surely we don't use ASCII anywhere.

FROM information_schema.columns 
| NULL               |      931 |
| ascii_general_ci   |        1 |
| utf8mb4_bin        |        7 |
| utf8mb4_general_ci |      377 |
4 rows in set (0.004 sec)

Looked like I already had one column with the wrong collation in my dev database and it was from an unfinished migration. So that means this problem existed for quite a while, I just never saw it happening, because I had no database updates with the type Guid on a navigation property involved.

After some more research online I found out, that apparently since we moved to .NET 5.0 we also reeled in this breaking change in our EF Core tools as described here by the Github user lauxjpn with this wonderful solution:

// The following applies the "utf8mb4" charset recursively only to columns, without changing the default charset of
// the database or its tables.
modelBuilder.HasCharSet("utf8mb4", DelegationModes.ApplyToColumns);

// The following disables the newly introduced default collation 'ascii_general_ci' for character based Guid columns.

So that's what I tried and surely enough I got a new migration with a LOT of overhead, where every Guid property in our database layer has been explicitly set to the utf8mb4_general_ci collation. But that seems to be expected and supposedly does no harm.