-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Description
Question
On TPH tables, there are separate columns for each derived type with indexes looking something like this (using PostgreSQL):
CREATE UNIQUE INDEX "IX_BaseType_DerivedTypeId" ON public."BaseTypes" ("DerivedTypeId")I'm wondering what the reasons are for these not being made partial/filtered by default:
CREATE UNIQUE INDEX "IX_BaseType_DerivedTypeId" ON public."BaseTypes" ("DerivedTypeId")
WHERE "DerivedTypeId" IS NOT NULL(see posted code for a slightly more complete example and actual generated SQL from EF 9.0.2)
Having partial (ie filtered) indices substantially reduces their size on large/wide tables with no apparent (?) downside. As far as I can tell, the only thing that will be impacted is if one were to do something like
var results = context.Entity.Where(x => x.ForeignEntityId == null).ToList();which on large tables - even with a non-partial index - tend to perform poorly due to low selectivity and might easily cause the planner to opt for a full table scan anyway.
We have several wide TPH tables in our application with >100M rows, and it feels a bit cumbersome to manually override each and every index EF generates here to avoid the severe bloat that comes with having them non-partial. We're currently using ModelBuilder to handle this automatically per entity, but the code seems fairly complex/brittle, so we're on the lookout for a better/more idiomatic approach.
I've tried looking around the various ef/npgsql repos for any discussion on this, but have come up short. Would greatly appreciate any advice/pointers here.
(Originally posted on npgsql/efcore.pg#3735)
Your code
public class RootEntity
{
public int Id { get; set; }
public SubEntityA A { get; set; } = null!;
public SubEntityB B { get; set; } = null!;
public SubEntityC C { get; set; } = null!;
}
public abstract class BaseEntity
{
public int Id { get; set; }
}
public class SubEntityA : BaseEntity;
public class SubEntityB : BaseEntity;
public class SubEntityC : BaseEntity;
// ...
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
modelBuilder.Entity<RootEntity>().HasOne(x => x.A).WithOne().HasForeignKey<SubEntityA>("RootEntityId").IsRequired();
modelBuilder.Entity<RootEntity>().HasOne(x => x.B).WithOne().HasForeignKey<SubEntityB>("RootEntityId").IsRequired();
modelBuilder.Entity<RootEntity>().HasOne(x => x.C).WithOne().HasForeignKey<SubEntityC>("RootEntityId").IsRequired();
}
/*
This yields the following SQL in PostgreSQL:
*/
CREATE TABLE "BaseEntities" (
"Id" integer GENERATED BY DEFAULT AS IDENTITY,
"Discriminator" character varying(13) NOT NULL,
"RootEntityId" integer,
"SubEntityB_RootEntityId" integer,
"SubEntityC_RootEntityId" integer,
CONSTRAINT "PK_BaseEntities" PRIMARY KEY ("Id"),
CONSTRAINT "FK_BaseEntities_RootEntities_RootEntityId" FOREIGN KEY ("RootEntityId") REFERENCES "RootEntities" ("Id") ON DELETE CASCADE,
CONSTRAINT "FK_BaseEntities_RootEntities_SubEntityB_RootEntityId" FOREIGN KEY ("SubEntityB_RootEntityId") REFERENCES "RootEntities" ("Id") ON DELETE CASCADE,
CONSTRAINT "FK_BaseEntities_RootEntities_SubEntityC_RootEntityId" FOREIGN KEY ("SubEntityC_RootEntityId") REFERENCES "RootEntities" ("Id") ON DELETE CASCADE
);
-- Note there's no filter defined; each index contains the full amount
-- of rows in the table (so even if 99.9% of the table consists is SubEntityB rows,
-- all indices are equally large in terms of bytes, wasting RAM/shared_buffers)
CREATE UNIQUE INDEX "IX_BaseEntities_RootEntityId" ON "BaseEntities" ("RootEntityId");
CREATE UNIQUE INDEX "IX_BaseEntities_SubEntityB_RootEntityId" ON "BaseEntities" ("SubEntityB_RootEntityId");
CREATE UNIQUE INDEX "IX_BaseEntities_SubEntityC_RootEntityId" ON "BaseEntities" ("SubEntityC_RootEntityId");Stack traces
Verbose output
EF Core version
9.0.2
Database provider
Npgsql.EntityFrameworkCore.PostgreSQL
Target framework
.NET10
Operating system
MacOS Tahoe 26.3
IDE
Rider 2025.3.1