Given the following schema
CREATE TABLE categories ( id UNIQUEIDENTIFIER PRIMARY KEY, name NVARCHAR(50) ); CREATE TABLE [group] ( id UNIQUEIDENTIFIER PRIMARY KEY ); CREATE TABLE logger ( id UNIQUEIDENTIFIER PRIMARY KEY, group_id UNIQUEIDENTIFIER, uuid CHAR(17) ); CREATE TABLE data ( id UNIQUEIDENTIFIER PRIMARY KEY, logger_uuid CHAR(17), category_name NVARCHAR(50) );
And the following rules
- Each
data
record references alogger
and acategory
- Each
logger
will always have agroup
- Each
group
can have multiplelogger
s
What would be the optimum way to achieve the a resultset like
category_id | logger_group_count -------------------------------- 12345 4 67890 2 ..... ...
i.e. count the no. of groups for each category where a logger has recorded data?
As an initial stab I came up with:
SELECT g.id, COUNT(DISTINCT(a.id)) AS logger_group_count FROM categories g LEFT OUTER JOIN data d ON d.category_name = g.name INNER JOIN logger s ON s.uuid = d.logger_uuid INNER JOIN group a ON a.id = s.group_id GROUP BY g.id
But is extremely slow (~45s), data
has 400k+ records – here’s the query plan.