I have these jobs that run in parallel that call a series of queries on my DB. They’re batched so each job is associated with a group of “bar” objects. So in the example below, temp1
is just a temporary table like:
CREATE TEMPORARY TABLE temp1 (id int(11) NOT NULL, PRIMARY KEY (id))
The problem I’m having is queries like this are deadlocking:
explain update foo join temp1 on foo.bar_id = temp1.id set foo.a = null;
Gives the output (please excuse the formatting):
id select_type table partitions type possible_keys key key_len ref rows filtered Extra 1 UPDATE foo NULL ALL bar_id NULL NULL NULL 222939 100.00 Using where 1 SIMPLE temp1 NULL eq_ref PRIMARY PRIMARY 4 db.foo.bar_id 1 100.00 Using index
So as I understand it, this will lock every row while updating because of that second line. As a sort of band aid, if I put the IDs to update in a temp table everything works fine and the deadlocks stop:
create temporary table temp2 as ( select foo.id from foo join temp1 on foo.bar_id = temp1.id ); explain update foo join temp2 on foo.id = temp2.id set foo.a = null;
Gives:
id select_type table partitions type possible_keys key key_len ref rows filtered Extra 1 SIMPLE temp2 NULL ALL NULL NULL NULL NULL 1135 100.00 NULL 1 UPDATE foo NULL eq_ref PRIMARY PRIMARY 4 db.temp2.id 1 100.00 NULL
So not only does this fix my problem in practice, that result also looks a lot better.
This is where my MySQL knowledge fails me. I don’t understand why the first query is doing that full scan. I thought it was because it’s not using an index but I tried stuff like force index for join (bar_id)
to no avail.
Is there something better I can do here? I’d at least like to understand what the problem is. Thanks!