Let’s say I have the following two tables:
FILE_TBL ID (primary key) (some other stuff) TAGS_TBL ID (primary key) FILE_ID (references FILE_TBL.FILE_ID, indexed) TAG (varchar, indexed)
This is a many-to-one relationship; that is a single file can have any number of tags. Let’s say I’m searching for all files that have the tag “finance”. There are 3 ways I could do this:
Using a subquery:
select * from FILE_TBL where ID in (select FILE_ID from TAGS_TBL where TAG = 'finance')
Using a join with only on:
select FILE_TBL.* from FILE_TBL join TAGS_TBL on TAGS_TBL.FILE_ID = FILE_TBL.ID and TAGS_TBL.TAG = 'finance'
Using a join with with where:
select FILE_TBL.* from FILE_TBL join TAGS_TBL on TAGS_TBL.FILE_ID = FILE_TBL.ID where TAGS_TBL.TAG = 'finance'
Of these 3 methods, which one would you expect to perform the best on PostgreSQL for Aurora (frontend is based on Postgres 9.6.1, backend is amazon proprietary thing), and why? Note that I’m not actually selecting any data from TAGS_TBL
or using anything from it elsewhere in the query.
The “where in” version with the subquery seems the most readable and straightforward, but my gut says that it’s also the slowest.
Thanks!