Speed up bulk-exists check with python sets
Posted on Sat 01 August 2020 in Better Django
The .exists()
function of Django's ORM is very useful. However, if you need to do this in bulk (think hundreds of thousands or more), this becomes a strain on your database. Let's say you are fetching records from an external system, and if the record doesn't exist locally, you need to do something:
for item in external_records:
if not Data.objects.filter(external_id=item.id).exists():
# Do something
If you are working with a large quantity of records, this will flood your DB with queries. Instead, you could first pull all external_id
values from the DB:
existing_ids = Data.objects.all …
Continue reading