Speed up bulk-exists check with python sets

Posted on Sat 01 August 2020 in Better Django

image

The .exists() function of Django's ORM is very useful. However, if you need to do this in bulk (think hundreds of thousands or more), this becomes a strain on your database. Let's say you are fetching records from an external system, and if the record doesn't exist locally, you need to do something:

for item in external_records:
    if not Data.objects.filter(external_id=item.id).exists():
        # Do something

If you are working with a large quantity of records, this will flood your DB with queries. Instead, you could first pull all external_id values from the DB:

existing_ids = Data.objects.all …

Continue reading

Early Exit

Posted on Sun 12 July 2020 in Better Python

Photo

You will have definitely come across the following pattern often:

if post_data:
    thing = post_data.get("thing")
    if thing is not None:
        setting = get_user_setting(user, thing)
        if setting is not None:
            permission = get_user_permission(user)
            if permission is True:
                if user.is_superuser:
                    return DataPoint.objects.all()
                elif user.is_staff:
                    return DataPoint.objects.filter(user=user)
                else:
                    return DataPoint.objects.filter(user=user, thing=thing)
            else:
                return "Permission denied"
        else:
            return "Setting not found"
    else:
        return "Thing not specified"
else:
    return "No data posted"

What is good about the code above? It follows the thought process that most developers would have: which conditions …


Continue reading

Faster updating of dictionaries

Posted on Thu 09 July 2020 in Better Python

Photo

When updating a dictionary, no one would object to this:

ages["Pete"] = 42

But what about this?

ages["Tom"] = 29
ages["Linda"] = 37
ages["Pam"] = 62
ages["Coraline"] = 21

Still fine, right? There is another way of writing this, using the .update() method.

ages.update(
    {
        "Tom": 29,
        "Linda": 37,
        "Pam": 62,
        "Coraline": 21,
    }
)

Which one do you prefer? Would you say it's a matter of preference? I always preferred using the .update() method when there were at least 3 updates being done. But one day I wondered about the performance of each method. Let's see if one is slower than …


Continue reading