Developers / Platform

Salesforce SOQL: Queries and Limits

By Alex Crisp

Salesforce Object Query Language (not to be confused with SQL) – or SOQL (commonly pronounced sock-el or soak-el) – is the query language that powers our interactions with the Salesforce database. While it is commonly used by developers when writing Apex, its uses go far beyond that, allowing both admins and developers to retrieve data from within the org and produce powerful reports on the data when exporting the results.

While most developers have a basic understanding of how to perform queries (and most admins will most likely be able to follow the basic syntax), SOQL can do a lot more than simply retrieving records from the database. Today, let’s explore what makes up a SOQL query and how we can take our queries to the next level.

Anatomy of a Query

The basic syntax of a query looks relatively straight forward; a SELECT statement here and a few fields there, and you have a basic query. However, the syntax has a lot more detail to it than may be initially present when looking at a simple query. 

We can break down queries into different blocks (commonly called clauses), which define which data should be retrieved, and how it should be formatted.

At a minimum, all queries contain a SELECT clause: this is the part of the query that determines which fields to be returned, followed by the FROM keyword to state the object which is being queried.

The SELECT clause can then be followed by one or more of the optional clauses which can drastically alter the results:

  • WHERE clause: used to filter out the records to be returned by providing logical operators which can be combined using the AND or OR operators.
  • TYPEOF clause: used to provide fields to select when referencing polymorphic lookups, e.g. to query Tasks which are related to a specific object.
  • WITH clause: used to specify additional query behavior; this could be enforcing field level permissions via WITH SECURITY_ENFORCED or to specify a data category when querying knowledge articles or questions.
  • ORDER BY clause: does what it says on the tin and orders the query results in a specific way.
  • GROUP BY clause: this clause drastically changes the behavior of the query. Rather than returning individual records as results, the GROUP BY query changes the query into an aggregate query, returning data based on the collective dataset rather than from individual rows. Using a GROUP BY clause enables us to use the HAVING clause, which we shall go into more detail shortly.

Relational Queries

Another powerful feature of SOQL is the ability to span relationships within our queries. This can be as simple as retrieving a parent account’s name, or something more complex, such as retrieving the contacts related to an account. 

These relationship queries come in two types:

Child to Parent

  • These go up a relationship tree, referencing fields of a parent object when we query the child object. 
  • These can go up 5 levels, referencing fields on grandparent, great grandparent, and so on. 
  • These types of relationships are accessed via dot notation, and it is key to remember this uses the relationship name which for custom relationships always ends with __r.

Parent to Child

  • These go down the relationship tree, referencing records and fields of an object which is a child of the query record.
  • These queries can only go down one level. i.e. you could query an account’s contacts, but not also records related to those contacts.
  • These are written via sub queries in the SELECT clause, with the FROM being the relationship name (e.g. Contacts or My_Custom_Objects__r).

Aggregate Queries

Aggregate queries are like magic. They can take massive and complex datasets and condense them down into a few rows, giving you exactly the data you need without having to do any of the hard work yourself! These should be used when you care more about the dataset as a whole, rather than individual records.

They do this through the combined use of the GROUP BY clause and what we call Aggregate Functions that are placed into the SELECT clause.

We can use the following aggregate functions depending on our specific use case:

  • COUNT() / COUNT(fieldName) – Counts the number of rows returned by a query. Usually used with the Id field, or other lookups.
  • COUNT_DISTINCT(fieldName) – Counts the number of unique, nonnull, values for the query.
  • AVG(fieldName) – Returns the average value for a field based on the rows returned by the query.
  • MIN(fieldName) – Returns the minimum value for a field based on the query. Can even be used on picklists, which returns based on the picklist sort order.
  • MAX(fieldName) – Returns the maximum value for a field based on the query. Can even be used on picklists, which returns based on the picklist sort order.
  • SUM(fieldName) – Returns the numeric sum of a field based on the query results.

The GROUP BY clause also has two siblings which can further augment the power of our query while also providing the same data as the standard GROUP BY clause – through also returning various subtotals. Useful when you need to see the breakdown of the data, and not just the grand totals.

GROUP BY ROLLUP(fieldName, fieldName2, ….)

  • Used with multiple field aggregates (e.g. count the number of opportunities in stages and sum their total amount).
  • Takes our aggregate query and also returns the subtotals for each category.
  • Includes a subtotal row for each combination of fields (calculated left to right in the provided fields).
  • Field ordering is important!
  • Has a final grand total row.

The above query would return the following:

AccountStageAmount
Universal ContainersQualifications125
Universal ContainersClosed Won800
Universal Containersnull925
ACMEProposal/Price Quote2500
ACMEClosed Lost500
ACMEnull3000
nullnull3925

GROUP BY CUBE(fieldName, fieldName2, ….)

  • Used with multiple field aggregates (e.g. count the number of opportunities in stages and sum their total amount).
  • Gives us an aggregate for all combinations of group fields.
  • Includes more information than a GROUP BY ROLLUP.
  • Field ordering doesn’t matter.
  • Useful for producing cross tabular reports.

The above query would return the following:

AccountStageAmount
nullnull3925
nullQualification125
Universal ContainersQualification125
Universal ContainersClosed Won800
Universal Containersnull925
ACMEProposal/Price Quote2500
ACMEClosed Lost500
ACMEnull3000

These queries can then be further filtered down through the use of the HAVING clause. For example, let’s say we have an object which we use to tag contacts with specific keywords. We wish to find all the contacts which have tags for ‘United Kingdom’, ‘Salesforce Developer’, and ‘Available’; we can use the following query to find exactly those contacts, without having to do any of the hard work ourselves!

The HAVING clause can also be appended to our GROUP BY ROLLUP and GROUP BY CUBE queries to further customise the exact results returned.

Limits and Limitations

As with most things of Salesforce, SOQL comes with its own set limits. With SOQL queries, these limits can change depending on the context on which you’re running the query, but first, let’s talk about its universal limits:

  • 100k character limit: the entire query, including all clauses, must be less than 100k characters in length.
  • 4000-character limit for strings in the WHERE clause: this is for individual strings, not the WHERE clause as a whole (e.g. a concatenation of Ids).
  • No more than 55 child-to-parent relationships.
  • No more than 5 chained parent-to-child relationships, e.g. “Contact.Account.Owner.Name” is valid with 3 relationships, however “Contact.Account.Parent.Custom_Lookup__r.Owner.Manager.Name” is not, as it traverses 6 relationships.
  • No more than 20 parent-to-child relationships.
  • Parent-to-child relationships can only be performed on the object in the FROM clause, e.g. if querying Accounts, you could return child Contacts, but not child records of those contacts.
  • 120 seconds timeout for queries, however the processing of those results can take up to 30 minutes before timing out.

When working with SOQL, you’ll most likely be using it within Apex, which has a few more restrictions around how you can use SOQL:

  • Maximum 100 queries in a synchronous transaction
  • Maximum 200 queries in an asynchronous transaction 
  • Maximum 50000 rows returned per transaction

Note: Aggregate queries also count towards this limit, however their usage is calculated differently. Each row returned by the query counts as 1 row, regardless of whether the data has been aggregated from several records.

For example, if the following query returned 7 rows “SELECT COUNT(Id), AccountId FROM Contact GROUP BY AccountId”, it would consume only 7 from this limit, regardless of the number of counted contacts.

Summary

SOQL queries are a powerful tool for both developers writing code and for admins wishing to find complex information about an org’s dataset and understanding what makes up a query and how we can extend a queries functionality through the use of various clauses can turn some complex data manipulation into a trivial task.

Exploring your data through some of the more advanced functionalities of Salesforce SOQL can provide greater insights into your data or enable quick and easy customizations within code. Hopefully the above walkthrough of the features and anatomy of a SOQL query can help you in writing your next query for code, or in quickly and easily producing that report that will wow your boss.

The Author

Alex Crisp

Alex is CTO for Seven20, an ISV providing a CRM/ATS built on Salesforce. He has a wealth of experience building everything and anything on the Salesforce platform.

Comments:

    Krishnamurthy
    February 04, 2023 1:07 pm
    Awesome refresher on SOQL. Looking forward to more such dev content.

Leave a Reply