Back
August 17, 2022
6
min read

Six Tricks For Writing Complex SQL Queries

By
Eric Sammer
Share this post

SQL is incredibly powerful, but it can be hard to remember what output you get from different parts of a SQL query. Here's my mental cheat-sheet for writing and reading non-trivial SQL queries.

The Tricks

1. The absence of 'group by' means you will get one output record per input, after predicates like 'where' clauses are applied.

2. If a query has a 'group by', you will get one record per group.

3. 'Over' clauses generate a field for each output record. They operate on a window defined by either a) the 'partition by' expression - which works like a 'group by' - or b) a range of records defined by 'rows between'.

4. Subqueries can be put in a bunch of different places in SQL. They can also appear in predicates in the 'where' clause, where their output acts as part of the filtering condition.

5. In a select list, a subquery (usually) runs once, before the outer query, and its output is inserted as a field.

6. In a 'from', a subquery's output acts as an input table. You can alias the subquery and reference its fields.

Here are a few examples of SQL tricks demonstrating these tricks, written for clarity not performance:

Example: Tweet Likes / All Likes

Example: Current Order Price vs Average Over The Last 10

Example: HTTP Errors By Path And Status, And By Path

There are lots (and lots) of intermediate SQL bits that can greatly simplify life. I've oversimplified a bunch of things and focused on some common cases. Performance can vary greatly based on the DBMS, data, indexes, and config, but hopefully, this helps! Join me on the Decodable community slack with questions, suggestions or comments!


You can get started with Decodable for free - our developer account includes enough for you to build a useful pipeline and - unlike a trial - it never expires.

Learn more:

Join the community Slack

📫 Email signup 👇

Did you enjoy this issue of Checkpoint Chronicle? Would you like the next edition delivered directly to your email to read from the comfort of your own home?

Simply enter your email address here and we'll send you the next issue as soon as it's published—and nothing else, we promise!

👍 Got it!
Oops! Something went wrong while submitting the form.
Eric Sammer

Eric Sammer is a data analytics industry veteran who has started two companies, Rocana (acquired by Splunk in 2017), and Decodable. He is an author, engineer, and leader on a mission to help companies move and transform data to achieve new and useful business results. Eric is a speaker on topics including data engineering, ML/AI, real-time data processing, entrepreneurship, and open source. He has spoken at events including the RTA Summit and Current, on podcasts with Software Engineering Daily and Sam Ramji, and has appeared in various industry publications.

SQL is incredibly powerful, but it can be hard to remember what output you get from different parts of a SQL query. Here's my mental cheat-sheet for writing and reading non-trivial SQL queries.

The Tricks

1. The absence of 'group by' means you will get one output record per input, after predicates like 'where' clauses are applied.

2. If a query has a 'group by', you will get one record per group.

3. 'Over' clauses generate a field for each output record. They operate on a window defined by either a) the 'partition by' expression - which works like a 'group by' - or b) a range of records defined by 'rows between'.

4. Subqueries can be put in a bunch of different places in SQL. They can also appear in predicates in the 'where' clause, where their output acts as part of the filtering condition.

5. In a select list, a subquery (usually) runs once, before the outer query, and its output is inserted as a field.

6. In a 'from', a subquery's output acts as an input table. You can alias the subquery and reference its fields.

Here are a few examples of SQL tricks demonstrating these tricks, written for clarity not performance:

Example: Tweet Likes / All Likes

Example: Current Order Price vs Average Over The Last 10

Example: HTTP Errors By Path And Status, And By Path

There are lots (and lots) of intermediate SQL bits that can greatly simplify life. I've oversimplified a bunch of things and focused on some common cases. Performance can vary greatly based on the DBMS, data, indexes, and config, but hopefully, this helps! Join me on the Decodable community slack with questions, suggestions or comments!


You can get started with Decodable for free - our developer account includes enough for you to build a useful pipeline and - unlike a trial - it never expires.

Learn more:

Join the community Slack

📫 Email signup 👇

Did you enjoy this issue of Checkpoint Chronicle? Would you like the next edition delivered directly to your email to read from the comfort of your own home?

Simply enter your email address here and we'll send you the next issue as soon as it's published—and nothing else, we promise!

Eric Sammer

Eric Sammer is a data analytics industry veteran who has started two companies, Rocana (acquired by Splunk in 2017), and Decodable. He is an author, engineer, and leader on a mission to help companies move and transform data to achieve new and useful business results. Eric is a speaker on topics including data engineering, ML/AI, real-time data processing, entrepreneurship, and open source. He has spoken at events including the RTA Summit and Current, on podcasts with Software Engineering Daily and Sam Ramji, and has appeared in various industry publications.