JOIN keyword
QuestDB supports the following types of joins: INNER
, LEFT (OUTER)
, CROSS
,
ASOF
and SPLICE
. FULL
joins are not yet implemented and are on our
roadmap. All supported join types can be combined in a single SQL statement;
QuestDB SQL's optimizer determines the best execution order and algorithms.
There are no known limitations on size of tables or sub-queries used in joins and there are no limitations on number of joins either.
#
SyntaxColumns from joined tables are combined in single row. Columns with the same name originating from different tables will be automatically aliased to create a unique column namespace of the resulting set.
Though it is usually preferable to explicitly specify join conditions, QuestDB
will analyze WHERE
clauses for implicit join conditions and will derive
transient join conditions where necessary.
tip
When tables are joined on column that has the same name in both tables you can
use the ON (column)
shorthand.
#
Execution orderJoin operations are performed in order of their appearance in a SQL query. The following query performs a join on a table with one million rows based on a column from a smaller table with one hundred rows:
The performance of this query can be improved by rewriting the query as follows:
#
Implicit joinsIt is possible to join two tables using the following syntax:
The type of join as well as the column will be inferred from the WHERE
clause,
and may be either an INNER
or CROSS
join. For the example above, the
equivalent explicit statement would be:
#
(INNER) JOIN(INNER) JOIN
is used to return rows from 2 tables where the records on the
compared column have matching values in both tables. JOIN
is interpreted as
INNER JOIN
by default, making the INNER
keyword implicit.
The following query will return the movieId
and the average rating from table
ratings
. It will also add a column for the title
from table movies
. The
corresponding title will be identified based on the movieId
in the ratings
table matching an id
in the movies
table.
#
LEFT (OUTER) JOINLEFT OUTER JOIN
or simply LEFT JOIN
will return all records from the
left table, and if matched, the records of the right table. When there is no
match for the right table, it will return NULL
values in right table fields.
The general syntax is as follows:
An OUTER JOIN
query can also be used to select all rows in the left table that
do not exist in the right table.
#
CROSS JOINCROSS JOIN
will return the Cartesian product of the two tables being joined
and can be used to create a table with all possible combinations of columns. The
following query will return all possible combinations of starters
and
deserts
:
note
CROSS JOIN
does not have an ON
clause.
#
ASOF JOINASOF
joins are used on time series data to join two tables based on timestamp
where timestamps do not exactly match. For a given record at a given timestamp,
it will return the corresponding record in the other table at the closest
timestamp prior to the timestamp in the first table.
note
To be able to leverage ASOF JOIN
, both joined table must have a designated
timestamp
column. To designate a column as timestamp
, please refer to the
CREATE TABLE section.
ASOF
join is performed on tables or result sets that are ordered by time. When
table is created as ordered by time order of records is enforced and timestamp
column name is in table metadata. ASOF
join will use timestamp column from
metadata.
Given the following tables:
ts | ask |
---|---|
2019-10-17T00:00:00.000000Z | 100 |
2019-10-17T00:00:00.200000Z | 101 |
2019-10-17T00:00:00.400000Z | 102 |
ts | bid |
---|---|
2019-10-17T00:00:00.100000Z | 101 |
2019-10-17T00:00:00.300000Z | 102 |
2019-10-17T00:00:00.500000Z | 103 |
An ASOF JOIN
query can look like the following:
The above query returns these results:
timebid | bid | ask |
---|---|---|
2019-10-17T00:00:00.100000Z | 101 | 100 |
2019-10-17T00:00:00.300000Z | 102 | 101 |
2019-10-17T00:00:00.500000Z | 103 | 102 |
Note that there is no ASKS
at timestamp 2019-10-17T00:00:00.100000Z
. The
ASOF JOIN
will look for the value in the BIDS
table that has the closest
timestamp prior to or equal to the target timestamp.
In case tables do not have designated timestamp column, but data is in chronological order, timestamp columns can be specified at runtime:
The query above assumes that there is only one instrument in BIDS
and ASKS
tables and therefore does not use the optional ON
clause. If both tables store
data for multiple instruments ON
clause will allow you to find bids for asks
with matching instrument value:
caution
ASOF
join does not check timestamp order, if data is not in chronological
order, the join result is non-deterministic.
#
LT JOINLT
join is very similar to ASOF
, except that it searches for the last row
from the right table strictly before the row from the left table. There will be
one or no rows joined from the right table per each row from the left table.
Consider the following tables:
ts | ask |
---|---|
2019-10-17T00:00:00.000000Z | 100 |
2019-10-17T00:00:00.300000Z | 101 |
2019-10-17T00:00:00.400000Z | 102 |
ts | bid |
---|---|
2019-10-17T00:00:00.000000Z | 101 |
2019-10-17T00:00:00.300000Z | 102 |
2019-10-17T00:00:00.500000Z | 103 |
An LT JOIN
can be built using the following query:
The query above returns the following results:
timebid | timeask | bid | ask |
---|---|---|---|
2019-10-17T00:00:00.000000Z | NULL | 101 | NULL |
2019-10-17T00:00:00.300000Z | 2019-10-17T00:00:00.000000Z | 102 | 100 |
2019-10-17T00:00:00.500000Z | 2019-10-17T00:00:00.400000Z | 103 | 102 |
note
LT
join is often useful to join a table to itself in order to get preceding
values for every row.
#
SPLICE JOINSPLICE JOIN
is a full ASOF JOIN
. It will return all the records from both
tables. For each record from left table splice join will find prevailing record
from right table and for each record from right table - prevailing record from
left table.
Considering the following tables.
ts | ask |
---|---|
2019-10-17T00:00:00.000000Z | 100 |
2019-10-17T00:00:00.200000Z | 101 |
2019-10-17T00:00:00.400000Z | 102 |
ts | bid |
---|---|
2019-10-17T00:00:00.100000Z | 101 |
2019-10-17T00:00:00.300000Z | 102 |
2019-10-17T00:00:00.500000Z | 103 |
A SPLICE JOIN
can be built as follows:
This query returns the following results:
timebid | bid | ask |
---|---|---|
null | null | 100 |
2019-10-17T00:00:00.100000Z | 101 | 100 |
2019-10-17T00:00:00.100000Z | 101 | 101 |
2019-10-17T00:00:00.300000Z | 102 | 101 |
2019-10-17T00:00:00.300000Z | 102 | 102 |
2019-10-17T00:00:00.500000Z | 103 | 102 |
Note that the above query does not use the optional ON
clause. In case you
need additional filtering on the two tables, the ON
clause can be used as
follows: