What's New in PostgreSQL 14?

October 05, 2021

· Chris Castle

PostgreSQL 14, released on September 30, 2021, introduces many new features and improvements. Two I find particularly interesting are JSON syntax conveniences and the multirange data type. But there are also several noticeable performance improvements that many applications will benefit from without any code changes.

PostgreSQL can store and manipulate JSON data. There are JSON and JSONB data types. They are similar, but most of the time you probably want to use JSONB. The JSON data type is stored as a string whereas JSONB is stored in a decomposed binary format that allows PostgreSQL to query and manipulate nested structures inside the JSON more efficiently. JSONB also allows you to create indexes based on nested values. I'll be using JSONB exclusively here.

Let's create a table and put some JSON data in it.

sql

CREATE TABLE jsonb_example (
  id bigserial,
  data JSONB
);

INSERT INTO
  jsonb_example (data)
VALUES
  ('{"a": {"b": ["foo", "hello"]}}'),
  ('{"a": {"b": ["chris", "render"]}}');

Here's the table we just created.

sql

SELECT * from jsonb_example;

 id |               data
----+-----------------------------------
  1 | {"a": {"b": ["foo", "hello"]}}
  2 | {"a": {"b": ["chris", "render"]}}
(2 rows)

Cool. But what if we want to query nested values within the JSON? Here's how you used to have to do it with PostgreSQL 13.

sql

SELECT data #> '{a,b,1}'
FROM jsonb_example;

 ?column?
----------
 "hello"
 "render"
(2 rows)

The #> is an operator that tells PostgreSQL to return a JSON sub-object. PostgreSQL has a lot of other operators for working with JSON like #>>, ->, and ->>.

The {a,b,1} is a path into the JSON object that says

Grab the value of property a: {"b": ["foo", "hello"]}
Within that, give me the value of property b: ["foo", "hello"]
Within that array, give me the 2nd element: "hello"

With PostgreSQL 14 you can now access JSON sub-objects in a way that more closely matches how other programming languages reference JSON values. This new functionality is called jsonb subscripting. Here's an example returning the same data as above.

sql

SELECT data['a']['b'][1]
FROM jsonb_example;

   data
----------
 "hello"
 "render"
(2 rows)

We can use the same syntax to filter data with a WHERE clause.

sql

SELECT *
FROM jsonb_example
WHERE data['a']['b'][1] = '"render"';

 id |               data
----+-----------------------------------
  2 | {"a": {"b": ["chris", "render"]}}
(1 row)

And we can update values within the JSON object using the same syntax.

sql

UPDATE jsonb_example SET data['a']['b'][0] = '"bar"';

SELECT *
FROM jsonb_example;

 id |              data
----+---------------------------------
  1 | {"a": {"b": ["bar", "hello"]}}
  2 | {"a": {"b": ["bar", "render"]}}
(2 rows)

This new syntax closely matches how you might access nested values in a JSON object with JavaScript.

javascript

const obj = {"a": {"b": ["foo", "hello"]}};
obj['a']['b'][1]; // evaluates to 'hello'

PostgreSQL has had range data types since version 9.2. Now there are multirange data types.

A range data type is useful when you want a single value to represent a range of dates, time, or numbers. For example, you might use a range data type to represent meeting schedules or radio spectrum allocation. With a range data type, you only need a single column to store what would otherwise require two columns--start of range and end of range.

Prior to PostgreSQL 14, that range could only be a single contiguous range. Now with multiranges, multiple discontiguous (but not overlapping) ranges can be stored as a single value.

Let's look at a meeting scheduling example derived from the PostgreSQL documentation.

sql

CREATE TABLE reservation (room int, during tsrange);

INSERT INTO reservation VALUES
    (1108, tsrange('2021-01-01 14:30', '2021-01-01 15:30'));

SELECT * FROM reservation;

 room |                    during
------+-----------------------------------------------
 1108 | ["2021-01-01 14:30:00","2021-01-01 15:30:00")

This row represents a room reservation for a meeting from 14:30 to 15:30 on January 1, 2021. But what if instead of a one hour meeting, we were planning an all-day meeting with a lunch break in the middle? What would a row for that reservation look like?

Prior to PostgreSQL 14's multirange support, it would have to be stored as two rows.

sql

INSERT INTO reservation VALUES
    (1108, tsrange('2021-01-01 09:00', '2021-01-01 12:00')),
    (1108, tsrange('2021-01-01 13:00', '2021-01-01 17:00'));

SELECT * FROM reservation;

 room |                    during
------+-----------------------------------------------
 1108 | ["2021-01-01 09:00:00","2021-01-01 12:00:00")
 1108 | ["2021-01-01 13:00:00","2021-01-01 17:00:00")
(2 rows)

But then we'd need to add a meeting_id column so that we know these two rows represent the same meeting.

With multiranges we only need one row to represent this meeting.

sql

DROP TABLE reservation;
CREATE TABLE reservation (room int, during tsmultirange);

INSERT INTO reservation VALUES
    (1108, tsmultirange(
      tsrange('2021-01-01 09:00', '2021-01-01 12:00'),
      tsrange('2021-01-01 13:00', '2021-01-01 17:00')
    ));

SELECT * FROM reservation;

 room |                                            during
------+-----------------------------------------------------------------------------------------------
 1108 | {["2021-01-01 09:00:00","2021-01-01 12:00:00"),["2021-01-01 13:00:00","2021-01-01 17:00:00")}
(1 row)

A lot has happened behind the scenes to improve PostgreSQL performance. However, PostgreSQL 14's handling of multiple client connections will likely be the performance improvement noticeable to the most people. The folks at pganalyze did some great performance testing using pgbench to verify this. At 5000 active connections, they saw about a 20% improvement in throughput over PostgreSQL 13. At 10,000 connections, that improvement went up to 50%. Of course, these results are dependent on your hardware, schema, data, and queries, so you may not see the same results, but the consensus seems to be that PostgreSQL's handling of concurrent connections has improved significantly.

There are many more new features and improvements, which you can read about in the PostgreSQL 14 release announcement or compare across versions in the feature matrix (deselect all versions except 14 and the one you want to compare to and select "Hide unchanged features").

I'd be remiss not to mention that Render was the first cloud provider I know to support PostgreSQL 14, making it available on the same day PostgreSQL 14 was released and giving our customers access to all these exciting improvements in just a few clicks. Try it out and see how it works for you.

A screenshot of the Render Dashboard while creating a new PostgreSQL 14 database

What's New in PostgreSQL 14?

JSON Conveniences

Multirange Data Type

Improved Handling of Concurrent Connections

And Much More

Render Supports PostgreSQL 14