Ultimate PostgreSQL Slug Function

Written on 13 May 2019

Having deployed a fair few web apps, APIs and utilities, I’ve grown more familiar and comfortable with PostgreSQL. As that familiarity has grown, I’ve started moving more functionality into the database; particularly using triggers/functions on UPDATE/INSERT as well as pre-creating JSON(B) objects that can effectively skip the application layer logic and be sent straight to the client.

Slug generation is low hanging fruit where PostgreSQL can excel. Moving this functionality to the database gives you the guarantee that no matter how rows are inserted/updated, they will get a consistent slug.

What is a slug?!

If you’re reading this, you probably know, but a slug is a URL/SEO friendly representation of a string. If for example you wrote an article:

The World’s “Best” Cafés!

You’d want a slug function to create a representation that looks a bit like:

the-worlds-best-cafes

Granted browsers can probably handle the accented “e” but it’s good to be over conservative for backwards compatibility.

PostgreSQL slug function

The most comprehensive slug function I’ve seen for Postgres (and in general) is by ianks on Github. Slug functions are one of those things that never seems to have a definitive “best and final” version, but this has been very reliable for me.

It’s verbose so you can feel comfortable modifying it, it’s pretty robust and it leverages the unaccent extension in Postgres to help with transcoding/transliterating non-ASCII characters (i.e. it’ll convert é => e for you).

One limitation of this function, though, is that it expands quotes (single and double) into a hyphen which - for me - is overkill. I’ve subsequently tweaked the function:

CREATE EXTENSION IF NOT EXISTS "unaccent";

CREATE OR REPLACE FUNCTION slugify("value" TEXT)
RETURNS TEXT AS $$
  -- removes accents (diacritic signs) from a given string --
  WITH "unaccented" AS (
    SELECT unaccent("value") AS "value"
  ),
  -- lowercases the string
  "lowercase" AS (
    SELECT lower("value") AS "value"
    FROM "unaccented"
  ),
  -- remove single and double quotes
  "removed_quotes" AS (
    SELECT regexp_replace("value", '[''"]+', '', 'gi') AS "value"
    FROM "lowercase"
  ),
  -- replaces anything that's not a letter, number, hyphen('-'), or underscore('_') with a hyphen('-')
  "hyphenated" AS (
    SELECT regexp_replace("value", '[^a-z0-9\\-_]+', '-', 'gi') AS "value"
    FROM "removed_quotes"
  ),
  -- trims hyphens('-') if they exist on the head or tail of the string
  "trimmed" AS (
    SELECT regexp_replace(regexp_replace("value", '\-+$', ''), '^\-', '') AS "value"
    FROM "hyphenated"
  )
  SELECT "value" FROM "trimmed";
$$ LANGUAGE SQL STRICT IMMUTABLE;

Now you can run the earlier example:

-- Having to escape the quote here to demo...
select slugify('The World''s "Best" Cafés!');

And you get:

| slugify               |
|-----------------------|
| the-worlds-best-cafes |

Using the slug function

Typically you will only want to create a slug when a record is created. If you update the title of something, you probably want to preserve the slug as it may well be part of your websites’ URL structure. You could update your slug each time, but you’d need an audit table to redirect people from the previous slug(s) to the new ones.

Depending on your schema, you can have a function and trigger for each table that requires a slug, OR if you have consistent column names over tables you can create a generic function that faithfully assumes you’ve got a title and a slug column:

CREATE FUNCTION public.set_slug_from_title() RETURNS trigger
    LANGUAGE plpgsql
    AS $$
BEGIN
  NEW.slug := slugify(NEW.title);
  RETURN NEW;
END
$$;

This function returns a trigger, whilst the slugify function returns text. The NEW keyword above is effectively referencing the table row we are updating.

Note that the above function will happily generate duplicate slugs. You could append an ID, hashid or some other bit of random text to shoot for uniqueness.

And finally, to add this trigger to your table(s)…

CREATE TRIGGER "t_news_insert" BEFORE INSERT ON "news" FOR EACH ROW WHEN (NEW.title IS NOT NULL AND NEW.slug IS NULL)
EXECUTE PROCEDURE set_slug_from_title();

You can now INSERT into news a new row with a title set and when you commit, you’ll see a slug appear that is a URL safe, sensible representation of the title.

I’ve forked the original gist and put it here.

Copyright © 2006 - 2019 Kester Dobson

WIP