The Montoya Herald, a weblog about Blueprint, jQuery, design, music and life, publishing on the web since September 2005. Written by Christian Montoya: developer, designer and entrepreneur.

The Montoya Herald — ChristianMontoya.com

Search

I Recommend

Genesis Rocket

Like What I Do?

My Amazon.com Wish List

On this domain

Elsewhere

MySQL GROUP_CONCAT (this query is insane)

Posted on September 14, 2007.

Last night I was working on a Facebook application for Facebook App Reviews (which I redesigned two nights ago). As you might already know, F.A.R. is a Wordpress-powered site so it has a complex database structure that holds all the posts, categories, and tags. There's a chance that I could have built the application inside of Wordpress but I decided not to do that since it probably would have been complicated and I needed the code for this app to be as flexible and lightweight as possible. So the challenge was as so: get all the reviews from the database, along with their permalinks, ratings, and tags. The first few items were easy; each post has a title and a permalink (which is called 'guid,' a permalink in RSS), and I can join the posts with the wp_post2cat table and just filter the category by ID. For the purposes of my site, posts to categories is a 1:1 relationship, so no problem there. I also join the posts with the wp_postmeta table to get the ratings, which is also 1:1.

The problem came with the tags… this is a one to many relationship, but I wanted only 1 row per review. Doing a standard join across with wp_post2tag and wp_tags table would return multiple rows for each review, with each review holding one single tag. It would look like this:

College Toolkit - external-services
College Toolkit - school
College Toolkit - useful
I Kick You - friends
I Kick You - games

See? Not what I wanted. Now, one solution might have been to use one query to get an array of all the reviews and then loop through this array to do a query for each review and get an array of all the tags that belong to it, but this is exactly what I was trying to avoid since I would have ended up with n+1 queries for n reviews. So, I took my problem to Google in the hopes that I would find an elegant solution.

And I did! The solution is GROUP_CONCAT, which allows you to join all the results from a field that correspond to one value from another field, so just as it sounds, you can concatenate all the fields in a one-to-many relationship and effectively get one row per review. The query requires that you specify which field is the unique 'one' field in the relationship (using GROUP BY) and that you join the 'many' field in the relationship with a SEPARATOR. If that all doesn't make sense just look at this here example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
SELECT 
  wp_posts.post_title AS title, 
  wp_posts.guid AS url, 
  wp_postmeta.meta_value AS rating,
  GROUP_CONCAT(
    wp_tags.tag 
    ORDER BY wp_tags.tag 
    SEPARATOR ", ")
  AS tags
FROM 
  wp_posts, 
  wp_post2cat, 
  wp_postmeta, 
  wp_post2tag, 
  wp_tags 
WHERE 
  wp_posts.ID = wp_post2cat.post_ID 
  AND wp_post2cat.category_id = 1 
  AND wp_posts.ID = wp_postmeta.post_id
  AND wp_postmeta.meta_key = "rating" 
  AND wp_posts.ID = wp_post2tag.post_id 
  AND wp_post2tag.tag_id = wp_tags.tag_ID
GROUP BY wp_posts.ID 
ORDER BY post_date DESC

That's a really long query, but it's perfect because the result is structured exactly as I want it and at this point I have all the data necessary for the reviews without the need for any more queries. To give you an idea, here's what I end up with (omitting 'url' and 'rating' fields):

College Toolkit - external-services, school, useful
I Kick You - friends, games

Isn't that cool? If you disagree, that just means that I am a total nerd and you are normal.

And by the way, Facebook App Reviews has its own Facebook application now… if you have a Facebook account, check it out.

P.S.: if you see any ways I can optimize this query further, let me know! I'm not a SQL expert by any means (just pretty good at it).

Get a trackback link

5 Comments

  1. Jordan T. Cox on September 14, 2007

    That's actually fairly awesome. I wonder if it could be mangled to actually render out the link display for tag listings… oh, what fun!

  2. Christian Montoya on September 14, 2007

    I think I know what you are talking about, but I wouldn't need to do anything in the query for that. All the tag links are of the form:

    base-url / tags / tag-name

    so with the names of the tags I already have the URLs.

  3. MattW on September 15, 2007

    WARNING: GROUP_CONCAT has a max length - I think the default is 256 or 1024. You can alter it by executing:

    SET GROUP_CONCAT_MAX_LEN=65535

    (or pick a number)

  4. Christian Montoya on September 15, 2007

    Thanks MattW, that's really good advice.

  5. Brianblocker on July 19, 2009

    Worth noting is that group_concat works funky with integers. If the items to be concatenated are all integers, you could end up with the results being [BLOB].

    If you have this issue, google how to select an INT as CHAR.

    Nice article!

Leave a comment

Use Markdown or basic HTML. For posting code, use Postable. Please keep comments respectful and on topic.