Introduction to JQ
Background: Fingers, Head, and Google
Whenever I reach a stopping point in my work, I use a bash alias called gwip
1 to create a ‘work in progress’ commit. It happens without conscious thinking on my part. The same way my fingers know the vim key bindings, they know gwip
.
Other actions, I know how they work, but I have to think about them every time. They are in my head, not my fingers.2
However, some things never stick in my head, nor my fingers, and I have to google them every time. jq
is one of these.
I know it’s a powerful tool, but I always end up back at Google and then copying and pasting a solution from somewhere. So I solve my problem but never learn the tool.
It’s time to fix that. In this article, I’m going to go over the basics building blocks of jq
in enough depth that you will be able to understand how jq works. Of course, you still might occasionally need to head to google to find a function name or check your syntax, but at least you’ll have a firm grounding in the basics.
What Is JQ
jq
is a lightweight, command-line JSON processor. I install it with brew (brew install jq
), but it’s a single portable executable, so it’s easy to install on Linux, Windows, or macOS. To use it, you construct one or more filters, and it applies those filters to a JSON document.
The simplest filter is the identity filter which returns all its input (.
):
This filter is handy for just pretty-printing a JSON document.3 I’m going to ignore the pretty-printing and jump right into using jq
to transform JSON documents.
Using JQ to Select Elements
I’m going to use jq
to filter the data returned by the GitHub repository API. The data I get back by default looks like this:
jq
lets us treat the JSON document as an object and select elements inside of it.
Here is how I filter the JSON document to select the value of the name
key:
Similarly, for selecting the value of the owner
key:
You can drill in as far as you want like this:
What I Learned: Object Identifier-Index
jq
lets you select elements in a JSON document like it’s a JavaScript object. Just start with .
( for the whole document) and drill down to the value you want. It ends up looking something like this:
Using JQ to Select Arrays
If you curl
the GitHub Issues API, you will get back an array of issues:
To get a specific element in the array, give jq
an index:
Side Note: Array Indexing in jq
Array indexing has some helpful convenience syntax.
You can select ranges:
You can select one sided ranges:
Also, you can use negatives to select from the end:
You can use the array index with the object index:
And you can use []
to get all the elements in the array. For example, here is how I would get the titles of the issues returned by my API request:
What I Learned: Array-Index
jq
lets you select the whole array []
, a specific element [3]
, or ranges [2:5]
and combine these with the object index if needed.
It ends up looking something like this:
Side Note: Removing Quotes From JQ Output
The -r
option in jq
gives you raw strings if you need that.
The -j
option (for join) can combine together your output.
Putting Elements in an Array using jq
Once you start using the array index to select elements, you have a new problem. The data returned won’t be a valid JSON document. In the example above, the issue titles were new line delimited:
In fact, whenever you ask jq
to return an unwrapped collection of elements, it prints them each on a new line. You can see this by explicitly asking jq
to ignore its input and instead return two numbers:
You can resolve this the same way you would turn the text 1,2
into an array in JavaScript: By wrapping it in an array constructor [ ... ]
.
Similarly, to put a generated collection of results into a JSON array, you wrap it in an array constructor [ ... ]
.
Get notified about new articles
We won’t send you spam. Unsubscribe at any time.
My GitHub issue title filter (.[].title
) then becomes [ .[].title ]
like this:
Now I have a valid JSON document.
What I Learned: Array Constructors
If your jq
query returns more than one element, they will be returned newline delimited.
To turn these values into a JSON array, what you do is similar to creating an array in JavaScript: You wrap the values in an array constructor ([...]
).
It ends up looking something like this:
Using jq
to Select Multiple Fields
The GitHub issues API has a lot of details I don’t care about. I want to select multiple fields from the returned JSON document and leave the rest behind.
The easiest way to do this is using ,
to specify multiple filters:
But this is returning the results of one selection after the other. To change the ordering, I can factor out the array selector:
This refactoring uses a pipe (|), which I’ll talk about shortly, and runs my object selectors (.title
and .number
) on each array element.
If you wrap the query in the array constructor you get this:
But this still isn’t the JSON document I need. To get these values into a proper JSON object, I need an object constructor { ... }
.
Putting Elements Into an Object Using jq
Let’s look at some simple examples before showing how my GitHub query can use an object constructor.
A Little Example
I have an array that contains my name (["Adam","Gordon","Bell"]
), and I want to turn it into a JSON object like this:
I can select the elements I need using array indexing like this:
To wrap those values into the shape I need, I can replace the values with the array indexes that return them:
Or on a single line like this:
This syntax is the same syntax for creating an object in a JSON document. The only difference is you can use the object and array queries you’ve built up as the values.
Back To GitHub
Returning to my GitHub API problem, to wrap the number and the title up into an array I use the object constructor like this:
What I Learned: Object Constructors
To put the elements you’ve selected back into a JSON document, you can wrap them in an object constructor { ... }
.
If you were building up a JSON object out of several selectors, it would end up looking something like this:
Which is the same syntax for an object in a JSON document, except with jq
you can use filters as values.4
Sorting and Counting With JQ
The next problem I have is that I want to summarize some this JSON data. Each issue returned by GitHub has a collection of labels:
jq
Built-in Functions
If I want those labels in alphabetical order I can use the built in sort
function. It works like this:
This is similar to how I would sort an array in JavaScript:
Other built-ins that mirror JavaScript functionality are available, like length
, reverse
, and tostring
and they can all be used in a similar way:
If I can combine these built-ins with the selectors I’ve built up so far, I’ll have solved my label sorting problem. So I’ll show that next.
What I Learned: jq
Built-Ins
jq
has many built-in functions. There are probably too many to remember but the built-ins tend to mirror JavaScript functions, so give those a try before heading to jq manual , and you might get lucky.5
Pipes and Filters
Before I can use sort
to sort the labels from my GitHub API request, I need to explain how pipes and filters work in jq
.
jq
is a filter in the UNIX command line sense. You pipe (|
) a JSON document to it, and it filters it and outputs it to standard out. I could easily use this feature to chain together jq invocations like this:
This is a wordy, though simple, way to determine the length of a string in a JSON document. You can use this same idea to combine various jq
built-in functions with the features I’ve shown so far. But there is an easier way, though. You can use pipes inside of jq
and conceptually they work just like shell pipes:
Here are some more examples:
.title | length
will return the length of the title.number | tostring
will return the issue number as a string.[] | .key
will return the values of keykey
in the array (this is equivalent to this.[].key
)
This means that sorting my labels array is simple. I can just change .labels
to .labels | sort
:
And if you want just a label count that is easy as well:
What I Learned: Pipes and Filters
Everything in jq
is a filter that you can combine with pipes (|
). This mimics the behavior of a UNIX shell.
You can use the pipes and the jq
built-ins to build complicated transformations from simple operations.
It ends up looking something like this:
Maps and Selects Using JQ
The issues list I was looking at has many low-quality issues in it.6 Let’s say I want to grab all the items that are labeled. This would let me skip all the drive-by fix-my-problem issues.
Unfortunately, it’s impossible to do this with the GitHub API unless you specify all the possible labels in your query. However, I can easily do this query on the command line by filtering our results with jq
. However, to do so, I’m going to need a couple more jq
functions.
My query so far looks like this:
The first thing I can do is simplify it using map
.
map(...)
let’s you unwrap an array, apply a filter and then rewrap the results back into an array. You can think of it as a shorthand for [ .[] | ... ]
and it comes up quite a bit in my experience, so it’s worth it committing to memory.
I can combine that with a select statement that looks like this:
select
is a built-in function that takes a boolean expression and only returns elements that match. It’s similar to the WHERE
clause in a SQL statement or array filter in JavaScript.
Like map
, I find select
comes up quite a bit, so while you may have to come back to this article or google it the first few times you need it, with luck, it will start to stick to your memory after that.
Putting this all together looks like this:
This uses three object indexes, two maps, two pipes, a length
function, and a select
predicate. But if you’ve followed along, this should all make sense. It’s all just composing together filters until you get the result you need.
Now lets talk about how you can put this knowledge into practice.
In Review
What I Learned
Here is what I’ve learned so far:
jq
lets you select elements by starting with a .
and accessing keys and arrays like it’s a JavaScript Object (which is it is). This feature uses the Object and Array index jq
creates of a JSON document and look like this:
jq
programs can contain object constructors { ... }
and array constructors [ ... ]
. You use these when you want to wrap back up something you’ve pulled out of a JSON document using the above indexes:
jq
contains built-in functions (length
,sort
,select
,map
) and pipes (|
), and you can compose these together just like you can combine pipes and filters at the command line:
Next Steps for Mastering jq
Reading about (or writing about) a tool is not enough to master it. Action is needed. Here is my process for cementing this knowledge:
1. Complete the jq
Tutorial
jq-tutorial is not a tutorial at all, but a collection of around 20 interactive exercises that test you knowledge of jq
. I’ve found it extremely helpful.
2. Try Using Your Memory First
Whenever I need to extract data or transform a JSON document, I try to do it first without looking anything up. If memory fails me, sometimes jqterm, which has auto-completion, is helpful. Often, I still need to look something up, but science has shown that repeated retrieval yields retention. So over time, my retention should improve.
3. Use It
If you don’t use a tool, you will never master it. So when I have a task that can be solved using jq
, then it is what I use. At least for the next little while, even if there is an easier way to do it. Whether it’s exploring a REST API or looking at docker inspect
results, JSON is everywhere, so opportunities abound.
4. Learn More
Lastly, to deepen my knowledge, I’m learning about recursive descent, declaring variables, and defining functions and advanced features found in the manual. Of course, these things rarely come up, but after writing all this, I’m hooked on this tool.
Doing all this isn’t necessary, but if you follow me in some of these steps, I think using jq
will become second nature.
Conclusion
So far I’ve only covered the basics of jq
. The jq
query language is a full programming language and you can do lots of exciting things with it. You convert from JSON to CSV. You can define you own functions and even find primes with jq
:
# Denoting the input by $n, which is assumed to be a positive integer,
> [!quote] src: https://earthly.dev/blog/jq-select/
>
# eratosthenes/0 produces an array of primes less than or equal to $n:
> [!quote] src: https://earthly.dev/blog/jq-select/
>
def eratosthenes:
(. + 1) as $n
| (($n|sqrt) / 2) as $s
| [null, null, range(2; $n)]
| reduce (2, 1 + (2 * range(1; $s))) as $i (.; erase($i))
| map(select(.));
However, simple stuff – like selecting elements, filtering by key, or value – is often all you need.
I hope this helps make jq
more approachable and that you no longer have to go to google every time you want to query a JSON document7.