jq select elements with array not containing string

Tags: jq
By : Thomas
Source: Stackoverflow.com

Now, this is somewhat similar to jq: select only an array which contains element A but not element B but it somehow doesn't work for me (which is likely my fault)... ;-)

So here's what we have:

[ {

        "employeeType": "student",
        "cn": "dc8aff1",
        "uid": "dc8aff1",
        "ou": [
            "4210910 #Abg",
            "4210910 Abgang",
            "4240115 5",
            "4240115 5\/5"
        "employeeType": "student",
        "cn": "160f656",
        "uid": "160f656",
        "ou": [
            "4210910 3",
            "4210910 3a"
        ] } ]

I'd like to select all elements where ou does not contain a specific string, say "4210910 3a" or - which would be even better - where ou does not contain any member of a given list of strings.

By : Thomas


When it comes to possibly changing inputs, you should make it a parameter to your filter, rather than hardcoding it in. Also, using contains might not work for you in general. It runs the filter recursively so even substrings will match which might not be preferred.

For example:

["10", "20", "30", "40", "50"] | contains(["0"])

is true

I would write it like this:

$ jq --argjson ex '["4210910 3a"]' 'map(select(all(.ou[]; $ex[]!=.)))' input.json

This response addresses the case where .ou is an array and we are given another array of forbidden strings.

For clarity, let's define a filter, intersectq(a;b), that will return true iff the arrays have an element in common:

def intersectq(a;b):
  any(a[]; . as $x | any( b[]; . == $x) );

This is effectively a loop-within-a-loop, but because of the semantics of any/2, the computation will stop once a match has been found.(*)

Assuming $ex is the list of exceptions, then the filter we could use to solve the problem would be:

map(select(intersectq(.ou; $ex) | not))

For example, we could use an invocation along the lines suggested by Jeff:

$ jq --argjson ex '["4210910 3a"]' -f myfilter.jq input.json

Now you might ask: why use the any-within-any double loop rather than .[]-within-all double loop? The answer is efficiency, as can be seen using debug:

$ jq -n '[1,2,3] as $a | [1,1] as $b | all( $a[]; ($b[] | debug) != .)'

$ jq -n '[1,2,3] as $a | [1,1] as $b | all( $a[]; . as $x | all( $b[]; debug | $x != .))'

(*) Footnote

Of course intersectq/2 as defined here is still O(m*n) and thus inefficient, but the main point of this post is to highlight the drawback of the .[]-within-all double loop.

By : peak

This video can help you solving your question :)
By: admin