yq (source) - Yertl

Back to documentation
#!/usr/bin/env perl
package yq;
our $VERSION = '0.038';
# ABSTRACT: Filter YAML through a command-line program

use ETL::Yertl;
use Pod::Usage::Return qw( pod2usage );
use Getopt::Long qw( GetOptionsFromArray );
use ETL::Yertl::Command::yq;

$|++; # no buffering

sub main {
    my ( $class, @argv ) = @_;

    my %opt = (
        verbose => $ENV{YERTL_VERBOSE},
    );

    GetOptionsFromArray( \@argv, \%opt,
        'help|h',
        'verbose|v+',
        'version',
        'xargs|x',
    );
    return pod2usage(0) if $opt{help};
    if ( $opt{version} ) {
        print "yq version $yq::VERSION (Perl $^V)\n";
        return 0;
    }

    eval {
        ETL::Yertl::Command::yq->main( @argv, \%opt );
    };
    if ( $@ ) {
        return pod2usage( "ERROR: $@" );
    }
    return 0;
}

exit __PACKAGE__->main( @ARGV ) unless caller(0);

=head1 SYNOPSIS

    yq [-vx] <script> [<file>...]

    yq -h|--help|--version

=head1 DESCRIPTION

This program takes a stream of YAML documents (on STDIN or file arguments),
applies a filter, then writes the results to STDOUT.

=head1 ARGUMENTS

=head2 script

The script to run. For the script syntax, see L<SYNTAX>.

=head2 <file>

A YAML file to filter. The special file "-" refers to STDIN. If no files are
specified, filter STDIN.

=head1 OPTIONS

=head2 -v | --verbose

Set verbose mode to print out some internal program messages on STDERR to help
with debugging.

=head2 -x | --xargs

xargs mode. When the filter returns only a single item, simply print it out without
using the serializer. This allows single values to be piped into other programs that
may not know how to deal with serialized data, like xargs.

=head2 -h | --help

Show this help document.

=head2 --version

Print the current yq and Perl versions.

=head1 SYNTAX

=head2 EXPRESSIONS

An C<EXPRESSION> is allowed to be either a L<FILTER>, L<VALUE>, or a L<COMPARISON>.

=head2 FILTERS

Filters select a portion of the incoming documents. Filters can be combined
to reach deep inside the documents you're working with.

=over

=item .

Returns the entire document, unfiltered. Useful in if/then statements.

    # INPUT
    foo: bar
    baz: fuzz

    $ yq .
    foo: bar
    baz: fuzz

=item .key

Extract a single item out of a hash.

    # INPUT
    foo:
        bar: baz
        fizz: fuzz

    $ yq .foo
    bar: baz
    fizz: fuzz

    $ yq .foo.fizz
    fuzz

=item .[#]

Extract a single item out of an array.

    # INPUT
    - foo: fuzz
    - bar: buzz
    - baz:
        good: bad
        new: old

    $ yq .[1]
    bar: buzz

    $ yq .[2]
    baz:
        good: bad
        new: old

    $ yq .[2].baz
    good: bad
    new: old

    $ yq .[2].baz.new
    old

=item []

Use [] with no index to flatten an array.

    # INPUT
    - foo: fuzz
    - bar: buzz

    $ yq '.[]'
    foo: fuzz
    ---
    bar: buzz

=item $.

C<$.> is the whole original document before any pipe operators changed
what part of the document we're working with.

=item Assignment

Filters can be assigned new values. The new value can be another filter
or any L</VALUES>.

    # INPUT
    foo: fuzz
    bar: baz
    ---
    foo: buzz
    bar: taz

    $ yq '.foo = 2'
    foo: 2
    bar: baz
    ---
    foo: 2
    bar: taz

Multiple assignments can be combined with C<|>:

    $ yq '.foo = .bar | .bar = "foo"'
    foo: baz
    bar: foo
    ---
    foo: taz
    bar: foo

=back

=head2 VALUES

=over

=item 'STRING' "STRING"

Both single- and double-quoted strings are allowed. Using \ will escape
the string delimiter.

=item { KEY: EXPRESSION, ... }

The hash constructor. C<KEY> may be any C<FILTER> or a bare value.

    # INPUT
    foo: bar
    baz: fuzz
    ---
    foo: 1
    baz: 2

    $ yq '{ bar: .foo, .baz: foo }'
    bar: bar
    fuzz: foo
    ---
    2: foo
    bar: 1

=item [ EXPRESSION, ... ]

The array constructor.

    # INPUT
    foo: bar
    baz: fuzz
    ---
    foo: 1
    baz: 2

    $ yq '[ .foo, .baz ]'
    - bar
    - fuzz
    ---
    - 1
    - 2

=item empty

The special value empty suppresses printing of a document. Normally,
an undefined document will show up in the output as "--- ~". If your
filter instead yields empty, the document will not be printed at all.

This is especially useful in conditionals:

    # INPUT
    foo: bar
    baz: fuzz

    $ yq 'if .foo eq bar then . else empty'
    foo: bar
    baz: fuzz

    $ yq 'if .foo eq buzz then . else empty'
    $

... though see the C<grep()> function for a shorter way of writing this.

=item Values

Any bareword that is not recognized as a syntax element is treated as a value.
These barewords may only contain letters, numbers, and underscore.

B<NOTE>: This may be subject to change to only allow quoted strings and bare
numbers in a future version.

=back

=head2 COMPARISONS

=over

=item eq

String equals comparison. Returns true if both sides are equal to each other
when treated as a string.

The two sides may be L<FILTERS> or L<VALUES>.

    # INPUT
    foo: bar
    baz: fuzz
    buzz: fuzz

    $ yq '.foo eq bar'
    true

    $ yq '.baz eq .buzz'
    true

    $ yq '.baz eq bar'
    false

YAML treats the string "true" as a true value, and the string "false" as a
false value.

=item ne

String not equals comparison. Returns true if one side is not equal to the
other side when compared as a string.

The two sides may be L<FILTERS> or L<VALUES>.

    # INPUT
    foo: bar
    baz: fuzz
    buzz: fuzz

    $ yq '.foo eq bar'
    true

    $ yq '.baz eq .buzz'
    true

    $ yq '.baz eq bar'
    false

YAML treats the string "true" as a true value, and the string "false" as a
false value.

=item ==

Numeric equals comparison. Returns true if both sides are equal to each other
when treated as numbers. If one of the items is not a number, will print a
warning to STDERR but try to compare anyway.

The two sides may be L<FILTERS> or L<VALUES>.

    # INPUT
    one: 1
    two: 2
    uno: 1

    $ yq '.one == 1'
    true

    $ yq '.one == 2'
    false

    $ yq '.one == .uno'
    true

=item !=

Numeric not equals comparison. Returns true if both sides are equal to each
other when treated as numbers. If one of the items is not a number, will print
a warning to STDERR but try to compare anyway.

The two sides may be L<FILTERS> or L<VALUES>.

    # INPUT
    one: 1
    two: 2
    uno: 1

    $ yq '.two != 1'
    true

    $ yq '.two != 2'
    false

    $ yq '.one != .uno'
    false

=item > / >=

Numeric greater-than (or equal-to) comparison. Returns true if the left side is
greater than (or equal-to) the right side. If one of the items is not a number,
will print a warning to STDERR but try to compare anyway.

The two sides may be L<FILTERS> or L<VALUES>.

    # INPUT
    one: 1
    two: 2
    uno: 1

    $ yq '.two >= 1'
    true

    $ yq '.two > 2'
    false

    $ yq '.one >= .uno'
    true

=item < / <=

Numeric less-than (or equal-to) comparison. Returns true if the left side is
less than (or equal-to) the right side. If one of the items is not a number,
will print a warning to STDERR but try to compare anyway.

The two sides may be L<FILTERS> or L<VALUES>.

    # INPUT
    one: 1
    two: 2
    uno: 1

    $ yq '.two <= 1'
    false

    $ yq '.two < 2'
    false

    $ yq '.one <= .uno'
    true

=back

=head2 FUNCTIONS

=over

=item length( EXPRESSION )

Returns the length of the thing returned by EXPRESSION. Depending on what type
of thing that is:

    string/number   - Returns the number of characters
    array           - Returns the number of items
    hash            - Returns the number of pairs

If EXPRESSION is missing, gives the length of the entire document (C<length(.)>).
Returns a number suitable for assignment.

Although length() takes an expression, certain constructs are redundant:

    length( keys( EXPRESSION ) ) -> length( EXPRESSION )
    # length() works on hashes

A future version may optimize these away, or warn you of their redundancy.

    # INPUT
    foo:
        one: 1
        two: onetwothreefourfive
        three: 3
    baz: [ 3, 2, 1 ],

    $ yq 'length(.)'
    2

    $ yq 'length'
    2

    $ yq 'length( .foo )'
    3

    $ yq 'length( .baz )'
    3

    $ yq 'length( .foo.two )'
    19

    $ yq '{ l: length( .foo.two ) }'
    l: 19

=item keys( EXPRESSION )

Return the keys of the hash or the indicies of the array returned by EXPRESSION.
If EXPRESSION is missing, gives the keys of the entire document (C<keys(.)>).

Returns an array suitable for assignment.

    # INPUT
    foo:
        one: 1
        two: 2
        three: 3
    baz: [ 3, 2, 1 ]

    $ yq 'keys( .foo )'
    - one
    - two
    - three

    $ yq 'keys( .baz )'
    - 0
    - 1
    - 2

    $ yq 'keys( . )'
    - foo
    - baz

    $ yq 'keys'
    - foo
    - baz

    $ yq '{ k: keys( .foo ) }'
    k:
        - one
        - two
        - three

=item each( EXPRESSION )

Return a list of key/value pairs for the hash or array given by
EXPRESSION. If EXPRESSION is missing, gives the key/value pairs of the
entire document (C<each(.)>).

    # INPUT
    foo:
        one: 1
        two: 2
        three: 3
    baz: [ 3, 2, 1 ]

    $ yq 'each'
    ---
    key: foo
    value:
      one: 1
      two: 2
      three: 3
    ---
    key: baz
    value: [ 3, 2, 1 ]

    $ yq 'each( .foo )'
    ---
    key: one
    value: 1
    ---
    key: two
    value: 2
    ---
    key: three
    value: 3

    $ yq 'each( .baz )'
    ---
    key: 0
    value: 3
    ---
    key: 1
    value: 2
    ---
    key: 2
    value: 1

The documents created by each can be piped to further filters.

=item grep( EXPRESSION )

If C<EXPRESSION> is true, return the current document. Otherwise, return C<empty>.

This is exactly the same as:

    if EXPRESSION then . else empty

=item select( EXPRESSION )

Another name for C<grep()> to match C<jq>'s syntax.

=item group_by( EXPRESSION )

Group incoming documents based on the result of C<EXPRESSION>, yielding a single
document containing a hash of arrays.

    # INPUT
    ---
    foo: 'bar'
    baz: 1
    ---
    foo: 'bar'
    baz: 2
    ---
    foo: 'baz'
    baz: 3

    $ yq 'group_by( .foo )'
    bar:
        - foo: bar
          baz: 1
        - foo: bar
          baz: 2
    baz:
        - foo: baz
          baz: 3

NOTE: If you are filtering a lot of documents, this will consume a lot of memory.

=item parse_time( EXPRESSION, FORMAT )

Parse the date/time string in C<EXPRESSION> and return the number of
seconds since the UNIX epoch (1970-01-01 00:00:00). C<FORMAT> is
optional, and will try to guess what format the date/time is in.

C<FORMAT> may be one of the following:

=over

=item iso

An ISO8601 date/time string with a 4-digit year. 2-digit years, week
numbers, and year days are not (yet) supported.

    2017-01-01T00:00:00
    2017-01-01 00:00:00
    2017-01-01
    20170101000000
    201701010000
    20170101

=item apache

The date/time format used by the Common Log Format (Apache HTTP logs).

    01/Jan/2017:00:00:00

=back

    # INPUT
    timestamp: 2017-01-01 00:00:00

    $ yq '.timestamp = parse_time( .timestamp, "iso" )'
    timestamp: 1483228800

    $ yq '.timestamp = parse_time( .timestamp )'
    timestamp: 1483228800

=head2 CONDITIONALS

=over

=item if EXPRESSION then TRUE_FILTER else FALSE_FILTER

If the C<EXPRESSION> is true, return the result of C<TRUE_FILTER>, otherwise
return the result of C<FALSE_FILTER>.


    # INPUT
    foo: bar
    baz: fuzz

    $ yq 'if .foo eq bar then .baz else .foo'
    fuzz

    $ yq 'if .foo eq buzz then .baz else .foo'
    bar

    $ yq 'if .foo then .baz'
    fuzz

    $ yq 'if .herp then .derp else .'
    foo: bar
    baz: fuzz

The C<else FALSE_FILTER> is optional and defaults to returning undefined.

=back

=head2 COMBINATORS

Combinators combine multiple expressions to yield one or more documents in the
output stream.

=over

=item ,

Multiple EXPRESSIONS may be separated by commas to yield multiple documents in the
output.

    # INPUT
    foo: bar
    baz: fuzz

    $ yq '.foo, .baz'
    bar
    ---
    fuzz

=item |

Multiple EXPRESSIONS may be separated by pipes to give the output of the left
expression as the input of the right expression (much like how shell pipes
work).

    # INPUT
    foo: bar
    baz: fuzz
    pop: more
    ---
    foo: buzz
    baz: fizz
    pop: jump

    $ yq '{ foo: .foo, val: .pop } | group_by( .foo )'
    bar:
        - foo: bar
          val: more
    buzz:
        - foo: buzz
          val: jump

The above example can be useful to avoid C<group_by> memory issues when dealing
with very large streams: Reduce the size of the working document by keeping
only the keys you want, then group those documents.

=back

=head1 ENVIRONMENT VARIABLES

=over 4

=item YERTL_FORMAT

Specify the default format Yertl uses between commands. Defaults to C<yaml>. Can be
set to C<json> for interoperability with other programs.

=item YQ_VERBOSE

Set the verbosity level. Useful when running the tests.

=back