NAME

ygrok - Build YAML by parsing lines of plain text

SYNOPSIS

    ygrok [-l|--loose] <pattern> [<file>...]
    ygrok --pattern [<pattern_name> [<pattern>]]
    yfrom -h|--help|--version

DESCRIPTION

This program takes lines of plain text and converts them into documents.

ARGUMENTS

pattern

The pattern to match with. Any line that does not match the pattern will be ignored.

See the full documentation for pattern syntax.

file

A file to read. The special file "-" refers to STDIN. If no files are specified, read STDIN.

OPTIONS

-l|--loose

Match anywhere in the line. Normally, the pattern must match the full line. Setting this allows the pattern to match anywhere in the line (but still only once).

--pattern

View, add, and edit patterns. With no arguments, shows all the patterns. With pattern_name, shows the specific pattern or pattern category. With pattern, adds a custom pattern that can then be used in future patterns.

    # Show all patterns
    ygrok --pattern

    # Show all "NET" patterns
    ygrok --pattern NET

    # Show the "NET.HOSTNAME" pattern
    ygrok --pattern NET.HOSTNAME

    # Add a new pattern
    ygrok --pattern HOSTS_LINE '%{NET.HOSTNAME:host} %{NET.IPV4:ip}'

    # Use the new pattern
    ygrok '%{HOSTS_LINE}' < /etc/hosts

-h | --help

Show this help document.

--version

Print the current ygrok and Perl versions.

PATTERNS

A pattern is a match for the entire line, splitting the line into fields.

A named ygrok match has the format: %{PATTERN_NAME:field_name}. The PATTERN_NAME is one of the available patterns, listed below. The field_name is the field to put the matched data.

Additionally, the pattern is a Perl regular expression, so any regular expression syntax will work. Any named captures ((?<field_name>PATTERN)) will be part of the document.

BUILT-IN PATTERNS

The built-in patterns are common patterns that are always available.

Simple Patterns
WORD

A single word, \b\w+\b.

DATA

A non-slurpy section of data, .*?.

INT

An integer, positive or negative.

NUM

A floating-point number, positive or negative, with optional exponent.

Date/Time Patterns
DATE.MONTH

A full or abbreviated month name for the "C" locale (January (Jan), February (Feb), etc...)

DATE.ISO8601

An ISO8601 date/time

DATE.HTTP

An RFC822 date/time, used by HTTP.

DATE.SYSLOG

A syslog date, like "Jan 01 01:23:45"

Operating System Patterns
OS.USER

A username.

OS.PROCNAME

A process name

Networking Patterns
NET.IPV4

An IPv4 address.

NET.IPV6

An IPv6 address.

NET.HOSTNAME

A network host name. Either an RFC1101 domain or an IPv4 or IPv6 address.

URL Patterns
URL

A full URL with scheme

URL.PATH

The path part of a URL

Log File Patterns
LOG.HTTP_COMMON

The Apache Common Log Format.

LOG.HTTP_COMBINED

The Apache Combined Log Format.

LOG.SYSLOG

The syslog format (RFC 3164)

POSIX Command Output Patterns
POSIX.LS

Parse the output of ls -l

POSIX.PS

Parse the output of ps

POSIX.PSX

Parse the output of ps x and ps w

POSIX.PSU

Parse the output of ps u

ENVIRONMENT VARIABLES

YERTL_FORMAT

Specify the default format Yertl uses between commands. Defaults to yaml. Can be set to json for interoperability with other programs.