NAME
ygrok - Build YAML by parsing lines of plain text
SYNOPSIS
ygrok [-l|--loose] <pattern> [<file>...]
ygrok --pattern [<pattern_name> [<pattern>]]
yfrom -h|--help|--version
DESCRIPTION
This program takes lines of plain text and converts them into documents.
ARGUMENTS
pattern
The pattern to match with. Any line that does not match the pattern will be ignored.
See the full documentation for pattern syntax.
file
A file to read. The special file "-" refers to STDIN. If no files are specified, read STDIN.
OPTIONS
-l|--loose
Match anywhere in the line. Normally, the pattern must match the full line. Setting this allows the pattern to match anywhere in the line (but still only once).
--pattern
View, add, and edit patterns. With no arguments, shows all the patterns. With pattern_name
, shows the specific pattern or pattern category. With pattern
, adds a custom pattern that can then be used in future patterns.
# Show all patterns
ygrok --pattern
# Show all "NET" patterns
ygrok --pattern NET
# Show the "NET.HOSTNAME" pattern
ygrok --pattern NET.HOSTNAME
# Add a new pattern
ygrok --pattern HOSTS_LINE '%{NET.HOSTNAME:host} %{NET.IPV4:ip}'
# Use the new pattern
ygrok '%{HOSTS_LINE}' < /etc/hosts
-h | --help
Show this help document.
--version
Print the current ygrok and Perl versions.
PATTERNS
A pattern is a match for the entire line, splitting the line into fields.
A named ygrok match has the format: %{PATTERN_NAME:field_name}
. The PATTERN_NAME
is one of the available patterns, listed below. The field_name
is the field to put the matched data.
Additionally, the pattern is a Perl regular expression, so any regular expression syntax will work. Any named captures ((?<field_name>PATTERN)
) will be part of the document.
BUILT-IN PATTERNS
The built-in patterns are common patterns that are always available.
- Simple Patterns
-
- WORD
-
A single word,
\b\w+\b
. - DATA
-
A non-slurpy section of data,
.*?
. - INT
-
An integer, positive or negative.
- NUM
-
A floating-point number, positive or negative, with optional exponent.
- Date/Time Patterns
-
- DATE.MONTH
-
A full or abbreviated month name for the "C" locale (January (Jan), February (Feb), etc...)
- DATE.ISO8601
-
An ISO8601 date/time
- DATE.HTTP
-
An RFC822 date/time, used by HTTP.
- DATE.SYSLOG
-
A syslog date, like "Jan 01 01:23:45"
- Operating System Patterns
-
- OS.USER
-
A username.
- OS.PROCNAME
-
A process name
- Networking Patterns
-
- NET.IPV4
-
An IPv4 address.
- NET.IPV6
-
An IPv6 address.
- NET.HOSTNAME
-
A network host name. Either an RFC1101 domain or an IPv4 or IPv6 address.
- URL Patterns
-
- URL
-
A full URL with scheme
- URL.PATH
-
The path part of a URL
- Log File Patterns
-
- LOG.HTTP_COMMON
-
The Apache Common Log Format.
- LOG.HTTP_COMBINED
-
The Apache Combined Log Format.
- LOG.SYSLOG
-
The syslog format (RFC 3164)
- POSIX Command Output Patterns
-
- POSIX.LS
-
Parse the output of
ls -l
- POSIX.PS
-
Parse the output of
ps
- POSIX.PSX
-
Parse the output of
ps x
andps w
- POSIX.PSU
-
Parse the output of
ps u
ENVIRONMENT VARIABLES
- YERTL_FORMAT
-
Specify the default format Yertl uses between commands. Defaults to
yaml
. Can be set tojson
for interoperability with other programs.