=head1 NAME

tablizer - Manipulate tabular output of other programs

=head1 SYNOPSIS

    Usage:
      tablizer [regex,...] [-r file] [flags]
    
    Operational Flags:
      -c, --columns string               Only show the speficied columns (separated by ,)
      -v, --invert-match                 select non-matching rows
      -n, --numbering                    Enable header numbering
      -N, --no-color                     Disable pattern highlighting
      -H, --no-headers                   Disable headers display
      -s, --separator <string>           Custom field separator (maybe char, string or :class:)
      -k, --sort-by <int|name>           Sort by column (default: 1)
      -z, --fuzzy                        Use fuzzy search [experimental]
      -F, --filter <field[!]=reg>        Filter given field with regex, can be used multiple times
      -T, --transpose-columns string     Transpose the speficied columns (separated by ,)
      -R, --regex-transposer </from/to/> Apply /search/replace/ regexp to fields given in -T
      -j, --json                         Read JSON input (must be array of hashes)
      -I, --interactive                  Interactively filter and select rows
      -g, --auto-headers                 Generate headers if there are none present in input
      -x, --custom-headers a,b,...       Use custom headers, separated by comma

    Output Flags (mutually exclusive):
      -X, --extended                     Enable extended output
      -M, --markdown                     Enable markdown table output
      -O, --orgtbl                       Enable org-mode table output
      -S, --shell                        Enable shell evaluable output
      -Y, --yaml                         Enable yaml output
      -J, --jsonout                      Enable JSON output
      -C, --csv                          Enable CSV output
      -A, --ascii                        Default output mode, ascii tabular
      -P, --template <tpl>               Enable template mode with template <tpl>
      -L, --hightlight-lines             Use alternating background colors for tables
      -o, --ofs <char>                   Output field separator, used by -A and -C. 
      -y, --yank-columns                 Yank specified columns (separated by ,) to clipboard,
                                         space separated

    Sort Mode Flags (mutually exclusive):
      -a, --sort-age                     sort according to age (duration) string
      -D, --sort-desc                    Sort in descending order (default: ascending)
      -i, --sort-numeric                 sort according to string numerical value
      -t, --sort-time                    sort according to time string

    Other Flags:
      -r  --read-file <file>             Use <file> as input instead of STDIN
          --completion <shell>           Generate the autocompletion script for <shell>
      -f, --config <file>                Configuration file (default: ~/.config/tablizer/config)
      -d, --debug                        Enable debugging
      -h, --help                         help for tablizer
      -m, --man                          Display manual page
      -V, --version                      Print program version


=head1 DESCRIPTION

Many  programs generate  tabular output.   But sometimes  you need  to
post-process these tables, you may need  to remove one or more columns
or you  may want to filter  for some pattern (See  L<PATTERNS>) or you
may need the  output in another program and need  to parse it somehow.
Standard unix tools such as awk(1), grep(1) or column(1) may help, but
sometimes it's a tedious business.

Let's take  the output of  the tool  kubectl.  It contains  cells with
withespace and they do not separate columns by TAB characters. This is
not easy to process.

You can use B<tablizer> to do these and more things.

B<tablizer>  analyses the  header  fields of  a  table, registers  the
column positions of  each header field and separates  columns by those
positions.

Without any options it reads its input from C<STDIN>, but you can also
specify a  file as a  parameter. If you want  to reduce the  output by
some regular expression,  just specify it as its  first parameter. You
may also  use the  B<-v> option  to exclude all  rows which  match the
pattern. Hence:

   # read from STDIN
   > kubectl get pods | tablizer

   # read a file
   > tablizer -r filename

   # search for pattern in a file (works like grep)
   > tablizer regex -r filename

   # search for pattern in STDIN
   > kubectl get pods | tablizer regex

The output looks like the original  one. You can add the option B<-n>,
then every header field will have a numer associated with it, e.g.:

   NAME(1) READY(2) STATUS(3) RESTARTS(4) AGE(5)

These numbers denote the column and  you can use them to specify which
columns you want to have in your output (see L<COLUMNS>:

   > kubectl get pods | tablizer -c1,3

You can specify the numbers in any order but output will always follow
the original order.

However, you may also just use the header names instead of numbers,
eg:

   > kubectl get pods | tablizer -cname,status

You can also use regular expressions with B<-c>, eg:

   > kubectl get pods | tablizer -c '[ae]'  

By  default tablizer  shows  a  header containing  the  names of  each
column.  This can  be disabled using the B<-H> option.   Be aware that
this only affects tabular output modes.  Shell, Extended, Yaml and CSV
output modes always use the column names.

By  default, if  a  B<pattern>  has been  speficied,  matches will  be
highlighted. You can disable this behavior with the B<-N> option.

Use the  B<-k> option to specify  by which column to  sort the tabular
data  (as in  GNU  sort(1)).  The default  sort  column  is the  first
one. You can specify column numbers or names. Column numbers start
with 1, names are case insensitive. You can specify multiple columns
separated by comma to sort, but the type must be the same. For example
if you want to sort numerically, all columns must be numbers. If you
use column numbers, then be aware, that these are the numbers before
column extraction. For example if you have a table with 4 columns and
specify C<-c4>, then only 1 column (the fourth) will be printed,
however if you want to sort by this column, you'll have to specify
C<-k4>.

The default  sort order  is ascending.  You can  change this  to
descending order using the option B<-D>.  The default sort order is by
alphanumeric string, but there are other sort modes:

=over

=item B<-a --sort-age>

Sorts duration strings like "1d4h32m51s".

=item B<-i --sort-numeric>

Sorts numeric fields.

=item B<-t --sort-time>

Sorts timestamps.

=back

Finally the  B<-d> option  enables debugging  output which  is mostly
useful for the developer.

=head2 SEPARATOR

The option B<-s> can be a single character, in which case the CSV
parser will be invoked. You can also specify a string as
separator. The string will be interpreted as literal string unless it
is a valid go regular expression. For example:

    -s '\t{2,}\'

is being used as a regexp and will match two or more consecutive tabs.

    -s 'foo'

on the other hand is no regular expression and will be used literally.

To make live easier, there are a couple of predefined regular
expressions, which you can specify as classes:

=over

* 		:tab:      

Matches a tab and eats spaces around it.

*		:spaces:

Matches 2 or more spaces.

*		:pipe:

Matches a pipe character and eats spaces around it.

*		:default:

Matches 2 or more spaces or tab. This is the default separator if none
is specified.

*		:nonword:

Matches a non-word character.

*		:nondigit:

Matches a non-digit character.

*		:special:

Matches one or more special chars like brackets, dollar sign, slashes etc.

*		:nonprint:

Matches one or more non-printable characters.


=back

=head2 PATTERNS AND FILTERING

You can reduce  the rows being displayed by using  one or more regular
expression patterns.   The regexp  language being used  is the  one of
GOLANG,     refer    to     the    syntax     cheat    sheet     here:
L<https://pkg.go.dev/regexp/syntax>.

If you  want to  read a more  comprehensive documentation  about the
topic and have perl installed you can read it with:

    perldoc perlre

Or read it online: L<https://perldoc.perl.org/perlre>. But please note
that the GO regexp engine does NOT support all perl regex terms,
especially look-ahead and look-behind.

If you want to supply flags to a regex, then surround it with slashes
and append the flag. The following flags are supported:

    i => case insensitive
    ! => negative match

Example for a case insensitive search:

    > kubectl get pods -A | tablizer "/account/i"

If you use the C<!> flag, then the regex match will be negated, that
is, if a line in the input matches the given regex, but C<!> is
supplied, tablizer will NOT include it in the output.

For example, here we want to get all lines matching "foo" but not
"bar":

    cat table | tablizer foo '/bar/!'

This would match a line "foo zorro" but not "foo bar".

The flags can also be combined.

You  can also use  the experimental  fuzzy search  feature by  providing the
option B<-z>, in which case the  pattern is regarded as a fuzzy search
term, not a regexp.

Sometimes you want to  filter by one or more columns.  You can do that
using the B<-F> option. The option can be specified multiple times and
has the following format:

    fieldname=regexp

Fieldnames (== columns headers) are case insensitive.

If you specify more than one filter, both filters have to match (AND
operation).

These field filters can also be negated:

    fieldname!=regexp

If the option B<-v> is specified, the filtering is inverted.

=head2 INTERACTIVE FILTERING

You can also use the interactive mode, enabled with C<-I> to filter
and select rows. This mode is complementary, that is, other filter
options are still being respected.

To enter e filter, hit C</>, enter a filter string and finish with
C<ENTER>. Use C<SPACE> to select/deselect rows, use C<a> to select all
(visible) rows.

Commit your selection with C<q>. The selected rows are being fed to
the requested output mode as usual. Abort with C<CTRL-c>, in which
case the results of the interactive mode are being ignored and all
rows are being fed to output.

=head2 COLUMNS

The  parameter  B<-c>  can  be  used  to  specify,  which  columns  to
display.  By default  tablizer numerizes  the header  names and  these
numbers can  be used to specify  which header to display,  see example
above.

However, beside  numbers, you  can also  use regular  expressions with
B<-c>, also  separated by comma. And  you can mix column  numbers with
regexps.

Lets take this table:

        PID TTY          TIME CMD
      14001 pts/0    00:00:00 bash
      42871 pts/0    00:00:00 ps
      42872 pts/0    00:00:00 sed

We want to see only the CMD column and use a regex for this:

    > ps | tablizer -s '\s+' -c C
    CMD(4)
    bash
    ps
    tablizer
    sed

where "C" is our regexp which matches CMD.

If a column specifier doesn't look like a regular expression, matching
against header  fields will  be case  insensitive. So,  if you  have a
field with  the name C<ID> then  these will all match:  C<-c id>, C<-c
Id>. The same rule applies to the options C<-T> and C<-F>.


=head2 TRANSPOSE FIELDS USING REGEXPS

You can manipulate field contents using regular expressions. You have
to tell tablizer which field[s] to operate on using the option C<-T>
and the search/replace pattern using C<-R>. The number of columns and
patterns must match.

A search/replace pattern consists of the following elements:

    /search-regexp/replace-string/

The separator can be any valid character. Especially if you want to
use a regexp containing the C</> character, eg:

    |search-regexp|replace-string|

Example:
    
    > cat t/testtable2
    NAME  DURATION
    x     10
    a     100
    z     0
    u     4
    k     6
    
    > cat t/testtable2 | tablizer -T2 -R '/^\d/4/' -n
    NAME    DURATION 
    x       40      
    a       400     
    z       4       
    u       4       
    k       4  

=head2 OUTPUT MODES

There might be cases  when the tabular output of a  program is way too
large  for your  current  terminal but  you still  need  to see  every
column.   In such  cases the  B<-o extended>  or B<-X>  option can  be
useful which enables I<extended mode>. In  this mode, each row will be
printed vertically,  header left,  value right,  aligned by  the field
widths. Here's an example:

    > kubectl get pods | ./tablizer -o extended
        NAME: repldepl-7bcd8d5b64-7zq4l  
       READY: 1/1    
      STATUS: Running  
    RESTARTS: 1 (71m ago)  
         AGE: 5h28m

You can  of course  still use  a regex  to reduce  the number  of rows
displayed.

The option B<-o shell>  can be used if the output  has to be processed
by the shell,  it prints variable assignments for each  cell, one line
per row:

    > kubectl get pods | ./tablizer -o extended ./tablizer -o shell
    NAME="repldepl-7bcd8d5b64-7zq4l" READY="1/1" STATUS="Running" RESTARTS="9 (47m ago)" AGE="4d23h" 
    NAME="repldepl-7bcd8d5b64-m48n8" READY="1/1" STATUS="Running" RESTARTS="9 (47m ago)" AGE="4d23h" 
    NAME="repldepl-7bcd8d5b64-q2bf4" READY="1/1" STATUS="Running" RESTARTS="9 (47m ago)" AGE="4d23h" 

You can use this in an eval loop.

Beside normal  ascii mode  (the default) and  extended mode  there are
more output modes available: B<orgtbl>  which prints an Emacs org-mode
table and  B<markdown> which prints  a Markdown table,  B<yaml>, which
prints  yaml encoding  and B<CSV>  mode, which  prints a  comma separated
value file.

A special output mode ist the B<Template> mode, activated with the
option C<--template>. Template language is the Golang template
language: L<https://pkg.go.dev/text/template>. You can also use lot's
of additional functions from:
L<https://masterminds.github.io/sprig/>. Here's an example:

    > kubectl get pods | tablizer --template "{{.name}} is {{.status}}"
    alertmanager-kube-prometheus-alertmanager-0 is Running
    grafana-fcc54cbc9-bk7s8 is Running


You can use header names as variables.

=head2 PUT FIELDS TO CLIPBOARD

You can let tablizer put fields to the clipboard using the option
C<-y>. This best fits the use-case when the result of your filtering
yields just one row. For example:

    cloudctl cluster ls | tablizer -yid matchbox

If "matchbox" matches one cluster, you can immediately use the id of
that cluster somewhere else and paste it. Of course, if there are
multiple matches, then all id's will be put into the clipboard
separated by one space.

=head2 ENVIRONMENT VARIABLES

B<tablizer> supports  certain environment variables which  use can use
to  influence   program  behavior.   Commandline  flags   have  always
precedence over environment variables.

=over

=item <T_HEADER_NUMBERING> - enable numbering of header fields, like B<-n>.

=item <T_COLUMNS> - comma separated list of columns to output, like B<-c>

=item <NO_COLORS> - disable colorization of matches, like B<-N>

=back

=head2 COMPLETION

Shell completion for command line options  can be enabled by using the
B<--completion>  flag. The  required  parameter is  the  name of  your
shell. Currently supported are: bash, zsh, fish and powershell.

Detailed instructions:

=over

=item Bash:

   source <(tablizer --completion bash)

To load completions for each session, execute once:

  # Linux:
  $ tablizer --completion bash > /etc/bash_completion.d/tablizer

  # macOS:
  $ tablizer --completion bash > $(brew --prefix)/etc/bash_completion.d/tablizer

=item Zsh:

If shell completion is not already enabled in your environment,
you will need to enable it.  You can execute the following once:

  echo "autoload -U compinit; compinit" >> ~/.zshrc

To load completions for each session, execute once:

  $ tablizer --completion zsh > "${fpath[1]}/_tablizer"

You will need to start a new shell for this setup to take effect.

=item fish:

   tablizer --completion fish | source

To load completions for each session, execute once:

   tablizer --completion fish > ~/.config/fish/completions/tablizer.fish

=item PowerShell:

   tablizer --completion powershell | Out-String | Invoke-Expression

To load completions for every new session, run:

   tablizer --completion powershell > tablizer.ps1

and source this file from your PowerShell profile.

=back

=head1 CONFIGURATION AND COLORS

YOu can put certain configuration values into a configuration file in
HCL format. By default tablizer looks for
C<$HOME/.config/tablizer/config>, but you can provide one using the
parameter C<-f>.

In the configuration the following variables can be defined:

    BG             = "lightGreen"
    FG             = "white"
    HighlightBG    = "lightGreen"
    HighlightFG    = "white"
    NoHighlightBG  = "white"
    NoHighlightFG  = "lightGreen"
    HighlightHdrBG = "red"
    HighlightHdrFG = "white"

The following color definitions are available:

black, blue,  cyan, darkGray, default, green,  lightBlue, lightCyan,
lightGreen,   lightMagenta,   lightRed,   lightWhite,   lightYellow,
magenta, red, white, yellow

The Variables B<FG> and B<BG> are being used to highlight matches. The
other *FG and *BG variables are for colored table output (enabled with
the C<-L> parameter).

Colorization can be turned off completely either by setting the
parameter C<-N> or the environment variable B<NO_COLOR> to a true value.


=head1 BUGS

In order to report a bug, unexpected behavior, feature requests
or to submit a patch, please open an issue on github:
L<https://codeberg.org/scip/tablizer/issues>.

=head1 LICENSE

This software is licensed under the GNU GENERAL PUBLIC LICENSE version 3.

Copyright (c) 2022-2024 by Thomas von Dein

This software uses the following GO modules:

=over 4

=item repr (https://github.com/alecthomas/repr)

Released under the MIT License, Copyright (c) 2016 Alec Thomas

=item cobra (https://github.com/spf13/cobra)

Released under the Apache 2.0 license, Copyright 2013-2022 The Cobra Authors

=item dateparse (github.com/araddon/dateparse)

Released under the MIT License, Copyright (c) 2015-2017 Aaron Raddon

=item color (github.com/gookit/color)

Released under the MIT License, Copyright (c) 2016 inhere

=item tablewriter (github.com/olekukonko/tablewriter)

Released under the MIT License, Copyright (c) 201 by Oleku Konko

=item yaml (gopkg.in/yaml.v3)

Released under the MIT License, Copyright (c) 2006-2011 Kirill Simonov

=item bubble-table (https://github.com/Evertras/bubble-table)

Released under the MIT License, Copyright (c) 2022 Brandon Fulljames

=back

=head1 AUTHORS

Thomas von Dein B<tom AT vondein DOT org>

=cut