fix short usage formatting

add some handy builtin character classes as split separators (#84 )
fix builder go version
2025-12-18 21:11:03 +01:00 · 2025-10-09 23:16:07 +02:00 · 2025-10-09 23:03:57 +02:00 · 2025-10-08 10:36:09 +02:00 · 2025-10-06 23:27:48 +02:00 · 2025-10-06 23:02:28 +02:00
14 changed files with 358 additions and 69 deletions
--- a/.github/workflows/release.yaml
+++ b/.github/workflows/release.yaml
@@ -15,7 +15,7 @@ jobs:
      - name: Set up Go
        uses: actions/setup-go@v6
        with:
-          go-version: 1.22.11
+          go-version: 1.24.0
      - name: Build the executables
        run: ./mkrel.sh tablizer ${{ github.ref_name}}
--- a/2
+++ b/2
@@ -65,7 +65,7 @@ clean:
 	rm -rf $(tool) releases coverage.out
 test: clean
-	go test -cover ./... $(OPTS)
+	go test -count=1 -cover ./... $(OPTS)
 singletest:
 	@echo "Call like this: 'make singletest TEST=TestPrepareColumns MOD=lib'"
--- a/README.md
+++ b/README.md
@@ -192,10 +192,9 @@ hesitate to ask me about it, I'll add it.
 ## Documentation
 The  documentation  is  provided  as  a unix  man-page.   It  will  be
-automatically installed if  you install from source.  However, you can
+automatically installed if  you install from source.
 read the man-page online:
-https://github.com/TLINDEN/tablizer/blob/main/tablizer.pod
+[However, you can read the man-page online](https://github.com/TLINDEN/tablizer/blob/main/tablizer.pod).
 Or if you cloned  the repository you can read it  this way (perl needs
 to be installed though): `perldoc tablizer.pod`.
--- a/cfg/config.go
+++ b/cfg/config.go
@@ -27,13 +27,26 @@ import (
 	"github.com/hashicorp/hcl/v2/hclsimple"
 )
-const DefaultSeparator string = `(\s\s+|\t)`
+const (
-const Version string = "v1.5.7"
+	Version  = "v1.5.9"
-const MAXPARTS = 2
+	MAXPARTS = 2
 )
-var DefaultConfigfile = os.Getenv("HOME") + "/.config/tablizer/config"
+var (
 	DefaultConfigfile = os.Getenv("HOME") + "/.config/tablizer/config"
 	VERSION           string // maintained by -x
-var VERSION string // maintained by -x
+	SeparatorTemplates = map[string]string{
 		":tab:":      `\s*\t\s*`,                               // tab but eats spaces around
 		":spaces:":   `\s{2,}`,                                 // 2 or more spaces
 		":pipe:":     `\s*\|\s*`,                               // one pipe eating spaces around
 		":default:":  `(\s\s+|\t)`,                             // 2 or more spaces or tab
 		":nonword:":  `\W`,                                     // word boundary
 		":nondigit:": `\D`,                                     // same for numbers
 		":special:":  `[\*\+\-_\(\)\[\]\{\}?\\/<>=&$§"':,\^]+`, // match any special char
 		":nonprint:": `[[:^print:]]+`,                          // non printables
 	}
 )
 // public config, set via config file or using defaults
 type Settings struct {
@@ -356,6 +369,13 @@ func (conf *Config) ApplyDefaults() {
 	if conf.OutputMode == Yaml || conf.OutputMode == CSV {
 		conf.Numbering = false
 	}
 	if conf.Separator[0] == ':' && conf.Separator[len(conf.Separator)-1] == ':' {
 		separator, ok := SeparatorTemplates[conf.Separator]
 		if ok {
 			conf.Separator = separator
 		}
 	}
 }
 func (conf *Config) PreparePattern(patterns []*Pattern) error {
--- a/cmd/root.go
+++ b/cmd/root.go
@@ -123,7 +123,7 @@ func Execute() {
 		"Use alternating background colors")
 	rootCmd.PersistentFlags().StringVarP(&ShowCompletion, "completion", "", "",
 		"Display completion code")
-	rootCmd.PersistentFlags().StringVarP(&conf.Separator, "separator", "s", cfg.DefaultSeparator,
+	rootCmd.PersistentFlags().StringVarP(&conf.Separator, "separator", "s", cfg.SeparatorTemplates[":default:"],
 		"Custom field separator")
 	rootCmd.PersistentFlags().StringVarP(&conf.Columns, "columns", "c", "",
 		"Only show the speficied columns (separated by ,)")
--- a/cmd/shortusage.go
+++ b/cmd/shortusage.go
@@ -7,7 +7,7 @@ const shortusage = `tablizer [regex,...] [-r file] [flags]
 -T col,...   transpose specified columns           -n  numberize columns
 -R /from/to/ apply replacement to columns in -T    -N  do not use colors
 -y col,...   yank columns to clipboard             -H  do not show headers
--ofs char   output field separator	               -s  specify field separator
+--ofs char   output field separator                -s  specify field separator
 -r file      read input from file                  -z  use fuzzy search
 -f file      read config from file                 -I  interactive filter mode
                                                   -d  debug
--- a/cmd/tablizer.go
+++ b/cmd/tablizer.go
@@ -14,7 +14,7 @@ SYNOPSIS
          -n, --numbering                    Enable header numbering
          -N, --no-color                     Disable pattern highlighting
          -H, --no-headers                   Disable headers display
-          -s, --separator <string>           Custom field separator
+          -s, --separator <string>           Custom field separator (maybe char, string or :class:)
          -k, --sort-by <int|name>           Sort by column (default: 1)
          -z, --fuzzy                        Use fuzzy search [experimental]
          -F, --filter <field[!]=reg>        Filter given field with regex, can be used multiple times
@@ -141,6 +141,57 @@ DESCRIPTION
    Finally the -d option enables debugging output which is mostly useful
    for the developer.
  SEPARATOR
    The option -s can be a single character, in which case the CSV parser
    will be invoked. You can also specify a string as separator. The string
    will be interpreted as literal string unless it is a valid go regular
    expression. For example:
        -s '\t{2,}\'
    is being used as a regexp and will match two or more consecutive tabs.
        -s 'foo'
    on the other hand is no regular expression and will be used literally.
    To make live easier, there are a couple of predefined regular
    expressions, which you can specify as classes:
        * :tab:
        Matches a tab and eats spaces around it.
        * :spaces:
        Matches 2 or more spaces.
        * :pipe:
        Matches a pipe character and eats spaces around it.
        * :default:
        Matches 2 or more spaces or tab. This is the default separator if
        none is specified.
        * :nonword:
        Matches a non-word character.
        * :nondigit:
        Matches a non-digit character.
        * :special:
        Matches one or more special chars like brackets, dollar sign,
        slashes etc.
        * :nonprint:
        Matches one or more non-printable characters.
  PATTERNS AND FILTERING
    You can reduce the rows being displayed by using one or more regular
    expression patterns. The regexp language being used is the one of
@@ -458,7 +509,7 @@ Operational Flags:
  -n, --numbering                    Enable header numbering
  -N, --no-color                     Disable pattern highlighting
  -H, --no-headers                   Disable headers display
-  -s, --separator <string>           Custom field separator
+  -s, --separator <string>           Custom field separator (maybe char, string or :class:)
  -k, --sort-by <int|name>           Sort by column (default: 1)
  -z, --fuzzy                        Use fuzzy search [experimental]
  -F, --filter <field[!]=reg>        Filter given field with regex, can be used multiple times
--- a/lib/helpers.go
+++ b/lib/helpers.go
@@ -22,7 +22,7 @@ import (
 	"fmt"
 	"os"
 	"regexp"
-	"sort"
+	"slices"
 	"strconv"
 	"strings"
@@ -30,16 +30,6 @@ import (
 	"github.com/tlinden/tablizer/cfg"
 )
 func contains(s []int, e int) bool {
 	for _, a := range s {
 		if a == e {
 			return true
 		}
 	}
 	return false
 }
 func findindex(s []int, e int) (int, bool) {
 	for i, a := range s {
 		if a == e {
@@ -172,48 +162,32 @@ func PrepareColumnVars(columns string, data *Tabdata) ([]int, error) {
 		}
 	}
-	// deduplicate: put all values into a map (value gets map key)
+	// deduplicate columns, preserve order
-	// thereby  removing duplicates,  extract keys into  new slice
+	deduped := []int{}
 	// and sort it
 	imap := make(map[int]int, len(usecolumns))
 	for _, i := range usecolumns {
-		imap[i] = 0
+		if !slices.Contains(deduped, i) {
 			deduped = append(deduped, i)
 		}
 	}
-	// fill with deduplicated columns
+	return deduped, nil
 	usecolumns = nil
 	for k := range imap {
 		usecolumns = append(usecolumns, k)
 	}
 	sort.Ints(usecolumns)
 	return usecolumns, nil
 }
 // prepare headers: add numbers to headers
 func numberizeAndReduceHeaders(conf cfg.Config, data *Tabdata) {
-	numberedHeaders := []string{}
+	numberedHeaders := make([]string, len(data.headers))
 	maxwidth := 0 // start from scratch, so we only look at displayed column widths
 	// add numbers to headers if needed, get widest cell width
 	for idx, head := range data.headers {
 		var headlen int
 		if len(conf.Columns) > 0 {
 			// -c specified
 			if !contains(conf.UseColumns, idx+1) {
 				// ignore this one
 				continue
 			}
 		}
 		if conf.Numbering {
-			numhead := fmt.Sprintf("%s(%d)", head, idx+1)
+			newhead := fmt.Sprintf("%s(%d)", head, idx+1)
-			headlen = len(numhead)
+			numberedHeaders[idx] = newhead
-			numberedHeaders = append(numberedHeaders, numhead)
+			headlen = len(newhead)
 		} else {
 			numberedHeaders = append(numberedHeaders, head)
 			headlen = len(head)
 		}
@@ -222,7 +196,24 @@ func numberizeAndReduceHeaders(conf cfg.Config, data *Tabdata) {
 		}
 	}
-	data.headers = numberedHeaders
+	if conf.Numbering {
 		data.headers = numberedHeaders
 	}
 	if len(conf.UseColumns) > 0 {
 		// re-align headers based on user requested column list
 		headers := make([]string, len(conf.UseColumns))
 		for i, col := range conf.UseColumns {
 			for idx := range data.headers {
 				if col-1 == idx {
 					headers[i] = data.headers[col-1]
 				}
 			}
 		}
 		data.headers = headers
 	}
 	if data.maxwidthHeader != maxwidth && maxwidth > 0 {
 		data.maxwidthHeader = maxwidth
@@ -234,17 +225,17 @@ func reduceColumns(conf cfg.Config, data *Tabdata) {
 	if len(conf.Columns) > 0 {
 		reducedEntries := [][]string{}
 		var reducedEntry []string
 		for _, entry := range data.entries {
-			reducedEntry = nil
+			var reducedEntry []string
-			for i, value := range entry {
+			for _, col := range conf.UseColumns {
-				if !contains(conf.UseColumns, i+1) {
+				col--
-					continue
+
 				for idx, value := range entry {
 					if idx == col {
 						reducedEntry = append(reducedEntry, value)
 					}
 				}
 				reducedEntry = append(reducedEntry, value)
 			}
 			reducedEntries = append(reducedEntries, reducedEntry)
--- a/lib/helpers_test.go
+++ b/lib/helpers_test.go
@@ -19,6 +19,7 @@ package lib
 import (
 	"fmt"
 	"slices"
 	"testing"
 	"github.com/stretchr/testify/assert"
@@ -38,7 +39,7 @@ func TestContains(t *testing.T) {
 	for _, tt := range tests {
 		testname := fmt.Sprintf("contains-%d,%d,%t", tt.list, tt.search, tt.want)
 		t.Run(testname, func(t *testing.T) {
-			answer := contains(tt.list, tt.search)
+			answer := slices.Contains(tt.list, tt.search)
 			assert.EqualValues(t, tt.want, answer)
 		})
@@ -72,7 +73,8 @@ func TestPrepareColumns(t *testing.T) {
 	}
 	for _, testdata := range tests {
-		testname := fmt.Sprintf("PrepareColumns-%s-%t", testdata.input, testdata.wanterror)
+		testname := fmt.Sprintf("PrepareColumns-%s-%t",
 			testdata.input, testdata.wanterror)
 		t.Run(testname, func(t *testing.T) {
 			conf := cfg.Config{Columns: testdata.input}
 			err := PrepareColumns(&conf, &data)
--- a/lib/parser.go
+++ b/lib/parser.go
@@ -25,6 +25,7 @@ import (
 	"fmt"
 	"io"
 	"log"
 	"math"
 	"regexp"
 	"strings"
@@ -222,6 +223,32 @@ func parseRawJSON(conf cfg.Config, input io.Reader) (Tabdata, error) {
 					row[idxmap[currentfield]] = val
 				}
 			}
 		case float64:
 			var value string
 			// we set precision to 0 if the float is a whole number
 			if val == math.Trunc(val) {
 				value = fmt.Sprintf("%.f", val)
 			} else {
 				value = fmt.Sprintf("%f", val)
 			}
 			if !haveheaders {
 				row = append(row, value)
 			} else {
 				row[idxmap[currentfield]] = value
 			}
 		case nil:
 			// we ignore here if a value  shall be an int or a string,
 			// because tablizer only works with strings anyway
 			if !haveheaders {
 				row = append(row, "")
 			} else {
 				row[idxmap[currentfield]] = ""
 			}
 		case json.Delim:
 			if val.String() == "}" {
 				data = append(data, row)
@@ -240,6 +267,8 @@ func parseRawJSON(conf cfg.Config, input io.Reader) (Tabdata, error) {
 				haveheaders = true
 			}
 			isjson = true
 		default:
 			fmt.Printf("unknown token: %v type: %T\n", t, t)
 		}
 		iskey = !iskey
--- a/lib/parser_test.go
+++ b/lib/parser_test.go
@@ -34,7 +34,7 @@ var input = []struct {
 }{
 	{
 		name:      "tabular-data",
-		separator: cfg.DefaultSeparator,
+		separator: cfg.SeparatorTemplates[":default:"],
 		text: `
 ONE    TWO    THREE  
 asd    igig   cxxxncnc  
@@ -148,7 +148,7 @@ asd    igig
 19191  EDD 1  X`
 	readFd := strings.NewReader(strings.TrimSpace(table))
-	conf := cfg.Config{Separator: cfg.DefaultSeparator}
+	conf := cfg.Config{Separator: cfg.SeparatorTemplates[":default:"]}
 	gotdata, err := wrapValidateParser(conf, readFd)
 	assert.NoError(t, err)
@@ -180,6 +180,38 @@ func TestParserJSONInput(t *testing.T) {
 			expect: Tabdata{},
 		},
 		{
 			// contains nil, int and float values
 			name:      "niljson",
 			wanterror: false,
 			input: `[
  {
    "NAME": "postgres-operator-7f4c7c8485-ntlns",
    "READY": "1/1",
    "STATUS": "Running",
    "RESTARTS": 0,
    "AGE": null,
    "X": 12,
    "Y": 34.222
  }
 ]`,
 			expect: Tabdata{
 				columns: 7,
 				headers: []string{"NAME", "READY", "STATUS", "RESTARTS", "AGE", "X", "Y"},
 				entries: [][]string{
 					[]string{
 						"postgres-operator-7f4c7c8485-ntlns",
 						"1/1",
 						"Running",
 						"0",
 						"",
 						"12",
 						"34.222000",
 					},
 				},
 			},
 		},
 		{
 			// one field missing + different order
 			// but shall not fail
@@ -282,6 +314,58 @@ func TestParserJSONInput(t *testing.T) {
 	}
 }
 func TestParserSeparators(t *testing.T) {
 	list := []string{"alpha", "beta", "delta"}
 	tests := []struct {
 		input string
 		sep   string
 	}{
 		{
 			input: `🎲`,
 			sep:   ":nonprint:",
 		},
 		{
 			input: `|`,
 			sep:   ":pipe:",
 		},
 		{
 			input: `   `,
 			sep:   ":spaces:",
 		},
 		{
 			input: "   \t  ",
 			sep:   ":tab:",
 		},
 		{
 			input: `-`,
 			sep:   ":nonword:",
 		},
 		{
 			input: `//$`,
 			sep:   ":special:",
 		},
 	}
 	for _, testdata := range tests {
 		testname := fmt.Sprintf("parse-%s", testdata.sep)
 		t.Run(testname, func(t *testing.T) {
 			header := strings.Join(list, testdata.input)
 			row := header
 			content := header + "\n" + row
 			readFd := strings.NewReader(strings.TrimSpace(content))
 			conf := cfg.Config{Separator: testdata.sep}
 			conf.ApplyDefaults()
 			gotdata, err := wrapValidateParser(conf, readFd)
 			assert.NoError(t, err)
 			assert.EqualValues(t, [][]string{list}, gotdata.entries)
 		})
 	}
 }
 func wrapValidateParser(conf cfg.Config, input io.Reader) (Tabdata, error) {
 	data, err := Parse(conf, input)
--- a/lib/printer_test.go
+++ b/lib/printer_test.go
@@ -292,6 +292,7 @@ func TestPrinter(t *testing.T) {
 				conf.UseSortByColumn = []int{testdata.column}
 			}
 			conf.Separator = cfg.SeparatorTemplates[":default:"]
 			conf.ApplyDefaults()
 			// the test checks the len!
--- a/tablizer.1
+++ b/tablizer.1
@@ -133,7 +133,7 @@
 .\" ========================================================================
 .\"
 .IX Title "TABLIZER 1"
-.TH TABLIZER 1 "2025-10-01" "1" "User Commands"
+.TH TABLIZER 1 "2025-10-09" "1" "User Commands"
 .\" For nroff, turn off justification.  Always turn off hyphenation; it makes
 .\" way too many mistakes in technical documents.
 .if n .ad l
@@ -152,7 +152,7 @@ tablizer \- Manipulate tabular output of other programs
 \&      \-n, \-\-numbering                    Enable header numbering
 \&      \-N, \-\-no\-color                     Disable pattern highlighting
 \&      \-H, \-\-no\-headers                   Disable headers display
-\&      \-s, \-\-separator <string>           Custom field separator
+\&      \-s, \-\-separator <string>           Custom field separator (maybe char, string or :class:)
 \&      \-k, \-\-sort\-by <int|name>           Sort by column (default: 1)
 \&      \-z, \-\-fuzzy                        Use fuzzy search [experimental]
 \&      \-F, \-\-filter <field[!]=reg>        Filter given field with regex, can be used multiple times
@@ -293,6 +293,62 @@ Sorts timestamps.
 .PP
 Finally the  \fB\-d\fR option  enables debugging  output which  is mostly
 useful for the developer.
 .SS "\s-1SEPARATOR\s0"
 .IX Subsection "SEPARATOR"
 The option \fB\-s\fR can be a single character, in which case the \s-1CSV\s0
 parser will be invoked. You can also specify a string as
 separator. The string will be interpreted as literal string unless it
 is a valid go regular expression. For example:
 .PP
 .Vb 1
 \&    \-s \*(Aq\et{2,}\e\*(Aq
 .Ve
 .PP
 is being used as a regexp and will match two or more consecutive tabs.
 .PP
 .Vb 1
 \&    \-s \*(Aqfoo\*(Aq
 .Ve
 .PP
 on the other hand is no regular expression and will be used literally.
 .PP
 To make live easier, there are a couple of predefined regular
 expressions, which you can specify as classes:
 .Sp
 .RS 4
 * 		:tab:
 .Sp
 Matches a tab and eats spaces around it.
 .Sp
 *		:spaces:
 .Sp
 Matches 2 or more spaces.
 .Sp
 *		:pipe:
 .Sp
 Matches a pipe character and eats spaces around it.
 .Sp
 *		:default:
 .Sp
 Matches 2 or more spaces or tab. This is the default separator if none
 is specified.
 .Sp
 *		:nonword:
 .Sp
 Matches a non-word character.
 .Sp
 *		:nondigit:
 .Sp
 Matches a non-digit character.
 .Sp
 *		:special:
 .Sp
 Matches one or more special chars like brackets, dollar sign, slashes etc.
 .Sp
 *		:nonprint:
 .Sp
 Matches one or more non-printable characters.
 .RE
 .SS "\s-1PATTERNS AND FILTERING\s0"
 .IX Subsection "PATTERNS AND FILTERING"
 You can reduce  the rows being displayed by using  one or more regular
--- a/tablizer.pod
+++ b/tablizer.pod
@@ -13,7 +13,7 @@ tablizer - Manipulate tabular output of other programs
      -n, --numbering                    Enable header numbering
      -N, --no-color                     Disable pattern highlighting
      -H, --no-headers                   Disable headers display
-      -s, --separator <string>           Custom field separator
+      -s, --separator <string>           Custom field separator (maybe char, string or :class:)
      -k, --sort-by <int|name>           Sort by column (default: 1)
      -z, --fuzzy                        Use fuzzy search [experimental]
      -F, --filter <field[!]=reg>        Filter given field with regex, can be used multiple times
@@ -153,6 +153,62 @@ Sorts timestamps.
 Finally the  B<-d> option  enables debugging  output which  is mostly
 useful for the developer.
 =head2 SEPARATOR
 The option B<-s> can be a single character, in which case the CSV
 parser will be invoked. You can also specify a string as
 separator. The string will be interpreted as literal string unless it
 is a valid go regular expression. For example:
    -s '\t{2,}\'
 is being used as a regexp and will match two or more consecutive tabs.
    -s 'foo'
 on the other hand is no regular expression and will be used literally.
 To make live easier, there are a couple of predefined regular
 expressions, which you can specify as classes:
 =over
 * 		:tab:      
 Matches a tab and eats spaces around it.
 *		:spaces:
 Matches 2 or more spaces.
 *		:pipe:
 Matches a pipe character and eats spaces around it.
 *		:default:
 Matches 2 or more spaces or tab. This is the default separator if none
 is specified.
 *		:nonword:
 Matches a non-word character.
 *		:nondigit:
 Matches a non-digit character.
 *		:special:
 Matches one or more special chars like brackets, dollar sign, slashes etc.
 *		:nonprint:
 Matches one or more non-printable characters.
 =back
 =head2 PATTERNS AND FILTERING
 You can reduce  the rows being displayed by using  one or more regular
Author	SHA1	Message	Date
Thomas von Dein	4ce6c30f54	fix short usage formatting	2025-10-09 23:16:07 +02:00
T.v.Dein	ec0b210167	add some handy builtin character classes as split separators (#84 )	2025-10-09 23:03:57 +02:00
Thomas von Dein	253ef8262e	fix builder go version	2025-10-08 10:36:09 +02:00
Thomas von Dein	da48994744	fix comment	2025-10-06 23:27:48 +02:00
Thomas von Dein	39f06fddc8	md fix	2025-10-06 23:02:28 +02:00
T.v.Dein	50a9378d92	use column order of -c when specified (#81 )	2025-10-06 22:55:04 +02:00
T.v.Dein	35b726fee4	Fix json parser (#80 ) * fix #77: parse floats and nils as well and convert them to string	2025-10-06 22:54:31 +02:00