Determine the end-of-line format, tabs, bom, and nul characters
- For help, run
chars -h
chars v2.4.0
Determine the end-of-line format, tabs, bom, and nul
https://github.com/jftuga/chars
Usage:
chars [filename or file-glob 1] [filename or file-glob 2] ...
-F when used with -f, only display a list of failed files, one per line
-b examine binary files
-c add comma thousands separator to numeric values
-e string
exclude based on regular expression; use .* instead of *
-f string
fail with OS exit code=100 if any of the included characters exist; ex: -f crlf,nul,bom8
-j output results in JSON format; can't be used with -l; does not honor -t or -c
-l int
shorten files names to a maximum of this length
shorten files names to a maximum of this length
-t append a row which includes a total for each column
-v display version and then exit
Notes:
Use - to read a file from STDIN
On Windows, try: chars * -or- chars */* -or- chars */*/*
- macOS:
brew update; brew install jftuga/tap/chars
- Binaries for Linux, macOS and Windows are provided in the releases section.
- Run
chars
with no additional cmd-line switches -
- Only report files in the current directory
-
- Report text files only since
-b
is not used
- Report text files only since
PS C:\chars> .\chars.exe *
+-----------------+------+-----+-----+------+------+-------+-----------+
| FILENAME | CRLF | LF | TAB | NUL | BOM8 | BOM16 | BYTESREAD |
+-----------------+------+-----+-----+------+------+-------+-----------+
| .goreleaser.yml | 0 | 59 | 0 | 0 | 0 | 0 | 1066 |
| LICENSE | 0 | 21 | 0 | 0 | 0 | 0 | 1068 |
| README.md | 0 | 92 | 0 | 0 | 0 | 0 | 3510 |
| chars.go | 0 | 246 | 328 | 0 | 0 | 0 | 6477 |
| go.mod | 0 | 10 | 2 | 0 | 0 | 0 | 188 |
| go.sum | 0 | 6 | 0 | 0 | 0 | 0 | 533 |
| testfile1 | 0 | 22 | 0 | 3223 | 0 | 1 | 6448 |
+-----------------+------+-----+-----+------+------+-------+-----------+
- Run
chars
with-e
and-l
cmd-line switches -
- Only report files starting with
p
in theC:\Windows\System32
directory
- Only report files starting with
-
- Exclude all files matching
perf.*dat
- Exclude all files matching
-
- Shorten filenames to a maximum length of
32
- Shorten filenames to a maximum length of
PS C:\chars> .\chars.exe -e perf.*dat -l 32 C:\Windows\System32\p*
+----------------------------------+------+----+-----+------+------+-------+-----------+
| FILENAME | CRLF | LF | TAB | NUL | BOM8 | BOM16 | BYTESREAD |
+----------------------------------+------+----+-----+------+------+-------+-----------+
| C:\Windows\System32\pcl.sep | 11 | 0 | 0 | 0 | 0 | 0 | 150 |
| C:\Windows\System32\perfmon.msc | 1933 | 0 | 0 | 0 | 0 | 0 | 145519 |
| C:\Windows\Sys...tmanagement.msc | 1945 | 0 | 0 | 0 | 0 | 0 | 146389 |
| C:\Windows\System32\pscript.sep | 2 | 0 | 0 | 0 | 0 | 0 | 51 |
| C:\Windows\Sys...eryprovider.mof | 0 | 61 | 0 | 2073 | 0 | 1 | 4148 |
+----------------------------------+------+----+-----+------+------+-------+-----------+
- Pipe STDIN to
chars
- Use JSON output, with
-j
$ curl -s https://example.com/ | chars -j
[
{
"filename": "STDIN",
"crlf": 0,
"lf": 46,
"tab": 0,
"bom8": 0,
"bom16": 0,
"nul": 0,
"bytesRead": 1256
}
]
- Fail when certain characters are detected, with
-f
-
- OS exit code on a
-f
failure is always100
- OS exit code on a
-
-f
is a comma-delimited list containing:crlf
,lf
,tab
,nul
,bom8
,bom16
$ chars -f lf,tab /etc/group ; echo $?
+------------+------+----+-----+-----+------+-------+-----------+
| FILENAME | CRLF | LF | TAB | NUL | BOM8 | BOM16 | BYTESREAD |
+------------+------+----+-----+-----+------+-------+-----------+
| /etc/group | 0 | 58 | 0 | 0 | 0 | 0 | 795 |
+------------+------+----+-----+-----+------+-------+-----------+
100
- Fail when certain characters are detected, with
-f
- Only output failed file names, with
-F
$ chars -f lf,tab -F /etc/gr* ; echo $?
/etc/group
/etc/group.bak
100
- Output to JSON, with
-j
- Use
-e
to exclude and filenames starting withgo
, such asgo.mod
andgo.sum
- Use
jq
to output toCSV
containing two columns:filename
,tab
-
- Only include files that contain
tab
characters
- Only include files that contain
$ chars -e '^go' -j * | jq -r '.[] | select(.tab > 0) | [.filename,.tab] | @csv'
"case.go",80
"chars.go",475
- Output totals, with
-t
- Output commas in numeric values, with
-c
- Exclude files containing
.g*
, with-e
PS C:\chars> .\chars.exe -t -c -e "\.g.*" *
+-----------------+------+-----+-----+-----+------+-------+-----------+
| FILENAME | CRLF | LF | TAB | NUL | BOM8 | BOM16 | BYTESREAD |
+-----------------+------+-----+-----+-----+------+-------+-----------+
| LICENSE | 0 | 21 | 0 | 0 | 0 | 0 | 1,068 |
| README.md | 0 | 178 | 4 | 0 | 0 | 0 | 6,656 |
| STATUS.md | 0 | 50 | 0 | 0 | 0 | 0 | 3,055 |
| go.mod | 0 | 11 | 3 | 0 | 0 | 0 | 214 |
| go.sum | 0 | 9 | 0 | 0 | 0 | 0 | 795 |
| TOTALS: 5 files | 0 | 269 | 7 | 0 | 0 | 0 | 11,788 |
+-----------------+------+-----+-----+-----+------+-------+-----------+
- YMMV when piping to
STDIN
under Windows -
- Under
cmd
, instead oftype input.txt | chars
, use<
redirection when possible:chars < input.txt
- Under
-
- Under a recent version of
powershell
, useGet-Content -AsByteStream input.txt | chars
instead of justGet-Content input.txt | chars
- Under a recent version of
cmd
andpowershell
will skipBOM
characters; these 2 fields will both report a value of0
cmd
andpowershell
will skipNUL
characters; this field report a value of0
cmd
will convertLF
toCRLF
forUTF-16
encoded filespowershell
will convertLF
toCRLF
- Piping from programs such as
curl
will returnLF
characters undercmd
, butCRLF
underpowershell
-
- Under powershell, consider using
curl --output
- Under powershell, consider using
- Case folding on Windows is somewhat implemented in case.go.
-
- This programs attempts case-insensitive filename matching since this is the expected behavior on Windows.
-
- It is hard-coded to
English
.
- It is hard-coded to
- Newline -
CRLF
vsLF
- Tab key
- Null character
- Byte order mark -
BOM-8
vsBOM-16
- ellipsis - Go module to insert an ellipsis into the middle of a long string to shorten it
- tablewriter - ASCII table in golang
- /u/skeeto and /u/petreus provided code review and suggestions