# csv.sh parse CSV files with pure POSIX shell! CSV files have weird quoting rules, so parsing them with `awk` or `cut` won't cut it on its own. but we can convert them to a format that shell utilities like: ## the short way (see `csv-min.sh`) to convert from a csv file to lines of tab-separated values: ``` LC_ALL=C sed -n 's/'"$(printf "\r")"'$//;s/\\/\\\\/g;s/'"$(printf "\t")"'/\\t/g;H;x;h;s/^\n//;s/\n/\\n/g;s/,/,,/g;s/$/,/;s/^/,/;s/,\([^",]*\("[^"]*\(""[^"]*\)*"[^",]*\)*\),/\1'"$(printf "\t")"'/g;/,$/d;s/.$//;s/,,/,/g;s/"\([^"]*\(""[^"]*\)*\)"/\1/g;s/""/"/g;p;s/.*//;h' ``` tabs, newlines, and backslashes are escaped into `\t`, `\n`, and `\\`, respectively. > → `foo,bar,"baz ""quuz"" \etc"` > ← `foo[TAB]bar[TAB]baz "quuz" \\etc` now you can parse with regular shell tools: - `cut -f2` - `awk -F'\t' '{print $2 + $3}'` - etc. to convert back to CSV: ``` LC_ALL=C sed 's/"/""/g;s/'"$(printf "\t")"'/","/g;s/^/"/;s/$/"/;s/\\\\/& /g;s/\\n/\n/g;s/\\t/'"$(printf "\t")"'/g;s/\\\\ /\\/g;' ``` (this program doesn't output CR LFs, but you can modify it to! `sed 's/$/'"$(printf "\r")"'/'`) ## what? see `csv.sh`. ## disclaimer you shouldn't trust yourself to verify a CSV parser, let alone trust me to write one! CSV is an amalgamation of formats, loosely described by [RFC 4180](https://www.rfc-editor.org/rfc/rfc4180). i try to be slightly more lenient than RFC 4180, and i tried my parser on output from a variety of programs, but i don't guarantee correctness for weird files. ## license made by [Natalia Posting](https://equa.space/) in 2023. Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted. THE SOFTWARE IS PROVIDED “AS IS” AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.