Friday, July 31, 2009

Sort or search unix file by using tab as delimiter?

have a unix file with the following format:





one\TABtwo three\TABI love NBA\N


world\TABfour five\TABI love NFL\N





How can I search the second column using grep, or sort them by the second column? I couldn't type tab in cygwin or unix shell.

Sort or search unix file by using tab as delimiter?
In Unix, under the Bash shell, you can type in the tab character by using: Ctrl+V, TAB





e.g. you can use grep as follows:





grep "%26lt;ctrl+v,tab%26gt;two" myfile.txt





You can try cut as follows:





cut -f2 -d"ctrl+v,tab" myfile.txt





which will return the entire second column.





You can also use awk...
Reply:Typically awk is good for filtering column-based output - e.g.:





awk '$4 == 1 { print }'





will print all lines where the 4th colums is equal to 1.





and:





awk 'BEGIN { total = 0 } { total += $5; print $5 } END { print "total: ", total }'





will print the sum of all the numbers in column 5.





awk already understands that tabs or sequences of spaces delimit columns.





The Unix sort utility also understands columns as such and will deal fine with tabs.





sort -k2n foo





will sort the file foo based on the numerical value of column 2.





If you need to type a tab in the shell, typically you hit Ctrl+V first and then type the tab and that should get you a tab character. You would need to enclose that tab in quotes in the command-line as well.


No comments:

Post a Comment