bash - Join and delete lines based on patern -
i have file 200,000+ lines. lines grouped. beginning of each group of rows starts "image" followed 1 row starts "histo" , @ least one, multiple, rows start "frag". need to:
1. delete row starts "histo".
2. each "frag" line need join previous "image" row. here example.
>image ...data1... >histo numbers 0 0 1 1 0 1 0 >frag ...data1... >frag ...data2... >image ...data2... >histo numbers 0 0 1 1 0 1 0 >frag ...data1... >frag ...data2... >frag ...data3... >frag ...data4...
the result needs this:
>image ...data1... frag ...data1... >image ...data1... frag ...data2... >image ...data2... frag ...data1... >image ...data2... frag ...data2... >image ...data2... frag ...data3... >image ...data2... frag ...data4...
it possible have many frag lines before starts on image line. using mac can use pretty tool.
i tried combining multiple frag lines single image line.
awk '/^image/{if(nr>1)print a; a=$0} /^(frag)/{a=a" "$0}' input.txt > output.txt
that results in this:
image ...data1... frag ...data1... frag ...data2...
this works:
sed 's/>//' input.txt|awk '/^image/{a=$0;next;} /^frag/{print ">"a,$0}'
the next statement avoid checking frag pattern if line image, accelerating process.
Comments
Post a Comment