bash - Join and delete lines based on patern -
i have file 200,000+ lines. lines grouped. beginning of each group of rows starts "image" followed 1 row starts "histo" , @ least one, multiple, rows start "frag". need to:
 1. delete row starts "histo".
 2. each "frag" line need join previous "image" row.  here example. 
>image ...data1...   >histo numbers 0 0 1 1 0 1 0   >frag ...data1...   >frag ...data2...   >image ...data2...   >histo numbers 0 0 1 1 0 1 0    >frag ...data1...   >frag ...data2...   >frag ...data3...   >frag ...data4... the result needs this:
>image ...data1... frag ...data1...   >image ...data1... frag ...data2...   >image ...data2... frag ...data1...   >image ...data2... frag ...data2...   >image ...data2... frag ...data3...   >image ...data2... frag ...data4...   it possible have many frag lines before starts on image line. using mac can use pretty tool.
i tried combining multiple frag lines single image line.
awk '/^image/{if(nr>1)print a; a=$0} /^(frag)/{a=a" "$0}' input.txt > output.txt
that results in this:
image ...data1... frag ...data1... frag ...data2...
this works:
sed 's/>//' input.txt|awk '/^image/{a=$0;next;} /^frag/{print ">"a,$0}' the next statement avoid checking frag pattern if line image, accelerating process.
Comments
Post a Comment