bash - Join and delete lines based on patern -


i have file 200,000+ lines. lines grouped. beginning of each group of rows starts "image" followed 1 row starts "histo" , @ least one, multiple, rows start "frag". need to:
1. delete row starts "histo".
2. each "frag" line need join previous "image" row. here example.

>image ...data1...   >histo numbers 0 0 1 1 0 1 0   >frag ...data1...   >frag ...data2...   >image ...data2...   >histo numbers 0 0 1 1 0 1 0    >frag ...data1...   >frag ...data2...   >frag ...data3...   >frag ...data4... 

the result needs this:

>image ...data1... frag ...data1...   >image ...data1... frag ...data2...   >image ...data2... frag ...data1...   >image ...data2... frag ...data2...   >image ...data2... frag ...data3...   >image ...data2... frag ...data4...   

it possible have many frag lines before starts on image line. using mac can use pretty tool.

i tried combining multiple frag lines single image line.

awk '/^image/{if(nr>1)print a; a=$0} /^(frag)/{a=a" "$0}' input.txt > output.txt

that results in this:

image ...data1... frag ...data1... frag ...data2...

this works:

sed 's/>//' input.txt|awk '/^image/{a=$0;next;} /^frag/{print ">"a,$0}' 

the next statement avoid checking frag pattern if line image, accelerating process.


Comments

Popular posts from this blog

magento2 - Magento 2 admin grid add filter to collection -

Android volley - avoid multiple requests of the same kind to the server? -

Combining PHP Registration and Login into one class with multiple functions in one PHP file -