bash - Join and delete lines based on patern -


i have file 200,000+ lines. lines grouped. beginning of each group of rows starts "image" followed 1 row starts "histo" , @ least one, multiple, rows start "frag". need to:
1. delete row starts "histo".
2. each "frag" line need join previous "image" row. here example.

>image ...data1...   >histo numbers 0 0 1 1 0 1 0   >frag ...data1...   >frag ...data2...   >image ...data2...   >histo numbers 0 0 1 1 0 1 0    >frag ...data1...   >frag ...data2...   >frag ...data3...   >frag ...data4... 

the result needs this:

>image ...data1... frag ...data1...   >image ...data1... frag ...data2...   >image ...data2... frag ...data1...   >image ...data2... frag ...data2...   >image ...data2... frag ...data3...   >image ...data2... frag ...data4...   

it possible have many frag lines before starts on image line. using mac can use pretty tool.

i tried combining multiple frag lines single image line.

awk '/^image/{if(nr>1)print a; a=$0} /^(frag)/{a=a" "$0}' input.txt > output.txt

that results in this:

image ...data1... frag ...data1... frag ...data2...

this works:

sed 's/>//' input.txt|awk '/^image/{a=$0;next;} /^frag/{print ">"a,$0}' 

the next statement avoid checking frag pattern if line image, accelerating process.


Comments

Popular posts from this blog

Combining PHP Registration and Login into one class with multiple functions in one PHP file -

Android volley - avoid multiple requests of the same kind to the server? -

magento2 - Magento 2 admin grid add filter to collection -