linux - replace every nth occurrence of a pattern using awk -
this question has answer here:
i'm trying replace every nth occurrence of string in text file.
background: have huge bibtex file (called in.bib) containing hundreds of entries beginning "@". every entry has different amount of lines. want write string (e.g. "#") right before every (let's say) 6th occurrence of "@" so, in second step, can use csplit split huge file @ "#" files containing 5 entries each.
the problem find , replace every fifth "@".
since need repeatedly, suggested answer in printing sed or awk line following matching pattern won't job. again, not looking 1 matching place many of it.
what have far:
awk '/^@/ && v++%5 {sub(/^@/, "\n#\n@")} {print > "out.bib"}' in.bib
replaces 2nd until 5th occurance (and no more). (btw, found , adopted solution here: "sed replace every nth occurrence". initially, meant replace every second occurence--which does.)
and, second:
awk -v p="@" -v n="5" '$0~p{i++}i==n{sub(/^@/, "\n#\n@")}{print > "out.bib"}' in.bib
replaces 5th occurance , nothing else. (adopted solution here: "display n'th match of grep"
what need (and not able write) imho loop. loop job? like:
for (i = 1; <= 200; * 5) <find "@"> , <replace "\n#\n@"> print
the material have looks this:
@article{karamanic_jedno_2007, title = {jedno kosova, dva srbije}, journal = {ulaznica: journal culture, art , social issues}, author = {karamanic, slobodan}, year = {2007} } @inproceedings{blome_eigene_2008, title = {das eigene, das andere und ihre vermischung. zur rolle von sexualität und reproduktion im rassendiskurs des 19. jahrhunderts}, comment = {rest of lines snippet off here usability -- in following entries. original entries may have different amount of lines.} } @book{doring_inter-agency_2008, title = {inter-agency coordination in united nations peacebuilding} } @book{reckwitz_subjekt_2008, address = {bielefeld}, title = {subjekt} }
what want every sixth entry looking this:
# @book{reckwitz_subjekt_2008, address = {bielefeld}, title = {subjekt} }
thanks help.
your code right, modified it.
to replace every nth occurrence, need modular expression.
so better understanding brackets, need expression ((i % n) == 0)
awk -v p="@" -v n="5" ' $0~p { i++ } ((i%n)==0) { sub(/^@/, "\n#\n@") }{ print }' in.bib > out.bib
Comments
Post a Comment