python - Arrange the multi-similar data efficiently -
the datafile showed here measuring record exported instrument.
i uploaded here, interested can download it.
background
sample record-1 fid1, fid2, front_temperature, laser, laserlow, pressure, mode -925 284 1452 315 143 16653 -28500 -924 281 1462 322 136 16641 -28628 -920 281 1455 311 139 16649 -28756 -923 279 1454 312 139 16636 -28884 ...... sample record-2 fid1, fid2, front_temperature, laser, laserlow, pressure, mode -925 284 1452 315 143 16653 -28500 ...... ......
generally, there several record different samples in order of testing routine. , data record these samples in same format.
my attempt
if there 1 sample in datafile( in *.txt format), can arrange datafile pandas. dataframe, can handle data more analysis process in python.
my code shown here:
# whole datafile several samples record inside open("record.txt") f: mylist = f.read().splitlines() ## record each sample length in 803 lines lines = mylist[0:803] ### sample_name extract third line sample_name = lines[2] ### each sample, measure record saved in several aspects, ### regarded columns here columns = lines[22].split() ### generate empty columns saving data record later. df = {columns[0][:-1]:[],columns[1][:-1]:[],columns[2][:-1]:[],columns[3][:-1]:[],columns[4][:-1]:[], columns[5][:-1]:[],columns[6][:-1]:[],} #### though dumb method ## data extracting ### valid data record of sample 1 line 23 in range(0, len(lines[23:]),1): j in range(0, len(columns),1): df[columns[j][:-1]].append(lines[23+i].split()[j]) pd.dataframe(df)
the result shows this:
my target
from code above, deal datafile 1 sample. when there several samples represented in record text. couldn't find clue deal efficiently.
here illustration of target. generate dataframe dict saving samples records.
any advice appreciate!
i think looking this:
import pandas pd # whole datafile several samples record inside open("record.txt",'r') f: mylist = f.read().splitlines() dataset = [] while true: try: ## record each sample length in 803 lines lines, mylist = mylist[0:803], mylist[803:] #this split list!! ### sample_name extract third line sample_name = lines[2] ### each sample, measure record saved in several aspects, ### regarded columns here columns = lines[22].split() ### generate empty columns saving data record later. df = {columns[0][:-1]:[],columns[1][:-1]:[],columns[2][:-1]:[],columns[3][:-1]:[],columns[4][:-1]:[], columns[5][:-1]:[],columns[6][:-1]:[],} #### though dumb method ## data extracting ### valid data record of sample 1 line 23 in range(0, len(lines[23:]),1): j in range(0, len(columns),1): df[columns[j][:-1]].append(lines[23+i].split()[j]) except indexerror: break df = pd.dataframe(df) dataset.append(df)
now dataset[0]
should contain df of sample 1.
Comments
Post a Comment