Fix Numbering On Csv Files That Have Deleted Lines
Solution 1:
If you need to update the count, then you have to read twice and count the number of rows you are keeping first. You can keep a separate counter to rewrite the first column once you are writing the matched lines:
import re
numbered = re.compile(r'N\d+').match
for fn in fns:
# open for counting
reader = csv.reader(open(fn,"rb"))
count = sum(1for row in reader if row andnotany(r.strip() == 'DIF'for r in row) and numbered(row[0]))
# reopen for filtering
reader = csv.reader(open(fn,"rb"))
withopen (os.path.join('out', fn), 'wb') as f:
counter = 0
w = csv.writer(f)
for row in reader:
if row and'Count'in row[0].strip():
row = ['Count', count]
if row andnotany(r.strip() == 'DIF'for r in row): #remove DIFif numbered(row[0]):
counter += 1
row[0] = 'N%d' % counter
w.writerow(row)
Solution 2:
Your question is a little unclear I think you want N to be updated with the number relative to the position on the updated list I am assuming you are on Windows
Since it appears that you are not using row dictionaries I am going to do it a little differently
my_files = glob.glob('c:\\thedirectory\\orsubdirectorywhereyourfilesare\\*.csv')
for each_file in my_files:
initial = open(each_file).readlines()
no_diff = [row for row in initial if'DIF'notin row]
newCount = len(no_diff) - no_diff.index('NUMBER,ITEM\n') -1#you might have to tweak this
outList = []
counter = 0for row in no_diff:
if'Count'in row:
new_row = 'Count ' + str(newCount) + '\n'# this is a new line character
outList.append(new_row)
elif row.startswith('NUMBER'):
outList.append(row)
elif row.startswith('Name'):
outList.append(row)
elif row.startswith('N'):
print counter
row_end = row.split(',')[-1]
row_begin = 'N' + str(counter + 1)
new_row = row_begin + ',' + row_end
outList.append(new_row)
counter += 1else:
outList.append(row)
outref = open(each_file)
outref.writelines(outList)
outref.close()
I copied this into a file
'Name bunch of stuff \n''header stuff stuff \n''header stuff stuff \n''header stuff stuff \n''header stuff stuff \n''header stuff stuff \n''Count 11 \n''NUMBER,ITEM\n''N1,Shoe\n''N2,Heel\n''N3,Tee\n''N4,Polo\n''N5,Sneaker\n''N6,DIF\n''N7,DIF\n''N8,DIF\n''N9,DIF\n''N10,Heel\n''N11,Tee'
I ran the code above (which I had to tweak) and got this result
'Name bunch of stuff \n''header stuff stuff \n''header stuff stuff \n''header stuff stuff \n''header stuff stuff \n''header stuff stuff \n''Count 7\n''NUMBER,ITEM\n''N1,Shoe\n''N2,Heel\n''N3,Tee\n''N4,Polo\n''N5,Sneaker\n''N6,Heel\n''N7,Tee'
Now the other approach here and on you second question are definitely more elegant but elegance only comes after you really understand the code. There are too many moving parts in my opinion. You need to
- read a file
- handle parts of the file
- write it back out
If you add in regular expressions and csv handling then you are exploding all of the areas you can get into trouble. Those are great tools and I use them often but now to start learning how to program in Python Otherwise look at csv.DictReader if your header is not too messy
Post a Comment for "Fix Numbering On Csv Files That Have Deleted Lines"