Parsing Whatsapp Messages: How To Parse Multiline Texts
I have a file of WhatsApp messages which I want to save into csv format. File looks like this: [04/02/2018, 20:56:55] Name1: Messages to this chat and calls are now secured w
Solution 1:
Here is something that should work to append the extra text to the previous line.
This is checking whether the regex fails, in which case just write the line to the file without a newline \n
so it just appends to the previous line in the file.
start = Truewithopen('chat.txt', "r") as infile, open("Output.txt", "w") as outfile:
for line in infile:
time = re.search(r'(?<=\[)[^]]+(?=\])', line)
sender = re.search(r'(?<=\] )[^]]+(?=\:)', line)
if sender and time:
date = datetime.strptime(
time.group(),
'%d/%m/%Y, %H:%M:%S')
sender = sender.group()
text = line.rsplit(r'].+: ', 1)[-1]
new_line = str(date) + ',' + sender + ',' + text
ifnot start: new_line = '\n' + new_line
outfile.write(new_line)
else:
outfile.write(' ' + line)
start = False
It also looks like you weren't writing new lines to the file even when the regex worked, so I added that in too.
Post a Comment for "Parsing Whatsapp Messages: How To Parse Multiline Texts"