Non-consuming Regular Expression Split In Python
How can a string be split on a separator expression while leaving that separator on the preceding string? >>> text = 'This is an example. Is it made up of more than once s
Solution 1:
>>> re.split("(?<=[\.\?!]) ", text)
['This is an example.', 'Is it made up of more than once sentence?', 'Yes, it is.']
The crucial thing is the use of a look-behind assertion with ?<=
.
Solution 2:
import re
text = "This is an example.A particular case.Made up of more "\
"than once sentence?Yes, it is.But no blank !!!That's"\
" a problem ????Yes.I think so! :)"for x in re.split("(?<=[\.\?!]) ", text):
printrepr(x)
print'\n'for x in re.findall("[^.?!]*[.?!]|[^.?!]+(?=\Z)",text):
printrepr(x)
result
"This is an example.A particular case.Made up of more than once sentence?Yes, it is.But no blank !!!That'sa problem ????Yes.I think so!"':)''This is an example.''A particular case.''Made up of more than once sentence?''Yes, it is.''But no blank !''!''!'"That's a problem ?"'?''?''?''Yes.''I think so!'' :)'
.
EDIT
Also
import re
text = "! This is an example.A particular case.Made up of more "\
"than once sentence?Yes, it is.But no blank !!!That's"\
" a problem ????Yes.I think so! :)"
res = re.split('([.?!])',text)
print [ ''.join(res[i:i+2]) for i in xrange(0,len(res),2) ]
gives
['!', ' This is an example.', 'A particular case.', 'Made up of more than once sentence?', 'Yes, it is.', 'But no blank !', '!', '!', "That's a problem ?", '?', '?', '?', 'Yes.', 'I think so!', ' :)']
Post a Comment for "Non-consuming Regular Expression Split In Python"