Python: Strip Everything But Spaces And Alphanumeric
I have a large string with brackets and commas and such. I want to strip all those characters but keep the spacing. How can I do this. As of now I am using strippedList = re.sub(r'
Solution 1:
re.sub(r'([^\s\w]|_)+', '', origList)
Solution 2:
A bit faster implementation:
importrepattern= re.compile('([^\s\w]|_)+')
strippedList = pattern.sub('', value)
Solution 3:
The regular-expression based versions might be faster (especially if you switch to using a compiled expression), but I like this for clarity:
"".join([c for c in origList if c instring.letters or c instring.whitespace])
It's a bit weird with the join()
call, but I think that is pretty idiomatic Python for converting a list of characters into a string.
Solution 4:
Demonstrating what characters you will get in the result:
>>>s = ''.join(chr(i) for i inrange(256)) # all possible bytes>>>re.sub(r'[^\s\w_]+','',s) # What will remain
'\t\n\x0b\x0c\r 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz'
Docs: re.sub, Regex HOWTO: Matching Characters, Regex HOWTO: Repeating Things
Post a Comment for "Python: Strip Everything But Spaces And Alphanumeric"