String Count With Overlapping Occurrences
What's the best way to count the number of occurrences of a given string, including overlap in Python? This is one way: def function(string, str_to_search_for): count = 0
Solution 1:
Well, this might be faster since it does the comparing in C:
def occurrences(string, sub):
count = start = 0while True:
start = string.find(sub, start) + 1if start > 0:
count+=1else:
return count
Solution 2:
>>>import re>>>text = '1011101111'>>>len(re.findall('(?=11)', text))
5
If you didn't want to load the whole list of matches into memory, which would never be a problem! you could do this if you really wanted:
>>>sum(1for _ in re.finditer('(?=11)', text))
5
As a function (re.escape
makes sure the substring doesn't interfere with the regex):
>>> defoccurrences(text, sub):
returnlen(re.findall('(?={0})'.format(re.escape(sub)), text))
>>> occurrences(text, '11')
5
Solution 3:
You can also try using the new Python regex module, which supports overlapping matches.
import regex as re
defcount_overlapping(text, search_for):
returnlen(re.findall(search_for, text, overlapped=True))
count_overlapping('1011101111','11') # 5
Solution 4:
Python's str.count
counts non-overlapping substrings:
In [3]: "ababa".count("aba")
Out[3]: 1
Here are a few ways to count overlapping sequences, I'm sure there are many more :)
Look-ahead regular expressions
How to find overlapping matches with a regexp?
In[10]: re.findall("a(?=ba)", "ababa")
Out[10]: ['a', 'a']
Generate all substrings
In [11]: data = "ababa"
In [17]: sum(1 for i in range(len(data)) if data.startswith("aba", i))
Out[17]: 2
Solution 5:
def count_substring(string, sub_string):
count = 0for pos inrange(len(string)):
ifstring[pos:].startswith(sub_string):
count += 1return count
This could be the easiest way.
Post a Comment for "String Count With Overlapping Occurrences"