Counting The Number Of Unique Words In A List
Solution 1:
The best way to solve this is to use the set
collection type. A set
is a collection in which all elements are unique. Therefore:
unique = set([ 'one', 'two', 'two'])
len(unique) # is 2
You can use a set from the outset, adding words to it as you go:
unique.add('three')
This will throw out any duplicates as they are added. Or, you can collect all the elements in a list and pass the list to the set()
function, which will remove the duplicates at that time. The example I provided above shows this pattern:
unique = set([ 'one', 'two', 'two'])
unique.add('three')
# unique now contains {'one', 'two', 'three'}
Solution 2:
You have many options for this, I recommend a set, but you can also use a counter, which counts the amount a number shows up, or you can look at the number of keys for the dictionary you made.
Set
You can also convert the list to a set, where all elements have to be unique. Not unique elements are discarded:
helloString = ['hello', 'world', 'world', 'how', 'are', 'you', 'doing', 'today']
helloSet = set(helloString) #=> ['doing', 'how', 'are', 'world', 'you', 'hello', 'today']
uniqueWordCount = len(set(helloString)) #=> 7
Here's a link to further reading on sets
Counter
You can also use a counter, which can also tell you how often a word was used, if you still need that information.
from collections import Counter
helloString = ['hello', 'world', 'world', 'how', 'are', 'you', 'doing', 'today']
counter = Counter(helloString)
len(counter) #=> 7
counter["world"] #=> 2
Loop
At the end for your loop, you can check the len
of count
, also, you mistyped helloString
as words
:
uniqueWordCount = 0
helloString = ['hello', 'world', 'world', 'how', 'are', 'you', 'doing', 'today']
count = {}
for word in helloString:
if word in count :
count[word] += 1else:
count[word] = 1len(count) #=> 7
Solution 3:
You can use collections.Counter
helloString = ['hello', 'world', 'world']
from collections import Counter
c = Counter(helloString)
print("There are {} unique words".format(len(c)))
print('They are')
for k, v in c.items():
print(k)
I know the question doesn't specifically ask for this, but to maintain order
helloString = ['hello', 'world', 'world', 'how', 'are', 'you', 'doing', 'today']
from collections import Counter, OrderedDict
classOrderedCounter(Counter, OrderedDict):
pass
c = OrderedCounter(helloString)
print("There are {} unique words".format(len(c)))
print('They are')
for k, v in c.items():
print(k)
Solution 4:
In your current code you can either increment uniqueWordCount
in the else
case where you already set count[word]
, or just lookup the number of keys in the dictionary: len(count)
.
If you only want to know the number of unique elements, then get the elements in the set
: len(set(helloString))
Solution 5:
I would do this using a set.
defstuff(helloString):
hello_set = set(helloString)
returnlen(hello_set)
Post a Comment for "Counting The Number Of Unique Words In A List"