Skip to content Skip to sidebar Skip to footer

How To Find Elements That Are Common To All Lists In A Nested List?

I have a large nested list and each list within the nested list contains a list of numbers that are formatted as floats. However every individual list in the nested list is the sa

Solution 1:

You can use reduce and set.intersection:

>>> reduce(set.intersection, map(set, nested_list))
set([2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0])

Use itertools.imap for memory efficient solution.

Timing Comparisons:

>>>lis = [[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0],
              [2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0],
              [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0],
              [2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0]]
>>>%timeit set.intersection(*map(set, lis))
100000 loops, best of 3: 12.5 us per loop
>>>%timeit set.intersection(*(set(e) for e in lis))
10000 loops, best of 3: 14.4 us per loop
>>>%timeit reduce(set.intersection, map(set, lis))
10000 loops, best of 3: 12.8 us per loop
>>>%timeit reduce(set.intersection, imap(set, lis))
100000 loops, best of 3: 13.1 us per loop
>>>%timeit set.intersection(set(lis[0]), *islice(lis, 1, None))
100000 loops, best of 3: 10.6 us per loop


>>>lis = [[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0],
              [2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0],
              [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0],
              [2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0]]*1000
>>>%timeit set.intersection(*map(set, lis))
10 loops, best of 3: 16.4 ms per loop
>>>%timeit set.intersection(*(set(e) for e in lis))
10 loops, best of 3: 15.8 ms per loop
>>>%timeit reduce(set.intersection, map(set, lis))
100 loops, best of 3: 16.3 ms per loop
>>>%timeit reduce(set.intersection, imap(set, lis))
10 loops, best of 3: 13.8 ms per loop
>>>%timeit set.intersection(set(lis[0]), *islice(lis, 1, None))
100 loops, best of 3: 8.4 ms per loop


>>>lis = [[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0],              [2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0],
              [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0],
              [2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0]]*10**5
>>>%timeit set.intersection(*map(set, lis))  
1 loops, best of 3: 1.92 s per loop
>>>%timeit set.intersection(*(set(e) for e in lis))
1 loops, best of 3: 2.17 s per loop
>>>%timeit reduce(set.intersection, map(set, lis))
1 loops, best of 3: 2.14 s per loop
>>>%timeit reduce(set.intersection, imap(set, lis))
1 loops, best of 3: 1.52 s per loop
>>>%timeit set.intersection(set(lis[0]), *islice(lis, 1, None))
1 loops, best of 3: 913 ms per loop

Conclusion:

Steven Rumbalski's solution is clearly the best one in terms of efficiency.

Solution 2:

Try this, it's the simplest solution:

set.intersection(*map(set, nested_list))

Or if you prefer to use generator expressions, which should be a more efficient solution in terms of memory usage:

set.intersection(*(set(e) for e in nested_list))

Solution 3:

Ashwini Chaudhary's solution is elegant, but could be quite inefficient for large inputs because it creates many intermediate sets. If your nested_list is large do this:

>>> set.intersection(set(nested_list[0]), *itertools.islice(nested_list, 1, None))
set([2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0])

Solution 4:

Count occurences of each element in sets of lists occuring in nested_list, if occurence equals number of lements in nested_list, it is common to all. You do not need the set conversion if elements of nested_list do not have numbers repeated in them

nested_list = [[1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0],
              [2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0],
              [1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0],
              [2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0]]

from collections import Counter
result = [val for val,cnt inCounter([x for t in nested_list for x inset(t)]).items() if cnt == len(nested_list)]
print result


 #  [2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0]

Post a Comment for "How To Find Elements That Are Common To All Lists In A Nested List?"