Skip to content Skip to sidebar Skip to footer

Filtering Products Based On Description Scenarios And Status In Python Pandas

Let's say I have the following product descriptions in a Pandas DataFrame. I would like to keep all product descriptions of products that satisfy the following condition: For ever

Solution 1:

Use:

#create dictionary by scenaries
d = {'scenario{}'.format(k):v for k, v inenumerate(scenario_descriptions, 1)}

#unique id for reindex
uniq_id = df['id'].unique()

deff(x):
    #check if all description
    c = set(x['description']) >= set(v)
    #check if 4,5 or 6 value
    d = x['status'].isin([4,5,6]).all()
    return (c & d)

d1 = {}
for k, v in d.items():
     #filter df by scenary first for remove not relevant rows
     a = df[df['description'].isin(v)]
     #call groupby with custom function 
     b = a.groupby('id').apply(f)
     #add missing ids and fill by False#output to dictionary 
     d1[k] = b.reindex(uniq_id, fill_value=False)

print (d1)
{'scenario1': id1False2False
dtype: bool, 'scenario4': id1False2False
dtype: bool, 'scenario5': id1True2False
dtype: bool, 'scenario3': id1True2False
dtype: bool, 'scenario2': id1True2False
dtype: bool}

#reduce dict to DataFrame and check at least one True per row
m = pd.concat(d1, axis=1).any(axis=1)
print (m)
id1True2False

#last filteringdf = df[df['id'].isin(m.index[m])]
print (df)
    id description  status
0    1      world1       1
1    1      world2       4
2    1      world3       1
3    1      world4       4
4    1      world5       4
5    1      world6       4
6    1      world7       1
7    1      world8       4
8    1      world9       4
9    1     world10       4
10   1     world11       4
11   1     world12       4
12   1     world13       4
13   1     world14       4
14   1     world15       1

Solution 2:

Use

In [260]: product_descriptions.groupby('id').filter(
     ...:   lambda x: all(any(w in x.description.values for w in L)
     ...:                 for L in scenario_descriptions))
Out[260]:
    id description  status
01      world1       111      world2       421      world3       131      world4       441      world5       451      world6       461      world7       171      world8       481      world9       491     world10       4101     world11       4111     world12       4121     world13       4131     world14       4141     world15       1

Post a Comment for "Filtering Products Based On Description Scenarios And Status In Python Pandas"