Filtering Products Based On Description Scenarios And Status In Python Pandas
Let's say I have the following product descriptions in a Pandas DataFrame. I would like to keep all product descriptions of products that satisfy the following condition: For ever
Solution 1:
Use:
#create dictionary by scenaries
d = {'scenario{}'.format(k):v for k, v inenumerate(scenario_descriptions, 1)}
#unique id for reindex
uniq_id = df['id'].unique()
deff(x):
#check if all description
c = set(x['description']) >= set(v)
#check if 4,5 or 6 value
d = x['status'].isin([4,5,6]).all()
return (c & d)
d1 = {}
for k, v in d.items():
#filter df by scenary first for remove not relevant rows
a = df[df['description'].isin(v)]
#call groupby with custom function
b = a.groupby('id').apply(f)
#add missing ids and fill by False#output to dictionary
d1[k] = b.reindex(uniq_id, fill_value=False)
print (d1)
{'scenario1': id1False2False
dtype: bool, 'scenario4': id1False2False
dtype: bool, 'scenario5': id1True2False
dtype: bool, 'scenario3': id1True2False
dtype: bool, 'scenario2': id1True2False
dtype: bool}
#reduce dict to DataFrame and check at least one True per row
m = pd.concat(d1, axis=1).any(axis=1)
print (m)
id1True2False
#last filteringdf = df[df['id'].isin(m.index[m])]
print (df)
id description status
0 1 world1 1
1 1 world2 4
2 1 world3 1
3 1 world4 4
4 1 world5 4
5 1 world6 4
6 1 world7 1
7 1 world8 4
8 1 world9 4
9 1 world10 4
10 1 world11 4
11 1 world12 4
12 1 world13 4
13 1 world14 4
14 1 world15 1
Solution 2:
Use
In [260]: product_descriptions.groupby('id').filter(
...: lambda x: all(any(w in x.description.values for w in L)
...: for L in scenario_descriptions))
Out[260]:
id description status
01 world1 111 world2 421 world3 131 world4 441 world5 451 world6 461 world7 171 world8 481 world9 491 world10 4101 world11 4111 world12 4121 world13 4131 world14 4141 world15 1
Post a Comment for "Filtering Products Based On Description Scenarios And Status In Python Pandas"