Adding New Column With Condition
I would need to manage a data frame by adding more columns. My sample of data headers is `Date` `Sentence` 28 Jan who.c 30 Jan house.a 02 Feb eurolet.it I woul
Solution 1:
Can you convert your original
and country
into dict ?
original= [('a', 'apartment'), ('b', 'bungalow'), ('c', 'church')]
original = {x:y for x,y in original}
country = [('UK', 'United Kingdom'), ('IT', 'Italy'), ('DE', 'Germany'), ('H', 'Holland'), ..., ('F', 'France'), ('S', 'Spain')]
country = {x:y for x,y in country}
Now you can perform the same task as :
df['Tp'] = df['Sentence'].apply(lambda sen : original.get( sen[-1], country.get(sen[-1], 'unknown') ) )
In your code, you need to have the length of elements in conditions
to be same as in choices
(and by extension original and country)
Solution 2:
First, lets make some dictionaries from your tuples and combine them
country = {k.lower() : v for (k,v) in country}
og = {k : v for (k,v) in original}
country.update(og)
print(country)
{'uk': 'United Kingdom',
'it': 'Italy',
'de': 'Germany',
'h': 'Holland',
'f': 'France',
's': 'Spain',
'a': 'apartment',
'b': 'bungalow',
'c': 'church'}
then lets split and get the max element - this allows for any full stops in your text to be ignored, only looking at the final element. finally, we use .map
to associate your values.
df['value'] = df["Sentence"].str.split(".", expand=True).stack().reset_index(1).query(
"level_1 == level_1.max()"
)[0].map(country)
print(df)
Date Sentence value
0 28 Jan who.c church
1 30 Jan house.a apartment
2 02 Feb eurolet.it Italy
Post a Comment for "Adding New Column With Condition"