For Python users, we all know that it is very convenient to create a data frame from a dictionary. For example:
df = pd.DataFrame({‘Key’:[‘a’,’b’,’c’,’d’], ‘Value’:[1,2,3,4]})
It works beautifully when the values is a list/dict with multiple columns. However, you may encounter into syntax errors “ValueError: If using all scalar values, you must pass an index” when you try to convert the following dictionary to a data frame.
dict_test = {
‘bacon’:’pig’,
‘pulled pork’:’pig’,
‘pastrami’: ‘cow’,
‘honey ham’:’pip’,
‘nova lox’: ‘salmon’
}
df = pd.DataFrame.from_dict(dict_test)
Why is that?
While pandas create data frame from a dictionary, it is expecting its value to be a list or dict. If you give it a scalar, you’ll also need to supply index. In this example, the values are ‘pig’ instead of [‘pig’].
How to fix it:
- Change the data to:
dict_test = {
‘bacon’:[‘pig’],
‘pulled pork’:[‘pig’],
‘pastrami’: [‘cow’],
‘honey ham’:[‘pip’],
‘nova lox’: [‘salmon’]
}
2. Get the list items from the dictionary and add ‘list’ for Python 3.x.
pd.DataFrame.from_dict(list(dict_test.items()), columns = [‘food’,’animal’])
3. Specify the orientation with ‘index’.
pd.DataFrame.from_dict(dict_test, orient = ‘index’)
4. Pass the Series constructor instead:
s = pd.Series(dict_test, name = ‘animal’)
s.index.name = ‘Food’
df = pd.DataFrame(s)