方法詳解:
pandas.melt(frame, id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None
“Unpivots” a DataFrame from wide format to long format, optionally leaving identifier variables set.
如何理解Unpivots其實就是列轉行。
This function is useful to massage a DataFrame into a format where one or more columns are identifier variables (id_vars)
, while all other columns, considered measured variables (value_vars)
, are “unpivoted” to the row axis, leaving just two non-identifier columns, ‘variable’ and ‘value’.
- frame : DataFrame
id_vars : tuple, list, or ndarray, optional
Column(s) to use as identifier variables.其實就是作為主鍵
value_vars : tuple, list, or ndarray, optional
Column(s) to unpivot. If not specified, uses all columns that are not set as id_vars.對那些字段進行轉列且返回數據只有該字段的數據。
var_name : scalar
Name to use for the ‘variable’ column. If None it uses frame.columns.name or ‘variable’.對行專列的后,多個列名組成的列用什么名字?默認用variable
value_name : scalar, default ‘value’.
Name to use for the ‘value’ column.對行專列的后,多個列名下的值組成的列用什么名字?默認用value
col_level : int or string, optional
If columns are a MultiIndex then use this level to melt.
其實這個和我之前數倉博客hive數據倉庫表設計之(矮寬表+高窄表)異曲同工:
官網案例:
import pandas as pd df = pd.DataFrame({'A': {0: 'a', 1: 'b', 2: 'c'}, 'B': {0: 1, 1: 3, 2: 5}, 'C': {0: 2, 1: 4, 2: 6}}) df

pd.melt(df, id_vars=['A'], value_vars=['B'])

pd.melt(df, id_vars=['A'], value_vars=['B','C'])

