概述
I have time series data, i.e. by date (YYYY-MM-DD), returns, pnl, # of trades:
date returns pnl no_trades
1998-01-01 0.01 0.05 5
1998-01-02 -0.04 0.12 2
...
2010-12-31 0.05 0.25 3
Now I would like to show horizontal bar charts with
a) the average of the returns
b) sum of the pnls
by:
1) year, i.e. 1998, 1999, ..., 2010
2) quarter across all years, i.e. Q1 (YYYY-01-01 to YYYY-03-31), Q2, .., Q4
Additionally, the sum of # of trades per 1) and 2) should denote a number next to each of the horizontal bars.
So in my opinion there needs to be two separate steps:
1) Get the data in the right format
2) Feed the data to the plot and then with overlay of multiple plots.
Sample data:
start = datetime(1998, 1, 1)
end = datetime(2001, 12, 31)
dates = pd.date_range(start, end, freq = 'D')
df = pd.DataFrame(np.random.randn(len(dates), 3), index = dates,
columns = ['returns', 'pnl', 'no_trades'])
So that could be two horizontal bar charts for year and quarter each:
1) one for returns: bar chart, number in the middle of the bar, sum of no_trades at the end of the bar
2) one for pnl: bar chart, number in the middle of the bar, sum of no_trades at the end of the bar
Plus a dotted line vertical line across the going across the bars showing the average returns and pnl.
I could do it in excel (which in fact is adding columns with the respective view and then pivot chart it), but would prefer an "automatized" way with the possibility to reproduce (or understand how it's done) via python.
edit: as discussed in below comment, this is how far I've got; however, I am not sure whether this is the most the fastest approach with regards to 1). I am currently working on 2).
df_ret_year = df[['date', 'returns']].groupby(df['date'].dt.year).mean()
df_ret_quarter = df[['date', 'returns']].groupby(df['date'].dt.quarter).mean()
df_pnl_year = df[['date', 'pnl']].groupby(df['date'].dt.year).sum()
df_pnl_quarter = df[['date', 'pnl']].groupby(df['date'].dt.quarter).sum()
df_trades_year = df[['date', 'pnl']].groupby(df['date'].dt.year).sum()
df_trades_quarter = df[['date', 'pnl']].groupby(df['date'].dt.quarter).sum()
解决方案start = datetime(1998, 1, 1)
end = datetime(2001, 12, 31)
dates = pd.date_range(start, end, freq = 'D')
Create the DataFrame with a MultiIndex - (year,quarter)
index = pd.MultiIndex.from_tuples([(thing.year, thing.quarter) for thing in dates])
df = pd.DataFrame(np.random.randn(len(dates), 3), index = index,
columns = ['returns', 'pnl', 'no_trades'])
Then you can group by year, quarter or year and quarter:
gb_yr = df.groupby(level=0)
gb_qtr = df.groupby(level=1)
gb_yr_qtr = df.groupby(level=(0,1))
>>>
>>> # yearly means
>>> gb_yr.mean()
returns pnl no_trades
1998 0.080989 -0.019115 0.142576
1999 -0.040881 -0.005331 0.029815
2000 -0.036227 -0.100028 -0.009175
2001 0.097230 -0.019342 -0.089498
>>>
>>> # quarterly means across all years
>>> gb_qtr.mean()
returns pnl no_trades
1 0.036992 0.023923 0.048497
2 0.053445 -0.039583 0.076721
3 0.003891 -0.016180 0.004619
4 0.007145 -0.111050 -0.054988
>>>
>>> # means by year and quarter
>>> gb_yr_qtr.mean()
returns pnl no_trades
1998 1 -0.062570 0.139856 0.105288
2 0.044946 -0.008685 0.200393
3 0.152209 0.007341 0.119093
4 0.185858 -0.211401 0.145347
1999 1 0.085799 0.072655 0.054060
2 0.111595 0.002972 0.068792
3 -0.194506 -0.093435 0.107210
4 -0.161999 -0.001732 -0.109851
2000 1 0.001543 -0.083488 0.174226
2 -0.064343 -0.158431 -0.071415
3 -0.036334 -0.037008 -0.068717
4 -0.045669 -0.121640 -0.069474
2001 1 0.123592 -0.032138 -0.140982
2 0.121582 0.005810 0.109115
3 0.094194 0.058382 -0.139110
4 0.050388 -0.109429 -0.185975
>>>
>>> # operate on single columns
>>> gb_yr['pnl'].sum()
1998 -6.976917
1999 -1.945935
2000 -36.610206
2001 -7.060010
Name: pnl, dtype: float64
>>> # plotting
>>> from matplotlib import pyplot as plt
>>> gb_yr.mean().plot()
>>> plt.show()
>>> plt.close()
最后
以上就是含糊蜻蜓为你收集整理的python柱形图显示年份,Python:条形图-在所有年份中按a)年和b)季度绘制值的总和...的全部内容,希望文章能够帮你解决python柱形图显示年份,Python:条形图-在所有年份中按a)年和b)季度绘制值的总和...所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复