我是靠谱客的博主 调皮雨,最近开发中收集的这篇文章主要介绍使用XGBoost进行时间序列预测流程代码评价,觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

来自参考

原始数据

也就是两列数据,一列是时间,一列是电力消耗量:

Datetime,PJME_MW
2002-12-31 01:00:00,26498.0
2002-12-31 02:00:00,25147.0
2002-12-31 03:00:00,24574.0
2002-12-31 04:00:00,24393.0
2002-12-31 05:00:00,24860.0
2002-12-31 06:00:00,26222.0
2002-12-31 07:00:00,28702.0
2002-12-31 08:00:00,30698.0
...
2018-01-01 19:00:00,44343.0
2018-01-01 20:00:00,44284.0
2018-01-01 21:00:00,43751.0
2018-01-01 22:00:00,42402.0
2018-01-01 23:00:00,40164.0
2018-01-02 00:00:00,38608.0

准备训练集和测试集

以2015-01-01切分训练集和测试集:

pjme = pd.read_csv('PJME_hourly.csv', index_col=[0], parse_dates=[0])
split_date = '2015-01-01'
pjme_train = pjme.loc[pjme.index <= split_date].copy()
pjme_test = pjme.loc[pjme.index > split_date].copy()

构造特征:

def create_features(df, label=None):
df['date'] = df.index # index: DatetimeIndex
df['hour'] = df['date'].dt.hour # dt: DatetimeProperties, hour: Series
df['day_of_week'] = df['date'].dt.dayofweek
df['quarter'] = df['date'].dt.quarter
df['month'] = df['date'].dt.month
df['year'] = df['date'].dt.year
df['day_of_year'] = df['date'].dt.dayofyear
df['day_of_month'] = df['date'].dt.day
df['week_of_year'] = df['date'].dt.weekofyear
X = df[['hour', 'day_of_week', 'quarter', 'month', 'year', 'day_of_year', 'day_of_month', 'week_of_year']]
if label:
y = df[label]
return X, y
return X
# 训练集
X_train, y_train = create_features(pjme_train, label='PJME_MW')
# 测试集
X_test, y_test = create_features(pjme_test, label='PJME_MW')
X_train:
hour
day_of_week
quarter
month
year
day_of_year
day_of_month
week_of_year
Datetime
2002-12-31 01:00:00
1
1
4
12
2002
365
31
1
2002-12-31 02:00:00
2
1
4
12
2002
365
31
1
2002-12-31 03:00:00
3
1
4
12
2002
365
31
1
2002-12-31 04:00:00
4
1
4
12
2002
365
31
1
2002-12-31 05:00:00
5
1
4
12
2002
365
31
1
...

模型->训练->预测

# 模型
reg = xgb.XGBRegressor(n_estimators=1000)
# 训练
reg.fit(X_train, y_train, eval_set=[(X_train, y_train), (X_test, y_test)], early_stopping_rounds=50)
[0]	validation_0-rmse:29710.4	validation_1-rmse:28762.5
Multiple eval metrics have been passed: 'validation_1-rmse' will be used for early stopping.
Will train until validation_1-rmse hasn't improved in 50 rounds.
[1]	validation_0-rmse:26822.6	validation_1-rmse:25892.2
[2]	validation_0-rmse:24211.2	validation_1-rmse:23286.6
[3]	validation_0-rmse:21885.1	validation_1-rmse:20967.5
[4]	validation_0-rmse:19780.3	validation_1-rmse:18868.5
...
[195]	validation_0-rmse:2844.33	validation_1-rmse:3754.45
[196]	validation_0-rmse:2842.94	validation_1-rmse:3754.73
[197]	validation_0-rmse:2840.57	validation_1-rmse:3754.88
[198]	validation_0-rmse:2838.73	validation_1-rmse:3754.71
[199]	validation_0-rmse:2837.81	validation_1-rmse:3753.66
Stopping. Best iteration:
[149]	validation_0-rmse:2923.17	validation_1-rmse:3712.2
# 预测
y_pred = reg.predict(X_test)
[28804.365 27663.098 27125.912 ... 34988.7
32725.598 31440.66 ]

评价

RMSE: 均方根误差(Root Mean Square Error)

最后

以上就是调皮雨为你收集整理的使用XGBoost进行时间序列预测流程代码评价的全部内容,希望文章能够帮你解决使用XGBoost进行时间序列预测流程代码评价所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(50)

评论列表共有 0 条评论

立即
投稿
返回
顶部