Chen Shawn's Blogs

╭(●`∀´●)╯ ╰(●’◡’●)╮

0%

使用matplotlib绘制带有方差区间的曲线

在强化学习的论文中经常可以看到一条收敛线,周围还有浅浅的范围线,一直比较疑惑这个范围线的实际含义,似乎不同论文中这个范围线的实际含义是不同的

例如有些文章中,范围线随时间变化非常剧烈,表示的是不同random seed下运行结果的标准差,而大部分Berkeley和OpenAI的文章中,范围线都很比较平滑,代表的似乎是标准差的滑动平均

直到看TD3的时候才发现,Figure 5的caption处写明了画图的方式

The shaded region represents half a standard deviation of the average evaluation over 10 trials. Curves are smoothed uniformly for visual clarity.

matplotlib.pyplot.fill_between

实现该绘图功能需要用到的最重要的一个就是matplotlib.pyplot.fill_between,参数说明详见官方文档,函数原型如下

1
2
3
4
5
6
7
matplotlib.pyplot.fill_between(x, 
y1,
y2=0,
where=None,
interpolate=False,
step=None, *,
data=None, **kwargs)

函数的功能是将两条曲线之间的面积用制定颜色填充,在绘图时我们只需要手动计算出方差区间,然后使用该函数填充区间即可

注意需要将alpha参数调小,从而降低填充区域的透明度,避免原来绘制的图像被填充区域覆盖掉

Implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Suppose variable `reward_sum` is a list containing all the reward summary scalars
def plot_with_variance(reward_mean, reward_std, color='yellow', savefig_dir=None):
"""plot_with_variance
reward_mean: typr list, containing all the means of reward summmary scalars collected during training
reward_std: type list, containing all variance
savefig_dir: if not None, this must be a str representing the directory to save the figure
"""
half_reward_std = reward_std / 2.0
lower = [x - y for x, y in zip(reward_mean, half_reward_std)]
upper = [x + y for x, y in zip(reward_mean, half_reward_std)]
plt.figure()
xaxis = list(range(len(lower)))
plt.plot(xaxis, reward_mean, color=color)
plt.fill_between(xaxis, lower, upper, color=color, alpha=0.2)
plt.grid()
plt.xlabel('Episode')
plt.ylabel('Average reward')
plt.title('The convergence of rewards')
if savefig_dir is not None and type(savefig_dir) is str:
plt.savefig(savefig_dir, format='svg')
plt.show()