我是靠谱客的博主 直率保温杯,最近开发中收集的这篇文章主要介绍阻塞socket上read/write出现errno为EAGAIN的原因解密,觉得挺不错的,现在分享给大家,希望可以做个参考。

概述

一直以来,个人概念中只有非阻塞socket才会产生EAGAIN的错误,意思是当前不可读写,只要继续重试就好。当最近我们redis模块的一个报错纠正我的这个概念错误。

事件回顾:hiredis的redisConnectWithTimeout和redisContextSetTimeout接口会设置与redis-server连接的socket为阻塞模式,并且设置读写超时,我们项目中设置超时为50ms。接着在下面几天的日志中发现hiredis报错“Resource temporarily unavailable”,一开始非常奇怪,因为这个错误对应的就是EAGAIN,而这种情况在我知识概念中只

有非阻塞模式下才会报。下面介绍下我理清这个概念的过程:

1.查看redis的读写接口

/* Use this function to handle a read event on the descriptor. It will try
* and read some bytes from the socket and feed them to the reply parser.
*
* After this function is called, you may use redisContextReadReply to
* see if there is a reply available. */
int redisBufferRead(redisContext *c) {
char buf[1024*16];
int nread;
/* Return early when the context has seen an error. */
if (c->err)
return REDIS_ERR;
nread = read(c->fd,buf,sizeof(buf));
if (nread == -1) {
if ((errno == EAGAIN && !(c->flags & REDIS_BLOCK)) || (errno == EINTR)) {
/* Try again later */
} else {
__redisSetError(c,REDIS_ERR_IO,NULL);
return REDIS_ERR; </span>
//分支A1</span>
}
} else if (nread == 0) {
__redisSetError(c,REDIS_ERR_EOF,"Server closed the connection");
return REDIS_ERR;
</span>
//分支A2</span>
} else {
if (redisReaderFeed(c->reader,buf,nread) != REDIS_OK) {
__redisSetError(c,c->reader->err,c->reader->errstr);
return REDIS_ERR;
}
}
return REDIS_OK;
}
/* Write the output buffer to the socket.
*
* Returns REDIS_OK when the buffer is empty, or (a part of) the buffer was
* succesfully written to the socket. When the buffer is empty after the
* write operation, "done" is set to 1 (if given).
*
* Returns REDIS_ERR if an error occured trying to write and sets
* c->errstr to hold the appropriate error string.
*/
int redisBufferWrite(redisContext *c, int *done) {
int nwritten;
/* Return early when the context has seen an error. */
if (c->err)
return REDIS_ERR;
if (sdslen(c->obuf) > 0) {
nwritten = write(c->fd,c->obuf,sdslen(c->obuf));
if (nwritten == -1) {
if ((errno == EAGAIN && !(c->flags & REDIS_BLOCK)) || (errno == EINTR)) {
/* Try again later */
} else {
__redisSetError(c,REDIS_ERR_IO,NULL);
//分支B1
return REDIS_ERR;
}
} else if (nwritten > 0) {
if (nwritten == (signed)sdslen(c->obuf)) {
sdsfree(c->obuf);
c->obuf = sdsempty();
} else {
sdsrange(c->obuf,nwritten,-1);
}
}
}
if (done != NULL) *done = (sdslen(c->obuf) == 0);
return REDIS_OK;
}
注意,错误日志中显示hiredis设置错误为REDIS_ERR_IO,并且errstr为“Resource temporarily unavailable”,那么只可能是分支A1和分支B1,再往下追究

2.查看错误设置过程

void __redisSetError(redisContext *c, int type, const char *str) {
size_t len;
c->err = type;
if (str != NULL) {
len = strlen(str);
len = len < (sizeof(c->errstr)-1) ? len : (sizeof(c->errstr)-1);
memcpy(c->errstr,str,len);
c->errstr[len] = '';
} else {
<span style="color:#ff0000;">
/* Only REDIS_ERR_IO may lack a description! */
assert(type == REDIS_ERR_IO);
strerror_r(errno,c->errstr,sizeof(c->errstr));</span>
}
}
分支中设置了c->errstr为“Resource temporarily unavailable”,从而反推errno为EAGAIN

3.为什么blocking的read和write会导致errno为EAGAIN?

1)我们对socket做了什么? 设置了超时时间

int redisContextSetTimeout(redisContext *c, const struct timeval tv) {
if (setsockopt(c->fd,SOL_SOCKET,SO_RCVTIMEO,&tv,sizeof(tv)) == -1) {
__redisSetErrorFromErrno(c,REDIS_ERR_IO,"setsockopt(SO_RCVTIMEO)");
return REDIS_ERR;
}
if (setsockopt(c->fd,SOL_SOCKET,SO_SNDTIMEO,&tv,sizeof(tv)) == -1) {
__redisSetErrorFromErrno(c,REDIS_ERR_IO,"setsockopt(SO_SNDTIMEO)");
return REDIS_ERR;
}
return REDIS_OK;
}
2)socket设置SO_RCVTIMEO和SO_SNDTIMEO对read/write有什么影响?看man怎么说

SO_RCVTIMEO and  SO_SNDTIMEO
Specify the receiving or sending timeouts until reporting an error. The argument is a  struct timeval. If an input or output function blocks for this period of time, and data has been sent or received, the return value of that function will be the amount of data transferred; if no data has been transferred and the timeout has been reached then -1 is returned with errno set to  EAGAIN or  EWOULDBLOCK, or  EINPROGRESS (for  connect(2)) just as if the socket was specified to be nonblocking. If the timeout is set to zero (the default) then the operation will never timeout. Timeouts only have effect for system calls that perform socket I/O (e.g.,  read(2),  recvmsg(2),  send(2),  sendmsg(2)); timeouts have no effect for  select(2),  poll(2),  epoll_wait(2), and so on.

终于清晰了:SO_RCVTIMEO和SO_SNDTIMEO会导致read/write函数返回EAGAIN


另外,在确定错误过程中,同事提到O_NODELAY会导致write接口返回EAGAIN,的确,如果设置了O_NODELAY而当前不可写,那么write接口会设置errno为EAGAIN,但是write接口会返回0而不是-1.在本案中,hiredis接口中并没有设置O_NODELAY



最后

以上就是直率保温杯为你收集整理的阻塞socket上read/write出现errno为EAGAIN的原因解密的全部内容,希望文章能够帮你解决阻塞socket上read/write出现errno为EAGAIN的原因解密所遇到的程序开发问题。

如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。

本图文内容来源于网友提供,作为学习参考使用,或来自网络收集整理,版权属于原作者所有。
点赞(53)

评论列表共有 0 条评论

立即
投稿
返回
顶部