Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Somewhat related, I'm running into a gRPC latency issue in https://github.com/grpc/grpc-go/issues/8436

If request payload exceeds certain size the response latency goes from network RTT to double that, or triple.

Definitely something wrong with either TCP or HTTP/2 windowing as it doesn't send the full request without getting ACK from server first. But none of the gRPC windowing config options nor linux tcp_wmem/rmem settings work. Sending one byte request every few hundred milliseconds fixes it by keeping the gRPC channel / TCP connection active. Nagle / slow start is disabled.



sounds like classic tcp congestion window scaling delay. Sounds like your payload exceeds 10x initcwnd.


Doesn't initcwnd only apply as the initial value? I don't care that the first request on the gRPC channel is slow, but subsequent requests on the same channel reuse the TCP connection and should have larger window size. This works as long as the channel is actively being used, but after short inactivity (few hundred ms, unsure exactly) something appears to revert back.


Yes, in case of hot tcp connections congestion control should not be the issue.


Yeah that was my understanding too, hence I filed the bug (actually duplicate of older bug that was closed because poster didn't provide reproduction).

Still not sure if this is linux network configuration issue or grpc issue, but something is for sure broken if I can't send a ~1MB request and get response within roughly network RTT + server processing time.


Could you check the value of your kernel's net.ipv4.tcp_slow_start_after_idle sysctl, and if it's non zero set it to 0?


That seems to work, thank you!

Now latency is just RTT + server time + payloadsize/bandwidth, not multiple times RTT: https://github.com/grpc/grpc-go/issues/8436#issuecomment-311...

I was not aware of this setting, it's pretty unfortunate this is a system-level setting that can't be overridden on application layer, and the idle timeout can't be changed either. Will have to figure out how to safely make this change on the k8s service this is affecting...


As you can imagine, when a TCP connection is first established, it has no knowledge of the conditions on the network. Thus we have slow start. At the same time, when a TCP connection goes idle, it's information about the conditions on the network become increasingly stale. Thus we have slow start after idle. In the Linux stack at least, being idle longer than the RTT (perhaps the computed RTO) is interpreted as meaning the TCP connection's idea of network conditions is no longer valid.

An application won't know anything about background specifics of the network to which the system on which it is running is attached. A system administrator might. In that sense at least, it is reasonable that it is a system tunable rather than a per-connection setsockopt().


This sounds exactly like the culprit. I didn't knew there is a slow start after idle and it is set to 1 (active) by default.

I wonder if I should change this to 0 on my default desktop machines for all connections.


That's indeed interesting, thank you for sharing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: