Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A QUIC look at HTTP/3 (lwn.net)
106 points by lukastyrychtr on March 13, 2020 | hide | past | favorite | 41 comments


> In practice, it will mean asking the DNS first to check if HTTP/3 can be used.

Looking at the (draft) DNS specs for HTTPSSVC records [0] it seems that there's also the equivalence of SRV records? (SRV records is something that the browsers have never implemented)

I recall a semi-recent discussion about the prevalence of CDNs and what options there are for smaller players and DIY-ers and how SRV could have had a role in that [1]. I'm really looking forward to the spec rolling out if this is the case!

(There's a little example given in the Appendix of the DNS specs [2])

[0] https://tools.ietf.org/html/draft-ietf-dnsop-svcb-httpssvc-0...

[1] https://news.ycombinator.com/item?id=22073098

[2] https://tools.ietf.org/html/draft-ietf-dnsop-svcb-httpssvc-0...


Yes, Alt-Svc just isn't practically faster unless you're constantly back to that particular site so a DNS solution had to be on the table, and (the current assumed path for) eSNI assumes a DNS solution too.

It will be interesting to see which is the biggest deployability headache out of QUIC itself (UDP doesn't work for a noticeable fraction of users) and HTTPSSVC (not only will lots of people who have a shiny new HTTP/3-capable server not have a DNS server that understands HTTPSSVC lots of their clients have DNS servers which reject queries they don't understand outright).

Remember it was about a year between TLS 1.3 being finished and being really finished, because the standard that was cryptographically sound and made sense from an engineering perspective couldn't be deployed - what we've got now is that protocol seen through a funhouse mirror that lets it pass middleboxes.

For example the version field where you'd write TLS 1.3 (actually in a sense SSL 4.2 or maybe SSL 5) you can't do that. To sneak past a middlebox say instead that you're TLS 1.2 like it expects, and furthermore, you are re-connecting. It shouldn't worry that it can't understand the rest of what you say because you're just re-connecting - it must have approved that previous connection and so logically this one is likewise fine. Actually remembering which such connections have happened would cost money and isn't required to make the test pass so the middlebox doesn't do that and many will just allow all re-connections past untouched.


Am I the only one who thinks QUIC looks a bit crazy with all its variable length fields in the frame, separate short & long header format, and rather high limits for certain things? 160 bit connection ids and 64 bit stream ids sound like something you'd benefit from if you're Google scale and building big-ass load balancers on custom hardware but I feel like there's going to be a lot of overhead for everyone else. (Consider that a TCP header with no options is as large as a single maximum size connection id in QUIC).

At least the designers of IPv6 seemed to care more about efficiency, even if the protocol otherwise seems quite bloated and overengineered.


I guess connection IDs are that large to allow the use of stateless cryptographic identifiers similar to tcp syn cookies. And stream IDs are assigned sequentially and variable length encoded, so they shouldn't take up much space most of the time.


Google trying to build a competitive advantage for themselves is not surprising. A lack of at least some level of pushback is slightly more worrying.


Uber was an early adopter of QUIC and saw significant performance improvements for its apps from it, especially in markets where mobile connectivity is poor https://eng.uber.com/employing-quic-protocol/


With recent linux kernels most of the cited quic advantages are also available for TCP, the exceptions are HoL blocking which is inherent in TCP (but how bad it is depends on loss recovery, which is improving) and connection migration which will come to TCP via MP-TCP.

Their article doesn't even mention against which kernel verison they're comparing

> In the future, TLS1.3 will support 0-RTT, but the TCP three-way handshake will still be required.

This is outright wrong. With TCP you have fast open, which provides 0-RTT send just like the TLS 0-RTT handshake.


> This is outright wrong. With TCP you have fast open, which provides 0-RTT send just like the TLS 0-RTT handshake.

Yes in theory but it fails on its face in practice. Many middle boxes are not happy with TCP Fast Open, and there's the tracking problem. Chrome dropped TFO.


Doesn't the tracking problem also apply to 0RTT TLS? So we would have to discount either.


TLS lives in the browser stack. So your browser can provide policy here. For example the browser can say "OK, this is a Porn tab, so we shouldn't re-use the google.com TLS connection from the tab that's logged into GMail and searching for Tom Hanks movies, we'll spin up a new one (without 0RTT)".

But TCP lives down in the OS stack, so maybe without knowing it the browser's new TLS connection has TFO cookies the OS learned from the Tom Hanks connection, which ties the two sessions together. Oops.


The application has to explicitly ask for fast open to happen, so just as it does with TLS it can choose not to do so with TFO. But you're right, it has no control over the cookies themselves. That could be added with some ioctl or socket option, but I guess things haven't developed that far.

But cookieless TFO is also an option, combined with 0RTT TLS the application would be in charge of either accepting or rejecting the connection based on the TLS resumption cookie.


All of this is covered in the "Boxes" section in the article


I was sppecifically responding to the bullet list in the "Adopting QUIC" section of the uber article. I don't think the general "middle boxes ruing everything" argument addresses this.

BBR has very few if any middlebox problems. Neither do tail loss probe improvements or any other sender side optimizations such as TCP_NOTSENT_LOWAT. TFO may cause issues in corporate environments (works fine with my home equipment) but fallback is easy. So I don't think this really "addresses" the issue. Improved acking is not strictly needed if you use tcp timestamps for RTT estimation and still ack improvements have been added to the kernel over the years.


Fallback for TFO isn't easy because your request might not have been idempotent.

Start running some HTTP requests multiple times and you can quickly see hard to debug breakage.


Man, I’m just waiting for AWS load balancers to support HTTP2. Last time I checked they still convert HTTP2 into HTTP1 for downstream endpoints.


Why is that an issue for you? It seems like the downstream connection would be on AWS' internal network, so HTTP 2 doesn't have as many advantages.


HTTP/2 push comes to mind. Your services hosted behind an ALB won't be able to make us of it if the connection is dropped to HTTP/1.1


It's not just the performance – it'd be nice if backends could take advantage of the fun new features like push, stream prio, etc.

(Though with HTTP/3 around the corner, I can't imagine anyone will bother working on HTTP/2 to backends.)


> so HTTP 2 doesn't have as many advantages.

Request pipelining and multiplexing are good enough reasons.


Hard to support grpc without it


Maybe just run your own instance as a load balancer?


Running your own load balancer is a non negligeable amount of work for not that much benefit, and hard to nail if you're operating at scale.

"Just" seems like an understatement here, there's a reason why ELBs and ALBs are popular solutions.


Not an option.


Great article, thanks. Just today I happened to see an Alt-Svc header from a LiteSpeed server and was wondering what it was. I’ve (finally) signed up for an LWN subscription.


So far no mention of SCTP which would have been fine if people switched to IPv6 (which more and more are every day). Instead we have this monster of a protocol emulating it over UDP.


SCTP support in routers, firewalls, ... is just too bad, IPv6 has nothing to do with that.


Wouldn't SCTP suffer from the ossification of middle boxes?


Yes absolutely.

Some of the people involved in the early QUIC development in Google had been heavily involved in the SCTP work and blamed its lack of success on the difficulty in getting firewalls, middleboxes (and the teams that manage them) to accept a new IP protocol. So they wrapped QUIC in UDP.


SCTP can run over UDP. WebRTC does it.


I took an early look at the performance (using Python) in November which may also be of interest. https://pgjones.dev/blog/early-look-at-http3-2019/

(That site is also served over HTTP/3 if you want to try it out).


The article states:

>"In HTTP/2, the single connection carries hundreds of streams. In this case, when we lose one packet, "one hundred streams are waiting for that one single packet"

Could someone elaborate on how this is worse if all of those streams are part of the some client request? Am I missing something really obvious.


The difference is that in HTTP/1.1 browsers tended to use up to six TCP connections per origin. On a site that loaded many resources (a.jpg, b.png, c.js, an API call, &c.) from one origin, they were multiplexed across six TCP connections, and so if a packet on one connection was lost, the other five would be unaffected and could continue at full speed.

In HTTP/2, all those requests are multiplexed across only one TCP connection per origin (each request becomes an HTTP/2 stream); so now when a packet is lost, all requests in progress are affected, rather than as few as ⅙ of them.

If you only ever make one request at a time, this difference does not affect you, because only one connection would be used under HTTP/1.1 as well.

But this case can make HTTP/2 significantly worse than HTTP/1.1 on low-quality mobile networks.


They may all be affected, but with selective ACK that means the receiver will still ACK most of the data, the server will notice that one packet was missing via SACK, resend it along with even more new data and all you got is a larger batch coming in 1RTT later, throughput isn't affected if your congestion controller can distinguish spurious losses from congestion (as BBR does). The problem only snowballs when future requests will only be made depending on previous data.


The other requests are still taking the latency hit.


Sure, but they were parallel requests in the first place, so in terms of wall time you're still only taking the latency hit once. And if they're larger in aggregate than the connection's BDP some of them would only have finished by the time that it has recovered from the loss anyway.

So... it's complicated. I'm not saying that HoL is a non-issue and that TCP is perfect. It's just that you're not eating the latency penalty N times and for anything that is dominated by throughput rather than latency it's even less relevant.


They’re all part of the same TCP connection, which means that if a packet gets lost, they all need to wait for this packet to be retransmitted, even if the data was unimportant for almost all of them.


As I understand it, in HTTP 2 there would be one logical stream per request. So in reality, all requests will be blocked on one (or a minority of) particular request for which a packet was lost, which is clearly subopptimal.

This problem is known as head of line blocking.


They're all between the same client and the same server, but they're not necessarily the same request. If there's only a single request it's no different from HTTP/1.


Thanks all for the great explanations. I was failing to think about how a browser's opening of multiple TCP connections in 1.1 maps to the multiple streams model of HTTP/2. Cheers.


> HTTP/3 is expected to use two to three times more CPU time than the earlier versions for the same bandwidth consumption.

I wonder the "for the same bandwidth consumption" is based on what. HTTP/2 with TLS I guess?


> new implementation of TLS in the kernel

This sounds like insanity to me. Is there an old implementation of TLS?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: