失眠网 > 【转】浏览器/网页工作原理

【转】浏览器/网页工作原理

时间：2019-08-22 16:50:48

相关推荐

【转】浏览器/网页工作原理

英文原文：/archive/what-really-happens-when-you-navigate-to-a-url/

中文原文：http://cq-//02/%E6%B5%8F%E8%A7%88%E5%99%A8%E7%BD%91%E9%A1%B5%E5%B7%A5%E4%BD%9C%E5%8E%9F%E7%90%86/

本人笔记：

What really happens when you navigate to a URL

As a software developer, you certainly have a high-level picture of how web apps work and what kinds of technologies are involved: the browser, HTTP, HTML, web server, request handlers, and so on.

In this article, we will take a deeper look at the sequence of events that take place when you visit a URL.

1. You enter a URL into the browser

It all starts here:

2. The browser looks up the IP address for the domain name

The first step in the navigation is to figure out the IP address for the visited domain. The DNS lookup proceeds as follows:

Browser cache –The browser caches DNS records for some time. Interestingly, the OS does not tell the browser the time-to-live for each DNS record, and so the browser caches them for a fixed duration (varies between browsers, 2 – 30 minutes).OS cache– If the browser cache does not contain the desired record, the browser makes a system call (gethostbyname in Windows). The OS has its own cache.Router cache– The request continues on to your router, which typically has its own DNS cache.ISP DNS cache– The next place checked is the cache ISP’s DNS server. With a cache, naturally.Recursive search– Your ISP’s DNS server begins a recursive search, from the root nameserver, through the .com top-level nameserver, to Facebook’s nameserver. Normally, the DNS server will have names of the .com nameservers in cache, and so a hit to the root nameserver will not be necessary.

Here is a diagram of what a recursive DNS search looks like:

One worrying thing about DNS is that the entire domain like or seems to map to a single IP address. Fortunately, there are ways of mitigating the bottleneck:

Round-robin DNSis a solution where the DNS lookup returns multiple IP addresses, rather than just one. For example, actually maps to four IP addresses.Load-balanceris the piece of hardware that listens on a particular IP address and forwards the requests to other servers. Major sites will typically use expensive high-performance load balancers.Geographic DNSimproves scalability by mapping a domain name to different IP addresses, depending on the client’s geographic location. This is great for hosting static content so that different servers don’t have to update shared state.Anycastis a routing technique where a single IP address maps to multiple physical servers. Unfortunately, anycast does not fit well with TCP and is rarely used in that scenario.

Most of the DNS servers themselves use anycast to achieve high availability and low latencyof the DNS lookups.

3. The browser sends a HTTP request to the web server

You can be pretty sure that Facebook’s homepage will not be served from the browser cache because dynamic pages expire either very quickly or immediately (expiry date set to past).

So, the browser will send this request to the Facebook server:

GET / HTTP/1.1Accept: application/x-ms-application, image/jpeg, application/xaml+xml, [...]User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; [...]Accept-Encoding: gzip, deflateConnection: Keep-AliveHost: Cookie: datr=1265876274-[...]; locale=en_US; lsd=WW[...]; c_user=2101[...]

The GET request names theURLto fetch:“/”. The browser identifies itself (User-Agentheader), and states what types of responses it will accept (AcceptandAccept-Encodingheaders). TheConnectionheader asks the server to keep the TCP connection open for further requests.

The request also contains thecookiesthat the browser has for this domain. As you probably already know, cookies are key-value pairs that track the state of a web site in between different page requests. And so the cookies store the name of the logged-in user, a secret number that was assigned to the user by the server, some of user’s settings, etc. The cookies will be stored in a text file on the client, and sent to the server with every request.

There is a variety of tools that let you view the raw HTTP requests and corresponding responses. My favorite tool for viewing the raw HTTP traffic is fiddler, but there are many other tools (e.g., FireBug) These tools are a great help when optimizing a site.

In addition to GET requests, another type of requests that you may be familiar with is a POST requ

如果觉得《【转】浏览器/网页工作原理》对你有帮助，请点赞、收藏，并留下你的观点哦！

本内容不代表本网观点和政治立场，如有侵犯你的权益请联系我们处理。

网友评论

网友评论仅供其表达个人看法，并不表明网站立场。