Uncover Google's Web Advertising Technology: Based on the Big Data Perspective of the Internet
I believe that everyone on the Internet is troubled by a variety of online advertising, constantly consuming our traffic. If you look a little closer, you may find that the ads pushed by different websites are more suited to your preferences, and it seems that the technical means are not simple.Big data technologies related to the Internet include: cookies, dynamic scripts, user portraits, user behavior analysis and mass data access.
If you click on a laptop on JD.com, a few days later, when you visit a website you've never visited, you'll probably find an ad for a laptop on the page.
Figure 1
As an Internet big data technology researcher, the instinctive reaction, of course, is to look at the source code of the page and indeed find the appropriate script, in which the "-ad-" probably indicates that the advertisement is embedded here.
Figure 2
But because it's a dynamic script, you can't tell which site the ad is on. To do this, you can go to Developer Mode (Source) through the browser's settings function and find the script structure for the ad bar.
Figure 3
Then look at the URL corresponding to the dynamic script after execution, from the figure below, you can see that the ad URL points to googleads.g.doubleclick.net, from the domain name is google ads.
Figure 4
Yes, Doubleclick, an Internet advertising agency, was acquired by Google in 2008. It provides a variety of ad management and ad delivery solutions to help businesses buy, produce, or sell online ads, allowing users to centrally plan, execute, monitor, and track online advertising campaigns. So we can drawGoogle's web advertising technology platform architectureFigure.
Figure 5
The entire process is by serial number 1-5 as indicated in the diagram.
1 Customers who need to advertise to doubleclick to register, register;
2 Sites that join the ad network get dynamic scripts for embedded ads from doubleclick, similar to Figure 2. and embed the code into the page;
3 Internet users access the page through the browser, dynamic scripts are executed on the user's browser, to obtain a URL pointing to doubleclick;
4 When doubleclick is connected, doubleclick generates the user's unique identity and writes it to the local cookie file;
5 Each time we visit a page with an ad script, we automatically read doubleclick cookies and extract the appropriate ads from doubleclick. This way everyone's unique identity is recorded in its database. And this step, apparently based on our behavior data for clicking on ads and browsing pages, is a big amount of data. Accurate ad push requires big data mining and user portraits.
In this process,cookieIt helped a lot, with doubleclick cookie files on almost every computer. For IE under win7, it's usually in C: s.Users.... Administrator?AppData\Local?Microsoft?Windows?Temporary Internet Files; Chrome browsers can set Chrome Settings-Privacy Settings-Content Settings. When found, it can be cleared.
Written by the authorInternet Big Data Processing Technology and ApplicationsThe monograph (Tsinghua University Press, 2017), with the public number of the same name, focuses on the dissemination of scientific and engineering knowledge about big data technology, while also providing readers with some expanded reading materials.Welcome to choose this book as a big data-related professional teaching materials, there are relevant teaching resources to share.
Teaching resources related to Internet Big Data Processing Technology and Applications
Who in the financial news gets more attention from investors -- big data mining from the Internet
Internet Big Data Processing Technologies and Applications (No. 3 of 2018)
Features: The eternal theme of machine learning
Click.Read the original textLINK to view book details (catalog, introduction).
Go to "Discovery" - "Take a look" browse "Friends are watching"