Web crawlers are used to index the internet to help people search the web more efficiently.
Prerequisites
- Basic knowledge of JavaScript / HTML.
- Basic knowledge of Arm Treasure Data.
- Basic knowledge of Arm Treasure Data JavaScript SDK
User Agents for Google Crawlers
Because Treasure Data JavaScript SDK tracks all page views, raw data usually contains a lot of accesses from web crawlers. You can use td_browser parameter to recognize if the access is coming from the browser or not.
td_browser is recognized by user-agents, and it works on our SDK Backend server. td_browser shows the following value for each Google Crawler.(See Google Crawler)
Crawler | user-agents | HTTP(S) requests user-agent | td_browser |
---|---|---|---|
Googlebot (Google Web search) | Googlebot | Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) | “Googlebot” |
Googlebot (Google Web search) | Googlebot | (rarely used): Googlebot/2.1 (+http://www.google.com/bot.html) | “Googlebot” |
Googlebot News | Googlebot-News (Googlebot) | Googlebot-News | “Other” |
Googlebot Images | Googlebot-Image (Googlebot) | Googlebot-Image/1.0 | “Other” |
Googlebot Video | Googlebot-Video (Googlebot) | Googlebot-Video/1.0 | “Other” |
Google Mobile (feature phone) | Googlebot-Mobile | SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html) | “UP.Browser” |
Google Mobile (feature phone) | Googlebot-Mobile | DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html) | “Other” |
Google Smartphone | Googlebot | Mozilla/5.0 (iPhone; CPU iPhone OS 8_3 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12F70 Safari/600.1.4 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) | “Googlebot” |
Google Mobile AdSense | Mediapartners-Google | [various mobile device types] (compatible; Mediapartners-Google/2.1; +http://www.google.com/bot.html) | “Other” |
Google Mobile AdSense | Mediapartners (Googlebot) | [various mobile device types] (compatible; Mediapartners-Google/2.1; +http://www.google.com/bot.html) | “Other” |
Google AdSense | Mediapartners-Google | Mediapartners-Google | “Other” |
Google AdSense | Mediapartners (Googlebot) | Mediapartners-Google | “Other” |
Google AdsBot landing page quality check | AdsBot-Google | AdsBot-Google (+http://www.google.com/adsbot.html) | “Other” |
Comments
0 comments
Please sign in to leave a comment.