Walkthrough - Google Dorking

Tags: OSInt, Google Dorking, Dork, Beginner. Description: Explaining how Search Engines work and leveraging them into finding hidden content! Difficulty: Easy URL: https://tryhackme.com/room/googledorking

Let’s Learn About Crawlers

Name the key term of what a “Crawler” is used to do

Crawlers are used to index sites by searching for keywords and external links.

What is the name of the technique that “Search Engines” use to retrieve this information about websites?

crawling

What is an example of the type of contents that could be gathered from a website?

Keywords

Beepboop - Robots.txt

Where would “robots.txt” be located on the domain “ablog.com”

ablog.com/robots.txt

If a website was to have a sitemap, where would that be located?

/sitemap.xml

How would we only allow “Bingbot” to index the website?

User-agent: Bingbot

How would we prevent a “Crawler” from indexing the directory “/dont-index-me/”?

Disallow: /dont-index-me/

What is the extension of a Unix/Linux system configuration file that we might want to hide from “Crawlers”?

.conf

Task 5 Sitemaps

What is the typical file structure of a “Sitemap”?

xml

What real life example can “Sitemaps” be compared to?

map

Name the keyword for the path taken for content on a website

route

Task 5 Sitemaps

What would be the format used to query the site bbc.co.uk about flood defences

Walkthrough - Google Dorking

Let’s Learn About Crawlers

Beepboop - Robots.txt

Task 5 Sitemaps

Task 5 Sitemaps

Further Reading

THM Creative v2.2

THM Cyberlens v7

Nist Sp 800 30