Files and Documents

Files and Media are one of the more juicy targets to look for when planning a penetration test. For companies that publish things to the web on a regular basis, there is constantly information that is overlooked and should not have been sent out of the organization. I have found things like email distribution lists, Internal only email addresses perfect for phishing, personnel information, client communications, etc. Dont forget public facing FTP servers. They always seem to have something juicy hidden in them.
Documents.html is a tool that allows you to take a search term related to your target, and search for various file types associated with the term. The term should be something as unique as possible, but still related to the target: company name, platform, application, client, etc. Perform multiple searches for various terms for the best coverage.
  • ​ - The most comprehensive online file storage search engine. They have individual search engines for badongo, Mediafire, Zshare, 4shared and taringa. They also provide a search all function that searches filefactory, depositfiles, easy-share, sharedzilla, sendspace, yousendit, letitbit, drop, sharebee, rapidspread, and many others.
  • ​PowerMeta - PowerMeta searches for publicly available files hosted on various websites for a particular domain by using specially crafted Google, and Bing searches. It then allows for the download of those files from the target domain. After retrieving the files, the metadata associated with them can be analyzed by PowerMeta. Some interesting things commonly found in metadata are usernames, domains, software titles, and computer names.
  • ​goofile - Use this tool to search for a specific file type in a given domain.
  • ​FilePhish - A simple OSINT Google query builder for fast and easy document and file discovery.
  • ​MetaFinder - Search for documents in a domain through Search Engines (Google, Bing and Baidu). The objective is to extract metadata

Public Directory, FTP, and Cloud

Article, Presentation, and Book search

  • ​ - This is the largest free library in human history. Giving the world free access to over 84 million scholarly journals, over 6.6 million academic and general-interest books, over 2.2 million comics, and over 381 thousand magazines. Commonly referred to as "Libgen" for short. Libgen has zero regard for copyright.
  • ​ - A "shadow library" that provides free access to millions of research papers and books by bypassing paywalls. SciHub has zero regard for copyright.
  • ​ - An open directory data archive dedicated to the long-term preservation of any and all data including websites, books, games, software, video, audio, other digital-obscura and ideas. Currently hosts over 140TB of data for free.
    • ​ - A searchable index of Much faster than manually digging through subfolders or using Google dorks.
  • ​ - Search over 16,000 journals, over 6.5 million articles in 80 different languages from 129 different countries.
  • ​ - Allows users to upload content including presentations, infographics, documents, and videos. Users can upload files privately or publicly in PowerPoint, Word, PDF, or OpenDocument format


While having other functionality Michael Bazzell's Images.html and Videos.html tool helps search for image terms across multiple platforms. Looking for faces of employees? Maybe a picture of their security badge you can copy? Image of a target you can extract metadata from later? Start with a good image search. Google is hard to beat for this but there are other platforms that can lead to some interesting discoveries.
*Note - this tool is only to find images associated with a search term. If you have an image and you would like to find out more information about it, that will be discussed under the Forensics section.
  • ​CameraTrace - Trace the location a camera has been by the metadata it embeds in photos that end up on the internet.
  • ​FotoForensics - Free and public photo forensics tools.
  • ​ - Image Metadata Analysis tool
  • ​ - An open-source digital image forensic toolset
  • ​ - A video and audio search engine that searches over 70 different platforms.
  • Reverse Image Search
    • ​ β€” chinese reverse image search engine
    • ​Image So Search β€” Qihoo 360 Reverse Images Search
    • ​ β€” allows to upload an image once and immediately search for it in #Google, #Yandex, and #Bing.
    • ​Pixsy β€” allows to upload pictures from computer, social networks or cloud storages, and then search for their duplicates and check if they are copyrighted
    • ​Image Search Assistant β€” searches for a picture, screenshot or fragment of a screenshot in several search engines and stores at once
    • ​ β€” Reverse image search engine for scientific and medical images
    • ​DepositPhotos Reverse Image Search β€” tool for reverse image search (strictly from DepositPhoto's collection of 222 million files).
    • ​EveryPixel β€” Reverse image search engine. Search across 50 leading stock images agencies. It's possible to filter only free or only paid images.
  • Facial Recognition
    • ​pictriev - Search engine for faces. Upload the picture of choice and find links to other pictures with similar people.
    • ​PimEyes - Facial recognition and reverse image search
    • ​Portrait Matcher β€” Upload a picture of a face and get three paintings that show similar people.
    • ​FindFace - Russian face search engine.
    • ​Face Recognition β€” facial recognition api for Python and the command line
    • ​ β€” search people in VK, Odnoklassniki, TikTok and ClubHouse by photo or identikit
  • Clothing/Shopping

Breach/Leak/Paste data

Looking for easy creds? Linked data? Password hash? Breaches can be a trove for low hanging fruit for those targeting those not diligent with their cyber hygiene. Often times, the credentials found in large data breaches will turn into password lists such as the infamous rockyou.txt password list that came from a sizeable breach in 2009.
The below tools and links can be used to parse data in known data breaches and leaks, or be used for detection and alert for the presence of credentials when new breach data is reported. Paste sites like Pastebin have recently changed their ability to be parsed. Pastebin itself has removed the ability to to search its pastes. However, with a bit of clever google dorking, you can still search for breach data by submitting your search along with ""
  • ​ - Premium but well worth it, Breach data site. Can search by multiple types of indicators like email, IP, address, domain, even password.
  • ​Have I Been Pwned - Check if your email has been compromised in a data breach
  • ​Scylla - One of the greatest breach parsing tools available.
  • ​ - Leak-Lookup allows you to search across thousands of data breaches to stay on top of credentials that may have been compromised, allowing you to proactively stay on top of the latest data leaks with ease. AKA Citadel
  • ​ - Search via email address, username or phone number to see censored passwords. They also provide the full password as a SHA-1 hash, which can easily be cracked.
  • ​ - A tool for monitoring leaked passwords for accounts linked to emails. Actually shows you the leaked passwords.
  • ​ - Another leaked database search. Requires a paid subscription.
  • ​ - Provides you the best leaked breached databases downloads. Requires a paid subscription.
  • ​http://4wbwa6vcpvcr3vvf4qkhppgy56urmjcj2vagu2iqgp3z656xcmfdbiqd.onion/ - An .onion site that allows you to search through the full 2019 Facebook data breach.

Paste Tools

Misc Tools and Resources

  • ​Cryptome - Archive of publicly leaked documents. Usually government related.
  • ​easy-to-read breach list - Easy and helpful tracker for breach data.
  • ​Firefox Monitor - Great tool for searching if your accounts have been found in a breach and can alert you when new breaches are discovered and parsed.
  • ​pwd query - Check if your passwords have been compromised from a data leak...
  • ​Analysis Information Leak framework - AIL is a modular framework to analyze potential information leaks from unstructured data sources like pastes from Pastebin or similar services or unstructured data streams.
  • ​breach-parse - A tool for parsing breached passwords by The Cyber Mentor. Repo also contains large breach data collections.
  • ​ - This is a sub that aims at bringing data hoarders together to share their passion with like minded people.
  • ​ - Exchange and Sharing sub for /r/DataHoarder

Code Repositories

Ah the gold mine of git repositories. So at the time of writing this, we are still in the golden age of security ignorance in coding. DevSecOps has not yet fully caught on, and software engineers everywhere post up this tid-bits of insecure code for storage later, or post a bit if their config file on a forum asking for help. Little did they realize that in that bit of the config file, they accidentally posted their creds! These are a few examples of the fun things we can find when checking code repositories. Now searching for these is usually limited to the context of a penetration test against an organization where you know they have software engineers bust creating the next great thing.
There are many great options out there for code repositories, but there are 4 that are the gold standard for checking.
You can manually parse these by user or subject but there are some handy tools that can help search and keep track.
  • ​Gitrob - Gitrob is a tool to help find potentially sensitive files pushed to public repositories on Github.
  • ​Git all secrets - Clone different gits and automatically scan them for secrets.
  • ​Truffle Hog - Searches through git repositories for secrets, digging deep into commit history and branches. This is effective at finding secrets accidentally committed.
  • ​gitleaks - This package contains a SAST tool for detecting hardcoded secrets like passwords, API keys, and tokens in git repos. Gitleaks aims to be the easy-to-use, all-in-one solution for finding secrets, past or present, in your code.
  • ​GitDorker - A Python program to scrape secrets from GitHub through usage of a large repository of dorks.
  • ​Repo Supervisor - Find secrets and passwords in your code
  • ​Watchman - Git change monitor
  • ​ - A search engine for contents of Git Repos
  • ​gitoops - GitOops is a tool to help attackers and defenders identify lateral movement and privilege escalation paths in GitHub organizations by abusing CI/CD pipelines and GitHub access controls.
  • ​ - Search 75 billion lines of code from 40 million projects


Github dorking

" password" "access_key" "access_token" "amazonaws" "api.googlemaps AIza" "api_key" "api_secret" "apidocs" "apikey" "apiSecret" "app_key" "app_secret" "appkey" "appkeysecret" "application_key" "appsecret" "appspot" "auth" "auth_token" "authorizationToken" "aws_access" "aws_access_key_id" "aws_key" "aws_secret" "aws_token" "AWSSecretKey" "bashrc password" "bucket_password" "client_secret" "cloudfront" "codecov_token" "config" "conn.login" "connectionstring" "consumer_key" "credentials" "database_password" "db_password" "db_username" "dbpasswd" "dbpassword" "dbuser" "dot-files" "dotfiles" "encryption_key" "fabricApiSecret" "fb_secret" "firebase" "ftp" "gh_token" "github_key" "github_token" "gitlab" "gmail_password" "gmail_username" "herokuapp" "internal" "irc_pass" "JEKYLL_GITHUB_TOKEN" "key" "keyPassword" "ldap_password" "ldap_username" "login" "mailchimp" "mailgun" "master_key" "mydotfiles" "mysql" "node_env" "npmrc _auth" "oauth_token" "pass" "passwd" "password" "passwords" "pem private" "preprod" "private_key" "prod" "pwd" "pwds" " password" "redis_password" "root_password" "secret" "secret.password" "secret_access_key" "secret_key" "secret_token" "secrets" "secure" "security_credentials" "send.keys" "send_keys" "sendkeys" "SF_USERNAME salesforce" "sf_username" "" FIREBASE_API_JSON= "" vim_settings.xml "slack_api" "slack_token" "sql_password" "ssh" "ssh2_auth_password" "sshpass" "staging" "stg" "storePassword" "stripe" "swagger" "testuser" "token" "x-api-key" "xoxb " "xoxp" [WFClient] Password= extension:ica access_key bucket_password dbpassword dbuser extension:avastlic "" extension:bat extension:cfg extension:env extension:exs extension:ini extension:json extension:json googleusercontent client_secret extension:json extension:pem extension:pem private extension:ppk extension:ppk private extension:properties extension:sh extension:sls extension:sql extension:sql mysql dump extension:sql mysql dump password extension:yaml extension:zsh filename:.bash_history filename:.bash_history DOMAIN-NAME filename:.bash_profile aws filename:.bashrc mailchimp filename:.bashrc password filename:.cshrc filename:.dockercfg auth filename:.env DB_USERNAME NOT homestead filename:.env filename:.esmtprc password filename:.ftpconfig filename:.git-credentials filename:.history filename:.htpasswd filename:.netrc password filename:.npmrc _auth filename:.pgpass filename:.remote-sync.json filename:.s3cfg filename:.sh_history filename:.tugboat NOT _tugboat filename:_netrc password filename:apikey filename:bash filename:bash_history filename:bash_profile filename:bashrc filename:beanstalkd.yml filename:CCCam.cfg filename:composer.json filename:config filename:config irc_pass filename:config.json auths filename:config.php dbpasswd filename:configuration.php JConfig password filename:connections filename:connections.xml filename:constants filename:credentials filename:credentials aws_access_key_id filename:cshrc filename:database filename:dbeaver-data-sources.xml filename:deployment-config.json filename:dhcpd.conf filename:dockercfg filename:environment filename:express.conf filename:express.conf path:.openshift filename:filezilla.xml filename:filezilla.xml Pass filename:git-credentials filename:gitconfig filename:global filename:history filename:htpasswd filename:hub oauth_token filename:id_dsa filename:id_rsa filename:id_rsa or filename:id_dsa filename:idea14.key filename:known_hosts filename:logins.json filename:makefile filename:master.key path:config filename:netrc filename:npmrc filename:pass filename:passwd path:etc filename:pgpass filename:prod.exs filename:prod.exs NOT prod.secret.exs filename:prod.secret.exs filename:proftpdpasswd filename:recentservers.xml filename:recentservers.xml Pass filename:robomongo.json filename:s3cfg filename:secrets.yml password filename:server.cfg filename:server.cfg rcon password filename:settings SECRET_KEY filename:sftp-config.json filename:sftp-config.json password filename:sftp.json path:.vscode filename:shadow filename:shadow path:etc filename:spec filename:sshd_config filename:token filename:tugboat filename:ventrilo_srv.ini filename:WebServers.xml filename:wp-config filename:wp-config.php filename:zhrc HEROKU_API_KEY language:json HEROKU_API_KEY language:shell HOMEBREW_GITHUB_API_TOKEN language:shell jsforce extension:js conn.login language:yaml -filename:travis msg nickserv identify filename:config org:Target "AWS_ACCESS_KEY_ID" org:Target "list_aws_accounts" org:Target "aws_access_key" org:Target "aws_secret_key" org:Target "bucket_name" org:Target "S3_ACCESS_KEY_ID" org:Target "S3_BUCKET" org:Target "S3_ENDPOINT" org:Target "S3_SECRET_ACCESS_KEY" password path:sites databases password private -language:java PT_TOKEN language:bash redis_password root_password secret_access_key SECRET_KEY_BASE= shodan_api_key language:python WORDPRESS_DB_PASSWORD= xoxp OR xoxb OR xoxa s3.yml .exs beanstalkd.yml deploy.rake .sls