OSINT Tools: Metagoofil
2 min readWhat is Metagoofil?
Metagoofil is a command-line tool that automates the process of extracting metadata from files found on websites, such as PDFs, Word documents, and images. Metadata can reveal information about the file’s author, creation date, location, and more. This data can be crucial for investigations, security assessments, and information gathering.
Installation
Before we delve into using Metagoofil, let’s start with the installation process. Metagoofil is a Python-based tool, so setting it up is relatively straightforward.
Prerequisites
Ensure you have the following prerequisites:
- Python: Metagoofil requires Python 2.x. You can download Python from the official Python website.
Installation Steps
Follow these steps to install Metagoofil:
- Clone the GitHub Repository:Open your terminal or command prompt and run the following command to clone the Metagoofil GitHub repository to your local machine:bash
git clone https://github.com/laramies/metagoofil.git
Navigate to the Metagoofil Directory:
Change your working directory to the newly cloned repository:
bash
cd metagoofil
Install Required Dependencies:
Metagoofil does not have many dependencies. You can use pip
to install the lxml
library, which is required for XML parsing:
bash
pip install lxml
This will install the necessary library.
Using Metagoofil
Now that you have Metagoofil installed, let’s explore how to use it for metadata extraction.
Basic Usage
To run Metagoofil, use the following command format:
bash
python metagoofil.py -d <target_domain> -t <file_type> -l <limit> -n <output_directory> -f <output_file>
<target_domain>
: Specify the target domain from which you want to extract metadata.<file_type>
: Define the file type to search for (e.g.,pdf
,doc
,ppt
,xls
,jpg
, etc.).<limit>
: Set the limit for the number of files to download.<output_directory>
: Choose the directory where downloaded files will be saved.<output_file>
: Specify the name of the output HTML file that will contain the metadata.
For example, to extract PDF and DOC files from example.com
and limit the download to 10 files:
bash
python metagoofil.py -d example.com -t pdf,doc -l 10 -n output_directory -f output.html
Advanced Options
Metagoofil offers various advanced options, including specifying search engines, increasing the depth of the search, and customizing user agents. Use the --help
flag to display a list of available options and their descriptions.
bash
python metagoofil.py --help
Responsible Use of Metagoofil
It’s important to use Metagoofil responsibly and within legal and ethical boundaries. Always ensure that you have the necessary permissions to extract metadata from websites and files, and respect the terms of service and legal restrictions of the websites you target. Metagoofil is a powerful tool for information gathering, but it should be used for legitimate purposes.
In conclusion, Metagoofil is a valuable asset for extracting metadata from online documents and files. By following the installation steps and understanding how to use it effectively, you can enhance your digital investigations, OSINT efforts, and security assessments. Happy metadata hunting!