This project is a web scraper designed to extract tweets from X (formerly Twitter) based on specific search queries and date ranges. It uses Puppeteer to automate browser interactions and can handle multiple search queries in a single run.
- Scrapes tweets based on keywords and date ranges
- Handles user authentication (manual login required if not already logged in)
- Saves results in CSV format
- Supports multiple queries through a CSV input file
- Implements scrolling to load more tweets
- Handles cases where no tweets are found
- Runs in headless mode when already logged in
- Node.js (v12 or higher recommended)
- npm (Node Package Manager)
-
Clone this repository:
git clone https://github.com/zahidhussaina2l/x_ui_scraper cd x_ui_scraper
-
Install the required dependencies:
npm install
-
Run the script:
npm start
-
On first run, the script will create a sample
input.csv
file. Edit this file with your desired search queries and date ranges. -
Run the script again to start scraping based on your input file.
-
If you're not logged in, the browser will open, and you'll need to log in manually. Once logged in, press Enter in the console to continue.
-
The script will process each query in the input file and save results to individual CSV files.
The input.csv
file should have the following format: