Python script to download INE courses including labs, exercises, quizzes, slides, and, videos!
- Python (3.6.* - 3.8.*)
- Python
pip3
- Python module
requests_toolbelt
- Python module
requests
- Python module
loguru
- External downloader
aria2
The script was written based on the APIs of iOS application to prevent Google's Invisible captcha implementation hence you will see a header (X-Ine-Mobile
) hard coded in the script with an static API key required for the IOS API calls to succeed (this is hard-coded in the iOS application binary and can easily be grepped).
Initialization:
- The script starts with loading the credentials (
username/email
andpassword
) fromconfig.json
file - It then proceeds to login into the INE account and loading the
JWT
token into Authorization header to use in next API calls - The script then does an API call to download the metadata of all the INE courses present and write into
all_courses_metadata.json
- It then checks for the subscriptions the user account has and filters the courses fetched and creates another file named
all_courses_with_access.json
with the data
Downloading of videos:
- Does an API call to fetch the
video name
,url
, and checks if subtitle URL is present and downloads all those! - Also filters based on the resolution and goes from 1080, 720 -> low
Downloading of slides (format looks like the following):
- An initial request returns the slide metadata containing a link to
index.html
file (with cookies in response headers) - We store the cookies and then download
browsersupport.js
andplayer.js
files (static can be reused with other slides) - INE uses a mix of
css
,js
,woff
, and,pngs
but all have incrementable file names (format differs for text and binary files) - Wrote a
while loop
until the incremented file returns 404 and then stops the downloading of category:CSS/JS/WOFF/PNG
- It also downloads any attachments added with the slides.
Downloading of quizzes (with right answers):
- This was quite a pickle, looked into the quiz solving API call, first request returns a
JSON
containing the whole quiz content - Second request is a
PUT
request with right/wrong answers from user, in response to this, the JSON body now contains a new keyis_correct
containing the right answer - Wrote logic for posting the JSON body taken from the initial request, modified it to required standards, the server doesn't need options to be selected either
- The
PUT
request then returns the right answers, two files are made with no answers and correct answers.
Downloading of exercises:
- This was quite simple, an API call returns the contents in both
Markdown
andHTML
format - For ease of cross-platform usage, I've only stored the
HTML
content with_exercise_
in file name.
Downloading of labs:
- You lose an edge here, in most of the cases, the labs are stored on INE cloud.
- The script only stores the
HTML content
of it (if present)
- Resume capability for a course video
- Download subtitles for the videos (if present)
- Download all courses without any prompt (option: -a / --all)
- Downloading slides, labs, exercises, quizzes, and, videos
pip3 install -r requirements.txt
sudo apt install aria2
docker build -t ine-dl .
docker run -d --name ine-dl -v {HOSTDIRECTORY}:/app/ine-dl/downloads ine-dl
Make sure you replace {HOSTDIRECTORY} in the command to map to an on device save folder
Windows 7/8/8.1(It should work fine in WSL but not in cmd/PS)- Ubuntu 18.04 LTS
- Ubuntu 20.04 LTS
You can download the latest version of ine-dl by cloning the GitHub repository.
git clone https://github.com/Anon-Exploiter/ine-dl --depth 1
┌──(umar_0x01@b0x)-[~/scripts/ine-dl]
└─$ python3 ine.py
██╗███╗ ██╗███████╗ ██████╗ ██╗
██║████╗ ██║██╔════╝ ██╔══██╗██║
██║██╔██╗ ██║█████╗█████╗██║ ██║██║
██║██║╚██╗██║██╔══╝╚════╝██║ ██║██║
██║██║ ╚████║███████╗ ██████╔╝███████╗
╚═╝╚═╝ ╚═══╝╚══════╝ ╚═════╝ ╚══════╝
Usage: python3 ine.py --all
Help:
-h, --help show this help message and exit
Basic arguments (one or more):
-l LOG, --log-output LOG
Logs output of the script (if required later)
-lct, --list-categories
List all categories
-lcc, --list-courses List all courses
-lcct LCCT, --list-categories-courses LCCT
List all courses of a specific category UUID from -lct
Necessary arguments:
-p PROCESSES, --processes PROCESSES
Number of parallel processes to launch (2 if nothing specified)
-c COURSE, --course COURSE
Download course based on provided UUID from -lcc
-ct CATEGORY, --category CATEGORY
Download whole category based on provided UUID from -lct
-a, --all Download all courses of all categories
Running the Script (displays help menu with no args)
python ine-dl.py
Listing all the courses
python ine-dl.py -lc
Listing course categories
python ine-dl.py -lct
Listing all the courses of a specific category
python ine-dl.py -lcct {category_id}
Logging the script's output into a log file
python ine-dl.py <general_args> -l logfile.log
Downloading all the INE course (your subscription has access to, with/without parallel processing)
python ine-dl.py --all
python ine-dl.py --all -p 2
Downloading a single course
python ine-dl.py -c {course_id}
Downloading all courses of specified category (with/without parallel processes)
python ine-dl.py -ct {category_id}
python ine-dl.py -ct {category_id} -p 2
- Fetch all the courses and write into a file
- Fetch & Match the subscriptions and then put stuff into the course file
- Implement downloading of video files (highest resolution and so-on)
- Implement quiz downloading
- Write the quizzes into an text file and then write it's correct results (json:is_correct) into another file!
- Implement exercise downloading
- Implement iframe downloading
- Implement html, css, js, woff, img files downloading
- Implement lab downloading
- Check if description_html exists and if not, write the json object of the whole lab for user satisfaction
- Downloading the files_uuids zip/pdf files
- Write a json into the course directory containing whole course data
- Implement all argparse arguments
- At the moment, learning paths and bootcamps downloading is not supported
- You can, however, download the courses within those paths and bootcamps manually by specifying their UUID
- This can also be automated using a bash script to download a lot of UUIDs one by one or using parallel processing
- #8
- Inspired as always by downloaders of r0oth3x49
- Some ideas were taken from Jayapraveen's downloader. Though his script is nice, it has a lot of bugs, I spent almost 2 holidays fixing those and then thought of writing my own.
Please use the script w.r.t the usage guidelines of INE. Do not exhaust their backend servers. Do not dump and share the courses publicly.
Please use this on your own risk, If your account is blocked by the usage of this script, I won't be responsible.