Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ErrorError in Instagram scraping. #851

Closed
Thiago-Teofilo opened this issue Apr 22, 2023 · 1 comment
Closed

ErrorError in Instagram scraping. #851

Thiago-Teofilo opened this issue Apr 22, 2023 · 1 comment
Labels
duplicate This issue or pull request already exists

Comments

@Thiago-Teofilo
Copy link

Describe the bug

I tried to scrape 5 items from Instagram through the hashtag and the following error occurred.

How to reproduce

import snscrape.modules.instagram as sns

itens = sns.InstagramHashtagScraper('python').get_items()

for item in itens:
print(item)

Expected behaviour

It was expected that the items from the generator would be printed.

Screenshots and recordings

No response

Operating system

Windows 10

Python version: output of python3 --version

3.10.11

snscrape version: output of snscrape --version

0.6.2.20230320

Scraper

instagram

How are you using snscrape?

Module (import snscrape.modules.something in Python code)

Backtrace

Traceback (most recent call last):
File "H:\ESTUDOS\Projetos Dev\Python Projects\tests\TG_tests\instagram_scraper\main.py", line 5, in
for item in itens:
File "H:\ESTUDOS\Projetos Dev\Python Projects\tests\TG_tests\venv\lib\site-packages\snscrape\modules\instagram.py", line 109, in get_items
r = self._initial_page()
File "H:\ESTUDOS\Projetos Dev\Python Projects\tests\TG_tests\venv\lib\site-packages\snscrape\modules\instagram.py", line 77, in _initial_page
r = self._get(self._initialUrl, headers = self._headers, responseOkCallback = self._check_initial_page_callback)
File "H:\ESTUDOS\Projetos Dev\Python Projects\tests\TG_tests\venv\lib\site-packages\snscrape\base.py", line 251, in _get
return self._request('GET', *args, **kwargs)
File "H:\ESTUDOS\Projetos Dev\Python Projects\tests\TG_tests\venv\lib\site-packages\snscrape\base.py", line 222, in _request
success, msg = responseOkCallback(r)
File "H:\ESTUDOS\Projetos Dev\Python Projects\tests\TG_tests\venv\lib\site-packages\snscrape\modules\instagram.py", line 88, in _check_initial_page_callback
jsonData = r.text.split('<script type="text/javascript">window._sharedData = ')[1].split(';</script>')[0] # May throw an IndexError if Instagram changes something again; we just let that bubble.
IndexError: list index out of range
(venv) PS H:\ESTUDOS\Projetos Dev\Python Projects\tests\TG_tests\instagram_scraper> snscrape --version
snscrape 0.6.2.20230320
(venv) PS H:\ESTUDOS\Projetos Dev\Python Projects\tests\TG_tests\instagram_scraper> python3 --version
Python 3.10.11
(venv) PS H:\ESTUDOS\Projetos Dev\Python Projects\tests\TG_tests\instagram_scraper> python .\main.py
Traceback (most recent call last):
File "H:\ESTUDOS\Projetos Dev\Python Projects\tests\TG_tests\instagram_scraper\main.py", line 5, in
for item in itens:
File "H:\ESTUDOS\Projetos Dev\Python Projects\tests\TG_tests\venv\lib\site-packages\snscrape\modules\instagram.py", line 109, in get_items
r = self._initial_page()
File "H:\ESTUDOS\Projetos Dev\Python Projects\tests\TG_tests\venv\lib\site-packages\snscrape\modules\instagram.py", line 77, in _initial_page
r = self._get(self._initialUrl, headers = self._headers, responseOkCallback = self._check_initial_page_callback)
File "H:\ESTUDOS\Projetos Dev\Python Projects\tests\TG_tests\venv\lib\site-packages\snscrape\base.py", line 251, in _get
return self._request('GET', *args, **kwargs)
File "H:\ESTUDOS\Projetos Dev\Python Projects\tests\TG_tests\venv\lib\site-packages\snscrape\base.py", line 222, in _request
success, msg = responseOkCallback(r)
File "H:\ESTUDOS\Projetos Dev\Python Projects\tests\TG_tests\venv\lib\site-packages\snscrape\modules\instagram.py", line 88, in _check_initial_page_callback
jsonData = r.text.split('<script type="text/javascript">window._sharedData = ')[1].split(';</script>')[0] # May throw an IndexError if Instagram changes something again; we just let that bubble.
IndexError: list index out of range

Log output

No response

Dump of locals

No response

Additional context

No response

@Thiago-Teofilo Thiago-Teofilo added the bug Something isn't working label Apr 22, 2023
@JustAnotherArchivist JustAnotherArchivist added duplicate This issue or pull request already exists and removed bug Something isn't working labels Apr 22, 2023
@JustAnotherArchivist
Copy link
Owner

#520

@JustAnotherArchivist JustAnotherArchivist closed this as not planned Won't fix, can't repro, duplicate, stale Apr 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

2 participants