Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instagram Index Out of Range #914

Closed
arfathyahiya opened this issue May 24, 2023 · 1 comment
Closed

Instagram Index Out of Range #914

arfathyahiya opened this issue May 24, 2023 · 1 comment
Labels
duplicate This issue or pull request already exists

Comments

@arfathyahiya
Copy link

Describe the bug

Trying to scrape using instagram module it gives the following error

Traceback (most recent call last):
  File "D:\Github Contributions\snscrape\test.py", line 4, in <module>
    for post in instagram.InstagramLocationScraper(locationId="110585945628334").get_items():
  File "D:\Github Contributions\snscrape\snscrape\modules\instagram.py", line 163, in get_items
    r = self._initial_page()
  File "D:\Github Contributions\snscrape\snscrape\modules\instagram.py", line 130, in _initial_page
    r = self._get(self._initialUrl, headers=self._headers, responseOkCallback=self._check_initial_page_callback)
  File "D:\Github Contributions\snscrape\snscrape\base.py", line 266, in _get
    return self._request('GET', *args, **kwargs)
  File "D:\Github Contributions\snscrape\snscrape\base.py", line 237, in _request
    success, msg = responseOkCallback(r)
  File "D:\Github Contributions\snscrape\snscrape\modules\instagram.py", line 141, in _check_initial_page_callback
    jsonData = r.text.split('<script type="text/javascript">window._sharedData = ')[1].split(';</script>')[
IndexError: list index out of range

How to reproduce

Can use any scraper from instagram module and same result will be shown.

from snscrape.modules import instagram


for post in instagram.InstagramLocationScraper(locationId="110585945628334").get_items():
    print(post)

Expected behaviour

Return Location Posts

Screenshots and recordings

No response

Operating system

WIndows 10

Python version: output of python3 --version

3.9

snscrape version: output of snscrape --version

snscrape 0.6.2.20230320

Scraper

instagram-location

How are you using snscrape?

CLI (snscrape ... as a command, e.g. in a terminal)

Backtrace

No response

Log output

No response

Dump of locals

No response

Additional context

I checked the source code and apparently it was expected to happen sooner or later

May throw an IndexError if Instagram changes something again; we just let that bubble.

And now currently I'm unable to find it's replacement in source page.
I checked it manually and entry_data is empty dict there.

maybe the _InstagramCommonScraper needs to be updated and use the API calls instead of scraping from the html source code?

@arfathyahiya arfathyahiya added the bug Something isn't working label May 24, 2023
@JustAnotherArchivist JustAnotherArchivist added duplicate This issue or pull request already exists and removed bug Something isn't working labels May 25, 2023
@JustAnotherArchivist
Copy link
Owner

#520

@JustAnotherArchivist JustAnotherArchivist closed this as not planned Won't fix, can't repro, duplicate, stale May 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

2 participants