You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running the software in a regular fashion results in errors:
$ snscrape vkontakte-user durov
2023-02-24 14:31:55.752 WARNING snscrape.modules.vkontakte Skipping post without link: '<div class="_post post page_block all own post--withPostBottomAction post--with-likes closed_comments deep_active Post--redesign" data-post-id="1_2442097" data-replies-limit="0" id="post1_2442097" onc'
2023-02-24 14:31:55.808 CRITICAL snscrape._cli Dumped stack and locals to /tmp/snscrape_locals__x7ru5_r
Traceback (most recent call last):
File "[PATH]/venv2/bin/snscrape", line 8, in <module>
sys.exit(main())
File "[PATH]/venv2/lib/python3.10/site-packages/snscrape/_cli.py", line 318, in main
for i, item in enumerate(scraper.get_items(), start = 1):
File "[PATH]/venv2/lib/python3.10/site-packages/snscrape/modules/vkontakte.py", line 278, in get_items
yield from _process_soup(soup)
File "[PATH]/venv2/lib/python3.10/site-packages/snscrape/modules/vkontakte.py", line 273, in _process_soup
postID = int(item.url.rsplit('_', 1)[1])
AttributeError: 'NoneType' object has no attribute 'url'
In vkontakte.py:
Instead of post_link class, we see PostHeaderSubtitle__link.
For dates, instead of this: post.find('div', class_ = 'post_date').find('span', class_ = 'rel_date')
we found this to work: postLink.find('time', class_ = 'PostHeaderSubtitle__item')
By doing those replacements, we find that the function starts (mostly) working again.
We're not sure what the full extent of the replacements needs to be.
How to reproduce
Run the command: snscrape vkontakte-user durov
Expected behaviour
After doing the aforementioned replacements, we start getting results like so:
Describe the bug
Running the software in a regular fashion results in errors:
In
vkontakte.py
:Instead of
post_link
class, we seePostHeaderSubtitle__link
.For dates, instead of this:
post.find('div', class_ = 'post_date').find('span', class_ = 'rel_date')
we found this to work:
postLink.find('time', class_ = 'PostHeaderSubtitle__item')
By doing those replacements, we find that the function starts (mostly) working again.
We're not sure what the full extent of the replacements needs to be.
How to reproduce
Run the command:
snscrape vkontakte-user durov
Expected behaviour
After doing the aforementioned replacements, we start getting results like so:
$ snscrape vkontakte-user durov
https://vk.com/wall1_2442097
https://vk.com/wall1_2431591
https://vk.com/wall1_2422169
https://vk.com/wall1_2418560
https://vk.com/wall1_2412029
https://vk.com/wall1_2407925
https://vk.com/wall1_2405336
https://vk.com/wall1_2401719
https://vk.com/wall1_2401089
...
Screenshots and recordings
No response
Operating system
Ubuntu 22.04
Python version: output of
python3 --version
Python 3.10.6
snscrape version: output of
snscrape --version
snscrape 0.5.0.20230113 & snscrape 0.5.0.20230114.dev31+gf329b69
Scraper
vkontakte-user
Backtrace
No response
Dump of locals
No response
How are you using snscrape?
CLI (
snscrape ...
as a command, e.g. in a terminal)Additional context
No response
The text was updated successfully, but these errors were encountered: