-
Notifications
You must be signed in to change notification settings - Fork 715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Content of tweet includes non written mentions #992
Comments
These mentions are technically part of the tweet text. This is exactly what Twitter returns:
There is however also a |
Thanks for pointing it out @JustAnotherArchivist 🙏🏻 I did not realize that all accounts mentioned in a tweet are internally included in its replies (since you get notified about replies it makes sense 😄). This might be a good opportunity for me to task as well about the differences of |
Forget that
|
Links replacement you mean the https://t.co ones instead of the originals right? I’m using Puppeteer to navigate those and get the actual URLs. So as far as I understood, I should be using |
Describe the bug
Then scrapping the following tweet, the content returned starts like
"@GitHubCopilot @tabnine @Replit @vercel Have you tried them ?"
instead of just"Have you tried them ?"
as expected.How to reproduce
Use the
TwitterTweetScraper
and pass the tweet id1674020720458776576
.Expected behaviour
There should be no non-written mentions at the beginning of the content.
Screenshots and recordings
No response
Operating system
macOS 13.4.1
Python version: output of
python3 --version
3.9
snscrape version: output of
snscrape --version
0.7.0.20230622
Scraper
TwitterTweetScraper
How are you using snscrape?
Module (
import snscrape.modules.something
in Python code)Backtrace
No response
Log output
No response
Dump of locals
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: