Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No feed returned when running recommendation twice #6

Open
danthe1st opened this issue Apr 16, 2021 · 3 comments
Open

No feed returned when running recommendation twice #6

danthe1st opened this issue Apr 16, 2021 · 3 comments

Comments

@danthe1st
Copy link
Contributor

danthe1st commented Apr 16, 2021

Bug Description

When running python3 answerable.py recommend the first time, questions are recommended as planned.

However, when running it multiple times (with the same user id), the following error occurs:

No feed returned
Full log in answerable.log

Steps to reproduce

# setup
git clone https://github.com/MiguelMJ/Answerable /tmp/poc
cd /tmp/poc
pip install -r requirements.txt
# run first time (successfull)
python3 answerable.py recommend -u 1
# run second time (No feed returned)
python3 answerable.py recommend -u 1

Expected results

It should print actual questions, cached or not

Actual results

No feed returned
Full log in answerable.log

Details

This is likely a caching issue as Answerable runs successfully after deleting .cache/spider.rss.

Instead of printing cached answers, it shows no answers at all.

It might not happen the second time, but running it multiple times should do it.

Logs

Log when running it the first time (successfully):
[Answerable] Log of 2021-04-16 08:58:52.651543
[Answerable] No tags file provided.
[Fetcher] Fetching user information
[Cache]   Miss fetcher/1.json
[Spider] Checking cache before petition https://api.stackexchange.com/2.2/users/1/answers?page=1&pagesize=100&order=desc&sort=creation&site=stackoverflow&filter=!.Fjr43gf6UvsWf.-.z(SMRV3sqodT
[Cache]   Miss spider/https:--api.stackexchange.com-2.2-users-1-answers?page=1&pagesize=100&order=desc&sort=creation&site=stackoverflow&filter=!.Fjr43gf6UvsWf.-.z(SMRV3sqodT
[Spider] Waiting to ask for https://api.stackexchange.com/2.2/users/1/answers?page=1&pagesize=100&order=desc&sort=creation&site=stackoverflow&filter=!.Fjr43gf6UvsWf.-.z(SMRV3sqodT
[Spider]   in 0.50 seconds
[Spider] Requesting
[Cache]   Cache updated: spider/https:--api.stackexchange.com-2.2-users-1-answers?page=1&pagesize=100&order=desc&sort=creation&site=stackoverflow&filter=!.Fjr43gf6UvsWf.-.z(SMRV3sqodT
[Spider] Checking cache before petition https://api.stackexchange.com/2.2/users/1/answers?page=2&pagesize=100&order=desc&sort=creation&site=stackoverflow&filter=!.Fjr43gf6UvsWf.-.z(SMRV3sqodT
[Cache]   Miss spider/https:--api.stackexchange.com-2.2-users-1-answers?page=2&pagesize=100&order=desc&sort=creation&site=stackoverflow&filter=!.Fjr43gf6UvsWf.-.z(SMRV3sqodT
[Spider] Waiting to ask for https://api.stackexchange.com/2.2/users/1/answers?page=2&pagesize=100&order=desc&sort=creation&site=stackoverflow&filter=!.Fjr43gf6UvsWf.-.z(SMRV3sqodT
[Spider]   in 0.50 seconds
[Spider] Requesting
[Cache]   Cache updated: spider/https:--api.stackexchange.com-2.2-users-1-answers?page=2&pagesize=100&order=desc&sort=creation&site=stackoverflow&filter=!.Fjr43gf6UvsWf.-.z(SMRV3sqodT
[Fetcher] 127 answers, 2 batches
[Fetcher] batch 1
[Spider] Waiting to ask for https://api.stackexchange.com/2.2/questions/11752544;16908313;3417371;3009380;990477;2246901;4291018;3588529;444344;337985;294622;2787177;2696480;2482174;2233695;297005;2049534;2055205;2047992;2032652;1342898;218680;1741193;275878;1379156;1545256;1536120;642954;477913;1058783;1149270;1102118;786638;1041623;397250;7707;146297;291102;755332;713247;704464;700583;696331;665122;660319;652665;646338;98606;629573;438923;606258;606018;604277;571978;551665;540311;485083;499817;481879;463258;414931;399770;391332;388595;386380;29814;360028;342528;341397;336605;330032;330053;295515;287141;283965;267369;255855;250874;234059;232732;230517;205277;206719;205923;197042;42937;191093;189493;177597;177506;169828;165887;136443;134235;108604;100519;62151;54475;50532;48475?page=1&pagesize=100&order=desc&sort=creation&site=stackoverflow&filter=!5RCLVFC_3nVp6Kjoti6BKirZj
[Spider]   in 0.50 seconds
[Spider] Requesting
[Fetcher] batch 2
[Spider] Waiting to ask for https://api.stackexchange.com/2.2/questions/37809;32149;32010;30026;26021;25259;25277;16233;6173;13470;10999;10616;10604;10610;9673;9501;9303;9304;8681;8472;8348;8106;7089;1010;944;944;11?page=1&pagesize=100&order=desc&sort=creation&site=stackoverflow&filter=!5RCLVFC_3nVp6Kjoti6BKirZj
[Spider]   in 0.50 seconds
[Spider] Requesting
[Cache]   Cache updated: fetcher/1.json
[Fetcher] Fetching question feed
[Spider] Requesting feed https://stackoverflow.com/feeds/
[Cache]   Miss spider.rss/https:__stackoverflow.com_feeds_
[Spider] with etag: None
[Spider] with modified: None
[Cache]   Cache updated: spider.rss/https:__stackoverflow.com_feeds_
[Spider] Stored new etag: None
[Spider] Stored new modified: Fri, 16 Apr 2021 06:57:27 GMT
[Fetcher] Number of entries in feed: 30
[Answerable] Discarded: 0 ignored | 0 closed | 1 duplicate
Log when running again afterwards:
[Answerable] Log of 2021-04-16 08:59:52.197975
[Answerable] No tags file provided.
[Fetcher] Fetching user information
[Cache]   Hit fetcher/1.json
[Cache]   Time passed since last fetch: 0:00:54.541158
[Cache]   Recent enough
[Fetcher] Fetching question feed
[Spider] Requesting feed https://stackoverflow.com/feeds/
[Cache]   Hit spider.rss/https:__stackoverflow.com_feeds_
[Cache]   Time passed since last fetch: 0:00:13.732423
[Cache]   Recent enough
[Spider] with etag: None
[Spider] with modified: Fri, 16 Apr 2021 06:59:38 GMT
[Fetcher] Feed not modified since last retrieval (status 304)

System details

WSL 2 (Ubuntu) on Windows 10
            .-/+oossssoo+/-.               dan@Daniellaptop-wsl
        `:+ssssssssssssssssss+:`           --------------------
      -+ssssssssssssssssssyyssss+-         OS: Ubuntu 20.10 on Windows 10 x86_64
    .ossssssssssssssssssdMMMNysssso.       Kernel: 4.19.72-microsoft-standard+
   /ssssssssssshdmmNNmmyNMMMMhssssss/      Uptime: 35 mins
  +ssssssssshmydMMMMMMMNddddyssssssss+     Packages: 3164 (dpkg), 10 (snap)
 /sssssssshNMMMyhhyyyyhmNMMMNhssssssss/    Shell: bash 5.0.17
.ssssssssdMMMNhsssssssssshNMMMdssssssss.   Terminal: /dev/pts/0
+sssshhhyNMMNyssssssssssssyNMMMysssssss+   CPU: Intel i7-7500U (4) @ 2.904GHz
ossyNMMMNyMMhsssssssssssssshmmmhssssssso   Memory: 363MiB / 6284MiB
ossyNMMMNyMMhsssssssssssssshmmmhssssssso
+sssshhhyNMMNyssssssssssssyNMMMysssssss+
.ssssssssdMMMNhsssssssssshNMMMdssssssss.
 /sssssssshNMMMyhhyyyyhdNMMMNhssssssss/
  +sssssssssdmydMMMMMMMMddddyssssssss+
   /ssssssssssshdmNNNNmyNMMMMhssssss/
    .ossssssssssssssssssdMMMNysssso.
      -+sssssssssssssssssyyyssss+-
        `:+ssssssssssssssssss+:`
            .-/+oossssoo+/-.
Microsoft Windows [Version 10.0.19041.388]

Workaround

When python3 answerable.py recommend fails with the message No feed returned, run the following command and try again:

rm -rf .cache/spider.rss
@MiguelMJ
Copy link
Owner

MiguelMJ commented Apr 16, 2021

Thanks for asking!
Answerable tries not to make more requests than necessary, so it won't repeat requests for a certain time after the last recommendation (the time passed since last fetch in the log is relatively low). To ignore this restriction, you can use the -F option with the recommend command in the second run:

python3 answerable.py recommend -F -u 1

You can find all the options here.
From the link above:

The timestamp of the last feed retrieved is stored in data/spider/feed, and it's used in order to avoid redundant requests. For this reason, the second from two consecutive calls of this command will display nothing. In other words, each recommendation is unique and unrepeatable.

Edit: Note that even with the -F option, the second run will be faster, because it forces the reload of the feed but not the API calls.
Edit 2: Feel free to ask any other doubt.

@danthe1st
Copy link
Contributor Author

danthe1st commented Apr 17, 2021

Wouldn't it make more sense to just return the last retrieved questions if the same request is triggered multiple times? Also, what is the cache timeout for this?

@MiguelMJ
Copy link
Owner

MiguelMJ commented Apr 17, 2021

My intention is to do it in a future version, because as you say, it actually makes more sense, but right now I don't have much time for that. Given how the RSS feed and API return their data and how it is cached, it could be somewhat messy and reduce the quality of the recommendations if not managed carefully.

For this version, an easy workaround could be to use the following bash script:

date >> recommendations.txt 
python3 answerable.py recommend -u 1 >> recommendations.txt
cat recommendations.txt

And just remove recommendations.txt when you're done with them.


Until I get my hands back on Answerable, feel free to suggest anything, I appreciate any feedback. I'm reopening the issue again, I think it may be interesting to take this into account for the next version.

Edit: Also, contributions are open ;)

@MiguelMJ MiguelMJ reopened this Apr 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants