-
Notifications
You must be signed in to change notification settings - Fork 10
Fix the issue with multi-line comments #8
base: master
Are you sure you want to change the base?
Conversation
@@ -0,0 +1,128 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is redundant.
@@ -9,7 +9,9 @@ class CommentSpider(scrapy.Spider): | |||
products_count = 0 | |||
|
|||
start_urls = [ | |||
# 'https://www.digikala.com/main/apparel/', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is redundant.
'http://digikala.com/', | ||
# 'https://digikala.com/main/book-and-media/' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is redundant.
@@ -64,8 +66,16 @@ def parse_comments(self, response): | |||
pos = [x.strip() for x in pos_raw] | |||
neg_raw = comment_selector.css('.c-comments__evaluation-negative li::text').extract() | |||
neg = [x.strip() for x in neg_raw] | |||
txt = comment_selector.css('p::text').extract_first() | |||
|
|||
##########MODIFIED BY ARMAN############ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should delete this comment. People can use git blame
or other editors' tools like annotation to see your contribution. Furthermore, your contributions would be always presented in repo's history.
txt = txt.strip() if txt is not None else None | ||
######################################## |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's redundant as well.
txt_list = comment_selector.css('p::text').extract() | ||
txt = "" | ||
for txt_itr in txt_list: | ||
txt += txt_itr.strip()+"\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to concatenate those lines with single whitespace instead of "\n"
.
This pull request, fixes the issue of the code with multi-line comments. The original code only extracts the first line of each comment. Now it is able to extract the multi-line comments completely.