Skip to content
This repository has been archived by the owner on Feb 7, 2024. It is now read-only.

Fix the issue with multi-line comments #8

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ArmanKabiri
Copy link

This pull request, fixes the issue of the code with multi-line comments. The original code only extracts the first line of each comment. Now it is able to extract the multi-line comments completely.

@@ -0,0 +1,128 @@
{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is redundant.

@@ -9,7 +9,9 @@ class CommentSpider(scrapy.Spider):
products_count = 0

start_urls = [
# 'https://www.digikala.com/main/apparel/',
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is redundant.

'http://digikala.com/',
# 'https://digikala.com/main/book-and-media/'
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is redundant.

@@ -64,8 +66,16 @@ def parse_comments(self, response):
pos = [x.strip() for x in pos_raw]
neg_raw = comment_selector.css('.c-comments__evaluation-negative li::text').extract()
neg = [x.strip() for x in neg_raw]
txt = comment_selector.css('p::text').extract_first()

##########MODIFIED BY ARMAN############
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should delete this comment. People can use git blame or other editors' tools like annotation to see your contribution. Furthermore, your contributions would be always presented in repo's history.

txt = txt.strip() if txt is not None else None
########################################
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's redundant as well.

txt_list = comment_selector.css('p::text').extract()
txt = ""
for txt_itr in txt_list:
txt += txt_itr.strip()+"\n"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to concatenate those lines with single whitespace instead of "\n".

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants