Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add license filter to discover-repos.rb #284

Merged
merged 5 commits into from
Apr 21, 2024
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 30 additions & 8 deletions steps/discover-repos.rb
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,11 @@
raise 'Can only retrieve up to 1000 repos' if opts[:total] > max

size = [opts[:page_size], opts[:total]].min
licenses = [
'mit',
'apache-2.0',
'0bsd'
]

github = Octokit::Client.new
unless opts[:token].empty?
Expand All @@ -69,17 +74,34 @@
'NOT',
'android'
].join(' ')
loop do
if page * size > max
puts "Can't go to page ##{page}, since it will be over #{max}"
break

def mock_array(size, licenses)
Array.new(size) do
{
full_name: "foo/#{Random.hex(5)}",
created_at: Time.now,
license: { key: licenses.sample(1)[0] }
}
end
json = if opts[:dry]
{ items: page > 100 ? [] : Array.new(size) { { full_name: "foo/#{Random.hex(5)}", created_at: Time.now } } }
end

def mock_reps(page, size, licenses)
{
items: if page > 100 then []
else
mock_array(size, licenses)
end
}
end

loop do
break if page * size > max
json = if opts[:dry] then mock_reps(page, size, licenses)
else
github.search_repositories(query, per_page: size, page: page)
end
json[:items].each do |i|
next if i[:license].nil? || !licenses.include?(i[:license][:key])
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Orillio I suggest to print a log line, when this happens

Copy link
Contributor Author

@Orillio Orillio Apr 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yegor256 I wanted to add this logic, but loop block is reached its maximum line length of 30.

I could have carry out this chunk of code to another function, but it wont allow me, because CI linter allows functions with length no more than 10 lines, which is highly unreasonable in my opinion.

found[i[:full_name]] = {
  full_name: i[:full_name],
  default_branch: i[:default_branch],
  stars: i[:stargazers_count],
  forks: i[:forks_count],
  created_at: i[:created_at].iso8601,
  size: i[:size],
  open_issues_count: i[:open_issues_count],
  description: i[:description],
  topics: i[:topics]
}

At this point, i dont know how to add new lines here without fully refactoring the code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yegor256 I somehow managed to overcome the issue. Please check the commit.

found[i[:full_name]] = {
full_name: i[:full_name],
default_branch: i[:default_branch],
Expand All @@ -92,9 +114,9 @@
topics: i[:topics]
}
puts "Found #{i[:full_name].inspect} GitHub repo ##{found.count} \
(#{i[:forks_count]} forks, #{i[:stargazers_count]} stars)"
(#{i[:forks_count]} forks, #{i[:stargazers_count]} stars) with license: #{i[:license][:name]}"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Orillio why do we need their "names"? I suggest to rely only on the :key and ignore the name

end
puts "Found #{json[:items].count} repositories in page ##{page}"
puts "Found #{found.count} repositories in page ##{page}"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Orillio now, we will print:

10 repositories in page #1 
20 repositories in page #2 
30 repositories in page #2 
40 repositories in page #2 
...

which is obviously wrong.

break if found.count >= opts[:total]
puts "Let's sleep for #{opts[:pause]} seconds to cool off GitHub API \
(already found #{found.count} repos, need #{opts[:total]})..."
Expand Down
Loading