Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop using external parser services #609

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ orbs:
executors:
bibliothecary:
docker:
- image: cimg/ruby:3.0.7
- image: cimg/ruby:3.2.4
working_directory: ~/bibliothecary


Expand Down
2 changes: 1 addition & 1 deletion .ruby-version
Original file line number Diff line number Diff line change
@@ -1 +1 @@
3.0.7
3.2.4
12 changes: 12 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Removed

## [12.0.0] - 2024-11-2?

### Removed

- This is a MAJOR release in that it removes support for hackage, carthage, hex, clojar, and swiftpm
from Bibliothecary. We are no longer doing any network calls when using Bibliothecary and reimplementing
parsing for those file types natively is non-trivial. Patches welcome :-)

### Changed

- Rewrote conda and yarn parsers to be in process vs calling out over the network

## [11.0.0] - 2024-11-22

### Changed
Expand Down
1 change: 0 additions & 1 deletion bibliothecary.gemspec
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,6 @@ Gem::Specification.new do |spec|
spec.add_development_dependency "rake", "~> 12.0"
spec.add_development_dependency "rspec", "~> 3.0"
spec.add_development_dependency "webmock"
spec.add_development_dependency "vcr"
spec.add_development_dependency "rubocop"
spec.add_development_dependency "rubocop-rails"
end
3 changes: 0 additions & 3 deletions lib/bibliothecary/configuration.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ class Configuration
attr_accessor :carthage_parser_host
attr_accessor :clojars_parser_host
attr_accessor :mix_parser_host
attr_accessor :yarn_parser_host
attr_accessor :conda_parser_host
attr_accessor :swift_parser_host
attr_accessor :cabal_parser_host
Expand All @@ -16,8 +15,6 @@ def initialize
@carthage_parser_host = "https://carthage.libraries.io"
@clojars_parser_host = "https://clojars.libraries.io"
@mix_parser_host = "https://mix.libraries.io"
@yarn_parser_host = "https://yarn-parser.libraries.io"
@conda_parser_host = "https://conda-parser.libraries.io"
@swift_parser_host = "http://swift.libraries.io"
@cabal_parser_host = "http://cabal.libraries.io"
end
Expand Down
52 changes: 0 additions & 52 deletions lib/bibliothecary/parsers/carthage.rb

This file was deleted.

38 changes: 0 additions & 38 deletions lib/bibliothecary/parsers/clojars.rb

This file was deleted.

85 changes: 53 additions & 32 deletions lib/bibliothecary/parsers/conda.rb
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
require "json"
require "yaml"

module Bibliothecary
module Parsers
Expand All @@ -15,14 +15,6 @@ def self.mapping
parser: :parse_conda,
kind: "manifest",
},
match_filename("environment.yml.lock") => {
parser: :parse_conda_lockfile,
kind: "lockfile",
},
match_filename("environment.yaml.lock") => {
parser: :parse_conda_lockfile,
kind: "lockfile",
},
}
end

Expand All @@ -31,34 +23,63 @@ def self.mapping
add_multi_parser(Bibliothecary::MultiParsers::Spdx)

def self.parse_conda(file_contents, options: {}) # rubocop:disable Lint/UnusedMethodArgument
parse_conda_with_kind(file_contents, "manifest")
end
manifest = YAML.load(file_contents)
deps = manifest.dig("dependencies")
deps.map do |dep|
next unless dep.is_a? String # only deal with strings to skip parsing pip stuff

def self.parse_conda_lockfile(file_contents, options: {}) # rubocop:disable Lint/UnusedMethodArgument
parse_conda_with_kind(file_contents, "lockfile")
parsed = parse_name_requirement_from_matchspec(dep)
Dependency.new(**parsed.merge(type: "runtime"))
end.compact
end

def self.parse_conda_with_kind(info, kind)
dependencies = call_conda_parser_web(info, kind)[kind.to_sym]
dependencies.map { |dep_kv| Dependency.new(**dep_kv.merge(type: "runtime")) }
end
def self.parse_name_requirement_from_matchspec(ms)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a regex is two problems there are a lot of problems here 😂

# simplified version of the implementation in conda to handle what we care about
# https://github.com/conda/conda/blob/main/conda/models/match_spec.py#L598
# (channel(/subdir):(namespace):)name(version(build))[key1=value1,key2=value2]
return if ms.end_with?("@")

private_class_method def self.call_conda_parser_web(file_contents, kind)
host = Bibliothecary.configuration.conda_parser_host
response = Typhoeus.post(
"#{host}/parse",
headers: {
ContentType: "multipart/form-data",
},
body: {
file: file_contents,
# Unfortunately we do not get the filename in the mapping parsers, so hardcoding the file name depending on the kind
filename: kind == "manifest" ? "environment.yml" : "environment.yml.lock",
}
)
raise Bibliothecary::RemoteParsingError.new("Http Error #{response.response_code} when contacting: #{host}/parse", response.response_code) unless response.success?
# strip off comments and optional features
ms = ms.split(/#/, 2).first
ms = ms.split(/ if /, 2).first

# strip off brackets
ms = ms.match(/^(.*)(?:\[(.*)\])?$/)[1]

JSON.parse(response.body, symbolize_names: true)
# strip off any parens
ms = ms.match(/^(.*)(?:(\(.*\)))?$/)[1]

# deal with channel and namespace, I wish there was rsplit in ruby
split = ms.reverse.split(":", 2)
ms = split.last.reverse

# split the name from the version/build combo
matches = ms.match(/([^ =<>!~]+)?([><!=~ ].+)?/)
name = matches[1]
version_build = matches[2]

version = nil
if matches && matches[2]
version_build = matches[2]
# and now deal with getting the version from version/build
matches = version_build.match(/((?:.+?)[^><!,|]?)(?:(?<![=!|,<>~])(?:[ =])([^-=,|<>~]+?))?$/)
version = if matches
matches[1].strip
else
version_build.strip
end
end
# if it's an exact requirement, lose the =
if version&.start_with?("==")
version = version[2..]
elsif version&.start_with?("=")
version = version[1..]
end

return {
name: name,
requirement: version || "", # NOTE: this ignores build info
}
end
end
end
Expand Down
2 changes: 1 addition & 1 deletion lib/bibliothecary/parsers/go.rb
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ def self.parse_glide_yaml(file_contents, options: {}) # rubocop:disable Lint/Unu
end

def self.parse_glide_lockfile(file_contents, options: {}) # rubocop:disable Lint/UnusedMethodArgument
manifest = YAML.load file_contents
manifest = YAML.load(file_contents, permitted_classes: [Time])
map_dependencies(manifest, "imports", "name", "version", "runtime")
end

Expand Down
53 changes: 0 additions & 53 deletions lib/bibliothecary/parsers/hackage.rb

This file was deleted.

54 changes: 0 additions & 54 deletions lib/bibliothecary/parsers/hex.rb

This file was deleted.

Loading