Skip to content
This repository has been archived by the owner on Dec 22, 2022. It is now read-only.

Commit

Permalink
Merge pull request #2067 from ualbertalib/circuit_breaker_memory
Browse files Browse the repository at this point in the history
rake ingest[sfx] after adding the circuit breaker caused the ingest script to run out of memory:

rake aborted!
Errno::ENOMEM: Cannot allocate memory - java

    Kilmarnock in Prod has 4Gb memory / 2Gb swap and performs this feat monthly without complaint.
    Forest has the same virtual h/w, but was throwing off Netdata alarms about memory & swap usage while ingesting SFX specifically

Basically we read the file twice with two different GC'd languages, thus the file's in memory twice. oops
  • Loading branch information
pgwillia authored Sep 28, 2020
2 parents 0644618 + c0128e9 commit 6af93f8
Show file tree
Hide file tree
Showing 5 changed files with 20 additions and 9 deletions.
4 changes: 3 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
language: ruby
sudo: required
addons:
chrome: stable
apt:
packages:
- libxml2-utils
rvm:
- 2.5
- 2.6
Expand Down
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,11 @@ and releases in Discovery project adheres to [Semantic Versioning](http://semver

## [Unreleased]

## [3.5.3]

### Fixed
- memory use by circuit breaker [PR#2067](https://github.com/ualbertalib/discovery/pull/2067)

## [3.5.2]

### Added
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ discovery platform. Based on [Project Blacklight](projectblacklight.org).

* Depends on [Ruby](https://www.ruby-lang.org/en/) 2.5.x
* Depends on Java (for SolrMarc and Ingestion scripts)
* Depends on [xmllint](http://xmlsoft.org/xmllint.html) (for SFX Ingestion scripts) which is available in `sudo apt install libxml2-utils` on Ubuntu.
* Depends on an instance of [Solr](https://lucene.apache.org/solr/) with [this configuration](https://github.com/ualbertalib/blacklight_solr_conf)
* If you wish to use docker for the datastores install [docker](https://docs.docker.com/install/) and [docker-compose](https://docs.docker.com/compose/install/) first.

Expand Down
2 changes: 1 addition & 1 deletion config/application.rb
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
Bundler.require(*Rails.groups)

module Discovery
VERSION = '3.5.2'.freeze # used in application layout meta generator tag
VERSION = '3.5.3'.freeze # used in application layout meta generator tag

class Application < Rails::Application
# Settings in config/environments/* take precedence over those specified here.
Expand Down
17 changes: 10 additions & 7 deletions lib/tasks/ingest.rake
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ require "#{Rails.root}/lib/ingest/promoted_services_om.rb"
require_relative './ingest_configuration.rb'

require 'yaml'
require 'open3'

import 'lib/tasks/delete.rake'

Expand Down Expand Up @@ -51,13 +52,15 @@ task :ingest, [:collection] => [:update_solr_marc_maps] do |_t, args|

# circuit breaker to prevent unparsable data from changing our index
if File.extname(Rails.root.join(@c.path)) == '.xml'
doc = File.open(Rails.root.join(@c.path)) { |f| Nokogiri::XML(f) }
if doc.errors.count.positive?
unparsable = "#{@c.path} is unparsable #{doc.errors}"
Rollbar.error(unparsable)
@ingest_log.fatal(unparsable)
at_exit { puts unparsable }
exit
Open3.popen3("xmllint --encode utf-8 --noout #{Rails.root.join(@c.path)}") do |_stdout, _stderr, status, _thread|
response = status.read
unless response.empty?
unparsable = "#{@c.path} is unparsable #{response}"
Rollbar.error(unparsable)
@ingest_log.fatal(unparsable)
at_exit { puts unparsable }
exit
end
end
end

Expand Down

0 comments on commit 6af93f8

Please sign in to comment.