Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged with the original to update the gem, but keep our customizations #1

Open
wants to merge 45 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
548df94
make library 1.8.7 compatible
liufengyun Oct 6, 2014
f9cbbeb
update version to 0.2.2
liufengyun Oct 6, 2014
687de5d
remove demo link
liufengyun Aug 26, 2015
75ed5b5
Use a_start and b_start variables in HashDiff.lcs
keram Nov 4, 2015
6844a34
Merge pull request #12 from keram/patch-1
liufengyun Nov 4, 2015
bfa0320
bumps version to 0.2.3
liufengyun Nov 5, 2015
45c572d
Add case insensitive option
ronco Feb 11, 2016
366d83b
add :case_insensitive option to README
ronco Feb 11, 2016
26f6a71
bumps version to 0.3.0
liufengyun Feb 11, 2016
fcb2b30
try fix travis test
liufengyun Feb 11, 2016
d07ae0a
Fix bug with array under hash key with non-word characters.
eirc May 28, 2016
7935759
don't test 1.8.7
liufengyun May 29, 2016
fff6fc2
Merge pull request #18 from eirc/master
liufengyun May 29, 2016
af067d6
add test to :delimiter in patch/unpatch
liufengyun Sep 1, 2016
68425c1
fix an error when a hash has mixed types
ZeroPointEnergy Nov 23, 2016
9df2cb5
Merge pull request #26 from ZeroPointEnergy/fix/mixed_keys
liufengyun Nov 24, 2016
a8f5873
bumps to 0.3.1
liufengyun Nov 24, 2016
938f747
New rubies support.
marshall-lee Dec 27, 2016
0b37a28
Merge pull request #28 from marshall-lee/new-rubies-support
liufengyun Dec 27, 2016
9cacb45
bumps to 0.3.2
liufengyun Dec 27, 2016
1a4bf75
Mention 2 compelling reasons to start using HashDiff
thbar Feb 8, 2017
addd333
Merge pull request #30 from thbar/patch-1
liufengyun Feb 21, 2017
1d1960a
Greatly improve performance of HashDiff#similar?
cloakedcode May 1, 2017
5751a2d
Merge pull request #31 from cloakedcode/early-return-similar
liufengyun May 1, 2017
7588d7d
bumps to 0.3.4
liufengyun May 1, 2017
7ada0b7
add codecov gem
stephengroat Jul 6, 2017
c553aa3
add codecov
stephengroat Jul 6, 2017
7271736
Merge pull request #33 from stephengroat/patch-1
liufengyun Jul 8, 2017
acb2d7e
Update patch documentation on README
kevindew Jul 29, 2017
9f44b1c
Introduce an array_path option
kevindew Jul 29, 2017
04e4e8b
Fix typo
kevindew Aug 3, 2017
5c47aff
Merge pull request #34 from kevindew/array_path
liufengyun Aug 6, 2017
83c6f4b
bumps to 0.3.5
liufengyun Aug 6, 2017
a885a77
Option to allow array comparisons in linear complexity
kevindew Aug 22, 2017
5dea0c7
Merge pull request #35 from kevindew/linear-arrays
liufengyun Aug 22, 2017
30a59a8
remove code coverage
liufengyun Aug 22, 2017
8c40f16
bumps to 0.3.6
liufengyun Aug 22, 2017
59a92b3
Documentation for the :use_lcs option
kevindew Aug 22, 2017
a5e22bb
Merge pull request #36 from kevindew/linear-array-docs
liufengyun Aug 23, 2017
34681b2
update minimum ruby to reflect actual support
lostapathy Oct 7, 2017
f153d24
set higher retry, bump ruby versions
lostapathy Oct 7, 2017
70fc43e
attempting to work around apparently bundler problem with travis and …
lostapathy Oct 7, 2017
0c00625
Merge pull request #39 from lostapathy/minimum_ruby
liufengyun Oct 7, 2017
0946ded
bumps to 0.3.7
liufengyun Oct 8, 2017
c0dc7b4
Merge remote-tracking branch 'original/master' into merged
Jul 21, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
sudo: false
language: ruby
rvm:
- 1.9.3
- 2.0.0
- 2.1.1
- 2.1.10
- 2.2.8
- 2.3.4
- 2.4.2
script: "bundle exec rake spec"

before_install:
- gem install bundler
2 changes: 1 addition & 1 deletion Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@ source "http://rubygems.org"
gemspec

group :test do
gem 'rake'
gem 'rake', '< 11'
end
102 changes: 79 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@

HashDiff is a ruby library to compute the smallest difference between two hashes.

**Demo**: [HashDiff](http://hashdiff.herokuapp.com/)
It also supports comparing two arrays.

HashDiff does not monkey-patch any existing class. All features are contained inside the `HashDiff` module.

**Docs**: [Documentation](http://rubydoc.info/gems/hashdiff)

Expand Down Expand Up @@ -70,8 +72,8 @@ diff.should == [['-', 'a[0].x', 2], ['-', 'a[0].z', 4], ['-', 'a[1].y', 22], ['-
patch example:

```ruby
a = {a: 3}
b = {a: {a1: 1, a2: 2}}
a = {'a' => 3}
b = {'a' => {'a1' => 1, 'a2' => 2}}

diff = HashDiff.diff(a, b)
HashDiff.patch!(a, diff).should == b
Expand All @@ -80,16 +82,18 @@ HashDiff.patch!(a, diff).should == b
unpatch example:

```ruby
a = [{a: 1, b: 2, c: 3, d: 4, e: 5}, {x: 5, y: 6, z: 3}, 1]
b = [1, {a: 1, b: 2, c: 3, e: 5}]
a = [{'a' => 1, 'b' => 2, 'c' => 3, 'd' => 4, 'e' => 5}, {'x' => 5, 'y' => 6, 'z' => 3}, 1]
b = [1, {'a' => 1, 'b' => 2, 'c' => 3, 'e' => 5}]

diff = HashDiff.diff(a, b) # diff two array is OK
HashDiff.unpatch!(b, diff).should == a
```

### Options

There are five options available: `:delimiter`, `:similarity`, `:strict`, `:numeric_tolerance` and `:strip`.
There are eight options available: `:delimiter`, `:similarity`,
`:strict`, `:numeric_tolerance`, `:strip`, `:case_insensitive`, `:array_path`
and `:use_lcs`

#### `:delimiter`

Expand Down Expand Up @@ -135,6 +139,73 @@ diff = HashDiff.diff(a, b, :comparison => { :numeric_tolerance => 0.1, :strip =>
diff.should == [["~", "x", 5, 6]]
```

#### `:case_insensitive`

The :case_insensitive option makes string comparisons ignore case.

```ruby
a = {x:5, s:'FooBar'}
b = {x:6, s:'foobar'}

diff = HashDiff.diff(a, b, :comparison => { :numeric_tolerance => 0.1, :case_insensitive => true })
diff.should == [["~", "x", 5, 6]]
```

#### `:array_path`

The :array_path option represents the path of the diff in an array rather than
a string. This can be used to show differences in between hash key types and
is useful for `patch!` when used on hashes without string keys.

```ruby
a = {x:5}
b = {'x'=>6}

diff = HashDiff.diff(a, b, :array_path => true)
diff.should == [['-', [:x], 5], ['+', ['x'], 6]]
```

For cases where there are arrays in paths their index will be added to the path.
```ruby
a = {x:[0,1]}
b = {x:[0,2]}

diff = HashDiff.diff(a, b, :array_path => true)
diff.should == [["-", [:x, 1], 1], ["+", [:x, 1], 2]]
```

This shouldn't cause problems if you are comparing an array with a hash:

```ruby
a = {x:{0=>1}}
b = {x:[1]}

diff = HashDiff.diff(a, b, :array_path => true)
diff.should == [["~", [:a], [1], {0=>1}]]
```

#### `:use_lcs`

The :use_lcs option is used to specify whether a
[Longest common subsequence](https://en.wikipedia.org/wiki/Longest_common_subsequence_problem)
(LCS) algorithm is used to determine differences in arrays. This defaults to
`true` but can be changed to `false` for significantly faster array comparisons
(O(n) complexity rather than O(n<sup>2</sup>) for LCS).

When :use_lcs is false the results of array comparisons have a tendency to
show changes at indexes rather than additions and subtractions when :use_lcs is
true.

Note, currently the :similarity option has no effect when :use_lcs is false.

```ruby
a = {x: [0, 1, 2]}
b = {x: [0, 2, 2, 3]}

diff = HashDiff.diff(a, b, :use_lcs => false)
diff.should == [["~", "x[1]", 1, 2], ["+", "x[3]", 3]]
```

#### Specifying a custom comparison method

It's possible to specify how the values of a key should be compared.
Expand Down Expand Up @@ -171,6 +242,8 @@ diff.should == [["~", "a", "car", "bus"], ["~", "b[1]", "plane", " plan"], ["-",

When a comparison block is given, it'll be given priority over other specified options. If the block returns value other than `true` or `false`, then the two values will be compared with other specified options.

When used in conjunction with the `array_path` option, the path passed in as an argument will be an array. When determining the ordering of an array a key of `"*"` will be used in place of the `key[*]` field. It is possible, if you have hashes with integer or `"*"` keys, to have problems distinguishing between arrays and hashes - although this shouldn't be an issue unless your data is very difficult to predict and/or your custom rules are very specific.

#### Sorting arrays before comparison

An order difference alone between two arrays can create too many diffs to be useful. Consider sorting them prior to diffing.
Expand All @@ -186,23 +259,6 @@ b[:b].sort!
HashDiff.diff(a, b) => []
```

### Special use cases

#### Using HashDiff on JSON API results

```ruby
require 'uri'
require 'net/http'
require 'json'

uri = URI('http://time.jsontest.com/')
json_resp = ->(uri) { JSON.parse(Net::HTTP.get_response(uri).body) }
a = json_resp.call(uri)
b = json_resp.call(uri)

HashDiff.diff(a,b) => [["~", "milliseconds_since_epoch", 1410542545874, 1410542545985]]
```

## License

HashDiff is distributed under the MIT-LICENSE.
Expand Down
36 changes: 36 additions & 0 deletions changelog.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,41 @@
# Change Log

## v0.3.7 2017-10-08

* remove 1.8.7 support from gemspec #39

## v0.3.6 2017-08-22

* add option `use_lcs` #35

## v0.3.5 2017-08-06

* add option `array_path` #34

## v0.3.4 2017-05-01

* performance improvement of HashDiff#similar? #31

## v0.3.2 2016-12-27

* replace `Fixnum` by `Integer` #28

## v0.3.1 2016-11-24

* fix an error when a hash has mixed types #26

## v0.3.0 2016-2-11

* support `:case_insensitive` option

## v0.2.3 2015-11-5

* improve performance of LCS algorithm #12

## v0.2.2 2014-10-6

* make library 1.8.7 compatible

## v0.2.1 2014-7-13

* yield added/deleted keys for custom comparison
Expand Down
2 changes: 1 addition & 1 deletion hashdiff.gemspec
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Gem::Specification.new do |s|
s.test_files = `git ls-files -- Appraisals {spec}/*`.split("\n")

s.require_paths = ['lib']
s.required_ruby_version = Gem::Requirement.new(">= 1.8.7")
s.required_ruby_version = Gem::Requirement.new(">= 1.9.3")

s.authors = ["Liu Fengyun"]
s.email = ["[email protected]"]
Expand Down
1 change: 1 addition & 0 deletions lib/hashdiff.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
require 'hashdiff/util'
require 'hashdiff/lcs'
require 'hashdiff/linear_compare_array'
require 'hashdiff/diff'
require 'hashdiff/patch'
require 'hashdiff/version'
75 changes: 40 additions & 35 deletions lib/hashdiff/diff.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,12 @@ module HashDiff
# @param [Array, Hash] obj1
# @param [Array, Hash] obj2
# @param [Hash] options the options to use when comparing
# * :strict (Boolean) [true] whether numeric values will be compared on type as well as value. Set to false to allow comparing Fixnum, Float, BigDecimal to each other
# * :strict (Boolean) [true] whether numeric values will be compared on type as well as value. Set to false to allow comparing Integer, Float, BigDecimal to each other
# * :delimiter (String) ['.'] the delimiter used when returning nested key references
# * :numeric_tolerance (Numeric) [0] should be a positive numeric value. Value by which numeric differences must be greater than. By default, numeric values are compared exactly; with the :tolerance option, the difference between numeric values must be greater than the given value.
# * :strip (Boolean) [false] whether or not to call #strip on strings before comparing
# * :array_path (Boolean) [false] whether to return the path references for nested values in an array, can be used for patch compatibility with non string keys.
# * :use_lcs (Boolean) [true] whether or not to use an implementation of the Longest common subsequence algorithm for comparing arrays, produces better diffs but is slower.
#
# @yield [path, value1, value2] Optional block is used to compare each value, instead of default #==. If the block returns value other than true of false, then other specified comparison options will be used to do the comparison.
#
Expand All @@ -27,15 +29,15 @@ module HashDiff
def self.best_diff(obj1, obj2, options = {}, &block)
options[:comparison] = block if block_given?

opts = {similarity: 0.3}.merge!(options)
opts = { :similarity => 0.3 }.merge!(options)
diffs_1 = diff(obj1, obj2, opts)
count_1 = count_diff diffs_1

opts = {similarity: 0.5}.merge!(options)
opts = { :similarity => 0.5 }.merge!(options)
diffs_2 = diff(obj1, obj2, opts)
count_2 = count_diff diffs_2

opts = {similarity: 0.8}.merge!(options)
opts = { :similarity => 0.8 }.merge!(options)
diffs_3 = diff(obj1, obj2, opts)
count_3 = count_diff diffs_3

Expand All @@ -48,11 +50,14 @@ def self.best_diff(obj1, obj2, options = {}, &block)
# @param [Array, Hash] obj1
# @param [Array, Hash] obj2
# @param [Hash] options the options to use when comparing
# * :strict (Boolean) [true] whether numeric values will be compared on type as well as value. Set to false to allow comparing Fixnum, Float, BigDecimal to each other
# * :strict (Boolean) [true] whether numeric values will be compared on type as well as value. Set to false to allow comparing Integer, Float, BigDecimal to each other
# * :similarity (Numeric) [0.8] should be between (0, 1]. Meaningful if there are similar hashes in arrays. See {best_diff}.
# * :delimiter (String) ['.'] the delimiter used when returning nested key references
# * :numeric_tolerance (Numeric) [0] should be a positive numeric value. Value by which numeric differences must be greater than. By default, numeric values are compared exactly; with the :tolerance option, the difference between numeric values must be greater than the given value.
# * :strip (Boolean) [false] whether or not to call #strip on strings before comparing
# * :array_path (Boolean) [false] whether to return the path references for nested values in an array, can be used for patch compatibility with non string keys.
# * :use_lcs (Boolean) [true] whether or not to use an implementation of the Longest common subsequence algorithm for comparing arrays, produces better diffs but is slower.
#
#
# @yield [path, value1, value2] Optional block is used to compare each value, instead of default #==. If the block returns value other than true of false, then other specified comparison options will be used to do the comparison.
#
Expand All @@ -74,9 +79,13 @@ def self.diff(obj1, obj2, options = {}, &block)
:delimiter => '.',
:strict => true,
:strip => false,
:numeric_tolerance => 0
:numeric_tolerance => 0,
:array_path => false,
:use_lcs => true
}.merge!(options)

opts[:prefix] = [] if opts[:array_path] && opts[:prefix] == ''

opts[:comparison] = block if block_given?

# prefer to compare with provided block
Expand All @@ -103,62 +112,59 @@ def self.diff(obj1, obj2, options = {}, &block)
end

result = []
if obj1.is_a?(Array)
changeset = diff_array(obj1, obj2, opts) do |lcs|
if obj1.is_a?(Array) && opts[:use_lcs]
changeset = diff_array_lcs(obj1, obj2, opts) do |lcs|
# use a's index for similarity
lcs.each do |pair|
result.concat(diff(obj1[pair[0]], obj2[pair[1]], opts.merge(prefix: "#{opts[:prefix]}[#{pair[0]}]")))
prefix = prefix_append_array_index(opts[:prefix], pair[0], opts)
result.concat(diff(obj1[pair[0]], obj2[pair[1]], opts.merge(:prefix => prefix)))
end
end

changeset.each do |change|
change_key = prefix_append_array_index(opts[:prefix], change[1], opts)
if change[0] == '-'
result << ['-', "#{opts[:prefix]}[#{change[1]}]", change[2]]
result << ['-', change_key, change[2]]
elsif change[0] == '+'
result << ['+', "#{opts[:prefix]}[#{change[1]}]", change[2]]
result << ['+', change_key, change[2]]
end
end
elsif obj1.is_a?(Array) && !opts[:use_lcs]
result.concat(LinearCompareArray.call(obj1, obj2, opts))
elsif obj1.is_a?(Hash)
if opts[:prefix].empty?
prefix = ""
else
prefix = "#{opts[:prefix]}#{opts[:delimiter]}"
end

deleted_keys = []
common_keys = []

obj1.each do |k, v|
if obj2.key?(k)
common_keys << k
else
deleted_keys << k
end
end
deleted_keys = obj1.keys - obj2.keys
common_keys = obj1.keys & obj2.keys
added_keys = obj2.keys - obj1.keys

# add deleted properties
deleted_keys.each do |k|
custom_result = custom_compare(opts[:comparison], "#{prefix}#{k}", obj1[k], nil)
deleted_keys.sort_by{|k,v| k.to_s }.each do |k|
change_key = prefix_append_key(opts[:prefix], k, opts)
custom_result = custom_compare(opts[:comparison], change_key, obj1[k], nil)

if custom_result
result.concat(custom_result)
else
result << ['-', "#{prefix}#{k}", obj1[k]]
result << ['-', change_key, obj1[k]]
end
end

# recursive comparison for common keys
common_keys.each {|k| result.concat(diff(obj1[k], obj2[k], opts.merge(prefix: "#{prefix}#{k}"))) }
common_keys.sort_by{|k,v| k.to_s }.each do |k|
prefix = prefix_append_key(opts[:prefix], k, opts)
result.concat(diff(obj1[k], obj2[k], opts.merge(:prefix => prefix)))
end

# added properties
obj2.each do |k, v|
added_keys.sort_by{|k,v| k.to_s }.each do |k|
change_key = prefix_append_key(opts[:prefix], k, opts)
unless obj1.key?(k)
custom_result = custom_compare(opts[:comparison], "#{prefix}#{k}", nil, v)
custom_result = custom_compare(opts[:comparison], change_key, nil, obj2[k])

if custom_result
result.concat(custom_result)
else
result << ['+', "#{prefix}#{k}", obj2[k]]
result << ['+', change_key, obj2[k]]
end
end
end
Expand All @@ -173,7 +179,7 @@ def self.diff(obj1, obj2, options = {}, &block)
# @private
#
# diff array using LCS algorithm
def self.diff_array(a, b, options = {})
def self.diff_array_lcs(a, b, options = {})
opts = {
:prefix => '',
:similarity => 0.8,
Expand Down Expand Up @@ -226,5 +232,4 @@ def self.diff_array(a, b, options = {})

change_set
end

end
Loading