Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UpdateMatching does not support custom StorerIndexFunc #110

Open
ljfuyuan opened this issue Apr 21, 2024 · 7 comments
Open

UpdateMatching does not support custom StorerIndexFunc #110

ljfuyuan opened this issue Apr 21, 2024 · 7 comments
Assignees

Comments

@ljfuyuan
Copy link

ljfuyuan commented Apr 21, 2024

When using an index, UpdateMatching creates an Iterator to query all index keys and uses matchsAllCriteria to compare whether the index meets the query criteria. Then, decode the value key[len(prefix):] for matching

// indexed field, get keys from index
	prefix = indexKeyPrefix(typeName, query.index)
	i.iter.Seek(prefix)
	i.nextKeys = func(iter *badger.Iterator) ([][]byte, error) {
		var nKeys [][]byte

		for len(nKeys) < iteratorKeyMinCacheSize {
			if !iter.ValidForPrefix(prefix) {
				return nKeys, nil
			}

			item := iter.Item()
			key := item.KeyCopy(nil)
			// no currentRow on indexes as it refers to multiple rows
			// remove index prefix for matching
			ok, err := s.matchesAllCriteria(criteria, key[len(prefix):], true, "", nil)
			if err != nil {
				return nil, err
			}

But Criterion.test always uses Default Decode.

			// used with keys
			if keyType != "" {
				err := s.decodeKey(testValue.([]byte), recordValue, keyType)
				if err != nil {
					return false, err
				}
			} else {
				err := s.decode(testValue.([]byte), recordValue)
				if err != nil {
					return false, err
				}
			}

While indexUpdate uses a custom Storer IndexFunc.

indexKey, err := index.IndexFunc(indexName, value)
	if err != nil {
		return err
	}
	if indexKey == nil {
		return nil
	}

	indexValue := make(KeyList, 0)

	indexKey = append(indexKeyPrefix(typeName, indexName), indexKey...)
@timshannon timshannon self-assigned this Apr 21, 2024
@timshannon
Copy link
Owner

Do you have an actual issue you are running into with a replicatible code snippet?

The Storer interface doesn't actually define any decoding or encoding, it just deals with returning bytes from a type so you can define anything that describes the atomic instance of your type; i.e. what makes this item unique.

The assumption is that the same decoder and encoder you specify in your DB connection options would be used in your custom Storer implementation using the default AnonStorer as an example as to how to implement it:

storer.indexes[indexName] = Index{
				IndexFunc: func(name string, value interface{}) ([]byte, error) {
					tp := reflect.ValueOf(value)
					for tp.Kind() == reflect.Ptr {
						tp = tp.Elem()
					}

					return s.encode(tp.FieldByName(name).Interface())
				},
				Unique: unique,
			}

So I'm trying to understand what scenario you're trying to solve where you are using a different encoder / decoder for your indexes then for the rest of your data?

@ljfuyuan
Copy link
Author

ljfuyuan commented May 20, 2024

Hi , I try to customize the Storer to construct a specific Index, but it seems that UpdateMatching is not working correctly, the following is the pseudo code

type Sample struct {
	Id        uint32
}

func (t *Sample) Type() string {
	return "Sample"
}

func (t *Sample) Indexes() map[string]badgerhold.Index {
	return sampleIndex
}

sampleIndex := make(map[string]badgerhold.Index)
sampleIndex["Id"] = badgerhold.Index{
	IndexFunc: func(name string, value interface{}) ([]byte, error) {
		if t, ok := value.(*Sample); ok {
			return badgerhold.DefaultEncode(t.Id)             // <- works fine

			// id := make([]byte, 4)                          // <- did not work here
			// binary.LittleEndian.PutUint32(id, t.Id)
			// return id, nil
		}

		return nil, errors.New("error")
	},
}

@timshannon
Copy link
Owner

Yep because you are just encoding it as bytes which is very different from an encoder / decoder that handles generic types:

image

@ljfuyuan
Copy link
Author

Sorry, I don't get you, Since the Storer interface is provided, and indexUpdate use IndexFunc to build indexKey, why the method fetchIndexValues use DefaultEncode to build indexKey ?

If I understand Storer.Index.IndexFunc incorrectly, what is the purpose of Storer's existence?

@timshannon
Copy link
Owner

From https://pkg.go.dev/github.com/timshannon/badgerhold#Storer

image

... why the method fetchIndexValues use DefaultEncode to build indexKey ?

Because it needs to match the same encoding as the rest of the database so that values can be properly compared.

It's kind of like saying "why can't I compare the word four and the number 4. They are the same value."

They may both represent the same value, but they are encoding that value in different ways. In order for the database to store things in order for quick searching, all values need to be encoded the same.

binary.LittleEndian.PutUint32(id, t.Id) and Gob encode return two very different byte arrays.

@ljfuyuan
Copy link
Author

Yes, I understand what you said.

But, what I want to express is that when indexFunc is used by indexUpdate to adding/updating an index, the same indexFunc should also be used during fetchIndexValues , instead of using the DefaultEncode, it seems they cannot generate the same result, so they never match.

Did I miss something?

@timshannon
Copy link
Owner

I may be missing something too, or missunderstanding what you're trying to accomplish.

the default / example Storer interface (i.e. the example off of which to build your own interface) encodes the index values before returning them. The fetchIndexValues function assumes that the values will be encoded, and decodes them properly.

As long as you decode your values before returning them in your custom Storer type everything should be fine. However I don't believe have any tests confirming that to be true. If you have a code snippet that shows otherwise, please share it. The previous one you sent was not encoding the index values correctly, it was using an entirely different encoding, and thus will not work.

I do acknowledge, however, that the Storer interface is a bit of a leaky abstraction, and relies on too much required behavior to be implemented to work properly. If I were re-doing it now, I would completely change it.

However doing so now would be backwards incompatible, and I'm not sure such a huge breaking change is worth it to simply clarify the interface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants