Add `json:""` marshaling annotations to response structs #21

austin-denoble · 2024-05-03T20:14:39Z

Problem

There's been some UX feedback regarding marshaling of some of our response structs.

Primarily, we're not using json:"key_name" annotation for our struct fields so they default to capitalized keys in JSON output when marshaled. IE: { "Usage": { "ReadUnits": 5 }}. We'd like these lower case and matched with our API spec.

There are also instances where we're returning pointers for certain *int32 fields which can be irritating to consumers needing to dereference and nil-check. In certain cases we can default any nil pointer values to 0 before returning things.

There's no testing for the marshaling on any of our custom structs.

Solution

I may have gone a bit overboard with the testing here. The bulk of this is tests and the tests are all very similar, definitely open to opinions on removing if people find them unnecessary, but changing up how the JSON is produced is pretty easy and it's an important point of integration in my mind, so maybe worth testing for.

I also wasn't clear if adding these annotations to our request structs made sense either, so I left those out for now. Overall I'm still working on getting a better understanding of Go syntax and expectations.

Add JSON marshaling annotations to the response structs in models.go and index_connection.go.
Update Collection and Usage structs to default integer values rather than using pointers.
Add new models_test.go file to run through some validation on struct marshaling, also add unit tests to index_connection_test.go.

Originally I thought omitempty for some of these pointers may be helpful from a consumer standpoint, but I'm actually not sure the more I've thought about it. Does it make sense to omit more or less things? Other feedback here would be much appreciated.

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update
Infrastructure change (CI configs, etc)
Non-code change (docs, etc)
None of the above: (explain here)

Test Plan

Unit tests to explicitly stress marshaling the structs via json.Marshal().

To see the specific tasks where the Asana app for GitHub is being used, see below:
- https://app.asana.com/0/0/1207208968992760
- https://app.asana.com/0/0/1207161516992976

haruska

👍

aulorbe · 2024-05-06T17:22:30Z

pinecone/index_connection.go

-	VectorIds           []*string
-	Usage               *Usage
-	NextPaginationToken *string
+	VectorIds           []*string `json:"vector_ids,omitempty"`


Re: your question in the PR description, from what I've read at least it seems like adding omitempty is a good call. I think it avoids ambiguous values -- like is a vector ID 0, or does it not exist? I'm always a fan of being explicit

I think there are places where not using omitempty does make sense as it'll drop zero values that maybe you want to include.

For this PR I was mainly focused on making sure we're using omitempty for pointers and structs where it makes sense, primarily to avoid null values in the resulting JSON.

aulorbe · 2024-05-06T17:25:46Z

pinecone/index_connection_test.go

+		{
+			name:  "Fields omitted",
+			input: FetchVectorsResponse{},
+			want:  `{}`,


Should we account for a case when some fields are present, but others are not (or are empty)?

There are cases where we do test for that like in TestMarshalingDescribeIndexStatsResponse, but for something like this where all fields have omitempty we just test for an empty struct.

aulorbe · 2024-05-06T17:27:16Z

pinecone/index_connection_test.go

+				},
+				Usage: &Usage{ReadUnits: toUInt32(5)},
+			},
+			want: `{"matches":[{"vector":{"id":"vec-1","values":[0.01,0.01,0.01]},"score":0.1},{"vector":{"id":"vec-2","values":[0.02,0.02,0.02]},"score":0.2}],"usage":{"read_units":5}}`,


Might be worth at least having the want values as globals at the top? Or fixtures or whatever? Just to downsize the amount of duplicated code

It's definitely a lot of duplicative code. I'd prefer not to move the want blocks around though as they're specific to the struct that's being tested, and it'd probably be more confusing to have to look them up elsewhere than in-line.

I can try and think of ways to clean some of this up though.

aulorbe · 2024-05-06T17:28:56Z

pinecone/models.go

+	Size        *int64           `json:"size,omitempty"`
+	Status      CollectionStatus `json:"status"`
+	Dimension   *int32           `json:"dimension,omitempty"`
+	VectorCount *int32           `json:"vector_count,omitempty"`


This might be an exception to the omitempty thing I was saying above -- if vector_count is empty, that actually is meaningful, right? Like, that means there are no vectors in the collection, right? If so, I think we should actually not have omitempty there, since the value of 0 indicates the actual state of things (vs an ambiguous state)

The main reason for using omitempty here is VectorCount is a pointer, and I've tried to use omitempty with all our pointers. The reason is otherwise an empty pointer is marshaled to null in the JSON results.

For collections specifically, this is mainly to cover a specific issue with VectorCount and other fields possibly being empty when creating a collection from an empty index.

I reworked the Collection struct and the toCollection function to map the pointers into zero values for the ints to make things easier to work with for consumers.

aulorbe · 2024-05-06T17:30:01Z

pinecone/models.go

-	SparseValues *SparseValues
-	Metadata     *Metadata
+	Id           string        `json:"id"`
+	Values       []float32     `json:"values,omitempty"`


I thought we always had to have dense values, no? Like, you can technically "only" have sparse values in an index if you upload a dense vector that's just filled with 0s and has a dimension of 1 or 2 or something, but I think you have to have a dense vector in order to upload a sparse vector (so I think omitempty here is not good)

(b/c you should never have an empty dense vector value)

You do have to have dense values, yes. Similar to the pointer issue I mentioned here: #21 (comment) I was mainly doing this defensively to avoid instances where we'd have values: null in the JSON, but I don't think that's ever expected to happen.

Also the generated code that we map between uses omitempty for this, so I think it's fine to leave it as-is for now:

go-pinecone/internal/gen/data/vector_service.pb.go

Line 88 in 32391bc

Values []float32 `protobuf:"fixed32,2,rep,packed,name=values,proto3" json:"values,omitempty"`

aulorbe

Some comments

…e stripping out empty pointers on conversion, and properly converting the struct keys to lower and snake case, add a bunch of unit tests to valid marshaling

…ional structs to default to 0 integer values rather than passing a pointer, clean up tests

austin-denoble requested review from haruska, jhamon and aulorbe May 3, 2024 20:14

haruska approved these changes May 6, 2024

View reviewed changes

aulorbe reviewed May 6, 2024

View reviewed changes

austin-denoble added 2 commits May 6, 2024 16:59

add json marshaling annotations to our response structs to facillitat…

1c8204c

…e stripping out empty pointers on conversion, and properly converting the struct keys to lower and snake case, add a bunch of unit tests to valid marshaling

tweak index_connection_test tests

567551d

austin-denoble force-pushed the adenoble/json-marshaling-annotations branch from 7ac687d to 567551d Compare May 6, 2024 20:59

review feedback, fix newly failing delete by filter test, tweak addit…

4db920c

…ional structs to default to 0 integer values rather than passing a pointer, clean up tests

austin-denoble merged commit cbb7f75 into main May 7, 2024
3 checks passed

austin-denoble deleted the adenoble/json-marshaling-annotations branch May 7, 2024 22:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `json:""` marshaling annotations to response structs #21

Add `json:""` marshaling annotations to response structs #21

austin-denoble commented May 3, 2024 •

edited

Loading

haruska left a comment

aulorbe May 6, 2024

austin-denoble May 6, 2024

aulorbe May 6, 2024

austin-denoble May 6, 2024

aulorbe May 6, 2024

austin-denoble May 6, 2024

aulorbe May 6, 2024

austin-denoble May 6, 2024

austin-denoble May 7, 2024

aulorbe May 6, 2024 •

edited

Loading

aulorbe May 6, 2024

austin-denoble May 6, 2024

austin-denoble May 6, 2024

aulorbe left a comment

Add json:"" marshaling annotations to response structs #21

Add json:"" marshaling annotations to response structs #21

Conversation

austin-denoble commented May 3, 2024 • edited Loading

Problem

Solution

Type of Change

Test Plan

haruska left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aulorbe May 6, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aulorbe left a comment

Choose a reason for hiding this comment

Add `json:""` marshaling annotations to response structs #21

Add `json:""` marshaling annotations to response structs #21

austin-denoble commented May 3, 2024 •

edited

Loading

aulorbe May 6, 2024 •

edited

Loading