Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Behavior of $project for nested documents #4704

Open
bithazard opened this issue May 20, 2024 · 2 comments
Open

Behavior of $project for nested documents #4704

bithazard opened this issue May 20, 2024 · 2 comments
Assignees
Labels
has: breaking-change An issue that is associated with a breaking change. type: bug A general bug

Comments

@bithazard
Copy link

Hi. I'm trying to understand the overloaded project methods in an aggregation. My initial understanding was that project("x") is only the short form of project(Fields.from(field("x"))) or even project(Fields.from(field("x", "x"))). The latter variant is of course only really needed if you want to project a field to a model with a different structure. This assumption is true for top level fields. However when used for nested documents the resulting queries look a bit different and don't work as expected. See the following example:

import org.springframework.boot.CommandLineRunner;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.data.annotation.Id;
import org.springframework.data.mongodb.core.MongoTemplate;
import org.springframework.data.mongodb.core.aggregation.Fields;
import org.springframework.data.mongodb.core.mapping.Document;

import java.util.List;

import static org.springframework.data.mongodb.core.aggregation.Aggregation.newAggregation;
import static org.springframework.data.mongodb.core.aggregation.Aggregation.project;
import static org.springframework.data.mongodb.core.aggregation.Fields.field;

/* Example document in database "test", collection "test":
      {
          "_id" : ObjectId("51a4da9b292904caffcff6eb"),
          "levelOneDocument" : {
              "levelOneField" : "levelOneFieldValue"
          }
      }
 */
@SpringBootApplication
public class MongodbProjectSscce implements CommandLineRunner {
    private final MongoTemplate mongoTemplate;

    public MongodbProjectSscce(MongoTemplate mongoTemplate) {
        this.mongoTemplate = mongoTemplate;
    }

    public static void main(String[] args) {
        SpringApplication.run(MongodbProjectSscce.class, args);
    }

    @Override
    public void run(String... args) {
        //results in: {"aggregate": "test", "pipeline": [{"$project": {"levelOneField": "$levelOneDocument.levelOneField"}}]}
        List<RootDocument> projection1 = mongoTemplate.aggregate(newAggregation(project(
                        "levelOneDocument.levelOneField"
        )), "test", RootDocument.class).getMappedResults();
        System.out.println(projection1);
        //-> doesn't work as expected: levelOneField in LevelOneDocument is not filled, instead levelOneField in RootDocument is filled

        //also results in: {"aggregate": "test", "pipeline": [{"$project": {"levelOneField": "$levelOneDocument.levelOneField"}}]}
        List<RootDocument> projection2 = mongoTemplate.aggregate(newAggregation(project(
                Fields.from(
                    field("levelOneDocument.levelOneField")
                )
        )), "test", RootDocument.class).getMappedResults();
        System.out.println(projection2);
        //-> also doesn't work as expected: levelOneField in LevelOneDocument is not filled, instead levelOneField in RootDocument is filled

        //results in: {"aggregate": "test", "pipeline": [{"$project": {"levelOneDocument.levelOneField": 1}}]}
        List<RootDocument> projection3 = mongoTemplate.aggregate(newAggregation(project(
                Fields.from(
                        field("levelOneDocument.levelOneField", "levelOneDocument.levelOneField")
                )
        )), "test", RootDocument.class).getMappedResults();
        System.out.println(projection3);
        //-> works as expected: levelOneField in LevelOneDocument is filled, levelOneField in RootDocument is not
    }

    @Document
    public record RootDocument(
            @Id
            String id,
            String levelOneField,   //Field should not be here - for demonstration purposes only
            LevelOneDocument levelOneDocument) {}

    public record LevelOneDocument(String levelOneField) {}
}

There is a record RootDocument at the bottom which contains a nested document LevelOneDocument with only one field levelOneField. The RootDocument also contains the id which is irrelevant here and another field levelOneField. This one should not be here and I only added it to demonstrate the problem. When you run the first two aggregations (project("...") and project(Fields.from(field("...")))) they both produce the same query which leads to the wrong result - i.e. "$levelOneDocument.levelOneField" is projected to "levelOneField". Only when you explicitly state that you want to project "levelOneDocument.levelOneField" to "levelOneDocument.levelOneField" (the third aggregation - project(Fields.from(field("...", "...")))) you get the expected query and result.

The reason for the resulting query in the first two aggregations is a check in org.springframework.data.mongodb.core.aggregation.Fields:238. It checks whether name contains a period and target is null. In this case only the substring after the first period is used as name. I'm not sure what the intention of this code is. Maybe a period has a special meaning in a projection that I'm not aware of. If this is the case this should be documented somewhere. Otherwise if you only look at the overloaded methods you would assume that they all behave similarly, regardless whether you project a top level field or a nested document.

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label May 20, 2024
@mp911de
Copy link
Member

mp911de commented May 21, 2024

Another datapoint: andInclude("a.b.c") renders {"b.c" : "$a.b.c"}, see #76

Concluding from the point of investigation, the goal was to derive the field name from a property. Therefore, paths use the segment after the dot. We never tested against paths containing multiple segments as the flaw that we resort to the first dot would have been revealed.

We should update this behavior with our next major release to correctly derive the field name and also verify functionality against placeholder paths (a.$.b). andInclude/andExclude should accept paths as-is and not trim these down to correctly mimic MongoDB behavior.

@mp911de mp911de self-assigned this May 21, 2024
@mp911de mp911de added type: bug A general bug has: breaking-change An issue that is associated with a breaking change. and removed status: waiting-for-triage An issue we've not yet triaged labels May 21, 2024
@mp911de mp911de changed the title Behavior of project for nested documents Behavior of $project for nested documents May 21, 2024
@shollander
Copy link

I have a similar issue, although in my case the Fields.from(field("name", "alias")) doesn't work.

I am using a $group operation before the $project. My $group results in a document with the following structure:

{
  _id: {
    key: 'ABC'
    key2: 'DEF'
  }
  sum: 10
}

The following does not work:

project()
  .andExclude("_id")
  .and("_id.key").as("key")
  .and("_id.key2").as("key2")
  .andInclude("sum")

The only (seemingly incorrect) thing that works is to use this:

project()
  .andExclude("_id")
  .andInclude("key", "key2")
  .andInclude("sum")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
has: breaking-change An issue that is associated with a breaking change. type: bug A general bug
Projects
None yet
Development

No branches or pull requests

4 participants