Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instancer storage rework #523

Merged
merged 28 commits into from
Oct 24, 2024
Merged

Conversation

Xtarsia
Copy link
Contributor

@Xtarsia Xtarsia commented Oct 11, 2024

Fixes #479

Implements region localised storage of instancer transforms.

See discussion
#524

  • 1. Embed vertex_spacing in transforms, relative to local region.
    Add vertex_spacing to Terrain3DRegion.
    Only change vertex_spacing of transforms on load/set_vs().

  • 2. Split MM data down to 32x32 storage
    append_location -> append_region -> append_cell -> storage in 32x32

  • 3. Change MM/MMI generation off of new data structure

  • 4. Add region container in tree / instancer

  • 5. Support changing region size

  • 6. Upgrade old data from 0.9.2 and 0.9.3

Bug squash and misc:

  • Hand removing instances doesn't fully delete regions from tree.
  • Update transforms isn't accurate at 0.5 vertex spacing.
  • Changing region size twice (1024-512-1024) broken.
  • Remove_instances operates much faster than add_instances now - should be same speed, and not "blocky".
  • Naming convention as per Instancer storage rework #523 (comment)
  • Expose visibility distance for each mesh asset
  • Fix visibility ranges on MMIs
  • Dictionary issues, test std::unordered_map
  • Crash changing vertex spacing
  • Removal slope filter being smaller than the add slope filter is an annoyance
  • Swap_ids() and other functions haven't been touched yet (comment out region::get_multimeshes() to reveal all)
  • Remove_instances: Crash in w/ remove_all
  • Remove instances: remove not completely filtered by mesh_id
  • Remove instances: warning - not erasing _instances[mesh_id]
  • Crashes on close pretty frequently, after backup_region() runs. Potentially corrupts data
  • Transfer data buffers directly into Multimesh rather than per instance - unnecessary set_buffer has a memcpy
  • Remove_instances: Sparse removal of instances shows the cell grid

@TokisanGames
Copy link
Owner

TokisanGames commented Oct 12, 2024

Good start. I'll work on 4, placement in the tree by region.
Edit:
This update now writes this to the tree with region location and MMI mesh id:

Terrain3D/MMI
Terrain3D/MMI/Region_00-01
Terrain3D/MMI/Region_00-01/MultiMeshInstance3D0
Terrain3D/MMI/Region_00-01/MultiMeshInstance3D1
Terrain3D/MMI/Region_00-01/MultiMeshInstance3D2
Terrain3D/MMI/Region_00-02
Terrain3D/MMI/Region_00-02/MultiMeshInstance3D0
Terrain3D/MMI/Region_00-02/MultiMeshInstance3D1
Terrain3D/MMI/Region_00-02/MultiMeshInstance3D2
Terrain3D/MMI/Region_00_00
Terrain3D/MMI/Region_00_00/MultiMeshInstance3D0
Terrain3D/MMI/Region_00_00/MultiMeshInstance3D1
Terrain3D/MMI/Region_00_00/MultiMeshInstance3D2

Next I'll work on 2/3.

@TokisanGames TokisanGames added the enhancement New feature or request label Oct 12, 2024
@TokisanGames TokisanGames added this to the 0.9 milestone Oct 12, 2024
@Xtarsia
Copy link
Contributor Author

Xtarsia commented Oct 12, 2024

Great!
It occurs to me that each cell will need its own internal location relative to the region.

Doing updates would need to place each mmi cell by (region_size * location + 32 * cell_location) * vertex scaling.

The vertex scaling update will have to operate per cell rather than region too.

The cell dictionary should be keyed by cell location relative to the region i belive.

Probably need another pass to fix the access / updates etc to so its relative to cells rather than regions once cells are in.

MMI and transforms should be reletive to the region origin.

@TokisanGames
Copy link
Owner

TokisanGames commented Oct 12, 2024

I have this to return cell location local to the region.

#define CELL_SIZE 32
Vector2i Terrain3DInstancer::get_cell(const Vector3 &p_global_position) {
	int region_size = _terrain->get_region_size();
	real_t vertex_spacing = _terrain->get_vertex_spacing();
	Vector2i cell;
	cell.x = (UtilityFunctions::floori(p_global_position.x / vertex_spacing) % region_size) / CELL_SIZE;
	cell.y = (UtilityFunctions::floori(p_global_position.z / vertex_spacing) % region_size) / CELL_SIZE;
	return cell;
}

I'm still just poking around the edges. I'm going to build up _instances simultaneously w/ _multimeshes so it doesn't break the old until I know the new works.

@Xtarsia
Copy link
Contributor Author

Xtarsia commented Oct 12, 2024

Thinking about it more, I think the transforms in each cell should be relative to the region still.

@TokisanGames
Copy link
Owner

The last commit is the first intrusive change to the data structure. It works a little bit, with a lot broken. I'm done for the day in case you want to review it or take a crack and it. I'll start looking again in 10-12 hours.

It works enough to paint foliage on the ground at the 0, 0 cell of any region. Outside of that first cell, the height starts to increase linearly. I'm accumulating height somewhere.

Undo sort of works, but doesn't fully remove everything, which means destroy and/or destroy_by_location aren't 100%.

That's about it. Not much else works. Lots of bugs.

If you click the Demo node, there is a dump_tree bool you can click to print the Terrain3D tree and see the MMIs.

@Xtarsia
Copy link
Contributor Author

Xtarsia commented Oct 13, 2024

Added a 3rd modified value to the tuple (making it a triple now i guess?) allows much more targeted updates, especially when there are a lot of cells & transforms present.

update vertex spacing and update transforms are now both working.

removes instances ive started on, but it has issues, only seems to work in region 0, and causes the entire region to clear / suddenly have blank arrays. Maybe need a different approach to it.

i didnt touch undo, some issues persist.

@TokisanGames
Copy link
Owner

TokisanGames commented Oct 13, 2024

When I loaded up my previous data it crashed when there was no tuple, so I fixed it. Also it was very slow to add. I removed, which cleared them all, then it stopped giving me so many problems by starting with new data.

Issues:

  • Foliage used to stay in it's region and not be allowed to create transforms outside it. Now you can add transforms outside your region. They may also be going outside their grid.

  • I'm crashing a lot on closing the editor. MSVC doesn't point to any of our code. It's likely in the allocation/deallocation.

  • I put the mesh_id at the end of the child and thought it was cool, but now I see it just auto increments when you move past the first region, so mesh id could be dropped, or go elsewhere in the name.

  • Sometimes I'm not able to reach some areas

After this I clicked around, changed vertex scaling a few times, then was able to paint here.

{BF510881-DC95-451E-B3C6-91366FB09811}
{4F022C24-64E9-430A-802F-9C69F0BAE054}

@Xtarsia
Copy link
Contributor Author

Xtarsia commented Oct 13, 2024

Before my comits last night, cell mmis were being translated to their cell positions, which would have required offsetting the transforms per cell rather than per region. Keeping transforms per region makes things easier all around.

However loading the current iteration with old data would mean it makes assumptions about the location of the mmis when updating.

For now i suggest we test with fresh data each comit until things are stable and everything is working. Then look at converting old data.

Remove instances was the last thing I looked at for about 30mins so its pretty broken atm. I have some thoughts on how to make it work correctly.

@Xtarsia
Copy link
Contributor Author

Xtarsia commented Oct 13, 2024

ed06faf does fully clear when removing, but causes crashes if adding afterwards?

added a destroy call to end of the vertex spacing update, was getting some serious duplication problems with the mmi themselves without it.

I think we need a better way to remove the mmi for a cell once its empty.

the underlying data seems OK now, just the handling of the MMIs needs some attention.

@Xtarsia
Copy link
Contributor Author

Xtarsia commented Oct 14, 2024

Things are a bit of a mess atm with the nested for loops. Vertex spacing is broken still for remove/update.

Rather than looping every cell, a similar lookup method to get_region_loc(), would make it possible to calculate the potential cell ids needed, and then check if they exist, bringing the worst case of 16k cells for 2k regions checked per mesh to only cells in the size of the area checked.

@TokisanGames
Copy link
Owner

TokisanGames commented Oct 14, 2024

I spent all day banging my head on the crash on close and don't have much to show for it, except that it is an engine bug.

It crashes in Resource::GDClass because Godot frees the GDExtension first, then asks a Resource what class it is. It has a non-null _extension pointer which triggers the conditions to trigger it.

I can't tell what class it is, other than a Resource. I've done extensive testing trying to tell, printing out instance IDs from all Objects and Resources we're creating, but haven't found anything yet.

  • One note is that after painting instances and saving. Closing crashes. But loading it up again, closing again doesn't crash. It only does if more instances are added. This suggests the trigger is in the add_instances chain rather than the load_region/update_mmi chain. Or maybe something to do in the replacement of the MMI data in update_mmi.

  • The problem sometimes occurs on main. I was able to reproduce it, then seemingly fix it, then I couldn't reproduce the original anymore. Since the problem object is a Resource, I attempted to unref the MultiMesh before deleting the MMI, which seemed to fix the crash on main, but later when testing undoing the fix, I couldn't get it to crash on main anymore. In the PR, I tried unreffing the MM before deleting and to the old one anytime a new one was created and assigned to the MM, but it doesn't help the PR.

Other issues:

  • Can't add to cells that have been reloaded from disk.

  • Even if I close the scene and say don't save, Godot often saves foliage inside the region files. It doesn't seem to save the sculpting, unless it crashes, then it could be either way.

@Xtarsia
Copy link
Contributor Author

Xtarsia commented Oct 15, 2024

outside of upgrading old data, everything should be working now, and needs testing.

there is a strange issue when repeatedly changing region size without saving between, that causes instancer data to just vanish for some regions.

@TokisanGames
Copy link
Owner

TokisanGames commented Oct 15, 2024

Lots of good stuff here. Things are coming along well. Most things are working including undo and storing data within the regions.

I put this is the current comprehensive list of issues up top.

@@ -573,6 +573,7 @@ void Terrain3DInstancer::remove_instances(const Vector3 &p_global_position, cons
MultiMeshInstance3D *mmi = cast_to<MultiMeshInstance3D>(mesh_mmis[cell]);
remove_from_tree(mmi);
memdelete_safely(mmi);
mesh_mmis.erase(cell);
Copy link
Owner

@TokisanGames TokisanGames Oct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could this fix a crash?

mesh_mmis stores v2i (mesh_id, lod)
Dictionary mesh_mmis = region_mmis[Vector2i(m, 0)];

cells are v2i(0-31, 0-31)
Vector2i cell = cell_queue[c]; -> c_locs.keys();

This line should do nothing, not finding the cell key in the mesh dict, most of the time, but might occasionally delete a valid mesh/lod if the cell matches eg around (0,0,).

I'm working on consolidating mmi deletion from update_transforms, remove_instances, and destroy_by_location into one function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without that line, I belive that the mesh_mmis dict would be left with a not-really-null refrence to an mmi that is queue free'd?

when update mmis is called, it would try to access the mmi that was just freed, as it would still be keyed.

I may be wrong.

having some set/get/check/delete functions for the data is 100% needed.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see my confusion. We are using different naming conventions.

In your code, you're doing this:
Dictionary region_mmis = _mmi_nodes[region_loc];
Whereas I call it this
Dictionary mesh_dict = _mmi_nodes[region_loc];

_mmi_nodes contains a dictionary of mesh dictionaries keyed by region location. ie, location-> mesh dictionary. So I think mesh_dict is more accurate. Whereas region_mmis or region_dict is what we might call _mmi_nodes itself; a dictionary of region dictionaries keyed by location.

On the next level, you have this:
Dictionary mesh_mmis = region_mmis[Vector2i(m, 0)];
Whereas I've written:
Dictionary cell_dict = mesh_dict[mesh_key];

Here we have a dictionary of cells keyed by mesh_id,lod. So I think cell_dict is a more accurate name.


Further, we are also using multiple nested dictionaries with similar but different structures to track both MMIs and triple[arrays].

Region::_instances{mesh_id:int} -> cell{v2i} -> [ TypedArray<Transform3D>, PackedColorArray, modified:bool ]
_mmi_nodes{region_loc} -> mesh{v2i(mesh_id,lod)} -> cell{v2i} -> MultiMeshInstance3D

I have written both of these in different areas:

Dictionary mesh_dict = region->get_instances();
Dictionary cell_dict = mesh_dict[mesh_key];

Dictionary mesh_dict = _mmi_nodes[region_loc];
Dictionary cell_dict = mesh_dict[mesh_key];

Clearly this is unclear when looking at code if mesh_dict or cell_dict are referring to the data set or the MMI set.

This is confusing to us and impossible for newcomers. So I propose we standardize on this naming convention for the dictionaries to differentiate the two in our variables:

// MM Resources stored in Terrain3DRegion::_instances as
// _instances{mesh_id:int} -> cell{v2i} -> [ TypedArray<Transform3D>, PackedColorArray, modified:bool ]

Dictionary mesh_inst_dict = region->get_instances();
Dictionary cell_inst_dict = mesh_inst_dict[mesh_id];
Array triple = cell_inst_dict[cell];
// MMI Objects attached to tree, freed in destructor, stored as
// _mmi_nodes{region_loc} -> mesh{v2i(mesh_id,lod)} -> cell{v2i} -> MultiMeshInstance3D

Dictionary mesh_mmi_dict = _mmi_nodes[region_loc];
Dictionary cell_mmi_dict = mesh_mmi_dict[mesh_key];
mmi = cell_mmi_dict[cell];

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, hard agree on the naming convention.

@TokisanGames
Copy link
Owner

TokisanGames commented Oct 17, 2024

  • Crash on close seems nullified if instancer::_backup_region() isn't called. It will crash if there is some instance data, and it's backed up via instancer or editor. It could be that Godot is freeing Terrain named resources (eg Terrain3DRegion) from the undo data after unloading the GDExtension. I was able to get the crash on the main branch. I think the current setup has aggravated the situation. Maybe we can manually free the editor data on scene close or something.
  • Fixed: I found another crash. Add some instances, change vertex_spacing to 2, add more, reset to 1. If it doesn't crash then change spacing a bit more and it will crash in _destroy_mmi_by_location() calling _destroy_mmi_by_cell(). access violation. **cell_pair** was 0x10. Likely update_mmis is inserting a null somewhere that is triggered by set_vertex_spacing. I'm not sure how to check if the cell_pair iterator is null.
  • Fixed: Labels and MMI visibility do not turn off with the Terrain3D node.

@TokisanGames
Copy link
Owner

TokisanGames commented Oct 17, 2024

I have exposed visibility but it's weird. Defaults to 1024m for now. 0 is disabled. Issues:

  • The visibility range is calculated between the camera and the 0,0 corner of the region.
  • When +x, +z of the region origin, the whole mmi visibility fades out. When -x, -z instance visibility turns off instantly per cell. The ideal is each cell individually fades out.

Edit: Fixed range calculation by disabling visibility fade mode, and enabled fade in the material.

@TokisanGames
Copy link
Owner

TokisanGames commented Oct 18, 2024

  • Slope improvement is great.

  • Vertex spacing is really solid now.

  • Sparse removal doesn't work. It's not possible to thin out a large area with an even distribution. Instead, it produces hot spots in a grid shape. Moderately cover a large area, then try to evenly, lightly remove at 10%.
    image

My next steps will to look at the untouched functions like swap_id, get_multimeshes. Then I will explore why backup_regions causes the crash more, or if there's a way to clear the undo/redo manager before Terrain3D is freed.

@Xtarsia
Copy link
Contributor Author

Xtarsia commented Oct 18, 2024

Solving that removal problem is going to need some work

Density per cell needs reducing by the ratio of (cell area inside the radius of the ring : full cell area)

@TokisanGames
Copy link
Owner

TokisanGames commented Oct 18, 2024

The original went:

  • Grab the transforms data
  • For the quota, if there's a Transform in the ring, drop it

This goes:

  • for each cell
    • Get the xforms (for that cell)
    • Remove the quota (from that cell)

What about:

  • for the quota
    • while something not removed or haven't reviewed each cell in circle (in a random order)
      • remove an instance within the circle

@TokisanGames
Copy link
Owner

TokisanGames commented Oct 18, 2024

Remove_instances issues:

  • Fixed: If one selects CTRL+SHIFT to remove all and the region does not have all N mesh types in the asset dock, it will crash on the bottom line with an invalid index.
int mesh_count = _terrain->get_assets()->get_mesh_count();
for (int r = 0; r < region_queue.size(); r++) {
	Ref<Terrain3DRegion> region = data->get_region(region_loc);
	_backup_region(region); // Note: region hasn't changed yet and might not. Move lower
	Dictionary mesh_inst_dict = region->get_instances();
	Array mesh_types = mesh_inst_dict.keys();
	if (mesh_types.size() == 0) {
		continue;
	}
	// For this mesh id, or all mesh ids
	for (int m = (modifier_shift ? 0 : mesh_id); m <= (modifier_shift ? mesh_count - 1 : mesh_id); m++) {
		int mesh_id = mesh_types[m]; // Crash
  • Fixed: The above causes a crash even if CTRL+SHIFT isn't pressed. Just having multiple mesh types and erasing one from a region can cause a crash. Specifically I had 5 mesh types, I painted type C in a cell. Then I removed those instances, and after the last one, it crashed on the line above.

  • Fixed: If you have several mesh types (A-E) and have placed them all on the ground, then start removing one mesh type (A), once you finish removing all of the instances while still holding down the mouse, you can then start removing the next mesh type (B). You can remove all but 1 or so of that type. Keep scrubbing and then you can start removing type (C) and so on, and it will likely crash as above.

@TokisanGames
Copy link
Owner

Your last fix did pretty well to fix the crash and being able to remove other non-selected meshes. Continuing to test things went well until I CTRL+SHIFT removed a selection of 5 mesh types. As I got to the end of the instances this warning popped up many times for all of the mesh ids.
WARNING: Terrain3DInstancer#5838:remove_instances: Region at: (0, 0) has instance dictionary for mesh id: 4 but has no cells.

It did remove the instances properly. However running the dump_data function I see the mesh_inst_dicts are still present, though they have no cells. They need to be erased from the parent instances dict. Though I do see a check for it in the code, but apparently it's not happening (or being done to a copy).

I updated swap_ids and removed some other functions we don't need. We're getting close.

@TokisanGames
Copy link
Owner

TokisanGames commented Oct 19, 2024

More testing on the crash on close:

  • The object it's crashing on is a Terrain3DRegion. It's an engine bug, attempting to call is_class() on it after freeing the library.
  • It has no issue with the first three loaded demo regions. Upon creating backups the questionable instance is either the first (original duplicate) or second (redo duplicate) created.
  • I also figured out how to clear the undo history on exit but that doesn't solve the issue.
  • I nullified region::duplicate() and even renamed it in case it was calling the base class, but no change.
  • The crashing code is only supposed to run if --verbose is specified, in the section printing out leaked instances. If you run from the command line it shouldn't crash. However I run with the msvc debugger, and though I don't specify --verbose, it does trigger the specific code.
  • I've narrowed down the circumstance that initiate the crash down to a single line:
void Terrain3DEditor::_store_undo() {
	IS_INIT_COND_MESG(_terrain->get_plugin() == nullptr, "_terrain isn't initialized, returning", VOID);
	if (_tool < 0 || _tool >= TOOL_MAX) {
		return;
	}
	LOG(DEBUG, "Finalize undo & redo snapshots");
	Dictionary redo_data;
	// Store current locations; Original backed up in start_operation()
	redo_data["region_locations"] = _terrain->get_data()->get_region_locations().duplicate();
	// Store original and current backups of edited regions
	_undo_data["edited_regions"] = _original_regions; // ** causes latent crash **
	redo_data["edited_regions"] = _edited_regions;

I can return right before that line or comment it out and it won't crash. If I return right after or leave it uncommented it will crash.

  • _original_regions and _edited_regions are both TypedArray.
  • _undo_data and redo_data are both dictionaries, but the first one is a class member.

Alternatively, clearing the dictionary rather than reassigning it also prevents the crash. However it breaks our undo/redo. Still this suggests maybe the issue is in GDExtension instead of Godot.

void Terrain3DEditor::start_operation(const Vector3 &p_global_position) {
	IS_DATA_INIT_MESG("Terrain isn't initialized", VOID);
	LOG(INFO, "Setting up undo snapshot");
	_undo_data.clear();
	//_undo_data = Dictionary(); // New pointer instead of clear

I'm bug reporting now, then will attempt to figure out a solution to make a new data pointer so undo works without triggering the bug.

@TokisanGames
Copy link
Owner

TokisanGames commented Oct 23, 2024

Using the default values of 20m/33% I made the following shapes. Then I found I could not remove any at 20m/33%, 100%, nor 1000%.
{9A1E2CBB-CB47-4B45-84CC-D513993CF611}

I bumped it up to 100m/100% and found I could start to remove some instances if I centered the cursor in the red area so the edge of the circle was brushing the instances. It's still difficult to remove those that remain. It takes a lot of scrubbing on the edge of the ring, then it gradually removes more and more.

{986B0E34-D836-4734-8A7E-7275B64285D7}

I made a ven diagram of three types with 100m/100%. At 20m/33% full slope, I can remove the blue from the bottom, but not from the hill. I can remove the yellow along the bottom edge on the slope, but not on the hill. Orange I can remove from the lower area and slope, but not on the hill. At 100m/100% I can start removing most of the instances, though some areas get stuck as above.

{7F08419F-FCF9-40EF-989C-951B2281047E}

@TokisanGames TokisanGames force-pushed the instancer_testing branch 2 times, most recently from ecede92 to 92ba372 Compare October 24, 2024 15:29
@TokisanGames TokisanGames merged commit 4088dbc into TokisanGames:main Oct 24, 2024
13 checks passed
@Xtarsia Xtarsia deleted the instancer_testing branch October 28, 2024 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Vertex spacing >= 2 issues
2 participants