-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
freeze when hdf5_type_id
on self-referential datatype
#1127
Comments
I'm trying to figure out if HDF5 compound types support self-referential types like this. |
When I had a short look, I couldn’t find it. If it doesn’t, one approach would be to wrap it in a variable length array type that stores all referenced copies together with indices that show their relation. |
I just tried this patch and this does not seem to work either: diff --git a/src/typeconversions.jl b/src/typeconversions.jl
index 1e82c659..a3f3a6b3 100644
--- a/src/typeconversions.jl
+++ b/src/typeconversions.jl
@@ -73,7 +73,8 @@ function hdf5_type_id(::Type{T}, isstruct::Val{true}) where {T}
dtype = API.h5t_create(API.H5T_COMPOUND, sizeof(T))
for (idx, fn) in enumerate(fieldnames(T))
ftype = fieldtype(T, idx)
- API.h5t_insert(dtype, Symbol(fn), fieldoffset(T, idx), hdf5_type_id(ftype))
+ _hdf5_type_id = ftype == T ? dtype : hdf5_type_id(ftype)
+ API.h5t_insert(dtype, Symbol(fn), fieldoffset(T, idx), _hdf5_type_id)
end
return dtype
end I get the following: julia> using HDF5
julia> struct A
x::A
end
julia> HDF5.hdf5_type_id(A)
ERROR: HDF5.API.H5Error: Error adding field x to compound datatype
libhdf5 Stacktrace:
[1] H5Tinsert: Invalid arguments to routine/Bad value
can't insert compound datatype within itself |
|
Is this even supported by Julia? How do you construct an instance of |
ah, you have to do it via inner constructors... the problem is that you end up with an |
For a mutable struct, this just turns into a bunch of pointers. To store this kind of structure in HDF5 I think we would need some kind of analog to pointers. Either we store the array index to what is being pointed at or we use references some how. That said I'm not sure if I can think of a way to do that generically. |
Actually, I think there are non-obvious ways to create multi-type self-referential cycles that still freeze the patched mutable struct B
x::Any
end
struct A
x::B
end
a = A(B(nothing))
a.x.x=a So to catch everything, the types encountered so far all need to be kept track of. Serializing this kind of stuff generically seems quite challenging. Maybe it is possible using the HDF5-native reference system? |
One solution would be to only support |
I think this would be fine if it errored instead of hung. I think we need a type id cache to break cycles. |
Self-referential structs are not |
My current solution is the following. import HDF5: hdf5_type_id, API
function hdf5_type_id(::Type{T}, isstruct::Val{true}) where {T}
cache = try
task_local_storage(:hdf5_type_id_cache)::Dict{DataType,Int}
catch
task_local_storage(:hdf5_type_id_cache, Dict{DataType, Int}())
end
if haskey(cache, T)
error("Cannot create a HDF5 datatype with fields containing that datatype.")
end
dtype = API.h5t_create(API.H5T_COMPOUND, sizeof(T))
cache[T] = dtype
try
for (idx, fn) in enumerate(fieldnames(T))
ftype = fieldtype(T, idx)
_hdf5_type_id = hdf5_type_id(ftype)
API.h5t_insert(dtype, Symbol(fn), fieldoffset(T, idx), _hdf5_type_id)
end
catch err
rethrow(err)
finally
delete!(cache, T)
end
return dtype
end |
julia> struct C{A}
x::A
end
julia> struct B{A}
x::C{A}
end
julia> struct A
x::B{A}
end
julia> HDF5.hdf5_type_id(A)
ERROR: Cannot create a HDF5 datatype with fields containing that datatype.
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:35
[2] hdf5_type_id(#unused#::Type{A}, isstruct::Val{true})
@ Main ./REPL[3]:8
[3] hdf5_type_id
@ ~/.julia/packages/HDF5/aiZLs/src/typeconversions.jl:71 [inlined]
[4] hdf5_type_id(#unused#::Type{C{A}}, isstruct::Val{true})
@ Main ./REPL[3]:15
[5] hdf5_type_id
@ ~/.julia/packages/HDF5/aiZLs/src/typeconversions.jl:71 [inlined]
[6] hdf5_type_id(#unused#::Type{B{A}}, isstruct::Val{true})
@ Main ./REPL[3]:15
[7] hdf5_type_id
@ ~/.julia/packages/HDF5/aiZLs/src/typeconversions.jl:71 [inlined]
[8] hdf5_type_id(#unused#::Type{A}, isstruct::Val{true})
@ Main ./REPL[3]:15
[9] hdf5_type_id(#unused#::Type{A})
@ HDF5 ~/.julia/packages/HDF5/aiZLs/src/typeconversions.jl:71
[10] top-level scope
@ REPL[13]:1 |
Any feedback on the above? |
It seems pretty bullet proof to me. I’m not an expert on |
Can we even write non- |
A mutable struct is not a bitstype. I'm not sure if we can write them yet, but I do think it would be possible in the future for us to write some simple mutable structs. If a mutable struct contains all bitstypes, then I think we can map it to a corresponding julia> struct Foo
x::Int
end
julia> mutable struct Bar
x::Int
end
julia> HDF5.Datatype(HDF5.hdf5_type_id(Foo))
HDF5.Datatype: H5T_COMPOUND {
H5T_STD_I64LE "x" : 0;
}
julia> HDF5.Datatype(HDF5.hdf5_type_id(Bar))
HDF5.Datatype: H5T_COMPOUND {
H5T_STD_I64LE "x" : 0;
}
julia> isbitstype(Foo)
true
julia> isbitstype(Bar)
false
julia> to_namedtuple(x::T) where T = NamedTuple{fieldnames(T)}(ntuple(i->getfield(x,i), fieldcount(T)))
to_namedtuple (generic function with 1 method)
julia> to_namedtuple(Bar(5))
(x = 5,) |
For example
For a recursive type, the function will keep calling itself, leading to an infinite loop. I am not a specialist, but I think what Julia does internally is to store the field
x
as a reference even thoughA
is immutable.As far as I know, there can be no self-reference cycles across multiple structs, so it is easy to detect this within a single call of
hdf5_type_id
. Maybe for now it is okay to just throw an error and fail gracefully in that case. I can implement that if wanted.One practical example for this problem appearing is when trying to save
Measurement
fromMeasurements.jl
. It contains a field ofMeasurements.Derivatives
which is self-referential.The text was updated successfully, but these errors were encountered: