Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some simulations seem to crash occasionally #15

Open
MarshallAsch opened this issue Aug 4, 2021 · 7 comments
Open

Some simulations seem to crash occasionally #15

MarshallAsch opened this issue Aug 4, 2021 · 7 comments

Comments

@MarshallAsch
Copy link
Owner

MarshallAsch commented Aug 4, 2021

There seems to be some sort of memory error with how something to do with the IPv4Interface stack that causes simulations to occasionally crash, the source of this must be identified to ensure that these crashes are not causes other results to be incorrect.

This was run on commit 4d5a7d8 and took over 15 hours of running until the error came up valgrind stack trace:

==410235== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==410235== Command: /home/marshallasch/Documents/ns3_stuff/ns-3-allinone/ns-3.32/build/optimized/contrib/rhpman/examples/ns3.32-rhpman-example-optimized --runTime=2400 --waitTime=600 --lookupTime=30 --updateTime=120 --dataSize=512 --profileUpdateDelay=6 --totalNodes=200 --storageSpace=200 --bufferSpace=200 --wcdc=0.5 --wcol=0.5 --carryingThreshold=0.6 --forwardingThreshold=0.3 -
-hops=2 --replicationHops=4 --percentDataOwners=10 --areaWidth=1000 --areaLength=1000 --gridRows=4 --gridCols=4 --wifiRadius=100 --partitionNodes=8 --travellerVelocity=20 --travellerWalkMode=time --travellerWalkTime=100 --pbnVelocityMin=1 --pbnVelocityMax=10 --pbnVelocityChangeAfter=100 --routing=dsdv --travellerWalkDist=0 --RngRun=951
==410235==
==410235== Invalid read of size 1
==410235==    at 0x53C2760: ns3::Ipv4Interface::IsUp() const (ipv4-interface.cc:174)
==410235==    by 0x53C9974: ns3::Ipv4L3Protocol::SendRealOut(ns3::Ptr<ns3::Ipv4Route>, ns3::Ptr<ns3::Packet>, ns3::Ipv4Header const&) (ipv4-l3-protocol.cc:989)
==410235==    by 0x53CACAD: ns3::Ipv4L3Protocol::IpForward(ns3::Ptr<ns3::Ipv4Route>, ns3::Ptr<ns3::Packet const>, ns3::Ipv4Header const&) (ipv4-l3-protocol.cc:1084)
==410235==    by 0x53D83BB: ns3::MemPtrCallbackImpl<ns3::Ipv4L3Protocol*, void (ns3::Ipv4L3Protocol::*)(ns3::Ptr<ns3::Ipv4Route>, ns3::Ptr<ns3::Packet const>, ns3::Ipv4Header const&), void, ns3::Ptr<ns3::Ipv4Route>, ns3::Ptr<ns3::Packet const>, ns3::Ipv4Header const&, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty>::operator()(ns3::Ptr<ns3::Ipv4Route>, ns
3::Ptr<ns3::Packet const>, ns3::Ipv4Header const&) (callback.h:646)
==410235==    by 0x4922140: operator() (callback.h:1430)
==410235==    by 0x4922140: ns3::dsdv::RoutingProtocol::SendPacketFromQueue(ns3::Ipv4Address, ns3::Ptr<ns3::Ipv4Route>) (dsdv-routing-protocol.cc:1175)
==410235==    by 0x49228C6: ns3::dsdv::RoutingProtocol::LookForQueuedPackets() (dsdv-routing-protocol.cc:1148)
==410235==    by 0x49234A7: ns3::dsdv::RoutingProtocol::RouteOutput(ns3::Ptr<ns3::Packet>, ns3::Ipv4Header const&, ns3::Ptr<ns3::NetDevice>, ns3::Socket::SocketErrno&) (dsdv-routing-protocol.cc:307)
==410235==    by 0x540B668: ns3::UdpSocketImpl::DoSendTo(ns3::Ptr<ns3::Packet>, ns3::Ipv4Address, unsigned short, unsigned char) (udp-socket-impl.cc:621)
==410235==    by 0x540D38F: ns3::UdpSocketImpl::SendTo(ns3::Ptr<ns3::Packet>, unsigned int, ns3::Address const&) (udp-socket-impl.cc:810)
==410235==    by 0x4928BF5: ns3::dsdv::RoutingProtocol::SendTriggeredUpdate() (dsdv-routing-protocol.cc:861)
==410235==    by 0x6223329: ns3::DefaultSimulatorImpl::ProcessOneEvent() (default-simulator-impl.cc:151)
==410235==    by 0x622337D: ns3::DefaultSimulatorImpl::Run() (default-simulator-impl.cc:204)
==410235==  Address 0x1c is not stack'd, malloc'd or (recently) free'd
==410235==
==410235==
==410235== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==410235==  Access not within mapped region at address 0x1C
==410235==    at 0x53C2760: ns3::Ipv4Interface::IsUp() const (ipv4-interface.cc:174)
==410235==    by 0x53C9974: ns3::Ipv4L3Protocol::SendRealOut(ns3::Ptr<ns3::Ipv4Route>, ns3::Ptr<ns3::Packet>, ns3::Ipv4Header const&) (ipv4-l3-protocol.cc:989)
==410235==    by 0x53CACAD: ns3::Ipv4L3Protocol::IpForward(ns3::Ptr<ns3::Ipv4Route>, ns3::Ptr<ns3::Packet const>, ns3::Ipv4Header const&) (ipv4-l3-protocol.cc:1084)
==410235==    by 0x53D83BB: ns3::MemPtrCallbackImpl<ns3::Ipv4L3Protocol*, void (ns3::Ipv4L3Protocol::*)(ns3::Ptr<ns3::Ipv4Route>, ns3::Ptr<ns3::Packet const>, ns3::Ipv4Header const&), void, ns3::Ptr<ns3::Ipv4Route>, ns3::Ptr<ns3::Packet const>, ns3::Ipv4Header const&, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty>::operator()(ns3::Ptr<ns3::Ipv4Route>, ns
3::Ptr<ns3::Packet const>, ns3::Ipv4Header const&) (callback.h:646)
==410235==    by 0x4922140: operator() (callback.h:1430)
==410235==    by 0x4922140: ns3::dsdv::RoutingProtocol::SendPacketFromQueue(ns3::Ipv4Address, ns3::Ptr<ns3::Ipv4Route>) (dsdv-routing-protocol.cc:1175)
==410235==    by 0x49228C6: ns3::dsdv::RoutingProtocol::LookForQueuedPackets() (dsdv-routing-protocol.cc:1148)
==410235==    by 0x49234A7: ns3::dsdv::RoutingProtocol::RouteOutput(ns3::Ptr<ns3::Packet>, ns3::Ipv4Header const&, ns3::Ptr<ns3::NetDevice>, ns3::Socket::SocketErrno&) (dsdv-routing-protocol.cc:307)
==410235==    by 0x540B668: ns3::UdpSocketImpl::DoSendTo(ns3::Ptr<ns3::Packet>, ns3::Ipv4Address, unsigned short, unsigned char) (udp-socket-impl.cc:621)
==410235==    by 0x540D38F: ns3::UdpSocketImpl::SendTo(ns3::Ptr<ns3::Packet>, unsigned int, ns3::Address const&) (udp-socket-impl.cc:810)
==410235==    by 0x4928BF5: ns3::dsdv::RoutingProtocol::SendTriggeredUpdate() (dsdv-routing-protocol.cc:861)
==410235==    by 0x6223329: ns3::DefaultSimulatorImpl::ProcessOneEvent() (default-simulator-impl.cc:151)
==410235==    by 0x622337D: ns3::DefaultSimulatorImpl::Run() (default-simulator-impl.cc:204)
==410235==  If you believe this happened as a result of a stack
==410235==  overflow in your program's main thread (unlikely but
==410235==  possible), you can try to increase the size of the
==410235==  main thread stack using the --main-stacksize= flag.
==410235==  The main thread stack size used in this run was 8388608.
==410235==
@MarshallAsch
Copy link
Owner Author

This seems to only happen in the larger simulations, it has not happened in the relatively small number of simulations I have run since this came up, and after some un needed memory that was being used was removed

@MarshallAsch
Copy link
Owner Author

Never mind, this seemed to happen again on commit 0e8927b

@compscidr
Copy link
Collaborator

Might be helpful if you can print all of the parameters (including the particular seed for that run) at the start of that run so that you can cherry pick those specific parameters and reproduce it.

@MarshallAsch
Copy link
Owner Author

Might be helpful if you can print all of the parameters (including the particular seed for that run) at the start of that run so that you can cherry pick those specific parameters and reproduce it.

The SEM tool will capture all of the parameters used and the the exit code from the simulation code so I can go back and rerun the failed ones with all the same parameters. So far that has been more effective than when I printed out the parameters manually

@compscidr
Copy link
Collaborator

Sure makes sense - maybe attach them to the issue then in case they are relevant for reproducing.

Just to remind me - the SEM tool is the thing you use to organize the distribution of all of the simulations right? Are there instructions for how to do this in the repo somewhere?

@MarshallAsch
Copy link
Owner Author

I'll add some of the commands to the issue after lunch.

And sem is the tool, I can send the scrips that I have been using to run the simulations

@MarshallAsch
Copy link
Owner Author

Linked to this ns3 issue https://gitlab.com/nsnam/ns-3-dev/-/issues/503

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants