Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trouble with env initialization, can't access Times Trials properly ? #98

Open
AdrianValente13 opened this issue Apr 8, 2023 · 6 comments

Comments

@AdrianValente13
Copy link

Hello !
I have a trouble with Gym-mupen64plus environnement and i don't know what to do.
When i start my project with Docker Container, the game doesn't go in Times Trials mode after the env initialization.
When i tested it with my agent script, it often bring me to the menu and my train code is executed there, which gives random behavior, like this video shows :
https://youtu.be/khSQw54nHMc

I tried to run the example script too (in case it's my code who has trouble), but it brings me all the time in GrandPrix and i have an incorrect HUD :/ Like this video shows :
https://youtu.be/phmREvLZGM0

I have the good version of Gym (0.7.4) and the other librairies (i used the DockerFile of this repo as a base), and my DockerFile set up the 3.7.9 Python Version.

What i wonder is... where is the problem ? Seems the env initailization (like the buttons presess for navigate through menu) doesn't work properly for me, is this a problem of frames ? It looks like that the script execute button too faster for the game, if we look at the container logs in the vids ?
Thanks in advance for your responses, have a nice day !

@bzier
Copy link
Owner

bzier commented Apr 23, 2023

Hi @AdrianValente13 , apologies for the delayed response. I have not actively worked on this project in a few years. The behavior is odd, and your theory is basically what I was thinking too. It seems that the button presses are happening too quickly. The way it is supposed to work is that it waits a certain number of frames for the emulator to step through the necessary frames to where the menu begins to 'listen' for button presses, and then send the appropriate controller buttons to do the navigation.

I wonder if the ROM you have is different. Can you confirm the MarioKart64 ROM file has an MD5 checksum of e19398a0fd1cc12df64fca7fbcaa82cc?

Also, if I remember correctly, I only ever had this running on Python2 and never got it upgraded to Python3 (see #81). Perhaps there is some issue there causing the frame synchronization issue.

Otherwise, I wonder if the mupen64plus emulator may have been upgraded and the number of frames has changed in some way. Is this the Dockerfile you are using? I notice that you have a different base Ubuntu image, which could install a different version of mupen64plus.

Off the top of my head, those are the three things that come to mind. I can't commit to spending any time on this right now, but if I happen to get the chance to try it, I will certainly update you.

@bzier
Copy link
Owner

bzier commented Apr 23, 2023

For reference, here's the code that handles the menu navigation:

def _navigate_menu(self):
self._wait(count=10, wait_for='Nintendo screen')
self._press_button(ControllerState.A_BUTTON)
self._wait(count=68, wait_for='Mario Kart splash screen')
self._press_button(ControllerState.A_BUTTON)
self._wait(count=68, wait_for='Game Select screen')
self._navigate_game_select()
self._wait(count=14, wait_for='Player Select screen')
self._navigate_player_select()
self._wait(count=31, wait_for='Map Select screen')
self._navigate_map_select()
self._wait(count=46, wait_for='race to load')
# Change HUD View twice to get to the one we want:
self._cycle_hud_view(times=2)
# Now that we have the HUD as needed, reset the race so we have a consistent starting frame:
self._reset_during_race()

and the game select navigation (grand prix vs time trials):

def _navigate_game_select(self):
# Select number of players (1 player highlighted by default)
self._press_button(ControllerState.A_BUTTON)
self._wait(count=3, wait_for='animation')
# Select GrandPrix or TimeTrials (GrandPrix highlighted by default - down to switch to TimeTrials)
self._press_button(ControllerState.JOYSTICK_DOWN)
self._wait(count=3, wait_for='animation')
# Select TimeTrials
self._press_button(ControllerState.A_BUTTON)
# Select Begin
self._press_button(ControllerState.A_BUTTON)
# Press OK
self._press_button(ControllerState.A_BUTTON)

There are a couple other methods like those that handle other aspects (player & map selection), all nearby in that file.

@bzier
Copy link
Owner

bzier commented Apr 23, 2023

Rewatching your video using the "example script", it appears to me that you have the 4 components running in docker-compose, and that the emulator is up already (mario-kart-agent_emulator_1), where you are able to see it with the VNC viewer. However, when you open the agent container's logs, it looks like it is also attempting to start an embedded emulator. It should be respecting the env var - EXTERNAL_EMULATOR=True (see here) and not running its own emulator.

I suspect you actually have two emulators running (one in its own container, and one embedded in the agent), both connected to the same controller server, and both requesting the controls. The controller server (i.e. the agent) believes it has progressed through the frames because two emulators have both asked for the state of the controller. It is basically causing it to progress through the frames at twice the rate than it normally would, and each emulator is only receiving half the controller button pushes than it needs.

Here's the code that checks the env var and chooses to start the embedded emulator or not:

# If the EXTERNAL_EMULATOR environment variable is True, we are running the
# emulator out-of-process (likely via docker/docker-compose). If not, we need
# to start the emulator in-process here
external_emulator = os.environ.has_key("EXTERNAL_EMULATOR") and os.environ["EXTERNAL_EMULATOR"] == 'True'
if not external_emulator:
self.xvfb_process, self.emulator_process = \
self._start_emulator(rom_name=self.config['ROM_NAME'],
gfx_plugin=self.config['GFX_PLUGIN'],
input_driver_path=self.config['INPUT_DRIVER_PATH'])

I would suggest troubleshooting why the agent is starting its own emulator. If you can ensure only one instance is started, it should work better 🤞

@AdrianValente13
Copy link
Author

AdrianValente13 commented Apr 23, 2023

Hello @bzier, thank you for your responses :D !

-Indeed, i've checked the MD5 (with Mupen64++, i haven't other tools to check that) and the hash is different : 3A67D9986F54EB282924FCA4CD5F6DFF
After i post my message here, i've redownloaded the base repo here (to have a proper repo') for retesting the example.py and... even with my rom, it seems to work ! I noticed that the problem i posted here appears when i'll add some imports and code (that maybe desync' the navigates menu process ?). But yeah, i'll should better use a proper ROM in the first place.

-I use the DockerFile you've linked yeah ! That was base of one i found in this repo but... if the base image is different, maybe it's adapted to newer Python version ?

-Thanks for the references about the navigation, i tried to edit some values here but it was not constantly solved the problem :')

So, the "solution" i've found for now, is to generate a savestate file with Mupen64plus (outside the project, i have not the code to generate savestate and there is no code about that in this repo') when the game load the race i will and load the savestate in the agent container (in docker-compose), like :
'--savestate','/src/gym_mupen64plus/gym_mupen64plus/ROMs/Mario Kart 64 (U) [!].st0',
It worked but, that is not the main problem i have sadly...

About the multiples emulators, maybe i have trouble with that yeah, i'll check about that !

My initial goal was to implement A3C Agent with that structure. The first attempt i made is to create multiple Agent in the code (for loop, multiple processes with Pytorch) but... when there is multiple environnement on GymMupen64plus, there is some problems with port and ip adresses. I didn't know how to adapt... Is there a way to be sure that the port and IP are different for all initialization of servers ?

So what i tried after, is to create a structure of multiple agent with Docker, like here (with a container that contain Master Agent, and others containers Agent that i can initialize with --scale).
But... like the A3C algorithm works with shared memory (for the global model and optimizer), i communicate between containers with files write on shared disk and read them when necessary. However, it seems not to work well, my results are not good, and i'm not sure that the differents ControllerServer tried to communicate with the same emulator or not.

So yeah, i think the main problem i have is about multiple instances of the env, maybe there is something i can't catch about that.

@bzier
Copy link
Owner

bzier commented Apr 30, 2023

This was how I had done A3C training on Mario Kart years ago: mario-kart-agent. (I'm not sure what condition the repo is in. I had forked the openai/universe-starter-agent repo and that one was deprecated a while ago.)

It implemented the A3C algorithm, and used multiple worker processes to train. I dockerized them so each worker could run in its own container. Importantly, each one is not using EXTERNAL_EMULATOR, meaning that each one starts its own emulator inside its container - along with its own controller server, Xvfb virtual screen, etc. In that way, all the workers are independent and there is no conflict with IPs or ports. The docker-compose file defines the parameter server container, and the worker container (which can be scaled). Other than the interaction with the parameter server, and participation in the same Tensorflow cluster, the environments are independent.

If you want to just run a single agent, the docker-compose file in this repo is a clean way to do that and to separate the different processes. However, if you want to run multiple independent agents (i.e. not multiplayer, but multiple separate emulators), it is probably easiest to leave it embedded without using EXTERNAL_EMULATOR, and not running an emulator container.

Hope that is somewhat helpful. Let me know if I can clarify any of that further.

@AdrianValente13
Copy link
Author

Ow i see ! Thank you very much for your message ! :D
I've seen your repo but it's more clear with your explanation, i think i'm going to read your code further and i'll tell you if i've questions about that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants