Skip to content

Commit

Permalink
Guard against undefined partition worker pid
Browse files Browse the repository at this point in the history
I've observed a data race when starting a consumer from within another
consumer (using co-partitioned topics), where the call to
`get_partition_worker` fails due to badarg in `is_process_alive`.

It seems that 1f2290b ("Verify partition worker process is alive.",
2022-05-19) already tried to resolve the data race, but did not consider
the possibility that the lookup returns `undefined`.

Example exit report from Elixir logger:
```
Last message: {:EXIT, #PID<0.2613.0>, {:badarg, [{:erlang, :is_process_alive, [:undefined], [error_info: %{module: :erl_erts_errors}]}, {:brod_client, :get_partition_worker, 2, [file: ~c"/app/deps/brod/src/brod_client.erl", line: 496]}, ...MyCallChain]}}
```
  • Loading branch information
urmastalimaa committed Jun 14, 2024
1 parent aba0511 commit c7ec58d
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/brod_client.erl
Original file line number Diff line number Diff line change
Expand Up @@ -493,7 +493,7 @@ get_partition_worker(ClientId, Key) when is_atom(ClientId) ->
%% If the worker process is returned form ets,
%% but it is not alive then there must be
%% an in-flight worker deregistration request.
case is_process_alive(Pid) of
case is_pid(Pid) and is_process_alive(Pid) of
true -> {ok, Pid};
false -> get_partition_worker_with_ets(ClientId, Key)
end;
Expand Down

0 comments on commit c7ec58d

Please sign in to comment.