Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BrokenPipeError #987

Open
johan-sightic opened this issue Nov 22, 2022 · 1 comment
Open

BrokenPipeError #987

johan-sightic opened this issue Nov 22, 2022 · 1 comment
Labels

Comments

@johan-sightic
Copy link

The problem:
I'm trying to run extract_features on my data but keep getting BrokenPipeError. I tried it on two different computers (both with the same environment) with the same error. The dataset is quite large, merged DataFrame shape: (880169, 522), so it is expected to run for 20 hours. It runs for a few hours and then crashes,

Settings:

features = extract_features(
    merged_time_series,
    column_id="id",
    default_fc_parameters=ComprehensiveFCParameters(),
    n_jobs=15,
    impute_function=impute,
)

Error (repeated many times):

Process ForkPoolWorker-1:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
Traceback (most recent call last):                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 131, in worker                                                                                                                                                                                                                                                                                                                   
    put((job, i, result))                                                                                                                                                                                                                                                                                                                                                                   
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 131, in worker                                                                                                                                                                                                                                                                                                                   
    put((job, i, result))                                                                                                                                                                                                                                                                                                                                                                   
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 131, in worker                                                                                                                                                                                                                                                                                                                   
    put((job, i, result))                                                                                                                                                                                                                                                                                                                                                                   
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 131, in worker                                                                                                                                                                                                                                                                                                                   
    put((job, i, result))                                                                                                                                                                                                                                                                                                                                                                   
  File "/usr/lib/python3.10/multiprocessing/pool.py", line 131, in worker                                                                                                                                                                                                                                                                                                                   
    put((job, i, result))                                                                                                                                                                                                                                                                                                                                                                   
  File "/usr/lib/python3.10/multiprocessing/queues.py", line 377, in put                                                                                                                                                                                                                                                                                                                    
    self._writer.send_bytes(obj)                                                                                                                                                                                                                                                                                                                                                            
  File "/usr/lib/python3.10/multiprocessing/queues.py", line 377, in put                                                                                                                                                                                                                                                                                                                    
    self._writer.send_bytes(obj)                                                                                                                                                                                                                                                                                                                                                            
  File "/usr/lib/python3.10/multiprocessing/queues.py", line 377, in put                                                                                                                                                                                                                                                                                                                    
    self._writer.send_bytes(obj)                                                                                                                                                                                                                                                                                                                                                            
  File "/usr/lib/python3.10/multiprocessing/queues.py", line 377, in put                                                                                                                                                                                                                                                                                                                    
    self._writer.send_bytes(obj)                                                                                                                                                                                                                                                                                                                                                            
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 200, in send_bytes                                                                                                                                                                                                                                                                                                         
    self._send_bytes(m[offset:offset + size])                                                                                                                                                                                                                                                                                                                                               
  File "/usr/lib/python3.10/multiprocessing/queues.py", line 377, in put                                                                                                                                                                                                                                                                                                                    
    self._writer.send_bytes(obj)                                                               
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 200, in send_bytes                                                                                                           
    self._send_bytes(m[offset:offset + size])                                                  
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 200, in send_bytes                                                                                                           
    self._send_bytes(m[offset:offset + size])                                                                                                                                                                                                                                                                                                                                               
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 200, in send_bytes                                                                                                           
    self._send_bytes(m[offset:offset + size])                            
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 404, in _send_bytes                                                                                                          
    self._send(header)                                                       
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 404, in _send_bytes                                                                                                          
    self._send(header)                                                  
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 200, in send_bytes                                                                                                                                                                                                                                                                                                         
    self._send_bytes(m[offset:offset + size])                                                                                                                                                                                                                                                                                                                                               
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 404, in _send_bytes                                                                                                                                                                                                                                                                                                        
    self._send(header)                                                                                                                                                                        
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 404, in _send_bytes                                                                                                          
    self._send(header)                                                                                                                                                                        
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 368, in _send                 
    n = write(self._handle, buf)                                                                                                                                                              
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 368, in _send                                                                                                                                                                                                                                                                                                              
    n = write(self._handle, buf)                                                                                                                                                              
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 404, in _send_bytes                                                                                                          
    self._send(header)                                                                                                                                                                        
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 368, in _send                                                                                                                
    n = write(self._handle, buf)                                                                                                                                                                                                                                                                                                                                                            
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 368, in _send                                                                                                                                                                                                                                                                                                              
    n = write(self._handle, buf)                                                                                                                                                                                                                                                                                                                                                            
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 368, in _send                                                                                                                
    n = write(self._handle, buf)                                                                                                                                                              
BrokenPipeError: [Errno 32] Broken pipe

Anything else we need to know?:
I also tried running it with a smaller chunksize and fewer jobs, but with no change.

features = extract_features(
    merged_time_series,
    column_id="id",
    default_fc_parameters=ComprehensiveFCParameters(),
    n_jobs=8,
    impute_function=impute,
    chunksize=1,
)

Environment:

  • Python version: 3.10
  • Operating System: Ubuntu 22.04
  • tsfresh version: 0.19.0
  • Install method (conda, pip, source): pip
@nils-braun
Copy link
Collaborator

Hi @johan-sightic !
Thanks for filing the issue and sorry for the long delay!
The "Broken pipe" typically means that the worker processes have been killed for some reason by the OS. Most likely, this is due to memory issues.
If your data consists of multiple IDs I would recommend you produce the features maybe in chunks of identifiers.
If your data consists of just a single ID, you might either want to use a bigger machine or produce features for windows of the data (this is different from the features for the full data, but maybe your use-case allows for this)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants