Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pathway_annotation error "An error occurred. Retrying......" #115

Open
jw0531jung opened this issue Aug 20, 2024 · 4 comments
Open

pathway_annotation error "An error occurred. Retrying......" #115

jw0531jung opened this issue Aug 20, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@jw0531jung
Copy link

jw0531jung commented Aug 20, 2024

Hello, Chen!

I got significant functions from two sites using "pathway_daa" and I want to perform pathway annotation.

When I run the following code:
httr::set_config(httr::config(ssl_verifypeer = FALSE))
annot <- pathway_annotation(pathway = "KO", daa_results_df = ald_sig, ko_to_kegg = TRUE)
httr::set_config(httr::config(ssl_verifypeer = TRUE))

I encounter the error message "An error occurred. Retrying......" repeatedly, and I have to forcefully terminate R.
The ald_sig contains fewer than 50 features (ko#).

I am not using a loop, so I don't think "sleep" is necessary.
Could this issue be caused by running this code frequently?

I am comparing a total of 9 sites, so I have already run this script more than 15 times to get results,
but suddenly it is not executing anymore.

Here are the details about my computer and package versions:
Microsoft Windows 10 Enterprise (x64-based PC)
ggpicurst2 (v1.7.3)
R (v4.3.3)

Thank you!

@jw0531jung jw0531jung added the bug Something isn't working label Aug 20, 2024
@jw0531jung
Copy link
Author

image

This is what's happening right now.. ㅠㅠ
Yesterday it worked fine, so it's really confusing..

@cafferychen777
Copy link
Owner

Hi @jw0531jung,

Thank you for providing more details about the issue you're experiencing. I have a question that might help us troubleshoot: Are you currently located in mainland China?

If you are in mainland China, this error could potentially be related to network connectivity issues. In such cases, some possible solutions include:

  1. Changing your VPN server/node
  2. Switching to a different network (e.g., from WiFi to cellular data)
  3. Using a different internet connection altogether

Network restrictions can sometimes interfere with the API calls that the package makes, leading to the error you're seeing. If you're not in mainland China, please let me know, and we can explore other potential causes and solutions.

Best,

@jw0531jung
Copy link
Author

jw0531jung commented Aug 21, 2024

Oh, I'm actually in South Korea, and since I work at a national research institute, it might be difficult to change the server or network. Also, the script didn't fail from the start. I was able to get results several times without any issues on the same day.

My workflow is as follows:

  1. Select significant pathways using ALDEx2 and MaAsLin2 methods in the "pathway_daa" function.
  2. Annotate each result using "pathway_annotation" function.
  3. Merge the results by features and select only the overlapping pathways.

In step 2, the MaAsLin2 results are successfully annotated, but the ALDEx2 results are not.
Since I only need the overlapping pathways anyway, I am merging the ALDEx2 results (ald_sig) with the annotated MaAsLin2 results (msl_sig_annot).
If I want to use ALDEx2 only for selection, I might not be able to perform the "pathway_annotation"..

With respect,

@cafferychen777
Copy link
Owner

Hi @jw0531jung,

Thank you for reporting this issue. Based on the code in pathway_annotation.R, I can suggest a solution to handle the KEGG API connection issues:

  1. Immediate Workaround Solution
# 1. First save your ALDEx2 results
saveRDS(ald_sig, "aldex2_results.rds")

# 2. Use this modified approach for annotation
annotate_with_retry <- function(daa_results) {
  # Set longer timeout
  options(timeout = 300)
  
  # Configure httr settings
  httr::set_config(httr::config(
    ssl_verifypeer = FALSE,
    timeout = 60,
    retries = 3
  ))
  
  # Try annotation in smaller batches
  features <- unique(daa_results$feature)
  batch_size <- 10
  batches <- split(features, ceiling(seq_along(features)/batch_size))
  
  results_list <- list()
  
  for(batch in batches) {
    batch_data <- daa_results[daa_results$feature %in% batch,]
    
    tryCatch({
      # Add delay between batches
      Sys.sleep(2)
      
      result <- pathway_annotation(
        pathway = "KO",
        daa_results_df = batch_data,
        ko_to_kegg = TRUE
      )
      
      results_list[[length(results_list) + 1]] <- result
    }, error = function(e) {
      message("Error in batch: ", paste(batch, collapse=", "))
      message("Error message: ", e$message)
    })
  }
  
  # Combine results
  do.call(rbind, results_list)
}

# 3. Run the modified function
annot <- annotate_with_retry(ald_sig)
  1. Alternative Approach Using Local Cache
# Create a simple caching mechanism
create_kegg_cache <- function(ko_ids) {
  cache <- list()
  
  for(ko in ko_ids) {
    tryCatch({
      Sys.sleep(1) # Rate limiting
      result <- KEGGREST::keggGet(ko)
      cache[[ko]] <- result
    }, error = function(e) {
      message("Failed to cache ", ko)
    })
  }
  
  saveRDS(cache, "kegg_cache.rds")
  return(cache)
}

# Use cached data
use_cached_annotation <- function(daa_results, cache_file = "kegg_cache.rds") {
  if(file.exists(cache_file)) {
    cache <- readRDS(cache_file)
    # Process using cache
    # ... annotation logic here ...
  } else {
    cache <- create_kegg_cache(unique(daa_results$feature))
  }
}
  1. Important Notes:
  • The issue might be related to KEGG API rate limits
  • Network stability in research institutions can affect API calls
  • Consider implementing local caching for frequently used pathways
  • The batch approach helps manage API requests better
  1. For Future Reference:
# Check your connection to KEGG
test_kegg_connection <- function() {
  tryCatch({
    test_result <- KEGGREST::keggGet("ko:K00001")
    return(TRUE)
  }, error = function(e) {
    message("KEGG connection failed: ", e$message)
    return(FALSE)
  })
}

I'll consider implementing these improvements in the next version of ggpicrust2. Would you be interested in testing the beta version with these enhancements?

Best regards,
Chen Yang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants