-
Notifications
You must be signed in to change notification settings - Fork 150
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #287 from massimoaria/develop
Develop to CRAN
- Loading branch information
Showing
33 changed files
with
3,996 additions
and
3,564 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
#' Completeness of bibliographic metadata | ||
#' | ||
#' It calculates the percentage of missing data in the metadata of a bibliographic data frame. | ||
#' | ||
#' Each metadata is assigned a status c("Excellent," "Good," "Acceptable", "Poor", "Critical," "Completely missing") | ||
#' depending on the percentage of missing data. In particular, the column *status* classifies the percentage of missing | ||
#' value in 5 categories: "Excellent" (0%), "Good" (0.01% to 10.00%), "Acceptable" (from 10.01% to 20.00%), | ||
#' "Poor" (from 20.01% to 50.00%), "Critical" (from 50.01% to 99.99%), "Completely missing" (100%). | ||
#' | ||
#' The results of the function allow us to understand which analyses can be performed with bibliometrix | ||
#' and which cannot based on the completeness (or status) of different metadata. | ||
#' @param M is a bibliographic data frame obtained by \code{\link{convert2df}} function. | ||
#' | ||
#' @return The function \code{missingData} returns a list containing two objects: | ||
#' \tabular{lll}{ | ||
#' \code{allTags} \tab \tab is a data frame including results for all original metadata tags from the collection\cr | ||
#' \code{mandatoryTags}\tab \tab is a data frame that included only the tags needed for analysis with bibliometrix and biblioshiny.} | ||
#' | ||
#' @examples | ||
#' data(scientometrics, package = "bibliometrixData") | ||
#' res <- missingData(scientometrics) | ||
#' print(res$mandatoryTags) | ||
#' | ||
#' @export | ||
#' | ||
missingData <- function(M) { | ||
cols <- names(M) | ||
missing_counts <- sapply(cols, function(x){ | ||
sum(is.na(M[,x]) | M[,x] %in% c("NA,0000,NA","NA","")) | ||
}) | ||
missing_pct <- round(missing_counts/nrow(M) * 100, 2) | ||
df_all <- data.frame(cols, missing_counts, missing_pct) | ||
|
||
tag <- unlist( | ||
strsplit( | ||
"AB,AU,C1,CR,DE,DI,DT,ID,LA,NR,PY,RP,SO,TC,TI,WC","," | ||
) | ||
) | ||
description <- trimws(unlist( | ||
strsplit( | ||
"Abstract, Author,Affiliation,Cited References,Keywords,DOI,Document Type,Keywords Plus,Language,Number of Cited References, | ||
Publication Year,Corresponding Author, Journal, Total Citation, Title, Science Categories", "," | ||
) | ||
)) | ||
|
||
df_all <- df_all %>% | ||
mutate(status = status(missing_pct)) %>% | ||
replace_na(replace = list(missing_counts = nrow(M), missing_pct = 100)) | ||
|
||
df_tags <- data.frame(tag, description) %>% | ||
left_join(df_all, by = c("tag" = "cols")) %>% | ||
replace_na(replace = list(missing_counts = nrow(M), missing_pct = 100, status = "Completely missing")) %>% | ||
arrange(missing_pct,description) | ||
|
||
results <- list(allTags=df_all, mandatoryTags=df_tags) | ||
return(results) | ||
} | ||
|
||
status <- function(x){ | ||
y <- character(length(x)) | ||
y[x==0] <- "Excellent" | ||
y[x>0 & x<= 10] <- "Good" | ||
y[x>10 & x<= 20] <- "Acceptable" | ||
y[x>20 & x<=50] <- "Poor" | ||
y[x>50 & x<100] <- "Critical" | ||
y[is.na(x) | x==100] <- "Completely missing" | ||
return(y) | ||
} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
toUpper <- function(D){ | ||
stringr::str_to_upper(D, locale = "en") | ||
stringi::stri_trans_toupper(D, locale = "en") | ||
} |
Oops, something went wrong.