Análisis cualitativo y cuantitativo de la publicación científica en América Latina sobre la base de datos de DOAJ (2025)

Objetivo

En este sitio se ofrecen datos y visualizaciones sobre la publicación científica en América Latina minados del Directory of Open Access Journals (DOAJ) para el año 2025. Para ello toma los datos de una notebook en R. La notebook consume los datos en tiempo real; por ende, su carga es más lenta. Los gráficos que allí se ofrecen deben correrse de forma manual (Run all cells).

Librerías a utilizar

Cada una de estas librerías se utilizan para realizar análisis estadístico y visualizaciones de la base de datos generada por DOAJ en formato CSV.

# URL oficial de DOAJ
url_doaj <- "https://doaj.org/csv"
# Descargar del CSV
temp <- tempfile(fileext = ".csv")
download.file(url_doaj, destfile = temp, method = "libcurl")

# Lectura de la BBDD
journal <- read.csv(temp, stringsAsFactors = FALSE)

# transformación en tibble
journal <- as_tibble(journal)
head(journal)
# A tibble: 6 × 51
  Journal.title Journal.URL URL.in.DOAJ When.did.the.journal…¹ Alternative.title
  <chr>         <chr>       <chr>                        <int> <chr>            
1 Nyimak: Jour… http://jur… https://do…                   2017 ""               
2 MCBS (Molecu… https://ce… https://do…                   2017 ""               
3 Acta Univers… https://ka… https://do…                   2014 "AUC Philologica"
4 Cuadernos pa… https://re… https://do…                   2016 "CILH"           
5 Enfances, Fa… https://jo… https://do…                   2014 ""               
6 RUDN Journal… http://jou… https://do…                   2008 "Vestnik Rossijs…
# ℹ abbreviated name:
#   ¹​When.did.the.journal.start.to.publish.all.content.using.an.open.license.
# ℹ 46 more variables: Journal.ISSN..print.version. <chr>,
#   Journal.EISSN..online.version. <chr>, Keywords <chr>,
#   Languages.in.which.the.journal.accepts.manuscripts <chr>, Publisher <chr>,
#   Country.of.publisher <chr>, Other.organisation <chr>,
#   Country.of.other.organisation <chr>, Journal.license <chr>, …

Selección de columnas de la BBDD

head(journal[, c("Journal.title", "Subjects", "Keywords")], 15)
# A tibble: 15 × 3
   Journal.title                                              Subjects  Keywords
   <chr>                                                      <chr>     <chr>   
 1 Nyimak: Journal of Communication                           Language… communi…
 2 MCBS (Molecular and Cellular Biomedical Sciences)          Medicine… biomedi…
 3 Acta Universitatis Carolinae: Philologica                  Language… philolo…
 4 Cuadernos para la Investigación de la Literatura Hispánica Language… spanish…
 5 Enfances, Familles, Générations                            Geograph… gender …
 6 RUDN Journal of Political Science                          Politica… politic…
 7 Éducation et Socialisation                                 Educatio… educati…
 8 RUDN Journal of Russian History                            History … russian…
 9 Turkish Journal of Bioscience and Collections              Science:… bioscie…
10 The Rehabilitation Journal                                 Social S… speech-…
11 Transitare                                                 Social S… tourism…
12 European Journal of Biology                                Science:… biology…
13 Revue d'ethnoécologie                                      Geograph… anthrop…
14 Cultura, Educación,  Sociedad                              Educatio… human s…
15 Tarih Dergisi                                              History … history 
print(paste("Total de publicaciones en la base de datos de DOAJ:", length(journal$Journal.title)))
[1] "Total de publicaciones en la base de datos de DOAJ: 22431"

Manipulación y limpieza de la base de datos

Para renombrar las columnas

colnames(journal)
 [1] "Journal.title"                                                              
 [2] "Journal.URL"                                                                
 [3] "URL.in.DOAJ"                                                                
 [4] "When.did.the.journal.start.to.publish.all.content.using.an.open.license."   
 [5] "Alternative.title"                                                          
 [6] "Journal.ISSN..print.version."                                               
 [7] "Journal.EISSN..online.version."                                             
 [8] "Keywords"                                                                   
 [9] "Languages.in.which.the.journal.accepts.manuscripts"                         
[10] "Publisher"                                                                  
[11] "Country.of.publisher"                                                       
[12] "Other.organisation"                                                         
[13] "Country.of.other.organisation"                                              
[14] "Journal.license"                                                            
[15] "License.attributes"                                                         
[16] "URL.for.license.terms"                                                      
[17] "Machine.readable.CC.licensing.information.embedded.or.displayed.in.articles"
[18] "Author.holds.copyright.without.restrictions"                                
[19] "Copyright.information.URL"                                                  
[20] "Review.process"                                                             
[21] "Review.process.information.URL"                                             
[22] "Journal.plagiarism.screening.policy"                                        
[23] "URL.for.journal.s.aims...scope"                                             
[24] "URL.for.the.Editorial.Board.page"                                           
[25] "URL.for.journal.s.instructions.for.authors"                                 
[26] "Average.number.of.weeks.between.article.submission.and.publication"         
[27] "APC"                                                                        
[28] "APC.information.URL"                                                        
[29] "APC.amount"                                                                 
[30] "Journal.waiver.policy..for.developing.country.authors.etc."                 
[31] "Waiver.policy.information.URL"                                              
[32] "Has.other.fees"                                                             
[33] "Other.fees.information.URL"                                                 
[34] "Preservation.Services"                                                      
[35] "Preservation.Service..national.library"                                     
[36] "Preservation.information.URL"                                               
[37] "Deposit.policy.directory"                                                   
[38] "URL.for.deposit.policy"                                                     
[39] "Persistent.article.identifiers"                                             
[40] "Does.the.journal.comply.to.DOAJ.s.definition.of.open.access."               
[41] "Continues"                                                                  
[42] "Continued.By"                                                               
[43] "LCC.Codes"                                                                  
[44] "Subscribe.to.Open"                                                          
[45] "Mirror.Journal"                                                             
[46] "Open.Journals.Collective"                                                   
[47] "Subjects"                                                                   
[48] "Added.on.Date"                                                              
[49] "Last.updated.Date"                                                          
[50] "Number.of.Article.Records"                                                  
[51] "Most.Recent.Article.Added"                                                  

Creación nuevo tibble para manipular la información, según criterios de análisis

journal.select <- journal%>% select(Journal.title, Country.of.publisher, Languages.in.which.the.journal.accepts.manuscripts, Journal.license, Publisher, Review.process, Subjects, APC, Persistent.article.identifiers, Keywords, Added.on.Date)

Renombrar y simplificar nombre de columnas

journal.select <- journal.select %>%
  rename(title = Journal.title) %>%
  rename(country = Country.of.publisher) %>%
  rename(language = Languages.in.which.the.journal.accepts.manuscripts) %>%
  rename (License = Journal.license) %>%
  rename (Review = Review.process) %>%
  rename (Ids = Persistent.article.identifiers) %>%
  rename (Added = Added.on.Date)

Conversión de la columna Added.on.Date en formato año-mes-día

journal.select$Added <- as.Date(journal.select$Added) 

Limpieza de la columna “Subjects”

Se realiza una limpieza y selección de las primeras dos palabras, dentro de los temas de las publicaciones para falicitar manipulación, análisis y visualizaciones.

journal.select$Subjects <- str_extract(journal.select$Subjects, "\\w+(?:[^\\w]+\\w+){0,1}")
#eliminación signos de puntuación
journal.select$Subjects <- gsub("[[:punct:]]", "", journal.select$Subjects)
journal.select <- journal.select %>%
  mutate(country = trimws(country),
         country = case_when(
      str_detect(country, "Bolivia")   ~ "Bolivia",
      str_detect(country, "Venezuela") ~ "Venezuela",
      str_detect(country, "Russian")   ~ "Russia",
      str_detect(country, "Iran")      ~ "Iran",
      str_detect(country, "Korea")     ~ "Korea",
      str_detect(country, "Moldova")   ~ "Moldova",
      str_detect(country, "Congo")     ~ "Congo",
      str_detect(country, "Tanzania")  ~ "Tanzania",
      str_detect(country, "Palestine") ~ "Palestine",
      TRUE ~ country)) %>%
  mutate(Subjects = str_trim(Subjects)) %>%
  mutate(Subjects = sapply(Subjects, function(x) {
    words <- str_split(x, "\\s{2,}|,|\\s*\\band\\b\\s*|\\s+")[[1]]  # dividir por palabras, "and", comas, o espacios
    words <- unique(words[words != ""])  # eliminar vacíos y duplicados
    paste(words, collapse = " ")
  }))

Limpieza columnas “Language” e “Ids”, identificadores persistentes

Se ejecuta una función para que organice los idiomas/ids, lo que facilita las visualizaciones y la manipulación del tibble.

# Crear una función para ordenar los idiomas en una lista
sort_columns <- function(column_list) {
  sorted_columns <- sort(unlist(strsplit(column_list, ", ")))
  return(paste(sorted_columns, collapse = ", "))
}
# Aplicar la función a cada celda en la columna 'language'
journal.select <- journal.select %>%
  mutate(across(c(language, Ids), ~ sapply(.x, sort_columns)))

Función caption

# Función caption
#an_actual <- format(Sys.Date(), "%Y")
add_caption <- function(author = "Romina De León y Gimena del Rio", year = format(Sys.Date(), "%Y")) {
  paste0("Citar como: ", author, ", ", year, 
         ". Análisis de revistas latinoamericanas en DOAJ.")
}

Total de revistas en DOAJ

Creación de un df con porcentajes y cantidades de publicaciones por países

porcen_journal <- journal.select %>%
  group_by(country)%>%
  count()%>%
  ungroup()%>%
  mutate(percentage= round(n / sum(n) * 100, 2)) %>%
  bind_rows(data.frame(country = "Total", n = NA, percentage = sum(.$percentage)))
porcen_journal[order(porcen_journal$n, decreasing = TRUE),] 
# A tibble: 141 × 3
   country            n percentage
   <chr>          <int>      <dbl>
 1 Indonesia       2612      11.6 
 2 United Kingdom  2255      10.0 
 3 Brazil          1456       6.49
 4 United States   1304       5.81
 5 Iran            1043       4.65
 6 Spain           1004       4.48
 7 Poland           964       4.3 
 8 Switzerland      795       3.54
 9 Russia           649       2.89
10 Türkiye          645       2.88
# ℹ 131 more rows

Minería y visualización de datos

Este apartado ofrece gráficas que buscan comparar el contexto de la publicación científica a nivel global con el de de América Latina. Se revisan cuestiones relacionadas con cantidad de revistas, disciplinas,editoriales, lengua de publicación y cobro de APC

Georreferenciación de las revistas a nivel global

p1 <- suppressWarnings(
ne_countries(scale = "large", returnclass = "sf") %>% 
left_join(
    porcen_journal %>%
    filter(!country %in% c("Total")) %>%
      mutate(
             country = trimws(country),
             country_std = case_when(
                country %in% c("United States", "USA") ~ "United States of America",
                TRUE ~ countrycode(country, origin = "country.name", destination = "country.name")
                ),
             country_std = coalesce(country_std, country),
             color_point = case_when( n <= 20 ~ "#91A1AF",
                                      n <= 50 ~ "#21BCFF",
                                      n <= 100 ~ "#F5276C",
                                      n <= 200 ~ "#2C0995",
                                      n <= 300 ~ "#27F5B0",
                                      n <= 450 ~ "#009E3F",
                                      n <= 600 ~ "#2e86c1",
                                      n <= 800 ~ "#1b4f72",
                                      n <= 1000 ~ "#884ea0",
                                      n <= 1250 ~ "#a569bd",
                                      n <= 1500 ~ "#af7ac5",
                                      n <= 1750 ~ "#d98880",
                                      n <= 2500 ~ "#8965F6",
                                      n <= 3500 ~"#729509",
                                      TRUE ~ "darkblue"
    )
      ),
    by = c("name" = "country_std")
  ) %>% 
  filter(!is.na(geometry)) %>%
  mutate(point_geom = st_point_on_surface(geometry),
         tooltip_text = paste0("<strong>", name, "</strong><br/>Revistas: ", n)) %>%
  ggplot() +
  geom_sf(fill = "gray90", color = "white", size = 0.1) +
  geom_point_interactive(
    aes(
      geometry = point_geom, 
      size = n,
     color = color_point,
     tooltip = tooltip_text,
     data_id = name
    ),
    stat = "sf_coordinates",
    alpha = 0.7
  ) +
  scale_size_continuous(range = c(2, 10), guide = "none") +
  scale_color_identity(
  name = "Rangos de n",
  guide = "legend",
  breaks = c("#91A1AF", "#21BCFF", "#F5276C", "#2C0995",
             "#27F5B0", "#009E3F", "#2e86c1", "#1b4f72",
             "#884ea0", "#a569bd", "#af7ac5", "#d98880",
             "#8965F6", "#729509", "darkblue"),
  labels = c("≤ 20", "21–50", "51–100", "101–200",
             "201–300", "301–450", "451–600", "601–800",
             "801–1000", "1001–1250", "1251–1500", "1501–1750",
             "1751–2500", "2501–3500", "> 3500")
)
 +
  labs(
    title = "Distribución de revistas por país",
      x = NULL,
      y = NULL,
    caption = add_caption()
  ) +
  theme_minimal() +
  theme(plot.title = element_text(size = 18, face = "bold"),
  legend.title = element_text(size = 14, face = "bold"),
  legend.text = element_text(size = 12),
  legend.position = "bottom"))

girafe(
  ggobj = p1,
  options = list(
    opts_hover(css = "fill-opacity:1;stroke:black;stroke-width:1pt;"),
    opts_tooltip(css = "background-color:white;color:black;padding:5px;border-radius:5px;font-family:sans-serif;"),
    opts_toolbar(saveaspng = TRUE)
  ),
  width_svg = 10,
  height_svg = 7
)

Línea temporal según año de ingreso a DOAJ

p2 <- journal.select %>%
  mutate( year_month = floor_date(Added, "year")) %>%
  count(year_month) %>%
  ggplot(aes(x = year_month, y = n, group = 1)) +
  geom_line(color = "#2c0fb1", size = 0.6) +
  geom_point(
    aes(
      text = paste(
        "Año:", format(year_month, "%Y"),
        "<br>Incorporaciones:", n
      )
    ),
    color = "#f93b20", size = 1) +
  theme_bw(base_size = 12) +
  scale_x_date(date_labels = "%Y", date_breaks = "2 year") +
  labs(x = "Trimestres del año", y = "Cantidad de incorporaciones", 
       title = "Línea de tiempo sobre incorporación de publicaciones a DOAJ", caption = add_caption()) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
       )
       axis.text.y = element_text(hjust = 1) 
ggplotly(p2, tooltip = "text")

Desagregado por países, según cantidad de publicaciones

p3 <- journal.select %>%
  mutate( year = floor_date(Added, "year")
  ) %>%
  count(country, year) %>% 
  group_by(country) %>%
  filter(sum(n) >= 100) %>%  
  ungroup() %>% 
  ggplot(aes(x = year, y = country, fill = n)) +
  geom_tile_interactive(
    aes(
      # Tooltip con información detallada
      tooltip = paste0(
        "<b>País:</b> ", country, "<br>",
        "<b>Año:</b> ", format(year, "%Y"), "<br>",
        "<b>Cantidad:</b> ", n
      ),
      # data_id único para cada celda (combinación país-año)
      data_id = paste0(country, year)
    ),
    color = "white" # Bordes blancos entre celdas
  ) +
  
  scale_fill_viridis_c(option = "C") +
  labs(
    x = "Año de incorporación a DOAJ",
    y = "País",
    title = "Registro de incorporación de publicaciones por país y año",
    fill = "Cantidad",
    caption = tryCatch(add_caption(), error = function(e) "") 
  ) +
  theme_classic(base_size = 12) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1),
    legend.position = "right"
  )

# Renderizado interactivo
girafe(
  ggobj = p3,
  width_svg = 10,
  height_svg = 6,
  options = list(
    opts_tooltip(css = "background-color:black; color:white; padding:5px; border-radius:5px;"),
    opts_hover(css = "stroke:black; stroke-width:2px;")))

Lenguas de publicación de las revistas indexadas en DOAJ

  • Primero gráfico: Entre 30 y 800 publicaciones por país
  • Segundo gráfico: Más de 800 publicaciones por país
p4 <- journal.select %>%
  group_by(language) %>%
  count() %>%
  filter(n >= 25 & n <= 800 ) %>%
  mutate(language = as.character(fct_reorder(language, n))) %>%  
  ggplot(aes(x = n, 
    y = language,
    fill = n)) +
    geom_col_interactive(
    aes(
      # tooltip
      tooltip = paste0("Idioma: ", language, "\nCantidad: ", n),
      data_id = language
    ),
    show.legend = FALSE
  ) +
  #  viridis
  scale_fill_viridis_c(option = "D") +
  # Títulos y captions
  labs(
    title = "Idiomas de publicación de revistas en todo el mundo",
    x = "Cantidad de revistas",
    y = "Idiomas",
    caption = tryCatch(add_caption(), error = function(e) "") 
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5, face = "bold"),
    panel.grid.major.y = element_blank() # Limpia lineas horizontales
  )


girafe(
  ggobj = p4,
  width_svg = 8,
  height_svg = 6,
  options = list(
    opts_hover(css = "fill-opacity: 1; stroke: black; stroke-width: 1.5px;"),
    opts_hover_inv(css = "opacity: 0.2;") # efecto de desvanecer
  )
)
p5 <- journal.select %>%
  group_by(language) %>%
  count() %>%
  filter(n >= 800 ) %>%
  mutate(language = as.character(fct_reorder(language, n))) %>% 
  ggplot(aes(x = n, y = fct_reorder(language, n), fill = n)) +
    geom_col_interactive(aes(
      tooltip = paste0("Idioma: ", language, "\nCantidad: ", n),
      data_id = language
    ),
    show.legend = FALSE ) +
  scale_fill_gradient_interactive(
  low = "aquamarine", high = "purple",
  labels = scales::comma_format(big.mark = ".", decimal.mark = ",") 
) + 
  labs(
    title = "Idiomas de publicación de revistas en todo el mundo",
    x = "Cantidad de revistas",
    y = "Idiomas",
    caption = tryCatch(add_caption(), error = function(e) "") 
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5, face = "bold"),
    panel.grid.major.y = element_blank())

girafe(
  ggobj = p5,
  width_svg = 8,
  height_svg = 5,
  options = list(
    opts_hover(css = "fill-opacity: 0.8; stroke: black; stroke-width: 1px;")))
#saveWidget(p5, "idiomas_pub_DOAJ_may800.html", selfcontained = FALSE)

Gráfico considerando lenguas de publicación de las revistas y país

p6 <- journal.select %>%
  group_by(country, language) %>%
  count() %>%
  ungroup() %>%
  filter(n >= 75 & n <= 5000) %>%
  mutate(language = fct_reorder(language, n)) %>%
  plot_ly(x = ~country, y = ~language, z = ~n,
        type = "heatmap", colors = viridisLite::viridis(100, option = "C"), 
        hovertemplate = paste(
      "País: %{y}<br>",
      "Idioma: %{x}<br>",
      "Publicaciones: %{z}<extra></extra>")
    ) %>%
  layout(title = list(
      text = "Publicaciones con discriminación de idiomas según países",
      x = 0.5,  
      xanchor = "center"
    ),
    xaxis = list(tickangle = -45, tickfont = list(size = 8)),
    yaxis = list(tickangle = -25, tickfont = list(size = 8)),
    margin = list(l = 100, r = 20, t = 50, b = 50), 
    annotations = list(
      list(
        x = 0.35, y = -0.39, 
        xref = "paper", yref = "paper",
        showarrow = FALSE,
        xanchor = "left",
        yanchor = "top",
        font = list(size = 8),
        text = add_caption()
      )
    )
  )

p6

Revistas que cobran cargos por procesamiento de artículo (APC)

p14 <- journal.select %>%
  count(Publisher, APC, name = "n") %>%
  group_by(Publisher) %>%
  filter(sum(n) >= 10) %>%   # filtrar APC con al menos 10 publicaciones
  ungroup() %>% 
  ggplot(aes(area = n, subgroup = Publisher, fill = APC)) +
  geom_treemap() +
  geom_treemap_subgroup_border(color = "white", size = 1) +
  geom_treemap_text(aes(label = Publisher), place = "centre", grow = TRUE, reflow = TRUE, min.size = 1) +
  scale_fill_manual(values = c("No" = "#A6CEE3", "Yes" = "#1F78B4")) +
  labs(
    title = "Registro de APC por Editoriales",
    fill = "APC",
    caption = add_caption()
  ) +
  theme( legend.position = "right" )

p14

Revistas de América Latina

América Latina

Selección de datos de países de Latinoamerica para realizar análisis sobre las publicaciones

# Filtrar los países de América Latina
selected_countries <- c("Brazil", "Argentina", "Mexico", "Colombia", "Ecuador", "Costa Rica", "Cuba", "Bolivia", "Dominican Republic", "El Salvador", "Guatemala", "Honduras", "Nicaragua", "Panama", "Chile", "Paraguay", "Peru", "Uruguay", "Venezuela")
filtered_data <- porcen_journal %>% filter(country %in% selected_countries)
head(filtered_data)
# A tibble: 6 × 3
  country        n percentage
  <chr>      <int>      <dbl>
1 Argentina    404       1.8 
2 Bolivia       12       0.05
3 Brazil      1456       6.49
4 Chile        194       0.86
5 Colombia     448       2   
6 Costa Rica    75       0.33
journal.amlat <- journal.select[journal.select$country %in% selected_countries, ]

print(paste("Total de publicaciones en América Latina:", length(journal.amlat$title)))
[1] "Total de publicaciones en América Latina: 3346"
head(journal.amlat)
# A tibble: 6 × 11
  title  country language License Publisher Review Subjects APC   Ids   Keywords
  <chr>  <chr>   <chr>    <chr>   <chr>     <chr>  <chr>    <chr> <chr> <chr>   
1 Trans… Mexico  English… CC BY-… Universi… Doubl… Social … No    ""    tourism…
2 Cultu… Colomb… Spanish  CC BY   Universi… Doubl… Educati… No    "DOI" human s…
3 Retos… Cuba    Spanish  CC BY-… Centro d… Peer … Social … No    ""    territo…
4 Revis… Brazil  Portugu… CC BY-… Universi… Doubl… Educati… No    ""    adminis…
5 Coope… Colomb… English… CC BY-… Edicione… Doubl… Politic… No    "DOI" associa…
6 Memor… Brazil  Portugu… CC BY   Universi… Doubl… Philoso… No    "DOI" history…
# ℹ 1 more variable: Added <date>

Gráfico interactivo con cantidad de publicaciones científicas en América Latina en DOAJ

#américa latina
p8 <- hchart(
  filtered_data,
  type = "pie",
  hcaes(x = country, y = percentage), 
  dataLabels = list(enabled = TRUE),  
  showInLegend = FALSE
) %>%
  hc_title(useHTML = TRUE,
    text = paste0(
    "<b>Porcentaje de las </b>",
    sum(filtered_data$n),
    "<b> publicaciones en América Latina</b>"
  )) %>%
  hc_subtitle(useHTML = TRUE,
              text = paste0("<i>Respecto a las ", length(journal.select$title) ," publicaciones de todo el mundo</i>")) %>%
  hc_exporting(
    enabled = TRUE, 
    filename = "paises_total"
  ) %>%
  hc_tooltip(
    pointFormat = "{point.percentage:.1f}% revistas"
  ) %>%
  hc_legend(
    enabled = FALSE, 
    layout = "horizontal",
    align = "center",
    verticalAlign = "bottom",
    y = 8
  )%>%
  hc_credits(
    enabled = TRUE,
  text = add_caption(),
  href = "https://github.com/rominicky/analisis-doaj",
  itemStyle = list(fontSize = "8px", fontWeight = "normal"),
  position = list(align = "left", x = 10, y = -5)
)

p8
#| scrolled: true

print(journal.amlat$Subjects[1:15])
         Social Sciences    Education  Philosophy          Social Sciences 
       "Social Sciences"   "Education Philosophy"        "Social Sciences" 
     Education Education        Political science    Philosophy Psychology 
             "Education"      "Political science"  "Philosophy Psychology" 
     Medicine Gynecology    Philosophy Psychology          Medicine Public 
   "Medicine Gynecology"  "Philosophy Psychology"        "Medicine Public" 
     Education Education          Social Sciences   Geography Anthropology 
             "Education"        "Social Sciences" "Geography Anthropology" 
               Fine Arts    Philosophy Psychology                Education 
             "Fine Arts"  "Philosophy Psychology"              "Education" 

Vector para filtrar solo las Humanidades y Cs Sociales

social_sc <- c("Philosophy Psychology", "Geography Anthropology", "Education Theory", "Language and", "Political science", "History", "Education History", "Education Social", "History General", "History America", "Language", "Philosophy", "Bibliography Library", "Auxiliary sciences", "Education Special", "Education", "Social Sciences")

Línea de tiempo sobre la incorporación de publicaciones en América Latina en DOAJ

options(repr.plot.width = 14, repr.plot.height = 10)
p9 <- journal.amlat %>%
  mutate( year_month = floor_date(Added, "quarter")) %>%
  count(year_month) %>%
  ggplot(aes(x = year_month, y = n)) +
  geom_line(color = "#2c0fb1", size = 0.6) +
  geom_point(aes(
      text = paste(
        "Año:", format(year_month, "%Y"),
        "<br>Incorporaciones:", n
      )
    ),
    color = "#f93b20", size = 1) +
  theme_bw(base_size = 12) +
  scale_x_date(date_labels = "%Y", date_breaks = "1 year") +
  labs(x = "Trimestres del año", y = "Cantidad de incorporaciones", 
       title = "Línea de tiempo sobre incorporación de publicaciones de América Latina a DOAJ", caption = add_caption()) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
       axis.text.y = element_text(hjust = 1)
       )ggplotly(p9, tooltip = "text")

El APC en las revistas de América Latina indexadas en DOAJ

p15 <- journal.amlat %>%
  count(Publisher, APC, name = "n") %>%
  group_by(Publisher) %>%
  filter(sum(n) >= 7) %>% 
  slice_max(n = 7, order_by = n) %>%
    ungroup() %>% 
  ggplot(aes(area = n, subgroup = Publisher, fill = APC)) +
  geom_treemap() +
  geom_treemap_subgroup_border(color = "black", size = 1) +
  geom_treemap_text(aes(label = Publisher), place = "centre", grow = TRUE, reflow = TRUE, min.size = 5) +
  scale_fill_manual(values = c("No" = "#8892FC", "Yes" = "#1F78B4")) +
  labs(
    title = "Registro de APC por editoriales en América Latina",
    fill = "APC",
    caption = add_caption()
  ) +
  theme( legend.position = "right" )

p15

Las revistas de Ciencias Sociales y Humanidades de América Latina en DOAJ

p10 <- journal.amlat %>%
  filter(Subjects == "Social Sciences") %>%  
  group_by(language) %>%
  count() %>%
  filter(n >= 5) %>%
  mutate(language = reorder(language, n)) %>%
  ggplot(aes(x = reorder(language, n), y = n, fill = language)) +
  geom_col_interactive(aes(
      tooltip = paste0("Idioma: ", language, "\nCantidad: ", n),
      data_id = language),
    show.legend = FALSE ) +
  theme_bw() +
  theme(legend.position = "none") + 
  scale_fill_viridis_d(option = "A") +
  labs(y = "Cantidad de publicaciones", x = "",
       title = "Revista de 'Social Science' según idiomas de publicación de América Latina",
       caption = add_caption()) +
       theme(plot.title = element_text(hjust = 1, size = 14, face = "bold"),
       axis.text.x = element_text(hjust = 1, size = 12),
       axis.text.y = element_text(hjust = 1, size = 12)
       ) + coord_flip()
girafe(
  ggobj = p10,
  width_svg = 8,
  height_svg = 5,
  options = list(
    opts_hover(css = "fill-opacity: 0.8; stroke: black; stroke-width: 1px;")))
p11 <- journal.amlat %>%
  filter(Subjects %in% social_sc) %>%
  group_by(language) %>%
  count() %>%
  filter(n >= 10) %>%
  mutate(language = reorder(language, n)) %>%
  ggplot(aes(x = reorder(language, n), y = n, fill = language)) +
  geom_col_interactive(aes(
      tooltip = paste0("Idioma: ", language, "\nCantidad: ", n),
      data_id = language),
    show.legend = FALSE ) +
  theme_bw() +
  theme(legend.position = "none") + 
  scale_fill_viridis_d(option = "A") +
  labs(y = "Cantidad de publicaciones", x= "", 
       title = "Revista de Cs. Sociales y Humanidades según idiomas de publicación",
       caption = add_caption()) +
       theme(plot.title = element_text(hjust = 1, size = 14, face = "bold"),
       axis.text.x = element_text(hjust = 1, size = 12),
       axis.text.y = element_text(hjust = 1, size = 12)
       ) + coord_flip()

girafe(
  ggobj = p11,
  width_svg = 8,
  height_svg = 5,
  options = list(
    opts_hover(css = "fill-opacity: 0.8; stroke: black; stroke-width: 1px;")))

Revistas según países y desagregadas por idiomas

p12 <- journal.amlat %>%
  group_by(country, language) %>%
  count() %>%
  ungroup() %>%
  filter(n >= 5 & n <= 410) %>%
  mutate(language = fct_reorder(language, n)) %>%
  plot_ly(x = ~country, y = ~language, z = ~n,
        type = "heatmap", colors = viridisLite::viridis(100, option = "C"), 
        hovertemplate = paste(
      "País: %{y}<br>",
      "Idioma: %{x}<br>",
      "Publicaciones: %{z}<extra></extra>")
    ) %>%
  layout(title = list(
      text = "Distribución según idiomas de publicaciones académicas de América Latina",
      x = 0.5,  
      xanchor = "center"
    ),
    xaxis = list(tickangle = -45, tickfont = list(size = 8)),
    yaxis = list(tickangle = -25, tickfont = list(size = 8)),
    margin = list(l = 100, r = 20, t = 50, b = 50), 
    annotations = list(
      list(
        x = 0.35, y = -0.58, 
        xref = "paper", yref = "paper",
        showarrow = FALSE,
        xanchor = "left",
        yanchor = "top",
        font = list(size = 8),
        text = add_caption()
      )
    )
  )

p12

Uso y tipo de identificadores persistentes en América Latina

p13 <- journal.amlat %>%
  filter(
    Subjects %in% social_sc,
    !is.na(Ids), Ids != ""
  ) %>%
  group_by(country, Ids) %>%
  summarise(count = n(), .groups = "drop") %>%
  filter(count >= 2) %>%
  group_by(country) %>%
  mutate(prop = count / sum(count)) %>%
  ggplot(aes(x = country, y = count, fill = Ids)) +
  geom_col_interactive(position = "stack",
    aes(tooltip = paste0("País: ", country,
                         "\nIdentificador: ", Ids,
                         "\nCantidad: ", count,
                         "\nProporción: ",
                         scales::percent(prop, accuracy = 0.1)), 
        data_id = Ids),
    show.legend = FALSE
  ) +
  scale_y_continuous(trans = "log10", labels = scales::comma) +
  labs(
    title = "Proporción de identificadores por país",
    x = "País",
    y = "Proporción",
    fill = "Tipo de identificador", caption = add_caption()
  ) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

girafe(
  ggobj = p13,
  width_svg = 9,
  height_svg = 5.5,
  options = list(
    opts_hover(css = "fill-opacity: 1; stroke: black; stroke-width: 1.5px;"),
    opts_hover_inv(css = "opacity: 0.2;") 
  )
)

Editoriales de publicaciones en América Latina

p18 <- journal.amlat %>%
  group_by(Publisher)%>%
  count()%>%
  filter(n >= 12)%>%
  mutate(Publisher = reorder(Publisher, n)) %>%
  ggplot(aes(x = reorder(Publisher, -n), y = n, fill = Publisher)) +  
  geom_col_interactive(
    position = "stack",
    aes(
      tooltip = paste0("Editorial: ", Publisher,
                       "\nCantidad: ", n),
      data_id = Publisher 
    ),
    show.legend = FALSE
  ) + 
  theme_minimal() +
  scale_color_scico(palette = "berlin", direction = -1) +
  labs(
    title = "Editoriales de las publicaciones en América Latina",
    subtitle = "Restringido a editoriales con más de 15 revistas",
    y = "Cantidad de revistas",
    x = NULL,
    caption = add_caption()
  ) +
  theme(legend.position = "none",
    plot.title = element_text(face = "bold", size = 14),
    axis.text.y = element_text(size = 9) 
  ) +
  coord_flip()

girafe(
  ggobj = p18,
  width_svg = 9,
  height_svg = 5.5,
  options = list(
    opts_hover(css = "stroke: white; stroke-width: 2px;"),
    opts_hover_inv(css = "opacity: 0.3;")
  )
)

Licencias utilizadas en las revistas de América Latina en DOAJ

p19 <- journal.amlat %>%
  mutate(License = str_replace_all(License, "'", "")) %>%
  group_by(License) %>%
  count() %>%
  filter(n >= 5) %>%
  mutate(License = reorder(License, n)) %>%
  ggplot(aes(x = reorder(License, n), y = n, fill = License)) +
  geom_col_interactive(
    aes(
      # tooltip
      tooltip = paste0("Licencia: ", License, "\nCantidad: ", n),
      data_id = License
    ),
    show.legend = FALSE
  ) +
  scale_color_scico_d(palette = "berlin") +
  theme_minimal() +
  theme(plot.title = element_text(hjust = 0.5, face = "bold"),
    panel.grid.major.x = element_blank(),
    legend.position = "none") +
  coord_flip() +
  labs(
    title = "Licencias utilizadas por todo tipo de publicaciones en América Latina",
    y = "Frecuencias",
    x = "Tipo de licencias",
    caption = add_caption())

girafe(
  ggobj = p19,
  width_svg = 8,
  height_svg = 5,
  options = list(
    opts_hover(css = "fill-opacity: 1; stroke: black; stroke-width: 1.5px;"),
    opts_hover_inv(css = "opacity: 0.2;") 
  )
)

Proceso de revisión utilizados en las revistas de América Latina en DOAJ

p20 <- journal.amlat %>%
  group_by(Review)%>%
  count()%>%
  filter(n>=8)%>%
  mutate(Review = fct_reorder(Review, n)) %>%
  ggplot(aes(x = reorder(Review, n), y = n, fill = Review)) +
  geom_col_interactive(
    aes(
      # tooltip
      tooltip = paste0("Proceso de revisión: ", Review, "\nCantidad: ", n),
      data_id = Review),
    show.legend = FALSE
  ) +
  scale_fill_scico_d(palette = "devon") +
  theme_minimal() +
  theme(legend.position = "n") +
  ylab("Frecuencia") +
  xlab("Tipo de revisiones") +
  ggtitle("Proceso de revisión para publicaciones para las revistas de América Latina") +
  coord_flip() +
  labs(caption = add_caption())

girafe(
  ggobj = p20,
  width_svg = 8,
  height_svg = 5,
  options = list(
    opts_hover(css = "fill-opacity: 1; stroke: black; stroke-width: 1.5px;"),
    opts_hover_inv(css = "opacity: 0.2;")))

Desagregado de áreas generales de publicación en las revistas de América Latina en DOAJ

p21 <- journal.amlat %>%
  group_by(Subjects) %>%
  count() %>%
  filter(n > 15) %>%
  #mutate(Subjects = fct_reorder(Subjects, n)) %>%
  ggplot(aes(x = reorder(Subjects, n), y = n, fill = reorder(Subjects, n))) +
  geom_col_interactive(
    aes(
      # tooltip
      tooltip = paste0("Áreas: ", Subjects, "\nCantidad: ", n),
      data_id = Subjects),
    show.legend = FALSE
  ) +
  scale_color_scico_d(palette = "devon") +
  coord_flip() + 
  theme_minimal() +
  theme(
    legend.position = "none",
    axis.text.y = element_text(size = 10)
  ) +
  ylab("Frecuencia") +
  xlab("Áreas") +
  ggtitle("Áreas generales de publicación en América Latina") +
  labs(caption = add_caption())

girafe(
  ggobj = p21,
  width_svg = 8,
  height_svg = 5,
  options = list(
    opts_hover(css = "fill-opacity: 1; stroke: black; stroke-width: 1.5px;"),
    opts_hover_inv(css = "opacity: 0.2;")
  )
)