Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dm_from_con() with MariaDB #1128

Closed
maelle opened this issue Jun 20, 2022 · 22 comments
Closed

dm_from_con() with MariaDB #1128

maelle opened this issue Jun 20, 2022 · 22 comments

Comments

@maelle
Copy link
Member

maelle commented Jun 20, 2022

I get different table names locally for https://cynkra.github.io/dm/dev/articles/dm.html#creating-a-dm-object

(from PR #1123)

#> ── Table source ───────────────────────────────────────────────────────────
#> src:  mysql  [[email protected]:NA/Financial_ijs]
#> ── Metadata ───────────────────────────────────────────────────────────────
#> Tables: `20news`, `2news`, `abbrev`, `acc`, `ACCOUNT_TRANSACT_TYPES`, … (2231 total)
#> Columns: 19490
#> Primary keys: 395
#> Foreign keys: 407
@maelle

This comment was marked as outdated.

@krlmlr

This comment was marked as outdated.

@krlmlr

This comment was marked as outdated.

@krlmlr
Copy link
Collaborator

krlmlr commented Jun 20, 2022

We can now use the schema argument to limit that, and we should.

@krlmlr
Copy link
Collaborator

krlmlr commented Jun 20, 2022

(Maybe?)

@krlmlr
Copy link
Collaborator

krlmlr commented Jun 20, 2022

Oh: relational.fit was down over the weekend, there is a fallback by @pachadotdev that only has this one table. Could that be the reason?

@krlmlr
Copy link
Collaborator

krlmlr commented Jun 20, 2022

Either way, schema = seems worth a shot.

@maelle
Copy link
Member Author

maelle commented Jun 20, 2022

Yes I think that over the weekend you were using the fallback.

@maelle
Copy link
Member Author

maelle commented Jun 20, 2022

Who should add the use of the schema argument in the vignette?

@krlmlr
Copy link
Collaborator

krlmlr commented Jun 20, 2022

Can you please rephrase?

@maelle
Copy link
Member Author

maelle commented Jun 20, 2022

If the fix is this vignette tweak (using the schema argument), should you or I tweak the vignette? :-)

Related question: should that argument be different based on whether the example is the main one or the fallback one?

@krlmlr
Copy link
Collaborator

krlmlr commented Jun 20, 2022

Please go ahead, I'm not touching this today or tomorrow.

Can we make the examples work the same for both main and fallback?

@maelle
Copy link
Member Author

maelle commented Jun 20, 2022

Noting that https://relational.fit.cvut.cz/dataset/Financial does not mention 2231 tables so something is wrong (investigating)

@maelle
Copy link
Member Author

maelle commented Jun 20, 2022

it's as if all table variables became one table

@maelle maelle changed the title Did the fin_db database change upstream? dm_from_con() with MariaDB Jun 20, 2022
@maelle
Copy link
Member Author

maelle commented Jun 20, 2022

or actually am I getting all tables from all databases of the provider?

Example with the fallback database

library(RMariaDB)
library(dm)
#> 
#> Attachement du package : 'dm'
#> L'objet suivant est masqué depuis 'package:stats':
#> 
#>     filter

fin_db <- dm:::dbedu_con()

fin_dm <- dm_from_con(fin_db)
#> Keys queried successfully, use `learn_keys = TRUE` to mute this message.

names(fin_dm)
#>   [1] "accounts"                             
#>   [2] "ALL_PLUGINS"                          
#>   [3] "APPLICABLE_ROLES"                     
#>   [4] "cards"                                
#>   [5] "CHARACTER_SETS"                       
#>   [6] "CHECK_CONSTRAINTS"                    
#>   [7] "CLIENT_STATISTICS"                    
#>   [8] "clients"                              
#>   [9] "COLLATION_CHARACTER_SET_APPLICABILITY"
#>  [10] "COLLATIONS"                           
#>  [11] "COLUMN_PRIVILEGES"                    
#>  [12] "COLUMNS"                              
#>  [13] "disps"                                
#>  [14] "districts"                            
#>  [15] "ENABLED_ROLES"                        
#>  [16] "ENGINES"                              
#>  [17] "EVENTS"                               
#>  [18] "FILES"                                
#>  [19] "GEOMETRY_COLUMNS"                     
#>  [20] "GLOBAL_STATUS"                        
#>  [21] "GLOBAL_VARIABLES"                     
#>  [22] "INDEX_STATISTICS"                     
#>  [23] "INNODB_BUFFER_PAGE"                   
#>  [24] "INNODB_BUFFER_PAGE_LRU"               
#>  [25] "INNODB_BUFFER_POOL_STATS"             
#>  [26] "INNODB_CMP"                           
#>  [27] "INNODB_CMP_PER_INDEX"                 
#>  [28] "INNODB_CMP_PER_INDEX_RESET"           
#>  [29] "INNODB_CMP_RESET"                     
#>  [30] "INNODB_CMPMEM"                        
#>  [31] "INNODB_CMPMEM_RESET"                  
#>  [32] "INNODB_FT_BEING_DELETED"              
#>  [33] "INNODB_FT_CONFIG"                     
#>  [34] "INNODB_FT_DEFAULT_STOPWORD"           
#>  [35] "INNODB_FT_DELETED"                    
#>  [36] "INNODB_FT_INDEX_CACHE"                
#>  [37] "INNODB_FT_INDEX_TABLE"                
#>  [38] "INNODB_LOCK_WAITS"                    
#>  [39] "INNODB_LOCKS"                         
#>  [40] "INNODB_METRICS"                       
#>  [41] "INNODB_MUTEXES"                       
#>  [42] "INNODB_SYS_COLUMNS"                   
#>  [43] "INNODB_SYS_DATAFILES"                 
#>  [44] "INNODB_SYS_FIELDS"                    
#>  [45] "INNODB_SYS_FOREIGN"                   
#>  [46] "INNODB_SYS_FOREIGN_COLS"              
#>  [47] "INNODB_SYS_INDEXES"                   
#>  [48] "INNODB_SYS_SEMAPHORE_WAITS"           
#>  [49] "INNODB_SYS_TABLES"                    
#>  [50] "INNODB_SYS_TABLESPACES"               
#>  [51] "INNODB_SYS_TABLESTATS"                
#>  [52] "INNODB_SYS_VIRTUAL"                   
#>  [53] "INNODB_TABLESPACES_ENCRYPTION"        
#>  [54] "INNODB_TABLESPACES_SCRUBBING"         
#>  [55] "INNODB_TRX"                           
#>  [56] "KEY_CACHES"                           
#>  [57] "KEY_COLUMN_USAGE"                     
#>  [58] "KEYWORDS"                             
#>  [59] "loans"                                
#>  [60] "orders"                               
#>  [61] "PARAMETERS"                           
#>  [62] "PARTITIONS"                           
#>  [63] "PLUGINS"                              
#>  [64] "PROCESSLIST"                          
#>  [65] "PROFILING"                            
#>  [66] "REFERENTIAL_CONSTRAINTS"              
#>  [67] "ROUTINES"                             
#>  [68] "SCHEMA_PRIVILEGES"                    
#>  [69] "SCHEMATA"                             
#>  [70] "SESSION_STATUS"                       
#>  [71] "SESSION_VARIABLES"                    
#>  [72] "sj_all_revenue_large"                 
#>  [73] "sj_all_revenue_large_f"               
#>  [74] "sj_all_revenue_medium"                
#>  [75] "sj_all_revenue_medium_f"              
#>  [76] "sj_all_revenue_small"                 
#>  [77] "sj_all_revenue_small_f"               
#>  [78] "sj_all_revenue_xlarge"                
#>  [79] "sj_all_revenue_xlarge_f"              
#>  [80] "sj_all_sessions_large"                
#>  [81] "sj_all_sessions_large_f"              
#>  [82] "sj_all_sessions_medium"               
#>  [83] "sj_all_sessions_medium_f"             
#>  [84] "sj_all_sessions_small"                
#>  [85] "sj_all_sessions_small_f"              
#>  [86] "sj_all_sessions_xlarge"               
#>  [87] "sj_all_sessions_xlarge_f"             
#>  [88] "sj_user_summary_large"                
#>  [89] "sj_user_summary_large_f"              
#>  [90] "sj_user_summary_medium"               
#>  [91] "sj_user_summary_medium_f"             
#>  [92] "sj_user_summary_small"                
#>  [93] "sj_user_summary_small_f"              
#>  [94] "sj_user_summary_xlarge"               
#>  [95] "sj_user_summary_xlarge_f"             
#>  [96] "sj_users_daily_large"                 
#>  [97] "sj_users_daily_large_f"               
#>  [98] "sj_users_daily_medium"                
#>  [99] "sj_users_daily_medium_f"              
#> [100] "sj_users_daily_small"                 
#> [101] "sj_users_daily_small_f"               
#> [102] "sj_users_daily_xlarge"                
#> [103] "sj_users_daily_xlarge_f"              
#> [104] "SPATIAL_REF_SYS"                      
#> [105] "SQL_FUNCTIONS"                        
#> [106] "STATISTICS"                           
#> [107] "SYSTEM_VARIABLES"                     
#> [108] "TABLE_CONSTRAINTS"                    
#> [109] "TABLE_PRIVILEGES"                     
#> [110] "TABLE_STATISTICS"                     
#> [111] "TABLES"                               
#> [112] "TABLESPACES"                          
#> [113] "tkeys"                                
#> [114] "trans"                                
#> [115] "TRIGGERS"                             
#> [116] "USER_PRIVILEGES"                      
#> [117] "USER_STATISTICS"                      
#> [118] "user_variables"                       
#> [119] "VIEWS"

Created on 2022-06-20 by the reprex package (v2.0.1)

@maelle
Copy link
Member Author

maelle commented Jun 20, 2022

Noting this does not happen with the dm version that's on main. I'm going to look into the PR more closely.

@maelle
Copy link
Member Author

maelle commented Jun 20, 2022

@krlmlr do you agree there's a bug here? 🤔

@maelle
Copy link
Member Author

maelle commented Jun 20, 2022

Currently looking for a Postgres example to see if that's affected too.

@pachadotdev
Copy link
Contributor

Dear @maelle

I have an example of the exact same DB (Financial) also on PostgreSQL and SQL Server. Please have a look at https://databases.pacha.dev/#financial-aka-loan-application.

Best,
Mauricio Vargas S.

@krlmlr
Copy link
Collaborator

krlmlr commented Jun 21, 2022

Thanks! I was referring to the schema argument of dm_from_con() . At some point we'll expose dm_meta() . It's worth making it consistent with ADBC (see apache/arrow-adbc#18), guessing the schema name isn't something that we can or want to cover in this documentation until we expose dm_meta(). That's a bit of a catch-22, unfortunately.

library(RMariaDB)
library(dm)

fin_db <- dm:::dbedu_con()

meta <-
  fin_db %>%
  dm:::dm_meta()

meta$schemata
#> # Source:   SQL [3 x 2]
#> # Database: mysql  [[email protected]:NA/financial]
#>   catalog_name schema_name       
#>   <chr>        <chr>             
#> 1 def          information_schema
#> 2 def          intendo           
#> 3 def          financial

fin_dm <- dm_from_con(fin_db, learn_keys = TRUE, schema = "financial")

names(fin_dm)
#> [1] "accounts"  "cards"     "clients"   "disps"     "districts" "loans"    
#> [7] "orders"    "tkeys"     "trans"

fin_db_relational <- dm:::relational_con()

fin_dm_relational <- dm_from_con(fin_db_relational, learn_keys = TRUE, schema = "financial")

names(fin_dm)
#> [1] "accounts"  "cards"     "clients"   "disps"     "districts" "loans"    
#> [7] "orders"    "tkeys"     "trans"

Created on 2022-06-21 by the reprex package (v2.0.1)

It is true that dm_from_con() now requires the schema argument for RMariaDB. There, "schema" and "database" are synonymous, and filtering by database in the connection could restrict the metadata returned. I'll take a look. Related: #1130.

@krlmlr
Copy link
Collaborator

krlmlr commented Jun 21, 2022

I fixed #1123 so that if dbname is passed to dbConnect() we're only getting tables from that schema/database. Closing, let's discuss in new issues if needed.

@krlmlr krlmlr closed this as completed Jun 21, 2022
@github-actions
Copy link
Contributor

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.

@github-actions github-actions bot locked and limited conversation to collaborators Jun 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

No branches or pull requests

3 participants