Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

textmodel_ca documentation #50

Open
vtmiller opened this issue Nov 2, 2021 · 2 comments
Open

textmodel_ca documentation #50

vtmiller opened this issue Nov 2, 2021 · 2 comments

Comments

@vtmiller
Copy link

vtmiller commented Nov 2, 2021

This is a really stupid issue, but I cannot figure out from the documentation what the "sv" variable in the ca-output is - can someone help please?

@kbenoit
Copy link
Contributor

kbenoit commented Nov 2, 2021

Can you show your output please?

@kbenoit
Copy link
Contributor

kbenoit commented Nov 2, 2021

You probably mean

> library("quanteda")
> dfmat <- dfm(tokens(data_corpus_irishbudget2010))
> tmod <- textmodel_ca(dfmat)
> summary(tmod)
           Length Class  Mode     
sv             7  -none- numeric  
nd             1  -none- numeric  
rownames      14  -none- character
rowmass       14  -none- numeric  
rowdist       14  -none- numeric  
rowinertia    14  -none- numeric  
rowcoord      98  -none- numeric  
rowsup         0  -none- logical  
colnames    5141  -none- character
colmass     5141  -none- numeric  
coldist     5141  -none- numeric  
colinertia  5141  -none- numeric  
colcoord   35987  -none- numeric  
colsup         0  -none- logical  
call           2  -none- call 

This is the default summary() method applied to the output, which is a an object defined by the ca package.

It looks a bit better when that package is attached:

> library("ca")
> library("quanteda")
> dfmat <- dfm(tokens(data_corpus_irishbudget2010))
> tmod <- textmodel_ca(dfmat)
> summary(tmod)

Principal inertias (eigenvalues):

 dim    value      %   cum%   scree plot               
 1      0.185186  19.5  19.5  *****                    
 2      0.139485  14.7  34.1  ****                     
 3      0.134083  14.1  48.2  ****                     
 4      0.127264  13.4  61.6  ***                      
 5      0.125959  13.2  74.9  ***                      
 6      0.121439  12.8  87.6  ***                      
 7      0.117473  12.4 100.0  ***                      
        -------- -----                                 
 Total: 0.950890 100.0                                 


Rows:
     name   mass  qlt  inr    k=1 cor ctr    k=2 cor ctr    k=3 cor ctr     k=4 cor ctr    k=5 cor ctr     k=6 cor ctr     k=7
1  | LBFF |  159 1000   78 |  676 852 392 | -162  49  30 |  -15   0   0 |   -24   1   1 | -119  26  18 |    74  10   7 |  -180
2  | BRFG |   82 1000   76 | -335  84  50 |  403 122  95 |  546 224 181 |   816 501 428 | -133  13  11 |    44   1   1 |  -268
3  | BJLA |  117 1000   80 | -413 162 108 | -905 777 689 |  190  34  32 |    73   5   5 |  118  13  13 |   -92   8   8 |   -15

...

Columns:
               name   mass  qlt  inr     k=1 cor ctr     k=2 cor ctr     k=3 cor ctr     k=4 cor ctr     k=5 cor ctr     k=6 cor
1    |         when |    2 1000    0 |  -379 501   1 |    74  19   0 |  -100  35   0 |  -281 275   1 |  -107  40   0 |  -149  77
2    |            i |    5 1000    1 |   106  44   0 |  -163 105   1 |   175 121   1 |  -359 508   5 |  -109  47   0 |  -190 142
3    |       prsntd |    0 1000    0 |   132  11   0 |    52   2   0 |  -801 401   0 |   -67   3   0 |  -866 468   0 |   271  46

See the ca package for things you can do with the output object. We don't have many post-model fitting functions in quanteda.textmodels for CA at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants