The sqpr
package gives easy access to the API of the Survey Quality Prediction website, a data base that contains over 40,000 predictions on the quality of questions.
After the feedback received at the ESRA 2019 conference, the sqpr
package has been decoupled into two packages. The first will be responsible for only obtaining the data from the SQP API (sqpr
) and the second will have all the measurement error corrections implemented separately. This way, measurement error corrections can happen independently of the SQP software for anyone who has other survey quality information. This new package is still under construction and will be uploaded as soon as it has working value.
Currently, the sqpr
package is not usable. This package is being tested on the new SQP 3.0 API and not on the published website of the SQP here.
If you have any questions, please contact us at cimentadaj@gmail.com.
sqpr
is not currently on CRAN but you can install the developing version of from Github with:
# install.packages("devtools") devtools::install_github("sociometricresearch/sqpr")
Register in the SQP website and confirm your registration through your email.
First, load the package in R and provide your registered credentials.
For details on the login process see the Accessing the SQP 3.0 API vignette
from the package.
Once you’ve ran sqp_login()
, you’re all set to work with the SQP 3.0 API! No need to run it again unless you close the R session.
To explore the SQP 3.0 API quickly, get_sqp
will be your main function. Assuming you know the study, question, country and language that you’re looking for, you can make one call to the SQP 3.0 API. Let’s try to get the question tvtot
in Round 1 for Spain in Spanish.
sp <- get_sqp( study = "ESS Round 1", question_name = "tvtot", country = "es", lang = "spa" ) sp #> # A tibble: 1 x 4 #> question reliability validity quality #> <chr> <dbl> <dbl> <dbl> #> 1 tvtot 0.731 0.939 0.686
The country and language specification need to be in two and three letter codes respectively. A simple search on Google will yield two and three letter codes for country and language, feel free to explore them.
get_sqp
also allows to specify several variables:
sp <- get_sqp( study = "ESS Round 1", question_name = c("tvtot", "trstprl", "ppltrst"), country = "es", lang = "spa" ) sp #> # A tibble: 3 x 4 #> question reliability validity quality #> <chr> <dbl> <dbl> <dbl> #> 1 tvtot 0.731 0.939 0.686 #> 2 ppltrst 0.724 0.956 0.692 #> 3 trstprl 0.828 0.902 0.747
Additionally, you can also use regular expressions:
sp <- get_sqp( study = "ESS Round 1", question_name = "^tv", country = "es", lang = "spa" ) sp #> # A tibble: 2 x 4 #> question reliability validity quality #> <chr> <dbl> <dbl> <dbl> #> 1 tvtot 0.731 0.939 0.686 #> 2 tvpol NA NA NA
The previos step assumes you’re well aware of the studies available and some of the country/languages available. Alternatively, you can query all the questions in a specific study to check whether a specific question has quality predictions. Use find_studies
to locate whether your study is in the SQP 3.0 database.
find_studies("ESS Round 4") #> # A tibble: 1 x 2 #> id name #> <int> <chr> #> 1 4 ESS Round 4
Great! Once we have that, we can use it to find all of it’s questions with find_questions
. find_questions
accepts the study that you’re looking for and a string that specifies the questions that you’re looking for. Let’s search for all questions that have tv
in the name:
q_ess <- find_questions("ESS Round 4", "tv")
That might take a while because it’s downloading all of the data to your computer. If you want to know all the questions in that study beforehand, use get_questions("ESS Round 4")
.
Let’s query further down to get the language for a specific question:
sp_tv <- q_ess[q_ess$language_iso == "spa", ] sp_tv #> # A tibble: 3 x 5 #> id study_id short_name country_iso language_iso #> <int> <int> <chr> <chr> <chr> #> 1 7999 4 TvTot ES spa #> 2 27699 4 TvPol ES spa #> 3 27638 4 PrtVtxx ES spa
The hard part is done now. Once we have the id
of your questions of interest, we supply it to get_estimates
and it will bring the quality predictions for those questions.
predictions <- get_estimates(sp_tv$id) predictions #> # A tibble: 3 x 4 #> question reliability validity quality #> <chr> <dbl> <dbl> <dbl> #> 1 tvtot 0.713 0.926 0.66 #> 2 tvpol NA NA NA #> 3 prtvtxx NA NA NA
get_estimates
will return all question names as lower case for increasing the chances of compatibility with the name in the questionnaire of the study.