Extract the Quality of your Survey Questions with SQP 3.0 • sqpr

sqpr

The sqpr package gives easy access to the API of the Survey Quality Prediction website, a data base that contains over 40,000 predictions on the quality of questions.

Important changes

After the feedback received at the ESRA 2019 conference, the sqpr package has been decoupled into two packages. The first will be responsible for only obtaining the data from the SQP API (sqpr) and the second will have all the measurement error corrections implemented separately. This way, measurement error corrections can happen independently of the SQP software for anyone who has other survey quality information. This new package is still under construction and will be uploaded as soon as it has working value.

Currently, the sqpr package is not usable. This package is being tested on the new SQP 3.0 API and not on the published website of the SQP here.

If you have any questions, please contact us at cimentadaj@gmail.com.

Installation

sqpr is not currently on CRAN but you can install the developing version of from Github with:

# install.packages("devtools")
devtools::install_github("sociometricresearch/sqpr")

Example

Registration and logging in

First, load the package in R and provide your registered credentials.

library(sqpr)
sqp_login('your username', 'your password')

For details on the login process see the Accessing the SQP 3.0 API vignette from the package.

Once you’ve ran sqp_login(), you’re all set to work with the SQP 3.0 API! No need to run it again unless you close the R session.

Exploring the SQP 3.0 API

Quick interaction

To explore the SQP 3.0 API quickly, get_sqp will be your main function. Assuming you know the study, question, country and language that you’re looking for, you can make one call to the SQP 3.0 API. Let’s try to get the question tvtot in Round 1 for Spain in Spanish.

sp <-
  get_sqp(
    study = "ESS Round 1",
    question_name = "tvtot",
    country = "es",
    lang = "spa"
  )

sp
#> # A tibble: 1 x 4
#>   question reliability validity quality
#>   <chr>          <dbl>    <dbl>   <dbl>
#> 1 tvtot          0.731    0.939   0.686

The country and language specification need to be in two and three letter codes respectively. A simple search on Google will yield two and three letter codes for country and language, feel free to explore them.

get_sqp also allows to specify several variables:

sp <-
  get_sqp(
    study = "ESS Round 1",
    question_name = c("tvtot", "trstprl", "ppltrst"),
    country = "es",
    lang = "spa"
  )

sp
#> # A tibble: 3 x 4
#>   question reliability validity quality
#>   <chr>          <dbl>    <dbl>   <dbl>
#> 1 tvtot          0.731    0.939   0.686
#> 2 ppltrst        0.724    0.956   0.692
#> 3 trstprl        0.828    0.902   0.747

Additionally, you can also use regular expressions:

sp <-
  get_sqp(
    study = "ESS Round 1",
    question_name = "^tv",
    country = "es",
    lang = "spa"
  )

sp
#> # A tibble: 2 x 4
#>   question reliability validity quality
#>   <chr>          <dbl>    <dbl>   <dbl>
#> 1 tvtot          0.731    0.939   0.686
#> 2 tvpol         NA       NA      NA

Manual searching

The previos step assumes you’re well aware of the studies available and some of the country/languages available. Alternatively, you can query all the questions in a specific study to check whether a specific question has quality predictions. Use find_studies to locate whether your study is in the SQP 3.0 database.

find_studies("ESS Round 4")
#> # A tibble: 1 x 2
#>      id name       
#>   <int> <chr>      
#> 1     4 ESS Round 4

Great! Once we have that, we can use it to find all of it’s questions with find_questions. find_questions accepts the study that you’re looking for and a string that specifies the questions that you’re looking for. Let’s search for all questions that have tv in the name:

q_ess <- find_questions("ESS Round 4", "tv")

That might take a while because it’s downloading all of the data to your computer. If you want to know all the questions in that study beforehand, use get_questions("ESS Round 4").

Let’s query further down to get the language for a specific question:

sp_tv <- q_ess[q_ess$language_iso == "spa", ]
sp_tv
#> # A tibble: 3 x 5
#>      id study_id short_name country_iso language_iso
#>   <int>    <int> <chr>      <chr>       <chr>       
#> 1  7999        4 TvTot      ES          spa         
#> 2 27699        4 TvPol      ES          spa         
#> 3 27638        4 PrtVtxx    ES          spa

The hard part is done now. Once we have the id of your questions of interest, we supply it to get_estimates and it will bring the quality predictions for those questions.

predictions <- get_estimates(sp_tv$id)
predictions
#> # A tibble: 3 x 4
#>   question reliability validity quality
#>   <chr>          <dbl>    <dbl>   <dbl>
#> 1 tvtot          0.713    0.926    0.66
#> 2 tvpol         NA       NA       NA   
#> 3 prtvtxx       NA       NA       NA

get_estimates will return all question names as lower case for increasing the chances of compatibility with the name in the questionnaire of the study.