TUS-CPS Frequently Asked Questions

The Tobacco Use Supplement to the Current Population Survey (TUS-CPS) is an NCI-sponsored survey of tobacco use that has been administered as part of the US Census Bureau's Current Population Survey approximately every 3-4 years since 1992-93.

For inquiries or additional information, please contact ncidccpsbrpadvances@mail.nih.gov.

  1. How can I get started analyzing TUS-CPS data? Do you have datasets available for download?

    1. Helpful materials are available from the 2021 webinar series2013 TUS-CPS webinar, and the 2009 User Workshop . These will provide an orientation to working with TUS-CPS data.

    2. Datasets are available as .dat files only; Excel data files are not available because of their size. Code to read in the .dat files, called “read-in files,” is available in SAS for all datasets and in Stata and R for select waves (2018-19 and the Harmonized Dataset). Please be aware that read-in files should be reviewed prior to use, as file paths will need to be specified to the exact location of the .dat file on the user’s computer.

    3. Stata read-in files are available for download. For the 2018-2019 wave, in lieu of separate programs for each of the three survey fieldings (July 2018, January 2019, and May 2019), a new program was written to create a single Stata format data file.

    4. R read-in files are available for download. For the 2018-2019 wave, there are three separate programs for each of the three survey fieldings (July 2018, January 2019, and May 2019).

  2. How can you merge the three months of data collection in a single wave and for several waves?

    1. Materials from the 2021 webinar series, 2013 TUS-CPS webinar, or the 2009 User Workshop provide tips on merging TUS-CPS data.

  3. What data precision restrictions are applied to Census Bureau data, and what confidentiality responsibilities do data users have when using the TUS-CPS Public Use File (PUF) and associated files?

    1. The risk of disclosure of confidential and/or personally identifiable information with the TUS-CPS PUF is low. Prior to releasing the PUF, the Census Bureau modifies the file to ensure that certain variables cannot be used to identify participants. Modifications can include collapsing variable subgroups (e.g., sociodemographic variables) or suppressing values for certain weighted population sizes, labeled as “not identified” (e.g., populations of less than 100,000 people, per U.S. Census Bureau regulations and Title 13, U.S. Code – Protection of Confidential Information).

  4. How have other researchers used the TUS-CPS data?

    1. The Publications Database lists nearly 400 publications and reports using the TUS-CPS.

  5. What is an appropriate citation for the TUS-CPS?

    1. To cite the harmonized data, please use:

      1. National Cancer Institute. (2021). Tobacco Use Supplement to the Current Population Survey Harmonized Data, 1992-2019. cancercontrol.cancer.gov/tus-cps

    2. For each survey wave 2014-15 and later, please use:

      1. US Department of Commerce, Census Bureau (Year of Data Release). National Cancer Institute and Food and Drug Administration co-sponsored Tobacco Use Supplement to the Current Population Survey. Years of Surveycancercontrol.cancer.gov/tus-cps

    3. For each survey wave 2010-11 and earlier, please use:

      1. US Department of Commerce, Census Bureau (Year of Data Release). National Cancer Institute sponsored Tobacco Use Supplement to the Current Population Survey. Years of Surveycancercontrol.cancer.gov/tus-cps

        For example, the citation for the data would be:

        US Department of Commerce, Census Bureau (2020). National Cancer Institute and Food and Drug Administration co-sponsored Tobacco Use Supplement to the Current Population Survey. 2018-2019cancercontrol.cancer.gov/tus-cps

  1. Why are county-level identifiers missing for some respondents?

    1. Congressional law prohibits the Census Bureau from disclosing geographical information on anyone living in a geographic area of less than 100,000 population. The result is that the majority of county codes are not identified.

      Geography data is provided to the state level and some sub-state levels for specific metropolitan identifiers The accompanying Technical Documentation provides additional detail about sub-state levels.

  2. Does cigar usage include cigarillos?

    1. Since 2010-11, the TUS-CPS explicitly states that the survey is asking about all cigar types and mentions "cigarillo” in the question. When respondents say they currently smoke cigars, they are asked about type of cigars and brand in 2010-11, 2014-15 , 2018-19, and in the 2022-23 fieldings While the TUS-CPS asked about all types of cigars prior to 2010-11 (since 1992-93), we can’t say for sure if everyone who smokes cigarillos counted themselves as smoking cigars.

  3. Does the TUS-CPS contain a variable measuring urbanicity?

    1. The metropolitan status variable is in the geographic identifiers section of the basic CPS record layout. The variable is listed as H_METSTA in the 1992-1993 wave, but GTMETSTA in all other waves.

      The variable has the following codes:
      1 = Metropolitan
      2 = Non-Metropolitan
      3 = Not identified

    2. In the Harmonized data set, the metropolitan status variable is listed as METSTAT.

  4. Are data available showing comparisons between dentists, physicians or other health professionals in prevalence of tobacco use screening, counseling, referral?

    1. The 2003 and 2010-11 questionnaires had items regarding the type of advice that was given (i.e., referral to a quit line, prescription meds, etc.).

  5. Are there instructions for merging the TUS-CPS data with the ASEC supplements?

    1. The 2021 webinar series included a detailed look at linking TUS-CPS data to other supplements, including the ASEC. Additionally, the Census Bureau website provides guidance on linking the CPS public use data files here.

      When merging the TUS-CPS and ASEC data it is a good idea to check for duplicate records before and after the merge. There shouldn’t be more than a few duplicates, if any. If there are more than a few, double check that there isn’t an error in the code used to merging the files. We exclude the few duplicate records from our merges.

  6. Is there documentation regarding use of replicate weights?

    1. Please see the TUS-CPS Technical Documentation for each individual survey wave, also on the Census FTP site—specifically, Appendix 16 – Source and Accuracy Statement. Also, the following webinars should be helpful for describing how replicate weights are derived and may be used to conduct analyses: 2021 webinar series and 2013 TUS-CPS webinar.

  7. Are there instructions to create new weights when linking various TUS data files’ data subsets with other TUS data file subsets and/or with subsets of other CPS Supplements?

    1. The 2021 webinar series included a detailed look at linking TUS-CPS data to other subsets of the CPS.

    2. When reweighting other TUS data file subsets and/or with subsets of other CPS Supplements, you may wish to refer to NCI’s Overlap Sample Report. This report summarizes weighting methodology and overlap sample characteristics. NCI refers to those who responded to both the February 2002 TUS and the February 2003 Tobacco Use Special Cessation Supplement as the overlap sample. The responses to the overlap sample can be analyzed as one-year longitudinal study with a representative sample of the U.S. and hence furnishes a unique opportunity for data analysis.

  8. Does TUS-CPS adopt any specific data precision criteria?

    1. Neither the TUS-CPS Management Team nor the Census Bureau recommend any statistical precision data methodology. Data precision refers to statistical precision limits due to small sample sizes or large relative standard errors (RSE). Publications using certain data sources, such as from Health United States or Healthy People 2020, may use varying data precision criteria such as a denominator of <50 or of <30, or a RSE >30%. The TUS-CPS Public Use File should not allow users to access effective sample sizes of <30 because of restrictions related to confidentiality under Title 13, U.S. Code. However, Census modifications for small subgroups only applies to demographic variables and not to tobacco use items. Additionally, due to decreasing response rates, sample sizes in both the CPS and TUS have declined over time. Therefore, there are some infrequent situations in which analyses may result in small effective sample sizes of <30.

  9. What data precision guidelines are recommended for publication of TUS-CPS data?

    1. In general, users may consider being cautious when working with small sample sizes (n < 30) or interpreting estimates with RSE >30%, but should balance such concerns with research objectives, such as when analyzing certain historically disadvantaged groups (e.g., racial/ethnic minorities, etc.), novel tobacco products (e.g., nicotine pouches or heated tobacco products), or other subgroup combinations or data points. In these exceptional cases, users should interpret large confidence intervals and estimates with caution and mark estimates as potentially uncertain in all publications and presentations.

      Users may also consider not using a specific cutoff for suppressing RSE for proportions derived from survey data and instead use an effective sample size of <30 (based on the Central Limit Theorem) as a cutoff for precision.

  10. What if I still have questions about TUS-CPS?

    1. The National Cancer Institute (NCI) TUS-CPS team recently compiled responses to questions submitted by registrants for the 2020 Society for Research on Nicotine and Tobacco (SRNT) TUS-CPS Informational Session (despite that Session having been cancelled). Please find all questions and responses in the 2020 Informational Session Questions and Responses document. Please also feel free to email us with additional questions at ncidccpsbrpadvances@mail.nih.gov.

Last Updated
February 01, 2024