|
|
|
|
Data Management
CSS routinely handles large and complex data files. We have extensive experience in matching data across files. For example, we sometimes need to match physician names and other identifiers across several health plans or need to match plan members to employers when the employees appear in different data files and the employers are not consistently identified. We also have expertise in complex sampling procedures—for example, selecting patients to survey based on a hierarchy of priorities—considering date of most recent visit, number of visits, number of visits by other family members, plan PCP assignment, physician specialty, and number of patients already sampled for the same physician. CSS also routinely receives data from multiple data sources. An example of this type of activity is the work CSS does as a contractor to the U.S. Office of Personnel Management (OPM). All health plans participating in the Federal Employees Health Benefits Program submit their HEDIS/CAHPS raw response-data files to CSS for CSS to process and analyze for OPM to use in preparing plan-choice materials for Federal employees and retirees. Sampling In a typical survey, CSS handles data in the sampling process to assure that the sample is correct for the study purposes and design. CSS randomly draws survey samples from the sampling frame data (lists of eligible patients, members, or providers, for example) provided by the entity that owns the data (government agency, plan, medical group, or hospital, for example). CSS has extensive experience dealing with sample frames from various sources, which themselves vary in data management expertise. CSS works with clients to design explicit instructions to guide those who will be submitting sampling frame data. Prior to submission of sampling frame data, CSS often provides each submitter a Microsoft Access or Excel program that will import test files and check for errors in the patient records that would eliminate them from being used in the survey. These errors might include formatting errors, missing data, and errors related to base survey rules (age range, assigned PCP, etc.). The software allows each submitter to check its data prior to submission and can serve as a useful tool in identifying any errors. Upon receiving the sampling frame files, CSS conducts a thorough quality assurance check of each submitter’s data and works with individual submitters, as needed, to correct errors in their data. CSS typically runs reports for each data submitter showing the total record count and the number of records that contain each type of error that would disqualify those records for use in the survey. CSS also provides other counts on the sampling frame file to assist the client in checking that the sampling frame is complete. For a patient survey project, for example, CSS might provide a summary report showing the distribution by month of patients’last visit, gender, age range, first three digits of ZIP code, payer identifier, and unique doctor identifier number. In addition, CSS conducts visual checks on a dump of records. If called for in the survey protocol, once the submitted data have been finalized, all individual sample frame members are assigned to a unique household—so that the sampling process can be limited to one survey per household. To increase the accuracy of identifying individual households and to maximize delivery in mail surveys, the name and address information on all submitted patient records is passed through the Postal Service's National Change of Address (NCOA) process prior to sampling. The NCOA process standardizes addresses to improve deliverability and provides new addresses for individuals who have recently moved. CSS updates addresses for which the NCOA process provides a forwarding address and (if allowed by the agreed survey protocol) drops from the sampling frame NCOA “nixies” (patients for whom the NCOA process indicates a high probability of a move but does not provide a forwarding address). From the clean sampling frame, CSS draws a random sample to survey. We follow any eligibility or exclusion requirements as determined by the agreed survey protocol. CSS then prepares demographic reports on the sample, similar to those compiled for the sampling frame. We review these reports as an additional check on sampling and will provide them to the client if requested. All sample members are assigned unique, confidential tracking numbers that are used to identify the status of each sample member throughout the survey process. A master sample tracking database will contain information on such matters as whether mail was returned as undeliverable, whether the surveyed individual indicated that he/she would not like to be surveyed, whether the questionnaire was returned, whether the individual requested an alternative-language survey, whether the individual had a useable phone number after phone lookup, whether the individual could not be interviewed because of a language problem, whether the individual broke off during an Internet survey, and other relevant facts. Response Data Returned mail questionnaires from mail surveys are key-entered and then 100-percent key-verified (double-keyed). All data entry operators are trained regarding the treatment of missing data, multiple marks, and marks that are not placed within a response box. Data-entered files are checked against hard-copy questionnaires early in the key-entry process. Each record shows the name of the initial data entry operator and the key-verification operator so that it is possible to review the work of any operator. All keyed records include a data entry batch number and a sequence number within the batch so that the hard-copy questionnaire can be found efficiently for any entered record. Responses given to questions that should have been skipped (based on screening questions) are keyed, rather than being forced as blanks. The raw response file is run through an editing program that enforces skip patterns, enabling the creation of an edited response file for use in data analysis and response rate calculation. All edit rules specified in the applicable protocol are followed during this "data cleaning" process. Throughout the entire survey process, the unique, confidential identification number for each sample member record is used to keep track of processing and response status. Each record contains information on such matters as whether:
CSS usually prepares and delivers to the client a survey response dataset, cleaned of non-usable records and purged, if appropriate, of personal member or patient identifiers. This dataset will include a value for each survey item and for the non-personal identifier fields drawn from the initial sampling frame file. CSS can provide the data in any of various file formats (flat ASCII, dbf, etc.) according to the preferences of the client. CSS provide documentation explaining each data field and data element type. CSS has extensive experience submitting data, on client instructions, to government agencies, data repositories, and other entities in many different formats. |
|||
| Copyright © 2010 Center for the Study of Services/Consumers' CHECKBOOK magazine. All Rights Reserved. |