CCMDB Data cleaner.mdb general information: Difference between revisions

From CCMDB Wiki
Jump to navigation Jump to search
mNo edit summary
mNo edit summary
Line 1: Line 1:
*The '''data cleaning tool''' is run by the [[Data Processor]].  
The '''data cleaning tool''' is used by the [[Data Processor]] to flag and fix internal inconsistencies in data already submitted by [[data collectors]].  
*It is an Access program which is also know as data cleaner.mdb program.
 
*It is located on Pagasa's C drive (need more information).
It is an Access program which is also known as the '''CCMDB Cleaner.mdb''' program.
 
The goal is to pull as many of these checks ahead to be done in [[CCMDB.mdb]], but some checks (e.g. those that required TISS data to be available) need to be included in this in office data cleaner.mdb program.
 
== Progress ==
The Data Cleaning process is a work in progress.
* some checks Julie usually runs have not yet been implemented
* some checks need to be added to the [[CCMDB.mdb]] instead of cleaner (do it sooner)
=== Discussion ===
{{discussion}}
* Trish/Pagasa, which check would you like to work on next? [[User:Ttenbergen|Ttenbergen]] 12:28, 8 December 2009 (CST)
 
== File Location ==
File name is '''CCMDB Cleaner.mdb'''
 
A master copy of the file is on '''X:\Data_cleaning\CCMDB Cleaner.mdb'''
 
For use, Pagasa copies the file to her C:\ drive. The file can grow to >1GB in size during use so working with it on the network would be too slow.
 
== History ==
*Original [[SAS]] data checking/cleaning queries developed by Julie in SAS who obtained input from Trish, data collectors or others where she needed.   
*Original [[SAS]] data checking/cleaning queries developed by Julie in SAS who obtained input from Trish, data collectors or others where she needed.   
*Tina was developing these checks in Access so that others who are not familiar with SAS programming can run or add additional queries into it when needed.
*Tina was developing these checks in Access so that others who are not familiar with SAS programming can run or add additional queries into it when needed.
Line 7: Line 26:
*data cleaner.mdb was outputting many false +ve's, and collectors were complaining about the number of items they were being required to check that were not errors.
*data cleaner.mdb was outputting many false +ve's, and collectors were complaining about the number of items they were being required to check that were not errors.
*The data cleaner.mdb was reviewed by Trish with Pagasa.  The problems found and also the original items in the data cleaning.mdb list from Julie is posted below.  The list of checks for this program that were are X drive were also transcribed to this list.  
*The data cleaner.mdb was reviewed by Trish with Pagasa.  The problems found and also the original items in the data cleaning.mdb list from Julie is posted below.  The list of checks for this program that were are X drive were also transcribed to this list.  
*There are a number of queries that Julie listed that she wanted in Access that had not yet been completed.
*Some of these checks can be included in the Access program that data collectors use before they send there data. These checks should be added there when they are ID'd and taken out of our Data Cleaner.
*Those checks that required TISS data to be available need to be included in this in office data cleaner.mdb program.


== Related articles ==
* for a list of articles relating to data cleaner see: [[:Category: Data Cleaner.mdb | Data Cleaner.mdb Category]]
* for a list of integrity checks, See: [[! Automated Data Integrity Checks]]  (this list is being '''thinned out''' into smaller articles which will be place in the Category Data Cleaner.mdb.   


*for a list of articles in data cleaner see: [[:Category: Data Cleaner.mdb | Data Cleaner.mdb]]
*for a list of integrity check, See: [[! Automated Data Integrity Checks]]  (this list is being '''thinned out''' into smaller articles which will be place in the Category Data Cleaner.mdb.   




{{stub}}
[[Category: Data Cleaner.mdb | *]]
[[Category: Data Cleaner.mdb | *]]

Revision as of 12:28, 8 December 2009

The data cleaning tool is used by the Data Processor to flag and fix internal inconsistencies in data already submitted by data collectors.

It is an Access program which is also known as the CCMDB Cleaner.mdb program.

The goal is to pull as many of these checks ahead to be done in CCMDB.mdb, but some checks (e.g. those that required TISS data to be available) need to be included in this in office data cleaner.mdb program.

Progress

The Data Cleaning process is a work in progress.

  • some checks Julie usually runs have not yet been implemented
  • some checks need to be added to the CCMDB.mdb instead of cleaner (do it sooner)

Discussion

Template:Discussion

  • Trish/Pagasa, which check would you like to work on next? Ttenbergen 12:28, 8 December 2009 (CST)

File Location

File name is CCMDB Cleaner.mdb

A master copy of the file is on X:\Data_cleaning\CCMDB Cleaner.mdb

For use, Pagasa copies the file to her C:\ drive. The file can grow to >1GB in size during use so working with it on the network would be too slow.

History

  • Original SAS data checking/cleaning queries developed by Julie in SAS who obtained input from Trish, data collectors or others where she needed.
  • Tina was developing these checks in Access so that others who are not familiar with SAS programming can run or add additional queries into it when needed.
  • Pagasa was running the queries and distributing problem ID, by email, fax or phone to data collectors.
  • data cleaner.mdb was outputting many false +ve's, and collectors were complaining about the number of items they were being required to check that were not errors.
  • The data cleaner.mdb was reviewed by Trish with Pagasa. The problems found and also the original items in the data cleaning.mdb list from Julie is posted below. The list of checks for this program that were are X drive were also transcribed to this list.

Related articles