Table 3 - University of Toronto's Data Library

Se repérer dans un dictionnaire selon l'ordre alphabétique. ...... En plus des
examens scolaires; les chapitres affectés du symbole (E.O.) situé dans les ..... Par
conséquent, l'utilisation de logiciels spécialisés dans le domaine du traitement
de texte (Word) s'avère indispensable. ...... Gaithersburg, MD: Aspen Publishers.
Mohr ...

Part of the document

PCCF + Version 4G
User's Guide
Automated Geographic Coding Based on the
Statistics Canada Postal Code Conversion Files Including Postal Codes to October 2005
by
Russell Wilkins Health Analysis and Measurement Group
Statistics Canada
Ottawa
January 2006
Catalogue no. 82F0086-XDB
h:\pccf4g\msword.pccf4g.doc 2006-01-31 Russell Wilkins. PCCF+ Version 4G User's Guide. Automated Geographic Coding
Based on the Statistics Canada Postal Code Conversion Files, Including
Postal Codes to October 2005. Catalogue 82F0086-XDB. Health Analysis and
Measurement Group, Statistics Canada, Ottawa, January 2006.
ABSTRACT PCCF+ Version 4 consists of a SAS control program and a series of reference
files derived from the most recent Statistics Canada Postal Code Conversion
File (PCCF) and a 2001 postal code population weight file (WCF). It
automatically assigns a full range of geographic identifiers (down to
dissemination area, block, and latitude, longitude) based on postal codes.
It is consistent and logical in the way it does this. Any incorrect coding
due to errors in the underlying reference files can easily be corrected
once identified. To do such coding by manual methods would require highly
skilled coders with much time and access to the full mailing address or
property description. Even so, the results of manual coding would tend to
be less accurate (particularly in urban areas), and they could
inadvertently introduce systematic bias (especially in rural areas). As long as the postal codes on the incoming file are valid for the
corresponding addresses, PCCF+ will usually generate highly accurate
geographic coding. Manual geographic coding is no longer required except in
very rare circumstances. Records for most postal codes which serve more
than one dissemination area--including most rural postal codes and several
classes of urban postal codes-are assigned geographic codes based on a
population-weighted random allocation among the possible dissemination
areas and blocks. This produces an unbiased allocation of events in
relation to the resident population. However, because of the nature of the
postal code conversion files, a few classes of valid postal codes cannot be
assigned full geographic identifiers corresponding to a place of residence
or business. In such cases, as well as for postal codes that do not match
exactly to the PCCF or WCF, the first two or three characters of the postal
code are used to try to assign partial geographic identifiers to the extent
possible. This takes care of many situations where the last one, two, or
three characters of the postal code are invalid, but the first two or three
characters are valid. Problem records include full diagnostic and reference
information. Business and institutional addresses are clearly identified,
which facilitates determining if the postal code corresponds to the
client's usual place of residence (or business), or was the result of a
keying or reporting error. An alternate version of the control program is
also provided for better coding of the location of health facilities and
professionals, as opposed to places of residence, where that is desired. Note: For authorized university research and teaching purposes, PCCF+ is
available under the Data Liberation Initiative (DLI). For general
information on the DLI, including contact persons at each participating
university, see the Statistics Canada website: www.statcan.ca (Learning
resources / Postsecondary/Data Liberation Initiative). On the DLI FTP site,
the PCCF+ filenames are shown in the directory -/health/pccf4g-fccp4g.
[Ressources éducatives / Niveau postsecondaire / l'initiative de
démocratisation des données]. For Statistics Canada internal use, see
//geodepot/Geographie_2001_Geography/Geo_Data_Products-
Produits_de_données_Géo/PCCFplus_version4G_oct05/ TABLE OF CONTENTS
Page
Abstract 2 Getting started 5
Introduction 5
Step 1: Getting set up 5
Step 2: Your input file 5
Step 3: The two output files produced 5
Step 4 (optional): Getting appropriate geographic coding for FSAs which
were moved (V1H & V9G) 6 Table 1 Files included in PCCF+ Version 4 7 How the package works 8
Origins and objectives of PCCF+ 8
Objectives 8
Bells and whistles 8
Operational requirements 8
What's new in Version 4G? 9
What was new in Version 4F? 9
What was new in Version 4D? 9
What was new in Version 4A? 9
What was new in Version 3E? 10
What was new in Version 3A? 11
What was new in Version 2? 12
How the reference files were produced 12
What the package does 13
Why it is important to have accurate postal codes 13
How the matching process works 13
How the programs deal with multiple matches 15
How the programs deal with reuse of postal codes 15
How to indicate unknown or partially unknown postal codes 15
How to run PCCF+ 15
Future versions of PCCF+ 16
Verification of geographic coding produced 16 Where to get help 16
Technical assistance 16
Suspected problems with the PCCF 16 Additional reference information 17
Acceptable characters and numbers in Canadian postal codes 17
Filename extensions 17
Abbreviations 17
References 18
Warning and disclaimer 20
Acknowledgements 20 Table 2 Distribution of postal codes and census population by DMT
21
Table 3 Coding errors using PCCF+ vs the PCCF single link indicator
(SLI) 21 List of appendices 22
( Appendix A. Record layout of the HLTHOUT file 23
( Appendix B. Record layout of the GEOPROB file 24
( Appendix C. Explanation of fields and codes appearing in the
output files and printouts 25
( Appendix D. Sample outputs from PCCF+ 37
( Appendix E. Census metropolitan areas and census agglomerations
40
( Appendix F. Geographic coding from partial postal codes 43
( Appendix H. Health regions and health districts, Canada, 2003
47
( Appendix J. Census divisions, 2001 58
( Appendix K. Economic regions, 2001 61
( Appendix L. Agricultural regions (crop districts), 2001 63
( Appendix M. Supplementary Program DIST4x.SAS 64
( Appendix N. Supplementary Program EXPLOD2.SAS 64 GETTING STARTED Introduction To do automated geographic coding based on postal codes using PCCF+, all
you need to do is follow Steps 1, 2 and 3 below. The rest of the
documentation provides supplementary detail and background information
which should be read eventually, but it is not essential to getting
started. A list of Abbreviations begins on page 17, the References begin on
page 18, and a List of Appendices available can be found on page 22. If you want to find out what the program does and how it works before
getting started, skip Steps 1-3, and begin reading at the section entitled
Origins and objectives of PCCF+. Then come back to Step 1 when you are
ready to begin coding. Step 1: Getting set up The PCCF+ package consists of five SAS control files (the programs) plus
several reference files derived mainly from the Statistics Canada Postal
Code Conversion File (PCCF) and Weighted Conversion File (WCF). To use the
programs, you must first have installed SAS on your mainframe or personal
computer (PC) and copied all of the files shown in Table 1(on page 7) into
your own directory. For residence coding, edit the program GEORES4x.SAS.
For coding of health facilities or office locations, edit the program
GEOINS4x.SAS. Step 2: Identifying your input file (with postal codes to be assigned
geography) Your incoming data to be coded will be known to the programs as HLTHDAT.
You must indicate to the program where to find your income file, by
changing the shaded filename shown below to your own incoming filename.ext
at the following line: filename HLTHDAT 'c:\pccf4a\sampldat.can'; /* your input file */ Your incoming file can be sorted in any order or unsorted. Each logical
record of the incoming file must contain a unique identifier (ID), plus a
postal code (PCODE) if available. The postal code can have a space or
hyphen between the first 3 characters (FSA) and the last 3 characters
(LDU), or no space. Those fields can be anywhere in the file, but you must
tell SAS where to find them, as in the following example: DATA HLTHDAT0; INFILE HLTHDAT MISSOVER;
INPUT
@ 5 ID $CHAR8. /* UNIQUE IDENTIFIER OR
REGISTRAT NUMBER */
/* IT CAN BE UP TO 12
CHARACTERS IN LENGTH */
@ 88 FSA $CHAR3. /* FSA (ANA)--FIRST 3
CHARACTERS OF PCODE */
@ 92 LDU $CHAR3.; /* LDU (NAN)--LAST 3
CHARACTERS OF PCODE */
PCODE=FSA||LDU; /* POSTAL CODE (ANANAN)
*/ The ID can be numerical, alphabetic or mixed. It can be up to 12 characters
in length, and can be found anywhere in your file, as specified in the
INPUT statement. If ID is more than 12 characters in length, the output
file formatting would have to be modified. Records with the same ID but
different postal codes will each be assigned geographic codes. However, if
the same ID and postal code appear in combination more than once, only one
example of each combination will be retained. The postal code can also be
found anywhere in the