logo of contributing agency - statsarchive.stats.govt.nz/~/media/statistics/browse...5 1 purpose of...
TRANSCRIPT
Crown copyright ©
This work is licensed under the Creative Commons Attribution 3.0 New Zealand licence.
You are free to copy, distribute, and adapt the work, as long as you attribute the work to
Statistics NZ and abide by the other licence terms. Please note you may not use any
departmental or governmental emblem, logo, or coat of arms in any way that infringes any
provision of the Flags, Emblems, and Names Protection Act 1981. Use the wording
‘Statistics New Zealand’ in your attribution, not the Statistics NZ logo.
Liability
While all care and diligence has been used in processing, analysing, and extracting data
and information in this publication, Statistics New Zealand gives no warranty it is error free
and will not be liable for any loss or damage suffered by the use directly, or indirectly, of the
information in this publication.
Citation
Statistics New Zealand (2015). IDI Data Dictionary: IR tax data (September 2015 edition).
Available from www.stats.govt.nz.
ISSN 2463-3615 (online)
Published in September 2015 by
Statistics New Zealand
Tatauranga Aotearoa
Wellington, New Zealand
Contact
Statistics New Zealand Information Centre: [email protected]
Phone toll-free 0508 525 525
Phone international +64 4 931 4600
www.stats.govt.nz
3
Contents
1 Purpose of this data dictionary .................................................................................... 5
2 About the tax data ......................................................................................................... 6
Coverage ......................................................................................................................... 6
Methodology .................................................................................................................... 6
Privacy, security, or confidentiality issues ....................................................................... 6
List of datasets ................................................................................................................. 6
3 Data dictionary for ird_ems .......................................................................................... 7
Dataset description .......................................................................................................... 7
Summary table ................................................................................................................. 7
Detailed information ......................................................................................................... 8
4 Data dictionary for ird_addresses ............................................................................ 15
Dataset description ........................................................................................................ 15
Summary table ............................................................................................................... 15
Detailed information ....................................................................................................... 15
5 Data dictionary for ird_customers ............................................................................. 20
Dataset description ........................................................................................................ 20
Summary table ............................................................................................................... 20
Detailed information ....................................................................................................... 20
6 Data dictionary for ird_client_names ........................................................................ 24
Dataset description ........................................................................................................ 24
Summary table ............................................................................................................... 24
Detailed information ....................................................................................................... 24
7 Data dictionary for ird_tax_registrations .................................................................. 27
Dataset description ........................................................................................................ 27
Summary table ............................................................................................................... 27
Detailed information ....................................................................................................... 27
8 Data dictionary for ird_cross_reference ................................................................... 31
Dataset description ........................................................................................................ 31
Summary table ............................................................................................................... 31
Detailed information ....................................................................................................... 31
9 Data dictionary for ird_rtns_keypoints_ir3 ............................................................... 34
Dataset description ........................................................................................................ 34
Summary table ............................................................................................................... 34
Detailed information ....................................................................................................... 34
IDI Data Dictionary: IR tax data (September 2015 edition)
4
10 Data dictionary for ird_attachments_ir20 ................................................................. 38
Dataset description ........................................................................................................ 38
Summary table ............................................................................................................... 38
Detailed information ....................................................................................................... 38
11 Data dictionary for ird_attachments_ir4s ................................................................. 41
Dataset description ........................................................................................................ 41
Summary table ............................................................................................................... 41
Detailed information ....................................................................................................... 41
12 Data dictionary for ird_old_systems_numbers ........................................................ 44
Dataset description ........................................................................................................ 44
Summary table ............................................................................................................... 44
Detailed information ....................................................................................................... 44
13 Glossary ........................................................................................................................ 46
5
1 Purpose of this data dictionary
IDI Data Dictionary: IR tax data (September 2015 edition) documents the content of the datasets the Inland Revenue (IR) provides to Statistics New Zealand to use in the Integrated Data Infrastructure (IDI). This document pulls together a number of documents that exist in relation to the IR tax data to create a ‘formalised’ central reference point for users.
This dictionary gives information on the variables contained in the IR tax datasets from April 1999 – including technical information and descriptions.
Use this data dictionary if you are interested in understanding and accessing the IR tax data in the IDI for your research.
6
2 About the tax data
Coverage Reference period start: 1 April 1999
Reference period end: ongoing
Geographic coverage: all New Zealand
Methodology Type of data: administrative data capture.
Data collector: Inland Revenue
Frequency of data collection: supplied monthly to the IDI
Privacy, security, or confidentiality issues In addition to the confidentiality clauses pertaining to all data held by Statistics New Zealand, the use of IR tax data is governed under conditions specified under the Memorandum of Understanding between Stats NZ and Inland Revenue as well as the conditions covered under the Tax Administration Act 1994.
The IR tax datasets that are accessible to researchers do not contain any name or address information to identify an individual. All researchers who have access to the tax data have had their research proposals assessed using Statistics NZ’s microdata access protocols and only approved researchers who have been granted access by Statistics NZ and the Inland Revenue Department may view the tax data.
Read Statistics NZ’s microdata access protocols.
All outputs produced from tax data must be aggregated and counts suppressed if the underlying unrounded count is fewer than 6.
List of datasets ird_ems
ird_addresses
ird_customers
ird_client_names
ird_tax_registrations
ird_cross_reference
ird_rtns_keypoints_ir3
ird_attachments_ir20
ird_attachments_ir4s
ird_old_systems_numbers
Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure
7
3 Data dictionary for ird_ems
Dataset description Contents of dataset: The employee level data from the EMS return for period dates from 1 April 1999.
Conditions: Active records only, ie. ir_ems_return_line_item_code = 'A'
Exclude records with gross earnings equal to 0.
Note: Employers are able file late returns and/or amend EMS returns relating to prior periods. This means that in a given period, data may be updated with:
(a) New Active data for the latest period – however, do not include records that were
created and made inactive within the same period.
(b) New data relating to prior periods – include new data that has been submitted to
Inland Revenue, but relates to prior periods.
(c) Revisions relating to prior periods – include changes/revisions to data already
supplied.
Summary table
IDI variable name Primary key
Manda-tory
Format Classification name
Source variable name
snz_uid Y Y N
snz_ird_uid N N employee_ird_number
snz_employer_ird_uid Y Y N employer_ird_number
ir_ems_employer_location_nbr Y Y 4N employer_location_number
ir_ems_return_period_date Y Y Datetime return_period_date
ir_ems_line_nbr Y Y 6N line_number
ir_ems_snz_unique_nbr Y Y N
ir_ems_version_nbr Y Y 6N version_number
ir_ems_doc_lodge_prefix_nbr Y Y 1N doc_lodge_nbr_prefix l
ir_ems_doc_lodge_nbr Y Y 9N doc_lodge_nbr
ir_ems_doc_lodge_suffix_nbr Y Y 2N doc_lodge_nbr_suffix
ir_ems_gross_earnings_amt N 13.2N gross_earnings_amount
ir_ems_gross_earnings_imp_co
de Y 1A gross_earnings_imp_code
ir_ems_paye_deductions_amt N 13.2N paye_deductions_amount
ir_ems_paye_imp_ind Y 1A paye_imp_ind
ir_ems_earnings_not_liable_am
t N 13.2N earnings_not_liable_amount
ir_ems_earnings_not_liab_imp_
ind Y 1A earnings_not_liab_imp_ind
ir_ems_fstc_amt N 13.2N ftsc_amount
ir_ems_sl_amt N 13.2N sl_amount
IDI Data Dictionary: IR tax data (September 2015 edition)
8
IDI variable name Primary key
Manda-tory
Format Classification name
Source variable name
ir_ems_withholding_type_code Y 1A withholding_type_code
ir_ems_income_source_code Y 3A income_source_code
ir_ems_employee_start_date N Datetime date_employee_started
ir_ems_employee_end_date N Datetime date_employee_finished
ir_ems_lump_sum_ind N 1A lump_sum_indicator
ir_ems_tax_code Y 6A tax_codes tax_code
ir_ems_return_line_item_code Y 1A return_line_item_status_cod
e
ir_ems_processed_date y Datetime date_processed
ir_ems_ird_timestamp_date Y Datetime timestamp
ir_ems_enterprise_nbr N 10A
ir_ems_pbn_nbr N 10A
Detailed information ______________________________________
Variable name: snz_uid
Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Format: N
Name of classification:
Notes:
_________________________________________
Variable name: snz_ird_uid
Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR unique identifier (ird number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change.
Format: N
Name of classification:
Notes:
_________________________________________
Variable name: snz_employer_ird_uid
Definition: A local unique identifier (for an employer) derived by Statistics NZ from an IR unique identifier (IRD number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change.
Format: N
IDI Data Dictionary: IR tax data (September 2015 edition)
9
Name of classification:
Notes:
_________________________________________
Variable name: ir_ems_employer_location_nbr
Definition:
A location number is a sequence number that identifies/distinguishes between the associated locations that have return filing obligations that a customer may have.
Format: Numeric, 9N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_return_period_date
Definition: Period covered by the return.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_line_nbr
Definition: A line item number is a sequence number used to identify the different line items on a return attachment eg it is incremented from 1 by 1 for each line item.
Format: Numeric, 6N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_snz_unique_nbr
Definition:
Format: N
Name of classification:
Notes:
_______________________________________
IDI Data Dictionary: IR tax data (September 2015 edition)
10
Variable name: ir_ems_version_nbr
Definition: A version number is a means of distinguishing one version of a return attachment line item from another. The version number is initialised at zero then incremented from 1 by 1 each time the record is changed.
Format: Numeric, 6N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_doc_lodge_prefix_nbr
Definition: The prefix of the document lodgement number under which this schedule (or EMS) was filed. A prefix of 3 indicates a manual return, a prefix of 8 indicates an e-filed return.
Format: Numeric, 1N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_doc_lodge_nbr
Definition: The document lodgement number (DLN) is a unique number assigned to documents or returns lodged.
Format: Numeric, 9N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_doc_lodge_suffix_nbr
Definition: Suffix to document lodgement number.
Format: Numeric, 2N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_gross_earnings_amt
Definition: Total earnings before tax deducted. The gross earnings paid to the employee. The EMS may include more than one line item entry.
Format: Numeric, 13.2
Name of classification:
IDI Data Dictionary: IR tax data (September 2015 edition)
11
Notes:
_______________________________________
Variable name: ir_ems_gross_earnings_imp_code
Definition:
Format: 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_paye_deductions_amt
Definition: Total income tax deductions.
Format: Numeric, 13.2
Name of classification:
Notes: This includes withholding payments
_______________________________________
Variable name: ir_ems_paye_imp_ind.
Definition:
Format: 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_earnings_not_liable_amt
Definition: Income not liable for ACC earner premium.
Format: Numeric, 13.2N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_earnings_not_liab_imp_ind
Definition:
Format: 1A
Name of classification:
Notes:
IDI Data Dictionary: IR tax data (September 2015 edition)
12
_______________________________________
Variable name: ir_ems_fstc_amt
Definition: Family Support Tax Credit – the amount of family support paid to each WINZ beneficiary for the line item. This column only applies to NZISS customers. FSTC is on DWI (WINZ) EMS schedules only as DWI are the only (beneficiary) ‘employer’ to fill in this column, so it does not appear on the standard EMS form.
Format: Numeric, 13.2N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_sl_amt
Definition: Student loan repayments – student loan deduction amount for the line item. The amount is always displayed as a negative number. The student loan amount is then subtracted from the total student loan.
Format: Numeric, 13.2N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_withholding_type_code
Definition: P for PAYE deductions, W for withholding tax deductions.
Format: Character, 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_income_source_code
Definition: Code representing the source of income.
Format: Character, 3A
Name of classification: W&S – wages and salary, WHP – withholding payment, BEN – benefits, STU – Student Allowance, PPL – Paid Parental Leave, PEN – Pensions (superannuation), CLM – Claimants Compensation.
Notes:
_______________________________________
Variable name: ir_ems_employee_start_date
Definition: Start date of the employee. Is entered by the employer on the EMS.
IDI Data Dictionary: IR tax data (September 2015 edition)
13
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_employee_end_date
Definition: End date of the employee. Is entered by the employer on the EMS.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_lump_sum_ind
Definition: Flag to indicate a lump sum payment.
Format: Character, 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_tax_code
Definition: Tax code of employee. The tax code at which deductions have been made for the employee for this line item number eg 'M' main source of income. Only one job can have this code at any one time.
Format: Character, 6A
Name of classification: tax_codes
Notes:
_______________________________________
Variable name: ir_ems_return_line_item_code
Definition: Status code. A code is an abbreviation for a return line item status. Status values are 'A' active or 'I' inactive.
Format: Character, 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_processed_date
Definition: Process date.
IDI Data Dictionary: IR tax data (September 2015 edition)
14
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_ird_timestamp_date
Definition: Indicates when data was extracted into Inland Revenue’s data warehouse.
Format: Datetime, yyyymmdd
Name of classification:
_______________________________________
Variable name: ir_ems_enterprise_nbr
Definition: A unique identifier generated by Statistics NZ for an enterprise. An enterprise is an institutional unit and generally corresponds to legal entities operating in New Zealand. It can be a company, partnership, trust, estate, incorporated society, producer board, local or central government organisation, voluntary organisation, or self-employed individual.
Format: 10A
Name of classification:
Notes:
_______________________________________
Variable name: ir_ems_pbn_nbr
Definition: Permanent Business Number. 10-character code, consisting of 'PB' prefix, followed by a unique 8-digit number. This is a Statistics NZ generated construct for a geographically located business unit.
Format: 10A
Name of classification:
Notes:
Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure
15
4 Data dictionary for ird_addresses
Dataset description Contents of dataset: This table contains geocoded address information for an individual.
Summary table
IDI variable name Primary
key
Manda-
tory
Format Classification
name
Source variable name
snz_uid Y N
snz_ird_uid Y Y N ird_number
ir_apc_location_nbr Y Y 4N location_number
ir_apc_address_type_code Y Y 1A address_types address_type
ir_apc_snz_unique_nbr N N
ir_apc_applied_date Y Datetime date_applied
ir_apc_tax_type_code Y Y 3A tax_types tax_type
ir_apc_main_address_ind Y Y 1A main_address_indicator
ir_apc_post_code N 6A post_code
ir_apc_address_status_code N 1A address_status address_status
ir_apc_ceased_date N Datetime date_ceased
ir_apc_ird_timestamp_date Y Datetime timestamp
ir_apc_region_code N 2A
ir_apc_ta_code N 3A
ir_apc_meshblock_code N 7A
ir_apc_meshblock_imputed_ind N 1A
snz_idi_address_register_uid N N
Detailed information _________________________________________
Variable name: snz_uid
Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Format: 7N
Name of classification:
Notes:
_______________________________________
Variable name: snz_ird_uid
Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR unique identifier (IRD number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change.
IDI Data Dictionary: IR tax data (September 2015 edition)
16
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: ir_apc_location_nbr
Definition: Location number of the EMS filer (payroll system)
Format: Numeric, 4N
Name of classification:
Notes:
_______________________________________
Variable name: ir_apc_address_type_code
Definition: Type of address a client may have e.g. 'L' - Physical Location Address, 'P'- Postal address, 'R' -Registered Office, 'S' - Specific address, etc.
Format: Character, 1A
Name of classification: address_types
Notes:
_______________________________________
Variable name: ir_apc_snz_unique_nbr
Definition:
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: ir_apc_applied_date
Definition: Date from which record became valid
Format: Datetime, dd/mm/yy
Name of classification:
Notes:
_______________________________________
Variable name: ir_apc_tax_type_code
Definition: Tax code.
IDI Data Dictionary: IR tax data (September 2015 edition)
17
Format: Character, 3A
Name of classification: tax_types
Notes:
_______________________________________
Variable name: ir_apc_main_address_ind
Definition: Y/N indicator that denotes whether the address is the client's main address. A client may have more than one main address.
Format: Character, 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_apc_post_code
Definition: This is a numeric code that has been assigned by the NZ Post for an area within New Zealand and is used for the delivery of mail
Format: Character, 6A
Name of classification:
Notes: In the post code field approximately 90 percent of data is available.
_______________________________________
Variable name: ir_apc_address_status_code
Definition: Current address status of customer, eg 'D' return to district office, 'I' invalid address, 'O' overseas address, 'V' valid address etc
Format: Character, 1A
Name of classification: address_status
Notes:
_______________________________________
Variable name: ir_apc_ceased_date
Definition: Date from which record ceased to be valid
Format: Datetime, dd/mm/yy
Name of classification:
Notes:
_______________________________________
IDI Data Dictionary: IR tax data (September 2015 edition)
18
Variable name: ir_apc_ird_timestamp_date
Definition: Indicates when data was extracted from into Inland Revenue’s data warehouse.
Format: Datetime, yyyymmdd
Name of classification:
_______________________________________
Variable name: ir_apc_region_code
Definition:
Format: 2A
Name of classification:
Notes:
_______________________________________
Variable name: ir_apc_ta_code
Definition:
Format: 3A
Name of classification:
Notes:
_______________________________________
Variable name: ir_apc_meshblock_code
Definition: A seven digit mesh block number which is the lowest level of a customer's geographic location.
Format: 7A
Name of classification:
Notes:
_______________________________________
Variable name: ir_apc_meshblock_imputed_ind
Definition:
Format: 1A
Name of classification:
Notes:
_______________________________________
IDI Data Dictionary: IR tax data (September 2015 edition)
19
Variable name: snz_idi_address_register_uid
Definition:
Format: N
Name of classification:
Notes:
_______________________________________
Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure
20
5 Data dictionary for ird_customers
Dataset description Contents of dataset: This table holds birth_month, birth_year, and entity_type.
Summary table
IDI variable name Primary
key
Manda-
tory
Format Classification
name
Source variable name
snz_uid Y Y N
snz_ird_uid Y Y N ird_number
ir_cus_snz_unique_nbr Y Y N
ir_cus_location_nbr Y Y 4N location_number
ir_cus_entity_type_code Y 1A entity_types entity_type
ir_cus_entity_class_code Y 2A entity_classes entity_class
ir_cus_client_status_code Y 1A client_status client_status
ir_cus_applied_date N Datetime date_applied
ir_cus_ceased_date N Datetime date_ceased
ir_cus_birth_year_nbr N 4N date_of_birth
ir_cus_birth_month_nbr N 2N date_of_birth
ir_cus_org_commencement_dat
e N Datetime
org_commencement_dat
e
ir_cus_loan_indicator_code N 1A loan_indicator
ir_cus_resident_indicator_code N 1A resident_indicator
ir_cus_sic_code N 8A sic_codes sic_code
Detailed information _______________________________________
Variable name: snz_uid
Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: snz_ird_uid
Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR unique identifier (IRD number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change.
Format: N
IDI Data Dictionary: IR tax data (September 2015 edition)
21
Name of classification:
Notes:
_______________________________________
Variable name: ir_cus_snz_unique_nbr
Definition:
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: ir_cus_location_nbr
Definition: Location number of taxpayer.
Format: Numeric, 4N
Name of classification:
Notes:
_______________________________________
Variable name: ir_cus_entity_type_code
Definition: Type of entity eg C = company, M = Māori authority, P = partnership, I= individual etc.
Format: Character, 1A
Name of classification: entity_types
Notes:
_______________________________________
Variable name: ir_cus_entity_class_code
Definition: Class of entity eg BS = Building Society, UT = unit trust, SW = salary or wages etc.
Format: Character, 2A
Name of classification: entity_classes
Notes:
_______________________________________
Variable name: ir_cus_client_status_code
Definition: Status of the client eg C = ceased, B = bankrupt, A = active, L = liquidation, R = receivership, M = amalgamated company, S = struck off, U = undischarged bankrupt.
Format: Character, 1A
IDI Data Dictionary: IR tax data (September 2015 edition)
22
Name of classification: client_status
Notes: e.g. active/bankrupt/ceased active
_______________________________________
Variable name: ir_cus_applied_date
Definition: Date from which the record became active.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_cus_ceased_date
Definition: Date from which the record became inactive.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_cus_birth_year_nbr
Definition:
Format: 4N
Name of classification:
Notes:
_______________________________________
Variable name: ir_cus_birth_month_nbr
Definition:
Format: 2N
Name of classification:
Notes:
_______________________________________
Variable name: ir_cus_org_commencement_date
Definition: Commencement date for any entity other than an individual, ie company, partnership, trust etc. Loan transfer date.
Format: Datetime, yyyymmdd
Name of classification:
Notes: May be set to 1/1/1970 if unknown.
IDI Data Dictionary: IR tax data (September 2015 edition)
23
_______________________________________
Variable name: ir_cus_loan_indicator_code
Definition: ‘Y’ indicates presence of student loan for individuals.
Format: Character, 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_cus_resident_indicator_code
Definition: NZ resident / non-resident for tax purposes (R/N)
Format: Character, 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_cus_sic_code
Definition: Industry Code, eg 511010 = supermarkets, 523100 = furniture retailing.
Format: Character, 8A
Name of classification: sic_codes
Notes:
_______________________________________
Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure
24
6 Data dictionary for ird_client_names
Dataset description Contents of dataset: This table holds sex and client status information.
Summary table
IDI variable name Primary
key
Manda-
tory
Format Classification
name
Source variable name
snz_uid Y Y N
snz_ird_uid Y Y N ird_number
ir_cli_snz_unique_nbr Y N
ir_cli_location_nbr Y N 4N location_number
ir_cli_client_name_type
_code Y N 2A client_name_type client_name_type
ir_cli_sequence_nbr Y N 3N sequence_number
ir_cli_applied_date Y N Datetime date_applied
ir_cli_sex_snz_code N 1A
ir_cli_sex_imp_code Y 1A
ir_cli_ceased_date N Datetime date_ceased
ir_cli_ird_timestamp_da
te N Datetime timestamp
Detailed information _________________________________________
Variable name: snz_uid
Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Format: N
Name of classification:
Notes:
_________________________________________
Variable name: snz_ird_uid
Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR unique identifier (IRD number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change.
Format: N
Name of classification:
Notes:
_________________________________________
IDI Data Dictionary: IR tax data (September 2015 edition)
25
Variable name: ir_cli_snz_unique_nbr
Definition:
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: ir_cli_location_nbr
Definition: Location number of the EMS filer (payroll system)
Format: Numeric, 4N
Name of classification:
Notes:
_______________________________________
Variable name: ir_cli_client_name_type_code
Definition: A code denoting the client name type eg P = preferred name, S = secondary name etc.
Format: Character, 2A
Name of classification: client_name_type
Notes:
_______________________________________
Variable name: ir_cli_sequence_nbr
Definition: A (sequence) number is the numeric code given to each of a client's names within the combination of IRD number, location number and client name type. It is not a serial number, but duplicates the code in the client name type entity. There is a 1:1 relationship between client name number and client name type code: No. Code 10 = P 20 = S 30 = A 40 = C 50 = T.
Format: Numeric, 3N
Name of classification:
Notes:
_______________________________________
Variable name: ir_cli_applied_date
Definition: Date from which the record became valid.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
IDI Data Dictionary: IR tax data (September 2015 edition)
26
_______________________________________
Variable name: ir_cli_sex_snz_code
Definition:
Format: 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_cli_sex_imp_code
Definition:
Format: 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_cli_ceased_date
Definition: Date from which the record became invalid.
Format: Datetime, yyyymmdd
Name of classification:
Notes: new name, death etc.
_______________________________________
Variable name: ir_cli_ird_timestamp_date
Definition: Indicates when data was extracted into Inland Revenue’s data warehouse.
Format: Datetime, yyyymmdd
Name of classification:
_________________________________________
Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure
27
7 Data dictionary for ird_tax_registrations
Dataset description Contents of dataset: This table holds information about tax types.
Summary table
IDI variable name Primary
key
Manda-
tory
Format Classification
name
Source variable name
snz_uid Y Y N
snz_ird_uid Y Y N ird_number
ir_treg_location_nbr Y Y 4N location_number
ir_treg_tax_type_code Y Y 3A tax_types tax_type
ir_treg_applied_date Y Y Datetime date_applied
ir_treg_snz_unique_nbr Y Y N ir_treg_snz_unique_nbr
ir_treg_treg_start_date Y Y Datetime treg_date_start
ir_treg_treg_end_date Y Datetime treg_date_end
ir_treg_filing_frequency_
code N 2A
tax_filing_freq
uency filing_frequency
ir_treg_treg_status_code N 1A tax_reg_status treg_status
ir_treg_ceased_date N Datetime date_ceased
ir_treg_posting_ind_code N 1A
posting_indicat
ors posting_ind
ir_treg_electronic_filing_i
nd N 1A electronic_filing_ind
ir_treg_corporate_filing_i
nd N 1A corporate_filing_ind
ir_treg_has_agent_ind Y 1A has_agent_ind
ir_treg_ird_timestamp_da
te Y Datetime timestamp
Detailed information _________________________________________
Variable name: snz_uid
Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: snz_ird_uid
Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR unique identifier (IRD number). This identifier will remain the same for an identity across
IDI Data Dictionary: IR tax data (September 2015 edition)
28
refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change.
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: ir_treg_location_nbr
Definition: Location number of the EMS filer (payroll system)
Format: Numeric, 4N
Name of classification:
Notes:
_______________________________________
Variable name: ir_treg_tax_type_code
Definition: Tax type
Format: Character, 3A
Name of classification: tax_types
_______________________________________
Variable name: ir_treg_applied_date
Definition: Date from which the record became active.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_treg_snz_unique_nbr
Definition:
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: ir_treg_treg_start_date
Definition: Date the client first registered for a particular tax type
Format: Datetime, yyyymmdd
IDI Data Dictionary: IR tax data (September 2015 edition)
29
Name of classification:
Notes:
_______________________________________
Variable name: ir_treg_treg_end_date
Definition: Date the client deregistered for a particular tax type.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_treg_filing_frequency_code
Definition: eg D = twice monthly, Q = quarterly, I = irregularly
Format: Character, 2A
Name of classification: tax_filing_frequency
Notes:
_______________________________________
Variable name: ir_treg_treg_status_code
Definition: Active/Ceased 'X' Unknown
Format: Character, 1A
Name of classification: tax_reg_status
Notes:
_______________________________________
Variable name: ir_treg_ceased_date
Definition: Date from which the record became invalid.
Format: Datetime, yyyymmdd
Name of classification:
Notes: new name, death etc.
_______________________________________
Variable name: ir_treg_posting_ind_code
Definition: Distinguishes the type of address eg P = postal, Q = liquidator, A = agent etc.
Format: Character, 1A
Name of classification: posting_indicators
IDI Data Dictionary: IR tax data (September 2015 edition)
30
Notes:
_______________________________________
Variable name: ir_treg_electronic_filing_ind
Definition: 'Y' if an electronic filer, 'N' if paper filer
Format: Character, 1A
Name of classification:
Notes:
_______________________________________
Variable name: ir_treg_corporate_filing_ind
Definition: Indicates whether the customer is part of a corporate filing group.
Format: Character, 1A
Name of classification:
Notes: 'N' = not part of a group, 'P' = parent, 'S' = subsidiary
_______________________________________
Variable name: ir_treg_has_agent_ind
Definition: Indicates whether a tax agent acts on behalf of the customer.
Format: Character, 1A
Name of classification:
Notes: 'Y' = yes, 'N' = no.
_______________________________________
Variable name: ir_treg_ird_timestamp_date
Definition: Indicates when data was extracted into Inland Revenue’s data warehouse.
Format: Datetime, yyyymmdd
Name of classification:
_________________________________________
Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure
31
8 Data dictionary for ird_cross_reference
Dataset description Contents of dataset: This table is maintained by IR and holds information about the set of relationships between two IRD numbers. Most of the information on this table is found when the annual returns are processed.
As most of the other information is found on annual returns, it’s only when the returns are processed that information may be validated. However, it may not always occur.
Summary table
IDI variable name Primary
key
Manda-
tory
Format Classification
name
Source variable
name
snz_uid Y Y N
ir_xrf_from_snz_ird_uid Y Y N ird_number_from
ir_xrf_to_snz_ird_uid Y Y N ird_number_to
ir_xrf_applied_date Y Y Datetime date_applied
ir_xrf_ceased_date N Datetime date_ceased
ir_xrf_reference_type_code Y Y 3A cross_referenc
e_types reference_type
ir_xrf_first_year_nbr N 4N first_year
ir_xrf_latest_year_nbr Y Y 4N latest_year
ir_xrf_ird_timestamp_date Y Datetime timestamp
Detailed information _________________________________________
Variable name: snz_uid
Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Format: N
Name of classification:
Notes:
_________________________________________
Variable name: ir_xrf_from_snz_ird_uid
Definition:
Format: N
Name of classification:
Notes:
_________________________________________
IDI Data Dictionary: IR tax data (September 2015 edition)
32
Variable name: ir_xrf_to_snz_ird_uid
Definition:
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: ir_xrf_applied_date
Definition: Date from which record is valid.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_xrf_ceased_date
Definition: Date from which record is invalid.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_xrf_reference_type_code
Definition:
AAC Amalgd/Amalging Co
ASS Associated Person
BAN Bankrupt
BEN Beneficiary
DEC Deceased
DEP Dependent
DIR Director
DUP Duplicate IRD No
EOH Exec Office Holder
GPR GENERAL PARTNER
IGN NOMINATED ICA CO
JVT Joint Venture
LPR LIMITED PARTNER
LQR LIQUIDATOR
LTI LOOK-THROUGH INT
LTO LOOK THROUGH OWNER
NOM Nominated Company
NOP NOMINEE
NOR NOMINATOR
IDI Data Dictionary: IR tax data (September 2015 edition)
33
NRC NON RES CHLD SUPPT
PTR Partner
SHR Shareholder
SPO Spouse/Defacto
SUB Subsidiary Company
TEE Trustee
TRA TRANSITIONAL CLIEN
VAD VOLUNTARY ADMINIST
Format: Character, 3A
Name of classification: cross_reference_types
Notes: eg shareholder/partner/bankrupt
_______________________________________
Variable name: ir_xrf_first_year_nbr
Definition: Start date of the cross reference relationship.
Format: Numeric,
Name of classification:
Notes: eg shareholder/partner/bankrupt
_______________________________________
Variable name: ir_xrf_latest_year_nbr
Definition: Latest year of the cross reference relationship.
Format: Numeric,
Name of classification:
Notes: eg shareholder/partner/bankrupt
_______________________________________
Variable name: ir_xrf_ird_timestamp_date
Definition: Indicates when data was extracted into Inland Revenue’s data .warehouse.
Format: Datetime, yyyymmdd
Name of classification:
_______________________________________
Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure
34
9 Data dictionary for ird_rtns_keypoints_ir3
Dataset description Contents of dataset: This table contains information for the active items which have non-zero partnership, self-employment, or shareholder salary income.
Summary table
IDI variable name Primary
key
Manda-
tory
Format Classification
name
Variable name
snz_uid Y Y N
snz_ird_uid Y Y N ird_number
ir_ir3_location_nbr Y Y 4N location_number
ir_ir3_return_period_date Y Datetime return_period_date
ir_ir3_snz_unique_nbr Y Y N
ir_ir3_tot_pship_income_amt N 13.2N total_partnership_income_808
ir_ir3_tot_sholder_salary_amt N 13.2N total_shareholder_salary_809
ir_ir3_net_profit_amt N 13.2N net_profit_702
ir_ir3_income_imp_ind Y 1A
ir_ir3_net_rents_826_amt N 13.2N net_rents_826
ir_ir3_tot_wholding_paymnts_
amt
N 13.2N
tot_w_holding_payments_100
514
ir_ir3_tot_expenses_claimed_
amt
N 13.2N
total_expenses_claimed_1512
ir_ir3_gross_earnings_407_a
mt
N 13.2N
gross_earnings_407
ir_ir3_ird_timestamp_date N Datetime timestamp
Detailed information _________________________________________
Variable name: snz_uid
Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Format: N
Name of classification:
Notes:
_________________________________________
Variable name: snz_ird_uid
Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR unique identifier (IRD number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change.
Format: N
IDI Data Dictionary: IR tax data (September 2015 edition)
35
Name of classification:
Notes:
_______________________________________
Variable name: ir_ir3_location_nbr
Definition: Location number of the EMS filer (payroll system).
Format: Numeric, 4N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ir3_return_period_date
Definition: Period covered by return.
Format: Datetime, dd/mm/yy
Name of classification:
Notes:
_______________________________________
Variable name: ir_ir3_snz_unique_nbr
Definition:
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ir3_tot_pship_income_amt
Definition: Partnership income.
Format: Numeric, 13.2N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ir3_tot_sholder_salary_amt
Definition: Shareholder salary income.
Format: Numeric, 13.2N
Name of classification:
IDI Data Dictionary: IR tax data (September 2015 edition)
36
Notes:
_______________________________________
Variable name: ir_ir3_net_profit_amt
Definition: Self-employment income.
Format: Numeric, 13.2N
Name of classification:
Notes:
________________________________________
Variable name: ir_ir3_income_imp_ind
Definition:
Format: 1A
Name of classification:
Notes:
________________________________________
Variable name: ir_ir3_net_rents_826_amt
Definition: Net rental income
Format: Numeric, 13.2
Name of classification:
Notes:
________________________________________
Variable name: ir_ir3_tot_wholding_paymnts_amt
Definition: Total gross earnings (with withholding tax deducted at source).
Format: Numeric, 13.2N
Name of classification:
Notes:
________________________________________
Variable name: ir_ir3_tot_expenses_claimed_amt
Definition: Total expenses claimed.
Format: Numeric, 13.2N
Name of classification:
Notes:
IDI Data Dictionary: IR tax data (September 2015 edition)
37
_________________________________________
Variable name: ir_ir3_gross_earnings_407_amt
Definition: Gross earnings with PAYE deducted at source.
Format: Numeric, 13.2N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ir3_ird_timestamp_date
Definition: Indicates when data was extracted into Inland Revenue’s data warehouse.
Format: Datetime, yyyymmdd
Name of classification:
_______________________________________
Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure
38
10 Data dictionary for ird_attachments_ir20
Dataset description Contents of dataset: This table contains information for active items which have non-zero partnership income.
Summary table
IDI variable name Primary
key
Manda-
tory
Format Classification
name
Variable name
snz_uid Y Y N
snz_ird_uid Y Y N ird_number
snz_employer_ird_uid Y N employer_ird_number
ir_ir20_location_nbr Y Y 4N location_number
ir_ir20_return_period_date Y Y Datetime return_period_date
ir_ir20_snz_unique_nbr Y N
ir_ir20_tot_share_of_inc_8
65_amt N 13.2N tot_share_of_inc_865_amt
ir_ir20_income_imp_ind Y 1A income_imp_ind
ir_ir20_ird_timestamp_date N Datetime timestamp
Detailed information _______________________________________
Variable name: snz_uid
Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: snz_ird_uid
Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR unique identifier (IRD number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change.
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: snz_employer_ird_uid
IDI Data Dictionary: IR tax data (September 2015 edition)
39
Definition: A local unique identifier (for an employer) derived by Statistics NZ from an IR unique identifier (IRD number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change.
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ir20_location_nbr
Definition: Location number of the payer.
Format: Numeric, 4N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ir20_return_period_date
Definition: The return period.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_________________________________________
Variable name: ir_ir20_snz_unique_nbr
Definition:
Format: N
Name of classification:
Notes:
_________________________________________
Variable name: ir_ir20_tot_share_of_inc_865_amt
Definition: Value of partnership income.
Format: Numeric, 13.2N
Name of classification:
Notes:
_______________________________________
IDI Data Dictionary: IR tax data (September 2015 edition)
40
Variable name: ir_ir20_income_imp_ind
Definition:
Format: 1A
Name of classification:
Notes:
_________________________________________
Variable name: ir_ir20_ird_timestamp_date
Definition: Indicates when data was extracted into Inland Revenue’s data warehouse.
Format: Datetime, yyyymmdd
Name of classification:
_______________________________________
Dictionary of Child, Youth and Family data in the Integrated Data Infrastructure
41
11 Data dictionary for ird_attachments_ir4s
Dataset description Contents of dataset: This table holds information about the active items which have non-zero shareholder income.
Summary table
IDI variable name Primary
key
Manda-
tory
Format Classification
name
Source variable name
snz_uid Y Y N
snz_ird_uid Y N ird_number
snz_employer_ird_uid Y Y N employer_ird_number
ir_ir4_location_nbr Y Y 4N location_number
ir_ir4_return_period_date Y Y Datetime return_period_date
ir_ir4_snz_unique_nbr Y Y N
ir_ir4_tot_sholder_sal_809
_amt N 13.2N
total_shareholder_salary_
809
ir_ir4_income_imp_ind Y 1A income_imp_ind
ir_ir4_ird_timestamp_date Y Datetime timestamp
Detailed information _______________________________________
Variable name: snz_uid
Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: snz_ird_uid
Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR unique identifier (IRD number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change.
Format: N
Name of classification:
Notes:
_______________________________________
IDI Data Dictionary: IR tax data (September 2015 edition)
42
Variable name: snz_employer_ird_uid
Definition: A local unique identifier (for an employer) derived by Statistics NZ from an IR unique identifier (IRD number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change.
Format: N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ir4_location_nbr
Definition: Location number of the payer.
Format: Numeric, 4N
Name of classification:
Notes:
_______________________________________
Variable name: ir_ir4_return_period_date
Definition: The return period.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_________________________________________
Variable name: ir_ir4_snz_unique_nbr
Definition:
Format: N
Name of classification:
Notes:
_________________________________________
Variable name: ir_ir4_tot_sholder_sal_809_amt
Definition: Value of shareholder salary.
Format: Numeric, 13.2N
Name of classification:
Notes:
_______________________________________
IDI Data Dictionary: IR tax data (September 2015 edition)
43
Variable name: ir_ir4_income_imp_ind
Definition:
Format: 1A
Name of classification:
Notes:
_________________________________________
Variable name: ir_ir4_ird_timestamp_date
Definition: Indicates when data was extracted into Inland Revenue’s data warehouse.
Format: Datetime, yyyymmdd
Name of classification:
_______________________________________
44
12 Data dictionary for ird_old_systems_numbers
Dataset description Contents of dataset: This table contains the mapping of IRD numbers from old system to the new system.
Summary table
IDI variable name Primary
key
Manda-
tory
Format Classification
name
Source variable name
snz_uid Y Y N
ir_osn_old_snz_ird_uid Y Y N old_system_number
snz_ird_uid Y N ird_number
ir_osn_location_nbr Y N location_number
ir_osn_applied_date Y Y Datetime date_applied
ir_osn_ceased_date N Datetime date_ceased
ir_osn_ird_timestamp_date Y Datetime timestamp
Detailed information _________________________________________
Variable name: snz_uid
Definition: A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Format: N
Name of classification:
Notes:
_________________________________________
Variable name: ir_osn_old_snz_ird_uid
Definition:
Format: N
Name of classification:
Notes:
_________________________________________
Variable name: snz_ird_uid
Definition: A local unique identifier (for an employee) derived by Statistics NZ from an IR unique identifier (ird number). This identifier will remain the same for an identity across refreshes. Where we receive more information during a subsequent refresh that indicates that two or more identities represent the same identity, the identifier may change.
Format: N
IDI Data Dictionary: IR tax data (September 2015 edition)
45
Name of classification:
Notes:
_______________________________________
Variable name: ir_osn_location_nbr
Definition: Location number of the payer.
Format: Numeric,
Name of classification:
Notes:
_________________________________________
Variable name: ir_osn_applied_date
Definition: Date from which record is valid.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_______________________________________
Variable name: ir_osn_ceased_date
Definition: Date from which record is invalid.
Format: Datetime, yyyymmdd
Name of classification:
Notes:
_________________________________________________
Variable name: ir_osn_ird_timestamp_date
Definition: Indicates when data was extracted into Inland Revenue’s data warehouse.
Format: Datetime, yyyymmdd
Name of classification:
_________________________________________________