Knowledgebase

NCCS Core Supplement (aka: Super Core)


NCCS Core Supplement (aka: Super Core) file contains Form 990 fields that are not found on the NCCS Core PC or Core CO files. The "Super Core" file is an archive of Form 990 return records that have been keypunched for a variety of NCCS projects beginning in 2007. This file is cumulative and will be updated as new keypunched records become available. Old tax year returns are not replaced when new ones are added, so the number of records increases with each update. Unlike many NCCS files which contain one record for each unique organization EIN, Super Core may contain more than one tax year return for each unique organization EIN, and may even contain more than one return within the same tax year for legitimate reasons (e.g. amended returns, short fiscal period, by mistake, etc.).

Two notes of caution when working with Super Core Data:

  1. Not all fields listed in the Super Core data dictionary are keypunched for every record, so selection criteria must include the variable "SCvars" to avoid counting records that do not contain data for analysis variables (discussed below).
  2. Duplicates must be addressed in statistical analysis to avoid over counting, so Super Core contains various types of flag fields that are used to select unique records (discussed below).

Data collected for various NCCS projects have specific selection criteria and variable requirements, so the composition of records and variables that eventually end up in Super Core will vary with each update, but a minimum standard set of variables is usually keypunched for most projects. Most returns in Super Core generally represent larger organizations (expenses typically >= $250,000). Most returns in Super Core are from public charities, but Super Core does contain a small number of organizations other than 501(c)(3). All Ntee Major Categories are represented in Super Core, but the distribution of Ntee’s is not uniform and does not represent the general population of 990 filers.

Super Core records are keypunched from Form 990 images (downloaded from nccs.IrsFormImages). There is a unique one-to-one relationship between Super Core and IrsFormImages. The only unique variable in Super Core is PDFname (one keypunched record per image), but Super Core can contain more than one tax year return per EIN and may even contain more than one return for the same tax year. Therefore, the Super Core Financial (_fin) file contains several flags that are essential when running summaries or merging with Core or other files:

SCvars = a comma-separated list of variables that were keypunched for that record (e.g. 'EIN,p0phone,website,p1contFund,p1dirsup,...'). Including SCvars in your selection criteria will ensure you are only including records that contain the data you need (e.g. – where SCvars like ‘%a3lobby%’ selects records that contain keypunched data for a3lobby). A variable listed in SCvars can contain NULL or zero values if that field was blank, missing, or illegible on the Form 990 image.

Flag_Core04, Flag_Core05, Flag_Core06: Core flags contain the value 1 or 0. Core flags are used to eliminate duplicates when merging Super Core data to a specific Core PC file (i.e.: only one record per EIN will have Flag_Core06=1). A value of 1 means the record is the best* match on EIN and FisYr to a record in that Core PC file.

  • There is no direct link between records in the list of IRS Form 990/990EZ/990PF Images and specific Core file records. If there were, then the amount of keypunching could be reduced for projects that only need to supplement existing core file records. Since EIN's can have more than one return image for the same reporting period and there is no sure way to identify which image matches a core record, then all of the images for that Core file sample usually have to be keypunched, including duplicate images. Merging Super Core to a Core file based on EIN & FisYr alone will create a many-to-one relationship, which is usually undesirable for most analysis. Core flags are therefore added to identify unique records and thereby make it possible to merge Super Core to Core PC on EIN and FisYr. These flags are based on a variety of points of comparison between SC and Core files. Matching on EIN + FisYr alone is the baseline score, and matching additional variables creates a higher score. The EIN record with the highest score is the 'best' match to a specific Core PC file.
  • CoreScore04, CoreScore05, CoreScore06 is related to Flag_CoreYY: the higher the score the better the match.

    FisYr_Best = 1 is the best unique record per EIN & FisYr. Example: WHERE FisYr = '2006' and FisYr_Best = 1 is the single best return per EIN for fiscal year 2006. NOTE - In most cases, the record flagged as the best match to a core file is also the best record for the fiscal year (e.g. most records with Flag_Core06 = 1 also have FisYr_Best = 1), HOWEVER this is not always the case: there are rare exceptions where an EIN can have >1 return for the same fiscal year and one of those may be the best match for a Core 2005 record and the other is the best match for a Core 2006 record. One will be marked as Flag_Core05=1 and the other marked as Flag_Core06=1, but only one return per EIN per FisYr can be marked as FisYr_Best=1 (typically the most recent return in this case).

    LatestOne = 1 is the most recent single best return record per EIN. The most recent return may not match to any record in Core 2004, 2005 or 2006 PC. The most recent return may be fairly current, but it may also be relatively old if the organization has not been active.

    EXAMPLES OF USING FLAGS:

    If you want to compute average Management and General Expenses (line 14) from the most recent returns filed by organizations (unique EIN regardless of fiscal year), then use LatestOne=1 and SCvars like '%p1mExp%' (LatestOne = 1 selects the most recent return keypunched per EIN and SCvars like '%p1mExp%' selects projects that keypunched p1mExp).

    If you want to select a unique fiscal year 2006 return per unique EIN , then use FisYr_Best=1 and FisYr = '2006' (does not guarantee that a matching Core PC record exists)

    If you want to find unique Super Core records that are the best match for records in Core 2005 PC (matching on EIN & FisYr), then use Flag_Core05=1. Adding FisYr_Best=1 with Flag_Core05=1 is unnecessary and may even slightly reduce the number of matches, because EIN's may have >1 record per FisYr, so a FY 2005 record may be flagged as the best match to Core 2005 PC but not be flagged as the best record for FY 2005 (see explanation of FisYr_Best above).

    JOINING Super Core TO MORE THAN ONE TABLE:

    The two examples below are based on the same criteria:

  • Use a master list of EINs to select one Form 990 return per EIN's (no duplicate EIN's);
  • Limit to records that contain keypunched data for p1govGt (p1govGt may still contain null or zero);
  • Give priority to SC records that match Core 2005 PC. Otherwise, use the most recent return for any the remaining EINs in the master list;
  • Include Core PC fields by joining on EIN and FisYr, giving priority to Core 2005 PC data, if available. Otherwise use Core 2006 PC data, followed by Core 2004 PC data; and
  • Set values of Core PC fields to NULL if no matching core record is found.
  • /* Example 1: Select one unique record per EIN: */

    select scall.* FROM

    (select sc.Flag_Core05, sc.LatestOne, sc.EIN, bm.name, bm.ntee_baird, sc.FisYr, sc.FormYr,

    /* identify source of core file data used: use 2005 then fill missing records with 2006 followed by 2004 */

    CASE

    WHEN core05.ein is not null then 'core2005pc'

    WHEN core06.ein is not null then 'core2006pc'

    WHEN core04.ein is not null then 'core2004pc'

    ELSE 'no match'

    END AS CoreDataSource,

    /* this case statement selects the value in core 2005 pc and fills missing values with 2006 followed by 2004 if available */

    CASE

    WHEN core05.ein is not null then core05.TOTREV2

    WHEN core06.ein is not null then core06.TOTREV2

    WHEN core04.ein is not null then core04.TOTREV2

    ELSE null

    END AS TOTREV2_core,

    /* this case statement selects the value in core 2005 pc and fills missing values with 2006 followed by 2004 if available */

    CASE

    WHEN core05.ein is not null then core05.EXPS

    WHEN core06.ein is not null then core06.EXPS

    WHEN core04.ein is not null then core04.EXPS

    ELSE null

    END AS EXPS_core,

    /* this case statement selects the value in core 2005 pc and fills missing values with 2006 followed by 2004 if available */

    CASE

    WHEN core05.ein is not null then core05.INFLEXP

    WHEN core06.ein is not null then core06.INFLEXP

    WHEN core04.ein is not null then core04.INFLEXP

    ELSE null

    END AS INFLEXP_core,

    sc.p1contFund, sc.p1dirsup, sc.p1indSup, sc.p1govGt,

    sc.p1tCont, sc.p1cash, sc.p1noncsh, sc.p1psRev, sc.p1totRev, sc.p1pExp, sc.p1mExp,

    sc.p1frExp, sc.p1affil, sc.p1totexp, sc.p1NetInc, sc.p1nAsBoy, sc.p1chgAs, sc.p1nAsEOY,

    sc.p4b_iLnd, sc.p4d_iLnd, sc.p4e_iLnd, sc.p4b_Land, sc.p4d_Land, sc.p4e_Land,

    sc.p4e_Asst, sc.p4e_unre, sc.p4e_temp, sc.p4e_perm, sc.p4e_net, sc.p6numEmp,

    sc.p7medAmt, sc.p7medXat, sc.p7medRin, sc.p7govAmt, sc.p7govXat, sc.p7govRin,

    sc.a3lobby, sc.a3LbyEx, sc.SCvars

    FROM nccs.SuperCore20080709_fin sc

    left join baird.bairdmaster bm on sc.ein=bm.ein

    left join nccs.core2004pc core04 on sc.ein=core04.ein and sc.FisYr=core04.FisYr and sc.Flag_Core04 = 1

    left join nccs.core2005pc core05 on sc.ein=core05.ein and sc.FisYr=core05.FisYr and sc.Flag_Core05 = 1

    left join nccs.core2006pc core06 on sc.ein=core06.ein and sc.FisYr=core06.FisYr and sc.Flag_Core06 = 1

    /* only pick records with p1govGt keypunched and match the master file */

    WHERE sc.SCvars like '%p1govGt%' and bm.ein is not null and

    /* only use Form 990 records */

    sc.I_IrsForm = 'EO') scall

    left join (select x.ein from nccs.SuperCore20080709_fin x where x.Flag_Core05 = 1 group by x.EIN) sc05

    on scall.EIN=sc05.EIN where

    /* use SC records matching core2005pc, but if no match then use the most recent keypunched return */

    (sc05.EIN is not null and scall.Flag_Core05 = 1) or (sc05.EIN is null and scall.LatestOne = 1) ;

    /* Example 2: Run counts and totals using one unique record per EIN: */

    select scall.FisYr, count(*) as Total_RecordCount,

    sum(ifnull(scall.Core04_RecordCount,0)) as Core04_RecordCount,

    sum(ifnull(scall.Core05_RecordCount,0)) as Core05_RecordCount,

    sum(ifnull(scall.Core06_RecordCount,0)) as Core06_RecordCount,

    sum(ifnull(scall.NoCore_RecordCount,0)) as NoCore_RecordCount,

    sum(ifnull(scall.Total_CONT_core,0)) as Total_CONT_core,

    sum(ifnull(scall.Total_TOTREV2_core,0)) as Total_TOTREV2_core,

    sum(ifnull(scall.Total_EXPS_core,0)) as Total_EXPS_core,

    sum(ifnull(scall.p1contFund,0)) as Total_p1contFund,

    sum(ifnull(scall.p1dirsup,0)) as Total_p1dirsup,

    sum(ifnull(scall.p1indSup,0)) as Total_p1indSup,

    sum(ifnull(scall.p1govGt,0)) as Total_p1govGt,

    sum(ifnull(scall.p1tCont,0)) as Total_p1tCont,

    sum(ifnull(scall.p1totRev,0)) as Total_p1totRev,

    sum(ifnull(scall.p1totExp,0)) as Total_p1totExp

    FROM

    (select sc.Flag_Core05, sc.LatestOne, sc.EIN, sc.FisYr,

    CASE

    WHEN core05.ein is not null then 0

    WHEN core06.ein is not null then 0

    WHEN core04.ein is not null then 1

    ELSE 0

    END AS Core04_RecordCount,

    CASE

    WHEN core05.ein is not null then 1

    WHEN core06.ein is not null then 0

    WHEN core04.ein is not null then 0

    ELSE 0

    END AS Core05_RecordCount,

    CASE

    WHEN core05.ein is not null then 0

    WHEN core06.ein is not null then 1

    WHEN core04.ein is not null then 0

    ELSE 0

    END AS Core06_RecordCount,

    CASE

    WHEN core05.ein is not null then 0

    WHEN core06.ein is not null then 0

    WHEN core04.ein is not null then 0

    ELSE 1

    END AS NoCore_RecordCount,

    CASE

    WHEN core05.ein is not null then ifnull(core05.CONT,0)

    WHEN core06.ein is not null then ifnull(core06.CONT,0)

    WHEN core04.ein is not null then ifnull(core04.CONT,0)

    ELSE null

    END AS Total_CONT_core,

    CASE

    WHEN core05.ein is not null then ifnull(core05.TOTREV2,0)

    WHEN core06.ein is not null then ifnull(core06.TOTREV2,0)

    WHEN core04.ein is not null then ifnull(core04.TOTREV2,0)

    ELSE null

    END AS Total_TOTREV2_core,

    CASE

    WHEN core05.ein is not null then ifnull(core05.EXPS,0)

    WHEN core06.ein is not null then ifnull(core06.EXPS,0)

    WHEN core04.ein is not null then ifnull(core04.EXPS,0)

    ELSE null

    END AS Total_EXPS_core,

    sc.p1contFund, sc.p1dirsup, sc.p1indSup, sc.p1govGt,

    sc.p1tCont, sc.p1totRev, sc.p1totExp

    FROM nccs.SuperCore20080709_fin sc

    left join baird.bairdmaster bm on sc.ein=bm.ein

    left join nccs.core2004pc core04 on sc.ein=core04.ein and sc.FisYr=core04.FisYr and sc.Flag_Core04 = 1

    left join nccs.core2005pc core05 on sc.ein=core05.ein and sc.FisYr=core05.FisYr and sc.Flag_Core05 = 1

    left join nccs.core2006pc core06 on sc.ein=core06.ein and sc.FisYr=core06.FisYr and sc.Flag_Core06 = 1

    /* only pick records with p1govGt keypunched and match the master file */

    WHERE sc.SCvars like '%p1govGt%' and bm.ein is not null and

    /* only use Form 990 records */

    sc.I_IrsForm = 'EO') scall

    left join (select x.ein from nccs.SuperCore20080709_fin x where x.Flag_Core05 = 1 group by x.EIN) sc05

    on scall.EIN=sc05.EIN where

    /* use SC records matching core2005pc, but if no match then use the most recent keypunched return */

    (sc05.EIN is not null and scall.Flag_Core05 = 1) or (sc05.EIN is null and scall.LatestOne = 1)

    group by scall.FisYr;


    Added 03/18/2008 by tpollak, Modified 12/18/2009 by jdurnford

    Comments

    No comments.

    Please login to add your own comments.