Knowledgebase

IRS SOI Sampling - Background and Changes in Strata (Technical Note)


SOI Sample Methodology

(Provided by Arnsberger Paul, IRS SOI Division, October 17, 2007)

We (SOI) run two samples per study year: one for 501(c)(3) organizations; the other is based on a pool of 501(c)(4)-(9) organizations. In addition to subsection code, the samples are based on end-of-year book value of assets as reported on the Forms 990 and 990-EZ. These are the only two sampling criteria we use. Here is a breakdown of the random stratified sample designs for Tax Year 2004.

Tax Year 2004 501(c)(3) Sample
End-of-year AssetsProjected PopulationProjected SampleSample Rate
Under $500,000 176,945 2,194 0.0124
$500,000 under $1,000,000 26,129 324 0.0124
$1,000,000 under $2,500,000 28,034 1,393 0.0497
$2,500,000 under $5,000,000 15,683 779 0.0497
$5,000,000 under $20,000,000 16,934 3,156 0.1864
$20,000,000 under $50,000,000 4,811 1,794 0.3729
$50,000,000 or more 5,068 5,068 1.0000
Tax Year 501(c)(4)-(9) Sample
End-of-year AssetsProjected PopulationProjected SampleSample Rate
Under $150,000 62,520 694 0.0111
$150,000 under $300,000 16,151 179 0.0111
$300,000 under $1,000,000 20,485 531 0.0259
$1,000,000 under $4,000,000 11,090 1,231 0.1110
$4,000,000 under $10,000,000 3,657 812 0.2220
$10,000,000 or more 3,356 3,356 1.0000

We have used the $50,000,000 100% sampling threshold for 501(c)(3) organizations since Tax Year 2003. (Prior to that it was $30,000,000.) We change the sample rates annually, and re-evaluate the sample design every five years or so.

For more information, I am attachinga copy of the "Data Sources and Limitations" section from our most recent SOI Bulletin article on exempt organizations:

The statistics in this article are based on a sample of the 2004 Forms 990, Return of Organization Exempt From Income Tax, and Forms 990-EZ, Short Form Return of Organization Exempt From Income Tax. Organizations were required to file the 2004 form when their accounting periods ended any time between December 31, 2004, and November 30, 2005. The sample did not include private foundations, which were required to file Form 990-PF. Most churches and certain other types of religious organizations were also excluded from the sample because they were not required to file Form 990 or Form 990-EZ. The sample included only those returns with average receipts of more than $25,000, the filing threshold.

The sample design was split into two parts: the first sampling frame contained all returns filed by organizations exempt under section 501(c)(3); the second sampling frame comprised a pool of all returns filed by organizations exempt under sections 501(c)(4) through (9). Organizations tax-exempt under other Code sections were excluded from the sample frames. The data presented were obtained from returns as originally filed with the Internal Revenue Service. They were subjected to comprehensive testing and correction procedures in order to improve statistical reliability and validity. However, in most cases, changes made to the original return as a result of either administrative processing or taxpayer amendment were not incorporated into the database.

The two samples were classified into strata based on the size of end-of-year total assets, with each stratum sampled at a different rate. For section 501(c)(3) organizations, a sample of 15,070 returns was selected from a population of 279,415. Sampling rates ranged from 1.24 percent for organizations reporting total assets less than $500,000 to 100 percent for organizations with total assets of $50,000,000 or more. The second sample contained 6,669 records selected from the population of 111,010 returns filed by organizations exempt under sections 501(c)(4) through (9). Sampling rates ranged from 1.11 percent for organizations reporting total assets less than $150,000 to 100 percent for organizations with assets of $10,000,000 or more. The filing populations for these organizations included some returns of terminated organizations, returns of inactive organizations, duplicate returns, and returns of organizations filed with tax periods prior to 2004. However, these returns were excluded from the final sample and the estimated population counts.

CHANGES TO THE IRS SOI SAMPLING STRATA OVER TIME

(Jon Durnford, NCCS, Oct 2007)

The population weights for the largest sample categories (* indicated below) are 1, and one hundred percent of the organizations in these largest categories are included in the sample. Organizations found in smaller categories have population weights > 1.

SOI changed the sample code categories used for Forms 990/990-EZ in 1998 and in 2003, as follows:

1997: Forms 990/990-EZ, All 501(c) orgs
Sample CodeEnd-of-year BV Assets
31 Assets under $250,000
32 Assets $250,000 under $500,000
33 Assets $500,000 under $1,000,000
34 Assets $1,000,000 under $2,500,000
35 Assets $2,500,000 under $5,000,000
36 Assets $5,000,000 under $10,000,000
37* Assets $10,000,000 or more
1998 to 2002: Forms 990/990-EZ, 501(c)(3) orgs
Sample CodeEnd-of-year BV Assets
31 Assets under $500,000
32 Assets $500,000 under $1,000,000
33 Assets $1,000,000 under $2,500,000
34 Assets $2,500,000 under $5,000,000
35 Assets $5,000,000 under $10,000,000
36 Assets $10,000,000 under $30,000,000
37* Assets $30,000,000 or more
1998 to 2002: Forms 990/990-EZ, 501(c)(4)-(9) orgs
Sample CodeEnd-of-year BV Assets
41 Assets under $125,000
42 Assets $125,000 under $400,000
43 Assets $400,000 under $1,000,000
44 Assets $1,000,000 under $2,500,000
45 Assets $2,500,000 under $10,000,000
46* Assets $10,000,000 or more
Beginning 2003: Forms 990/990-EZ, 501(c)(3) orgs
Sample CodeEnd-of-year BV Assets
31 Assets under $500,000
32 Assets $500,000 under $1,000,000
33 Assets $1,000,000 under $2,500,000
34 Assets $2,500,000 under $5,000,000
35 Assets $5,000,000 under $20,000,000
36 Assets $20,000,000 under $50,000,000
37* Assets $50,000,000 or more
Beginning 2003: Forms 990/990-EZ, 501(c)(4)-(9) orgs
Sample CodeEnd-of-year BV Assets
41 Assets under $150,000
42 Assets $150,000 under $300,000
43 Assets $300,000 under $1,000,000
44 Assets $1,000,000 under $4,000,000
45 Assets $4,000,000 under $10,000,000
46* Assets $10,000,000 or more

NOTE: * = large 990/990-EZ organization sample threshold (weight = 1)

SOI has not changed the sample code categories used for Form 990-PF since at least 1997 (most likely earlier) through 2004 (shown below). The large organization threshold (weight = 1) has been assigned to private foundations in categories 16 & 17 since at least 1997 through 2004. The large organization threshold was assigned to charitable trusts in category 23 until 2002. In 2003, all charitable trust categories (21, 22, and 23) were assigned weight = 1.

Charitable Trusts:
Sample CodeEnd-of-year BV Assets
21** Assets under $100,000
22** Assets $100,000 under $1,000,000
23* Assets $1,000,000 or more
Private Foundations:
Sample CodeEnd-of-year BV Assets
11 Assets under $125,000
12 Assets $125,000 under $400,000
13 Assets $400,000 under $1,000,000
14 Assets $1,000,000 under $2,500,000
15 Assets $2,500,000 under $10,000,000
16* Assets $10,000,000 under $25,000,000
17* Assets $25,000,000 or more

NOTE:

  • = large 990-PF organization sample threshold (weight = 1)
  • = large 990-PF organization sample thresholds added in 2003 (weight = 1)
  • WEIGHT STATS: See attached file for Min and Max WEIGHT values in SOI file years 1982-2004.


    Added 06/03/2002 by tpollak, Modified 11/04/2008 by tpollak

    Comments

    No comments.

    Please login to add your own comments.