Knowledgebase

Drawing a random sample from the Data Web (Helpful Hint)


All databases include a variable, randnum, that has a random number between 0 and 1. If you wish to draw a sample of, say, 400 records from a Core file containing 200,000 records, the easiest way to do this is to Extract the approximate number of records from the DataWeb.

The first step is to calculate the approximate percentage of records that you will need -- in this case one-fifth to one-fourth of one percent. Thus, you can select any 1/4% to get a random sample of the correct approximate size.

It is best to round up to slightly exceed the number of records you want in your final sample in case some of the records are incomplete or otherwise out of the scope of your research. If you have too many, simply sort the extracted records in the program of your choice (Excel, SAS, dbase) on the randnum variable and the specific number of rows or records that you need. For example, your

Your filter could be:

randnum >= .67 and randnum <= .6725

In Excel, you choose Data, Sort from the menu; in SAS, PROC SORT; in dBase or Foxpro, type "INDEX ON randnum TO randnum".


Added 08/24/2002 by tpollak, Modified 06/05/2006 by jauer

Comments

No comments.

Please login to add your own comments.