Random Selection Using an Archival Dataset

Quantitative Results

If a researcher has a large archival dataset to gather data from, they may need to select a random sample from that dataset.  The following information presents one method to preform random selection using an archival dataset.

Regarding how to randomly sample, keep in mind that is always better to have a sample size larger than the required minimum.  Additionally, when the analysis consists of groups, such as ANOVA or MANOVA, it is also helpful to have relatively similar group sizes (a ratio of < 2:1), however not required.  That being said, to randomly select data from an archival dataset, Excel is a functional program that allows for ease of use.

request a consultation

Discover How We Assist to Edit Your Dissertation Chapters

Aligning theoretical framework, gathering articles, synthesizing gaps, articulating a clear methodology and data plan, and writing about the theoretical and practical implications of your research are part of our comprehensive dissertation editing services.

  • Bring dissertation editing expertise to chapters 1-5 in timely manner.
  • Track all changes, then work with you to bring about scholarly writing.
  • Ongoing support to address committee feedback, reducing revisions.

Excel has a function that allows it to create random numbers that vary between two values.  The function is =randbetween(X1, X2), where X1 is the minimum value and X2 is the maximum value in the range.   The minimum and maximum values may be any values you select, such as min = 1 and max = 2.  Let’s say we have a dataset that contains the names of grocery stores.  If you copy all of the names from the database that fit your criteria into an Excel file, random selection will be very easy.  You will need one Excel file for all grocery stores.  The data will need to be entered into Excel in a single column and the name of each grocery store should be in a single cell.  If you had a list of 350 grocery stores, your data may be entered into cell a1 – a350.  In the cell to the right of the list of grocery store names, you will then assign a random number to each grocery store, using the above formula.  Once that has been completed, highlight both columns containing information and sort the sheet in either ascending or descending order.  You can then select the first 100 responses (or any predetermined number) and be confident that they were randomly chosen from your archival dataset.

Statistics Solutions can assist with managing and analyzing your data.  For more information or to schedule a free 30 minute consultation to go over your research and how we can assist, please click below.