Market Trends: Data Limitations

Topical Limitations

The availability of consumer and demographic data, whether in published in print or on the Internet, is dependent on the collection of appropriate information. This is usually done through periodic surveys but sometimes, as with retail sales, the data is consolidated automatically by computerized systems. In any case, what you want to know may not be in any published source, simply because certain questions were not asked and other data were not collected.

Some information is not collected because of prevailing laws and regulations e.g. religious affiliation. Other demographic characteristics may only be collected under strict rules; this applies to the determination of ethnic background or ancestry or national origin. Still other facts are not known due to logistical problems in data collection e.g. counting the homeless or the number of illegal immigrants. [http://www.census.gov/dmd/www/pdf/07f_or.pdf]

Survey Methods

Even when information is collected, the results may be inaccurate due to sampling methodologies used for the survey. For example, you may have the impression that the household income of a certain county is $50,000 but, unbeknownest to you, only 2% of the households were included in the sample. Look into the statistical methods used in conducting the survey and in calculating the published results.

Knowing who published the information will also go a long way in assuring you of the reliability of the information. Government agencies and university research institutes tend to be less biased than commercial firms. Wherever possible, verify your information by comparing results from more than one source. More help on evaluating information may be found on the notes for Week 5: Information Management.

File Formats

The data that you retrieve may be in one or more file formats. Obviously, plain text or HTML is easy to print out and read, and you can even highlight and copy portions of the displayed information, and paste them into a document in a word processor or spreadsheet program. However, some files are in portable document format (.pdf) and will require Adobe Acrobat Reader software (freely available at Adobe Systems) for viewing. The information in these files cannot be copied and pasted. [http://www.adobe.com/acrobat/readstep.html]

You may also encounter data that has to be downloaded before you can view it. Some government sites offer their data in compressed or zipped formats which have to be uncompressed; some of these files are self-extracting while others will require the use of a separate utility program such as WinZip. [http://mssg-ftp.rutgers.edu/utilities.html]

Two common file formats (.xls, .wks) are for direct use with spreadsheet programs such as Microsoft Excel or Lotus 1-2-3. Other data may be formatted as "comma-separated values" (.csv) or "tab-delimited" and will have to be imported into and converted by the spreadsheet programs. You can also obtain a free Excel viewer from Microsoft if you do not have the Excel program. [http://www.microsoft.com/office/000/viewers.htm]
Ka-Neng Au, 11 Feb 2002