Category Zip File Format

Reason for New Format

Summary of Changes

Example 1: Compressed file containing 1000 or less data series files.

Example 2: Compressed file containing more than 1000 data series files, and not more than 1000 series IDs start with the same first letter.

Example 3: Compressed file containing more than 1000 data series files, and more than 1000 series IDs start with letter 'A'.

Reason for New Format

The old category zip file format did not create any directories when a file was uncompressed. This did not scale well as the number of series in a category increased, and all files were extracted to the current directory. Also some file systems have fixed limits on the number of files that can be stored in a single directory.

To address this problem, a new directory structure has been created for all FRED category zip files (*.zip). This also includes zip files containing all FRED series data files.

Note that the old category zip file format is no longer available as of January 3, 2006.

Summary of Changes

In brief, the changes are:

  1. Compressed files are extracted to a directory that has the name of the compressed file without the extension. For instance, CPI_txt_2.zip extracts to directory 'CPI_txt_2', and CPI_xls_2.zip extracts to directory 'CPI_xls_2'.
  2. For compressed files containing 1000 files or less, A 'data' subdirectory contains all series data files.
  3. For compressed files containing more than 1000 files, subdirectories within the 'data' directory are created recursively for each character in a series ID until 1000 or fewer series IDs match a given substring. This is explained in more detail in example 2 and example 3.
  4. The new format has 2 README files, README_SERIES_ID_SORT.txt and README_TITLE_SORT.txt. Both README files lists all series data file properties and their locations. The 2 README files differ only by how the series data file information is sorted. README_SERIES_ID_SORT.txt sorts series data file information by series ID, and README_TITLE_SORT.txt sorts series data file information by series title.

Example 1: Compressed file containing 1000 or less data series files.

Consider the category 'FRED > Categories > Employment & Population > Population' and the category files available for download. With the new directory structure, unzipping EPPOP_txt_2.zip will output:

EPPOP_txt_2
|-- README_SERIES_ID_SORT.txt
|-- README_TITLE_SORT.txt
`-- data
    |-- CNP16OV.txt
    |-- POP.txt
    `-- POPTHM.txt

'EPPOP_txt_2' and 'data' are directories. 'README_SERIES_ID_SORT.txt' and 'README_TITLE_SORT.txt' describe the location and properties of series data files. 'CNP16OV.txt', 'POP.txt', and 'POPTHM.txt' are the actual series data files in text format for the series with IDs 'CNP16OV', 'POP', and 'POPTHM', respectively.

Below is the contents of the 'README_SERIES_ID_SORT.txt' file:

Category: FRED > Categories > Employment & Population > Population
Link: http://research.stlouisfed.org/fred2/categories/104
README File Created: 2004-09-09
FRED (Federal Reserve Economic Data)
Economic Research Division
Federal Reserve Bank of St. Louis

Files in the data directory sorted by series id:

File                 Title, Units, Frequency, Seasonal Adjustment, Last Updated
CNP16OV.txt          Civilian Noninstitutional Population, Thous., M, NSA, 2004-06-10
POP.txt              Total Population: All Ages including Armed Forces Overseas, Thous., M, NA, 2004-06-10
POPTHM.txt           Population: Mid-Month, Thous., M, NA, 2004-06-10

Abbreviations:

Frequency
   A = Annual
   Q = Quarterly
   M = Monthly
   BW = Bi-Weekly
   W = Weekly
   D = Daily
Seasonal Adjustment
   SA = Seasonally Adjusted
   NSA = Not Seasonally Adjusted
   SAAR = Seasonally Adjusted Annual Rate
   NA = Not Applicable

Directories are limited to 1000 files or less.
Subdirectories are created recursively for each character in a series ID until 1000 or fewer series IDs match a substring.
Most zip files contain less than 1000 files and do not contain additional subdirectories.   

Note that for each series in this category, there is a row in the README file that lists the series file and its series properties- title, units, frequency, seasonal adjustment, and last updated. In the file 'README_SERIES_ID_SORT.txt', the series rows are sorted by series ID, or in other words the name of the file. The other README file 'README_TITLE_SORT.txt' has an identical structure to 'README_SERIES_ID_SORT.txt' except that the series rows are sorted by series title instead of series ID. These 2 README files can be thought of as indexes to the locations of each series file. Sometimes you may want to lookup a series file by ID, and other times you may want to lookup a series file by title. This can be useful for categories that contain hundreds or even thousands of files.

Example 2: Compressed file containing more than 1000 data series files, and not more than 1000 series IDs start with the same first letter.

Another major feature of the new directory structure is that directories are limited to 1000 files or less. To accommodate categories with more than 1000 files, subdirectories are created recursively for each character in a series ID until 1000 or fewer series IDs match a substring. Most zip files contain less than 1000 files and do not contain these additional subdirectories.

Consider the zip files that contain all FRED data series. As of December 2004, these zip files contains approximately 3000 data files. Unzipping FRED2_txt_2.zip, creates the following directory structure.

FRED2_txt_2
|-- README_SERIES_ID_SORT.txt
|-- README_TITLE_SORT.txt
`-- data
    |-- A
    |   |-- AAA.txt
    |   |-- ADJBORNS.txt
    |   `-- ... 
    |-- B
    |   |-- BA3M.txt
    |   |-- BA6M.txt
    |   `-- ... 
    |-- C
    |   `-- ... 
    |-- D
    |   `-- ... 
    |-- E
    |   `-- ... 
    |-- F
    |   `-- ... 
    |-- G
    |   `-- ... 
    |-- H
    |   `-- ... 
    |-- I
    |   `-- ... 
    |-- J
    |   `-- ... 
    |-- K
    |   `-- ... 
    |-- L
    |   `-- ... 
    |-- M
    |   `-- ... 
    |-- N
    |   `-- ... 
    |-- O
    |   `-- ... 
    |-- P
    |   `-- ... 
    |-- R
    |   `-- ... 
    |-- S
    |   `-- ... 
    |-- T
    |   `-- ... 
    |-- U
    |   `-- ... 
    |-- V
    |   `-- ... 
    |-- W
    |   `-- ... 
    `-- X
        `-- ... 

Directory 'FRED2_txt_2\data\A' will contain all series files with series IDs starting with letter 'A', and directory 'FRED2_txt_2\data\B' will contain all series files with series IDs starting with letter 'B'. Note in this example, no series have series IDs that start with letter 'Y', and as a result no corresponding directory was created.

Note the README files show the path for the data series files. Below is an excerpt from README_SERIES_ID_SORT.txt.

Files in the data directory sorted by series id:

File                 Title, Units, Frequency, Seasonal Adjustment, Last Updated
A\AAA.txt            Moody's Seasoned Aaa Corporate Bond Yield, %, M, NA, 2004-06-10
A\ADJBORNS.txt       Adjustment Plus Seasonal Borrowings of Depository Institutions from the Federal Reserve, Bil. of $, M, NSA, 2003-02-07

Notice 'A\AAA.txt' and 'A\ADJBORNS.txt' show relative paths within directory 'data'.

Example 3: Compressed file containing more than 1000 data series files, and more than 1000 series IDs start with letter 'A'.

Over time as new series are added to FRED, it may be the case that more than 1000 series IDs start with letter 'A'. In this case, the directory structure will be modified in the following way:

FRED2_txt_2
|-- README_SERIES_ID_SORT.txt
|-- README_TITLE_SORT.txt
`-- data
    |-- A
    |   |-- A
    |   |   |-- AAA.txt
    |   |   |-- AAB.txt
    |   |   `-- ...
    |   `-- D
    |       |-- ADJBORNS.txt
    |       |-- ADJRAM.txt    
    |       `-- ...
    |-- B
    |   |-- BA3M.txt
    |   |-- BA6M.txt
    |   `-- ... 
    |-- C
    |   `-- ... 
    |-- D
    |   `-- ... 
    |-- E
    |   `-- ... 
    |-- F
    |   `-- ... 
    |-- G
    |   `-- ... 
    |-- H
    |   `-- ... 
    |-- I
    |   `-- ... 
    |-- J
    |   `-- ... 
    |-- K
    |   `-- ... 
    |-- L
    |   `-- ... 
    |-- M
    |   `-- ... 
    |-- N
    |   `-- ... 
    |-- O
    |   `-- ... 
    |-- P
    |   `-- ... 
    |-- R
    |   `-- ... 
    |-- S
    |   `-- ... 
    |-- T
    |   `-- ... 
    |-- U
    |   `-- ... 
    |-- V
    |   `-- ... 
    |-- W
    |   `-- ... 
    `-- X
        `-- ... 

Note all series IDs starting with 'A' have been pushed down a directory. Directory 'FRED2_txt_2\data\A\A' will contain files for all series IDs starting with 'AA', and directory 'FRED2_txt_2\data\A\D' will contain all files for series IDs starting with 'AD'. The parent directory directory 'FRED2_txt_2\data\A' will not contain any series files. In this example, all series IDs starting with letter 'A' have second letters of either 'A' or 'D'. In a more realistic example, there would be other second characters, and as a result other subdirectories would be created. Note that in this example 1000 or less files start with letter 'B', and, as a result, files in directory 'FRED2_txt_2\data\B' have not been pushed down further into subdirectories.


Recently Viewed Series


Subscribe to our newsletter for updates on published research, data news, and latest econ information.
Name:   Email:  
Twitter logo Google Plus logo Facebook logo YouTube logo LinkedIn logo

Click to send us feedback