Please select an option below
- Linux Primers
- Launching Batch Jobs
- Datastream Basic Usage
- Finding WRDS SAS Data Set Layouts & Variable Information
Select one of the following links to download the .pdf file
It is assumed that you've already installed PuTTY, WinSCP and know how to access your Flux cluster account.
Submitting/Executing batch jobs is the most efficient way to work on the Flux cluster. The cluster requires some information to wrap your code so that it can submit your job for execution. Wrapping these jobs allows the Flux scheduler to fits jobs to available processing on the system. This means you must make a batch PBS file to submit your code.
Detailed information on setting up Flux cluster and submitting batch jobs on Flux is available at the following weblink. Just follow the section 'Flux in 10 easy steps'.
Using Datastream's AFO (Advance For Office) Excel 2010 plug-in:
- Launch Excel
- Select the Datastream tab
There are essentially two types of searches:
Select "Static Request"
This is used to find data which either do not change over time or change only infrequently (for example, name, isin, location, etc.). For those items which change *infrequently* over time there will be multiple entries within a single static variable entry. To my knowledge, no list exists of the variables which exhibit the "change infrequently" characteristics.
2.) Time Series
Select "Time Series"
This is used to find data which changes regularly over time (typically all income statement, balance sheet items, etc.).
For each type of search you must identify the sample universe
This can be done in two ways.
- Select a previously defined list. Select the icon just beneath the "Find Series" button (it shows a magnifying glass). This displays the names of all previously user-defined lists. Double-click on one to select it.
- Identify a sample using the Navigator: Select "Find Series," then select Criteria Search. This brings up the Navigator proper.
First, select a Data Category (selectable list is to the immediate right)
Then there are a variety of ways to identify firms. Most are self-explanatory. Unfortunately there doesn't exist any documentation which describes all of the available search options (this according to the Datastream support folks).
Determining how best to get at your firms of interest can be extremely tricky. However, once you have made your search criterion selection then click on "Search Now". The results of your search will now be displayed. Here is where things get really interesting. No matter how many results are displayed, you can only select 100 at a time to include in your actual search. Assuming that your sample universe consists of more than 100 firms you have the choice of (a) manually selecting 100, processing them, then manually selecting the next 100, etc., (b) shrink list size by judicious use of the Name search criteria (this too is obnoxious as no wildcard scheme exists so you therefore cannot identify ranges but rather must explicitly identify starting letters/numbers), or (c) create a list (which also has its problems).
To create a list:
- Bring up the Navigator
- Select appropriate search criterion
- Select Search Now
- Select the icon on the far right (next to the "Displayed Results" box) which contains a tiny Excel symbol. Note that you are restricted to exporting a maximum of 8000 firms. If you've got more than 2500 in your sample universe, then you're going to have to do some of the pruning that is suggested in (b) above. A dialog-box will appear.
- Choose which variables you want to export to Excel (probably doesn't hurt to export all of them, which is the default)
- Select Transfer to Excel
- Select the down-arrow next to the Save button
- Select Save-As
- Close dialog-box
- Open the newly created Excel file
- Select the Datastream tab
- Select "Create List (From Range)"
- Choose the column which corresponds to the DS Code (which is the Datastream propieraty ID)
- Enter a meaningful name under "List Description" as this is what will appear when you go to select a previously defined list (see above).
- Enter a file name under "List File Name", do not change the "LLT" suffix!
- Select OK
- Close the Excel file
Now if you go back to step 1 given above, and the list you just created will be displayed using the information you provided in the "List Description" field.
Identify the data items of interest
- Select Datatypes
- Select the appropriate Data Category (should match the Data Category you used to identify your sample firms)
Within a given data category, there are many, many data types from which to choose. You can drill down by opening and closing and selecting sub-categories (works like the Windows Explorer tool).
Use the Datatype bar options to restrict the displayed variables to the type of search you are doing. Either Static or Time Series. If you are conducting a Time Series search and select some Static variables then you will obtain no output for those static variables. The same is true for selecting Time Series variables in a Static search.
Also useful is the Find bar (it appears just above the Datatype bar and just below the Data Category selection line). This will allow you to filter the displayed variables based upon various elements of their names.
Once you've identified an appropriate sub-set of potential variables, select the ones you want by clicking on the empty box to the right of the variable name. A check-mark will appear (this is a toggle). Once you have clicked on all variables of interest then click on "Use Selected" (which appears in the Variable box header to the right of Name).
The procedure from this point depends upon the type of search you are conducting.
Static search: select Submit. The firm's will be listed vertically starting in the column in which the cursor was located when you started the Datastream search process. Each data item will occupy a separate column.
Time Series search: Enter a Start Date, enter an End Date, enter a data frequency, select Transpose Data, select Submit. The firm's will be listed vertically starting in the column in which the cursor was located when you started the Datastream search process. There will be one line per firm per data variable. Each date will occupy a separate column. This display would be reversed if you did not opt to select "Transpose Data" above. However, most folks seem to find it easier to post-process firm/variables as rows and dates as columns.
How to find WRDS SAS data set layouts
- Log on to the WRDS web site
- Select Support
- Select Dataset List (under the Data Contents section)
- Scroll down until you find the name of the Database of interest and select it
- Scroll through the list of available tables and select the one of interest. A table will be displayed which describes the contents of the SAS data set plus a typically not very insightful description of the variables.
How to find additional variable description information
- Log on to the WRDS web site
- Select Support
- Select Data Vendor Manuals (under the Data Contents section)
- Select the appropriate data set. The value of the manual and the ease of use varies across vendors.
An example of using Compustat:
- Select Compustat
- Select Compustat Online Manual
- Select the Search tab, type in the variable name found in the table layout, press ENTER. This will give you a list of all references within the Compustat manual for that variable name. Since you are entering a specific variable name this usually resolves to a single item, the selection of which takes you to the page which describes the variable. Do this for each variable of interest.