CS345 - Topics in Data Warehousing
Autumn 2004

Assignment #4 Details

This page contains detailed information about Assignment #4.

Provided Scripts and Sample Code
The following SQL scripts and sample code fragments are provided in the directory /usr/class/cs345/HW4 on Stanford's UNIX systems.
  • createTables.sql Creates the metadata tables used to describe the schema.
  • insertRows1.sql Populates the metadata tables with one configuration that can be used to test your code.
  • insertRows2.sql Populates the metadata tables with another configuration that can be used to test your code.
  • deleteRows.sql Deletes the configuration information inserted by insertRows1.sql or insertRows2.sql.
  • dropTables.sql Drops the metadata tables.
Results on Test Data

After running your decision tree algorithm using the metadata rows created by the insertRows1.sql script, the following should be the structure of the decision tree learned by your algorithm:

Split on: DHOUR89
  DHOUR89=5: Split on: DWEEK89
    DWEEK89=2: Split on: IMILITARY
      IMILITARY=4: LEAF, class=3, correct = 86/253
      IMILITARY=3: LEAF, class=3, correct = 4/14
      IMILITARY=2: LEAF, class=3, correct = 26/64
      IMILITARY=1: LEAF, class=2, correct = 5/18
    DWEEK89=1: LEAF, class=1, correct = 52/141
  DHOUR89=4: LEAF, class=2, correct = 118/287
  DHOUR89=3: Split on: DWEEK89
    DWEEK89=2: LEAF, class=2, correct = 393/852
    DWEEK89=1: LEAF, class=1, correct = 333/543
  DHOUR89=2: LEAF, class=1, correct = 210/353
  DHOUR89=1: LEAF, class=1, correct = 394/470
  DHOUR89=0: LEAF, class=0, correct = 2599/2600

After running your aggregate selection algorithm on the test data produced by the insertRows2.sql script, you should get the following results:

Split on: IMILITARY
  IMILITARY=4: Split on: DHOUR89
    DHOUR89=5: LEAF, class=0, correct = 250/366
    DHOUR89=4: Split on: DINCOME1
      DINCOME1=4: LEAF, class=0, correct = 9/9
      DINCOME1=3: LEAF, class=0, correct = 58/72
      DINCOME1=2: LEAF, class=0, correct = 60/100
      DINCOME1=1: LEAF, class=1, correct = 25/45
      DINCOME1=0: LEAF, class=0, correct = 3/7
    DHOUR89=3: Split on: DINCOME1
      DINCOME1=4: LEAF, class=0, correct = 13/17
      DINCOME1=3: LEAF, class=0, correct = 118/180
      DINCOME1=2: LEAF, class=1, correct = 239/427
      DINCOME1=1: LEAF, class=1, correct = 288/465
      DINCOME1=0: LEAF, class=0, correct = 18/34
    DHOUR89=2: LEAF, class=1, correct = 242/328
    DHOUR89=1: LEAF, class=1, correct = 297/428
    DHOUR89=0: Split on: DINCOME1
      DINCOME1=2: LEAF, class=0, correct = 0/0
      DINCOME1=1: LEAF, class=0, correct = 1/1
      DINCOME1=0: LEAF, class=1, correct = 879/1155
  IMILITARY=3: LEAF, class=0, correct = 61/63
  IMILITARY=2: LEAF, class=0, correct = 554/578
  IMILITARY=1: LEAF, class=0, correct = 39/44
  IMILITARY=0: LEAF, class=0, correct = 663/1276