Courses/CS 460/Fall 2007/Test Harness

From CSWiki

Jump to: navigation, search

Discussions and findings here for the Test Harness project group.

Contents

[edit] TH Group Meeting #7

Friday, Nov 30 2007.
ECST - E&T A331, ca 3:00pm

  • New Test
  • SP Fitness Test and What to Modify

[edit] SP Death Penalty and Home Penalty Search Results

Image:Penalty.jpg

There are three variables in the SP fitness, one is a death penalty (if path leads to death), another is a home penalty (does path point to home?), and a path length penalty (longer paths are better). We looked to see if we can find some better, more optimal factors (the defaults are 10,000,000).

[edit] TH Group Meeting #6

Friday, Nov 16 2007.
ECST - E&T A331, ca 12:00pm

Topics Discussed

  • ST/SP/BEST/BFS test.


[edit] ST,SP,BEST, & BFS Search Results

Image:ST&SP&BEST&BFS.jpg
In the above figure, Fitness doesn't apply to ST & SP search.

[edit] ST,SP,BEST, & BFS Search Data

[edit] STSearch/SP Search Results (Extended)

Image:04_Angle_11_20_Torpedeos.jpg

[edit] STSearch/SP Search Data (Extended)

[edit] A long run

I did a long multi-run with the version that was loaded to the wiki page November 8, 2007 with the following results after 10,000 trials. Russ Abbott 13:30, 12 November 2007 (PST)

ST: true 8994/10000 = 90%;  Sp: true 7693/10000 = 77%;

For what it's worth, the percentages have been 90% and 77% from trial 353 onward.

[edit] TH Group Meeting #5

Friday, Nov 9 2007.
ECST - E&T A331/Lab, ca 1:00pm

Topics Discussed

  • ST/SP vs #Torpedoes & Max Angle.
  • Boat & Torpedoes code.
    • Possible memory leak.
    • OutOfMemoryError.


[edit] STSearch vs #Torpedoes & Max Angle Results

Image:ST3D_2.jpg

[edit] SPSearch vs #Torpedoes & Max Angle Results

Image:SP3D_2.jpg

[edit] STSearch/SP Search Data

The data for the test runs and 2D graphs for each test run.

BoatMaxAngle: 0.05, BoatMaxAngle: 0.06, BoatMaxAngle: 0.07, BoatMaxAngle: 0.08, BoatMaxAngle: 0.09,
BoatMaxAngle: 0.10, BoatMaxAngle: 0.20, BoatMaxAngle: 0.30, BoatMaxAngle: 0.40, BoatMaxAngle: 0.50.


[edit] Multirun XML

Example:

0.06.xml

<?xml version="1.0" encoding="UTF-8"?>
<Repast:Params xmlns:Repast="http://www.src.uchicago.edu">
runs: 1
TorpedoCount {
set_list: 1 2 3 4 5 6 7 8 9 10
}
Multirun {
set_string: 100
}
BoatMaxAngle {
set: 0.06
}
</Repast:Params>


[edit] OutOfMemoryError

Common problem encountered during small BoatMaxAngle tests.

Exception in thread "Thread-6" java.lang.OutOfMemoryError: Java heap space
 at java.awt.image.DataBufferInt.<init>(Unknown Source)
 at java.awt.image.Raster.createPackedRaster(Unknown Source)
 at java.awt.image.DirectColorModel.createCompatibleWritableRaster(Unknown Source)
 at sun.awt.Win32GraphicsConfig.createCompatibleImage(Unknown Source)
 at uchicago.src.sim.gui.Painter.createBufferedImage(Unknown Source)
 at uchicago.src.sim.gui.LocalPainter.paint(Unknown Source)
 at uchicago.src.sim.gui.DisplaySurface.updateDisplayDirect(Unknown Source)
 at edu.csula.cs.boatTorpedo.BoatTorpedoModel.begin(BoatTorpedoModel.java:156)
 at uchicago.src.sim.engine.BaseController.beginModel(Unknown Source)
 at uchicago.src.sim.engine.BaseController.startSim(Unknown Source)
 at uchicago.src.sim.engine.BatchController.start(Unknown Source)
 at uchicago.src.sim.engine.BatchController$1.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)

Increase the Java heap space.

-Xms<initial size>m -Xmx<max size>m

[edit] TH Group Meeting #4

Friday, Nov 2, 2007.
ECST - E&T Lab, ca 1:00pm

Topics Discussed

  • ST/SP vs #Torpedoes & Speed.
  • XML & Multirun testing.
  • Test code modifications.
  • Duration of test runs.
  • Data graphing (2D/3D).
  • Future ST/SP test proposal.


[edit] STSearch vs #Torpedoes & Speed Results

Image:ST_3D.jpg
Plot of (#Torpedoes, Speed Multiplier) Versus %Success for ST Search

[edit] SPSearch vs #Torpedoes & Speed Results

Image:SP_3D.jpg
Plot of (#Torpedoes, Speed Multiplier) Versus %Success for SP Search

[edit] STSearch/SPSearch vs #Torpedoes & Speed Results Overlay Plot

Image:ST_SP_3D_overlay.jpg
ST search success rate decays less rapidly. ST plot is the top mesh, SP plot is the bottom mesh.

[edit] STSearch/SP Search Data

The data for the test runs and 2D graphs for each test run.

BoatSpeed: 0.006, BoatSpeed: 0.007, BoatSpeed: 0.008, BoatSpeed: 0.009, BoatSpeed: 0.010.

[edit] Multirun XML

Example:

0.007.xml

<?xml version="1.0" encoding="UTF-8"?>
<Repast:Params xmlns:Repast="http://www.src.uchicago.edu">
runs: 1
TorpedoCount {
set_list: 1 2 3 4 5 6 7 8 9 10
}
Multirun {
set_string: 100
}
BoatSpeed {
set: 0.007
}
</Repast:Params>

[edit] TH Group Meeting #3

Friday, Oct 26, 2007.
ECST - E&T A331/Lab, ca 1:00pm

Topics Discussed

  • ST/SP vs #Torpedoes revisited
    • Dr. Abbot's Multirun code
  • Tortoise SVN review
  • Test-group repository
  • Test Harness framework
  • Data farming


[edit] Data Farming

Although you will not be able to create a data farming framework, you should read some of the information about data farming. Wikipedia has a brief introduction, including a number of additional links. -- Russ Abbott 07:06, 26 October 2007 (PDT)


[edit] STSearch vs # Torpedoes Revisited Results

Image:ST_revised.jpg
Plot of Number of Torpedoes Versus Number of Successes ST Search


[edit] SPSearch vs # Torpedoes Revisited Results

Image:SP_revised.jpg
Plot of Number of Torpedoes Versus Number of Successes SP Search


[edit] ST & SP Search vs # Torpedoes Results

Image:ST&SP_1-12_10.26.2007.gif

[edit] STSearch/SPSearch Data

Torpedo 1, Torpedo 2, Torpedo 3, Torpedo 4, Torpedo 5, Torpedo 6, Torpedo 7, Torpedo 8, Torpedo 9, Torpedo 10, Torpedo 11, Torpedo 12.


[edit] Multirun XML

<?xml version="1.0" encoding="UTF-8"?>
<Repast:Params xmlns:Repast="http://www.src.uchicago.edu">
runs: 1
TorpedoCount {
set_list: listOfTorpedoes
}
Multirun {
set_string: numberOfRuns for the search strategies
}
</Repast:Params>


[edit] Possible Display bug

Image:what_the.jpg

[edit] TH Group Meeting #2

Friday, Oct 19, 2007.
ECST - E&T A331, ca 1:00pm

Topics Discussed

  • ST/SP vs #Torpedoes
    • Test proposal.
    • Test demonstration.
    • ST XML test file usage.
    • Test run procedures.
    • SP XML test file generation.
    • Test run distributions.
  • Abbott's Multi-run Code.
  • Finding Minimal Parameters.


[edit] Multirun XML format

getseed.xml

<?xml version="1.0" encoding="UTF-8"?>
<Repast:Params xmlns:Repast="http://www.src.uchicago.edu">
runs: numberOfRuns
TorpedoCount {
set: numberOfTorpedos
}
</Repast:Params>


sp.xml

<?xml version="1.0" encoding="UTF-8"?>
<Repast:Params xmlns:Repast="http://www.src.uchicago.edu">
runs: numberOfRuns
TorpedoCount {
set: numberOfTorpedos
}
Rerun {
set_boolean: true
}
Seed {
set_list: listOfSeeds
}
Strategy_Sp_ST {
set_string: searchMethod
}
</Repast:Params>


sp8.xml

<?xml version="1.0" encoding="UTF-8"?>
<Repast:Params xmlns:Repast="http://www.src.uchicago.edu">
runs: 1
TorpedoCount {
set: 8
}
Rerun {
set_boolean: true
}
Seed {
set_list: 
1192840808942 1192840866239 1192840876833 1192840919551 1192841902567
1192840930801 1192840965504 1192840973098 1192840986020 1192841008629
1192841019911 1192841029583 1192841040723 1192841105348 1192841114536
1192841176458 1192841183958 1192841253067 1192841264864 1192841272442
1192841283208 1192841291911 1192841356270 1192841410333 1192841419411
1192841447676 1192841456426 1192841464879 1192841474801 1192841484161
1192841493286 1192841535786 1192841546442 1192841568895 1192841578754
1192841588083 1192841596692 1192841644739 1192841709004 1192841721426
1192841748614 1192841757473 1192841808911 1192841824879 1192841890958
1192841930536 1192841960786 1192841973083 1192841983301 1192841995504
1192842045036 1192842056833 1192842067004 1192842144239 1192842166551
1192842191208 1192842202004 1192842234192 1192842267879 1192842289926
1192842333645 1192842345020 1192842373458 1192842381161 1192842390629
1192842442708 1192842541708 1192842550520 1192842561708 1192842571817
1192842599129 1192842610051 1192842619395 1192842628364 1192842637551
1192842664145 1192842699864 1192842723817 1192842730458 1192842739364
1192842748770 1192842756176 1192842776770 1192842785848 1192842824661
1192842835192 1192842869817 1192842880270 1192842939879 1192842949458
1192842986536 1192843032348 1192843056739 1192843065442 1192843073723
1192843083598 1192843150551 1192843204708 1192843258973 1192843278754
}
Strategy_Sp_ST {
set_string: Sp
}
</Repast:Params>



[edit] STSearch vs # Torpedoes Preliminary Results

Image:ST_100.jpg
Plot of Number of Torpedoes Versus Number of Successes ST Search


[edit] SPSearch vs # Torpedoes Preliminary Results

Image:SP_100.jpg
Plot of Number of Torpedoes Versus Number of Successes SP Search

[edit] TH Group Meeting #1

Friday, Oct 12, 2007.
ECST - E&T A331, ca 1:00pm

Topics Discussed

  • Code modification for testing.
  • Data extraction from multiple runs.
  • Possible parameter modifications.
  • Possible GUI modifications.
  • Organizing/Parsing data for graphing.
  • Distributing test runs.
  • Infinite Looping problem. (see image below)

Image:I_loop.jpg
Common problem encountered during testing; the boat & torpedoes would spin forever.


[edit] STSearch Preliminary Results

Image:Sts_graph1.jpg
Plot of Branching Factor Versus Number of Successes

Discussion

  • Hardware relevance. Testers on different computers obtained significantly different results.
  • Improving testing speed. 100 tests can take 30 to 55 minutes.
  • Stable code version. The source code changes at least twice a day.


Some tests I'd like to conduct
The current system (just uploaded) works nicely. I'd like to try varying some of the parameters to see what effect that has.

  • STSearch (in the package edu.csula.cs.boatTorpedo.treeSearch.frontierSearch.stSearch) has a private int variable branchingFactor. The default is 7. I'd like to see how well the boat does with branching factor values ranging from 2 to 20. That means something like this. Make 100 runs for each value of branchingFactor from 2 to 20 and record how many times (or what percentage of the time) the boat reaches the home base. Then plot the results.

[edit] Multirun

I'm just uploading a version that allows multiple runs to compare different search strategies. (Sorry, I still haven't installed subclipse.) This is implemented independently of the Repast multiple-run capability (This isn't because that capability isn't useful. I just did it this way.)

If the parameter Multirun is 0, the system operates normally. If it is greater than 0 it does that many runs of all the search strategies hard-coded into the BoatTorpedoModel.strategies[] array. (So the code must be modified to allow a different set of search strategies to be compared. Now it compares ST and Sp.)

It works as follows. At the start it generates a sequence of random numbers, which it uses as seeds for the multiple runs. So if Multirun is 5, it will generate 5 random numbers. It then uses each of those random numbers for a separate run of each search strategy. So as currently set up, it uses each random number once for ST and once for Sp. That way the search strategies are compared on the same set of initial conditions.

If Display is left on, each run will be displayed. There is a bug, which causes the result of previous runs to be left on the display. I don't know why that happens. Look at initializeRun() for my attempt to remove the previous avatars. Usually, you won't want to see the display since you will be making many runs, and the display slows the system.

-- Russ Abbott 08:38, 19 October 2007 (PDT)

P.S. The actual code is a bit hacked. Basically it treats the Multirun as a single Repast run and generates new initial conditions when it needs them. See preSte() and postStep(). It uses boolean newRun as a flag to indicate whether it should generate a new initial configuration.

[edit] Multirun Data

Here is some data from a number of multi-runs comparing ST and Sp with different value for the adjustment in AgentState.leftOrRightToIntersect(). "Adjustment parameter tests" (txt) (upload page)


Changing adjustment parameter on ST Search Image:AdjustST.jpg

Changing adjustment parameter on SP Search Image:AdjustSP.jpg