Working Set Size

Next: Small Datasets Up: Experimental Results Previous: Experimental Results

Working Set Size

Using the first 10000 examples of each dataset (or 6192 for Kin which is too small), we trained different models using various values of q, from 2 to 100. We used a fixed cache size of 100Mb and turned on the shrinking, but did not use the sparse mode. The optimizer used to solve the subproblems of size q>2 was a conjugate gradient method with projection¹¹. TABLE 2 gives the results of these experiments. It is clear that q=2 is always faster than any other value of q. Thus, in the following experiments, we have always used q=2.

Table 2: Training time (in seconds) as a function of the working set size, for non-sparse data.

	Working set size
	2	4	10	50	100
Kin	11	14	16	28	54
Artificial	98	149	190	629	1537
Forest	272	406	462	670	981
Sunspots	7	11	15	45	89
MNIST	573	664	829	1657	2213

Next: Small Datasets Up: Experimental Results Previous: Experimental Results

Journal of Machine Learning Research