Got Data?
In anticipation of the algorithms presented in a course I’m having this semester I toyed around yesterday and created the diskvector. It is a simple piece of code that only took two hours to write. It is used when you need a vector of, say, a gibibyte of data. If you only have 128Mib of memory this will clearly nut suffice. Your operating system will then page the data onto disk, but you might find it hard to control this behaviour, and random access will nevertheless cripple your computer. The diskvector is initialized with the total size of the vector and the number of elements (B) you want to keep in memory. The disk vector overloads the [ ] operator and will always keep B consecutive elements in memory and only reload the memory if you try to get an element not in memory. This can be used to, for instance, the construction of a fast searching algorithm for large vectors. You could for instance modify the standard merge sort to work in blocks of size B.