Performance measurements of a clustered, high-end storage system.
Previous research has resulted in building a working prototype of a clustered
storage system that is (will be) able to scale to large amounts of storage (in
the order of hundreds of disks and TBytes of capacity). The purpose of this
project is to perform the initial evaluation of a subset of such a system with
micro-benchmarks and realistic applications. The work will mostly involve understanding
of performance issues in systems software (storage I/O stack in the Linux kernel)
on a real system.
Using the resources of a graphics adapter for high-performance computing.
Graphics cards are very good at certrain types of computation. What types of
more "generic" operation can be performed efficiently using a graphics
card? This project will examine certain types of applications and/or operations
(e.g. encryption/decryption operations) and how they can be performed on graphics
cards more efficiently than general purpose CPUs.
Performance evaluation of Linux I/O schedulers.
Linux kernel 2.6 comes with four different flavors for the disk I/O scheduler:
no-op (traditional 'elevator' from 2.4 kernel), anticipatory, deadline-based,
and completely-fair queueing (CFQ). Which one is the best and for what workloads?
Also, how does the selection of a filesystem interract with the scheduler
(remember a filesystem changes the disk data access pattern generated by the
application due to metadata accesses generated by the filesystem itself).
How does ext3, ReiserFS, xfs interract with the schedulers? How about parallel
filesystems?
Tune and examine the overheads of an existing shared virtual memory system
on top of a 10 Gbit/s interconnect.
Shared virtual memory is a technique that allows a cluster of systems that do
not share memory to behave a single, multiprocessor system that can execute
transparently multi-threaded applications. Such system usually rely on high-speed
interconnects to achieve good performance. The goal of this work is to port
an existing, heavily optimized shared virtual memory system that runs on top
of 1 GBit/s Ethernet interfaces to 10 GBit/s network interfaces (Ethernet or
otherwise).
Porting and tuning of kernel-level remote storage access protocol over
10 GBit/s network interfaces.
Previous work has resulted in a kernel-level protocol that performs
remote storage I/O on top of high-speed interconnects. This protocol
is currently implemented on top of custom-designed network cards. The
purpose of this work is to port this protocol on top of commercially
available, 10 GBit/s network cards that have a programmable
processor. The work involves porting of user- and kernel-level code,
as well as the potential for executing part of the code in the network
interface it self for further improving system performance.
Examine the implications of removing interrupts from the receive path of
disk and network I/O stack in the kernel.
In high speed I/O (network or storage) a major performance problem
(especially for speeds greater than 10GBits/s) is the cost of
interrupts. The goal of this work is to examine how systems software
may be restructured on systems that have multiple CPUs to replace
interrupts with polling (or hybrid) techniques to eliminate the cost
associated with interrupt processing.
Create a repository of existing applications for multi-core processors
A main trend in processor architecture is the design and
implementation of multi-core cpus that share as few hardware structures
as possible (for achieving good scalability in future
designs). Evaluating a design in this area requires running actual
applications. The purpose of this project is to collect a set of
applications that may be used for performance analysis, port them on a
multi-core CPU, and provide a means for running them off-line in an
automated manner.
Improving the robustness of the communication protocol in a real
sensor network.
Recently, it has become possible to build sensors that besides sensing
they also have processing and communication capabilities. Such
systems, are usually equipped with a small CPU, little memory, and a
short-range, low-speed wireless interface. Previous work in this area
has resulted in building an operating system that allows programmer to
write and execute programs on a network of such devices. In this
system, the communication protocols is one of the most significant
components, as it is responsible for all interactions of each sensors
with the rest of the world. The goal of this work is to examine in
more detail the characteristics of the existing communication protocol
and the factors that affect its robustness and to propose mechanisms
that will improve communication efficiency in a real (noisy)
environment.
Development of a block-device access tracer for the Linux kernel
An interesting problem in analyzing the performance of storage
systmes is the ability to collect traces of block access patterns
during the execution of certain I/O-intensive workloads (eg: TPC-C
using MySQL, on top of reiserfs). This project involves developing a
tracer kernel module that "appears" to be a block device, but
actually relays read/write requests to another block device (eg:
bindings like /dev/tracer0 --> /dev/sda). The tracer module uses a
dedicated disk, or partition, to store (in 'raw' binary format)
trace records. The tracer module should be controlled & monitored
via ioctl() and/or via /proc entries (eg: signal to start/stop
tracing, accumulative count of read/write accesses). To retrieve the
records for later processing, we could a user-space tool.
Prototype of a content-addressable storage system
Design and implement a "virtual", content-addressable block device. Block
devices traditionally allow read/write addresses based on block addresses.
However, in a content-addressable device, the write operation returns a (content)
"key" or "tag" to the user and the user is able to retrieve blocks using this
key with the read operation. Thus, such devices, do not have duplicate blocks,
they provide strong support for archival purposes, however, may be hard to
use when updates to existing information is required. This project will build
such a device and will also provide a simple file system that allows users
to store and retrieve regular files.
Install, evaluate, and tune an existing distributed file system (GFS) on
top of a mid-size clustered storage system.
Building large scale storage systems today requires usually using a distributed
file system. Although this approach introduces very high overheads and scalability
limitations, it is the only realistic approach at this point. Current research
aims at addressing these limitations. The goal of this work is to examine the
overheads associated with using distributed file systems. This will happen by
examining the performance of an actual distributed file system (most probably
GFS) on a real system with ~100 disks using micro-benchmarks and real applications.
Design and evaluation of storage compression and duplicate elimination techniques
at the block-level.
With increasing needs for storage, saving space becomes an increasingly important
problem for many applications. Doing so transparently and without application
or file system modifications may result in reducing the cost significantly in
many application domains. This project will examine techniques for eliminating
duplicate blocks as well as compressing them on top of a block-level storage
system that supports only fixed block sizes. The work will occur mostly in the
Linux kernel and will use a custom framework (developed locally) that facilitates
kernel-level development of storage modules.