High Performance Computing

The High-Performance Computing (HPC) Cluster is available to researchers, faculty, student, staff in need of computing power. Large scale computing in Sciences, Engineering, and Bioinformatics is supported. The Cluster has ~7752 CPU compute cores and additional capacity is added yearly. This central service is offered at no cost to clients.

 

The HPC Cluster delivers:

35845900.00
CPU Hours
59427800.00
GPU Hours

of free compute time per year to the user community!

Getting Started for New Users

Tufts High Performance Computing (HPC) is comprised of the Tufts Linux Research Cluster and the host of advanced mathematical and scientific research applications installed on them there. 

Access to the Tufts Research Cluster resource begins with the account registration form. Click on the 'Request a Cluster Account or Cluster Storage' button (above the Tutorials section of this page).

  1. Log in with your Tufts Username and Tufts Password. Note: guests and students require faculty or researcher sponsorship.
  2. Fill in out the form, correcting any pre-filled information that is incorrect.
  3. Remove the example text in the Usage Information box and briefly describe your planned use of the Cluster.
  4. In the Type of Account field, select the Cluster account.
  5. When finished, click Submit Request.

The Research and Geospatial Technology Services group supports scientific, mathematical, and biological research at Tufts by providing access to the high-performance computing power of the Tufts Linux Research Cluster. These servers and their extensive list of installed software, editors, and compilers combine to create a high-performance computing environment worthy of Tufts research, science and medical communities.

Features include:

  • Online forms available for Cluster accounts
  • Network-accessible Research storage is available to support data collection
  • Tufts-sponsored Cluster guest accounts support collaborative research

Please log into the cluster using your Tufts UTLN and password, which are the same credentials you use to log into email, Canvas, SIS, and many other services. If you do not have your password, then you can set it by going to http://tuftstools.tufts.edu and selecting “I forgot or don’t have my Tufts password.”


If you have questions regarding the Tufts HPC cluster usage, please let us know.

Please find more information for new Tufts HPC users at: https://tufts.box.com/v/HPC-New-User

Note: If you submitted additional request to access a research group folder or QOS on the cluster, we have generated a child ticket for that request and assigned it to the TTS Research Technology group.

Access

Ways to access Tufts HPC cluster:

Outside Tufts Network? If yes, Tufts VPN is required before you can utilize the Tufts High Performance Compute (HPC) Cluster and OnDemand web interface. Need Tufts VPN? Download and install Cisco Anyconnect. Once installed, input the server address (vpn.tufts.edu/duo) and then press 'Connect.' Enter your Tufts credentials and choose a Duo authentication method for the second password field.

An operating system independent web-based entryway to Tufts HPC cluster. Easy access to popular resources on the cluster. Works well with Chrome and FireFox. Great for new Linux users.

Please check out the Tufts HPC cluster “OnDemand” web interface.

NOTE: You must be connected to the Tufts VPN in order to access the web interface login page.

  • “Files” Menu: You can access your home directory (/cluster/home/Tufts_Account) and cluster research project storage space (/cluster/tufts). You can download/upload/view/edit files from the built-in file browser with code highlighting.
  • “Clusters” Menu: You can start a web based command line terminal accessing the cluster using “Tufts HPC Shell Access” or a command line terminal with FastX11 for graphical user interface applications (RStudio, Matlab, STATA, COMSOL, etc.) using “Tufts HPC FastX11 Shell Access”
  • “Interactive Apps” Menu: Simply point, select, and click, you can launch available applications (Jupyter, Jupyter Lab, RStudio, ABAQUS, COMSOL, MAPLE, MATHEMATICA, MATLAB, X11 Terminal) on Tufts HPC cluster directly from your browser!
  • “Misc” Menu: You can find out the Tufts HPC cluster “Inventory” and “Scheduler Info”. You can also check your home directory and group cluster research project storage space disk usage with “Quota Report”.

Putty is a free and open-source terminal emulator, serial console and network file transfer application. It supports several network protocols, including SCP, SSH, Telnet, rlogin, and raw socket connection. Configure Putty to use SSH protocol and point the connection to login.cluster.tufts.edu.

 

For more information, check out the PuTTY software site.

SecureCRT for Windows, Mac, and Linux provides rock-solid terminal emulation for computing professionals, raising productivity with advanced session management and a host of ways to save time and streamline repetitive tasks. SecureCRT provides secure remote access, file transfer, and data tunneling. Configure SecureCRT to use SSH protocol and point the connection to login.cluster.tufts.edu.

 

For more information, check out the SecureCRT software site.

File Management

Transfer your data to/from Tufts HPC cluster:

File Management Options

To transfer files between the Cluster and your local computer, be sure to use SCP or SFTP protocol and hostname “xfer.cluster.tufts.edu”. Options for file management:

Compute

Get your work done on Tufts HPC cluster:

SLURM

What is SLURM?

Slurm is an open-source workload manager designed for Linux clusters of all sizes. It provides three key functions:

1) It allocates exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work.

2) It provides a framework for starting, executing, and monitoring work on a set of allocated nodes.

3) It arbitrates contention for resources by managing an internal queue of pending work. This is done with Slurm's fair-share algorithm framework.

Tufts HPC Cluster GPU Partition Updates

In keeping with other institutions and best practices, this update improves how GPUs are managed on the Cluster. Additionally, these changes set up the framework for future growth on both the system-wide GPU capacity as well as contributed capacity. Further information will be provided as those plans move forward.

If you need to use GPUs on the HPC Cluster, please follow the instructions below. Please only allocate jobs on these nodes which specifically use GPUs. They are not for general-purpose CPU bound computations.

If you are interested in using NVIDIA Tesla K20Xm on the cluster, please add “--gres=gpu:k20xm:1” to your srun, sbatch command or script.

 

For srun command:

$ srun -p gpu --gres=gpu:k20xm:1 -n 1 --mem=2g --time=1:30:00 --pty bash 

The above command requests “1 NVIDIA Tesla K20Xm with 1 CPU core, 2GB of RAM, and 1 hr 30 mins of time.”

 

For sbatch submission script, please add:

#SBATCH --gres=gpu:k20xm:1

If you are interested in using NVIDIA Tesla P100 on the cluster, please add “--gres=gpu:p100:1” to your srun, sbatch command or script.

 

For srun command:

$ srun -p gpu --gres=gpu:p100:2 -n 1 --mem=2g --time=1:30:00 --pty bash 

The above command requests “2 NVIDIA Tesla P100 with 1 CPU core, 2GB of RAM, and 1 hr 30 mins of time.”

 

For sbatch submission script, please add:

#SBATCH --gres=gpu:p100:2

NOTE:

1. All commands are in lower case.

2. There is one NVIDIA Tesla K20Xm available on each alpha025 and omega025 node. 

3. There is more NVIDIA Tesla P100 available per node on the pgpu nodes.

 

Please contact us if you have any questions!

Concepts Overview

Partitions vs. queues

Slurm does not have queues that differentiate service and job types. Instead it has the concept of a partition.

The new cluster, login.cluster.tufts.edu, has five partitions: batch, interactive, gpu, largemem and mpi.  When a compute job is submitted with Slurm, it must be placed on a partition. The batch partition is the default partition. Each partition has default settings of various kinds, such as cores, memory,  duration, priority, threads, etc. 

Explicit Slurm requests are required to alter defaults. For example, the batch partition has a default allocation of one cpu core and a maximum of 2 gig of ram memory. This could be used for a serial job or shell script. Partition settings may be viewed using the squeue command.

 

salloc vs. bsub options

Another concept is that of resource allocation. The salloc command allows one to request resources and a partition. salloc is available to fine tune resources for one or more submitted jobs. Cavalier use of salloc will likely waste resources unless it is well thought-out. Try to assess your needs carefully. It is likely that the larger the requested allocation, the longer the wait for the resources to be granted.

Associated with an allocation via the salloc command is the notion of job steps. These steps are nested within a granted allocation.  An allocation of resources is relinquished when you are finish with your job steps (by typing exit when used interactively) or when the allocation  times out or tasks are exhausted or the use of exit in a script is executed. Commands sbatch and srun can host all the typical resource requests for most jobs.  

NOTEFor most simple and straight-forward Slurm needs, most users will not need to use salloc.  

 

slurm Qos vs. queues

In order to provide a quality of service distinction by mechanisms such as relative job priority, hardware separation, fairshare policy rules, and preemption, and to support the needs of faculty contributed hardware owners, Slurm has the concept of slurm bank account and qos.  This is optionally applied to a partition and job submission.  Most cluster users are not affected by this and can ignore it, resulting in the profile of normal qos. Nothing special needs to be done by cluster users. The normal qos is specified as an option to srun and sbatch commands as follows:

--account=normal   --qos=normal

Those users that are affected and require access as defined by previous LSF queues may use the following as appropriate:

--account=yushan   --qos=yushan

--account=abriola   --qos=abriola

--account=napier   --qos=napier

--account=miller   --qos=miller

--account=cowen   --qos=cowen

--account=khardon   --qos=khardon

--account=cbi  --qos=cbi

--account=atlas  --qos=atlas

--account=qu  --qos=qu

NOTEThese account and qos pairs have elevated slurm placement privileges. 

 

Dregs

An additional special purpose QoS, dregs, is provided to address low-priority, long-running jobs that can withstand pre-emption and cancellation as a possible outcome. This is only suitable for a special class of jobs.

--account=dregs  --qos=dregs

Dregs is configured to have the lowest job priority among all Slurm qos/account options. Jobs submitted with a dregs specification may be preempted by other jobs during periods of heavy cluster use. If Slurm decides to preempt dregs job(s), the job(s) are re-queued for submission and subject to available cluster resources. In effect, the job starts over!

Not all workloads are appropriate for this qos treatment. Single core/low memory jobs are more likely to be suitable for dregs access and successful use.

 

Partition Runtime and Job Durations

Partition runtime duration limit:

Partition

Duration

gpu 3 days
largemem 5 days
interactive 4 hours
batch 3 days
mpi 7 days
m4

7 days

Job duration:

Use of slurm requires a job duration estimate. A default job time limit of 15 minutes has been set since May 26, 2016. Do not confuse this setting with a slurm partition maximum duration time limit, which already exists.  Each submitted job to all partitions(except Interactive),  needs to have a time estimate of your job's duration. For example, if your job is expected to run for 1 day and 2 hours, specify that in the sbatch or srun options:

> sbatch .... -p batch --time=1-2:00:00 .....

NOTE: Failure to include the --time specification will result in the job's termination after the default time limit of 15 minutes (plus some overhead time). If you think your job may be a long-duration job, set the --time option to something close to the partition maximum time and note the resulting duration. Subsequent job submissions may then be specified with better accuracy. The exact time is not needed! Unless there is a good reason to use these, we suggest using the gpu partition.

Command Equivalents

The following table is for users who have a prior background in LSF and need appropriate "translations" in Slurm: 

  Slurm LSF
controlling jobs scontrol bstop
copy files to allocated local disk sbcast lsrcp
obtain resources salloc bsub + specific queue + bsub options
interactive session or application srun bsub -Ip -q int_public6 xxxxxx
other submissions srun lsrun, lsplace, bsub
kill job scancel bkill
node listing sinfo bhosts, lshosts
submit job sbatch bsub
usage accounting  sreport, sacct, sstat bacct
view current running job(s) squeue bjobs,  qstat
what queues/partitions squeue bqueues

[Comparison matrix of Slurm, LSF, and other popular schedulers]

Research Cluster FAQs

Cluster computing is the result of connecting many local computers (nodes) together via a high speed connection to provide a single shared resource. Its distributed processing system allows complex computations to run in parallel as the tasks are shared among the individual processors and memory. Applications that are capable of utilizing cluster systems break down the large computational tasks into smaller components that can run in serial or parallel across the cluster systems, enabling a dramatic improvement in the time required to process large problems and complex tasks. Faculty, research staff, and students use this resource in support of a variety of research projects.

  • No user root access
  • No user ability to reboot node(s)
  • No user machine room access to Cluster hardware
  • No alternative linux kernels other than the current REDHAT version
  • No user cron or at access
  • No user servers/demons such as: HTTP(apache), FTP. etc.
  • Only Tufts Technology Services parallel file system storage is supported
  • No user contributed direct connect storage such as usb memory, or external disks. Contact us if you'd like to discuss expansion of our supported storage.
  • Only limited outgoing Internet access from the headnode will be allowed; exceptions must be reviewed
  • Allow approximate 2-week turn around for software requests
  • Whenever possible, commercial software limit to the two most recent versions
  • Home directory and parallel file system have snapshots but are not backed up
  • Temporary public storage file systems have no quota and are subject to automated file deletions
  • Cluster does not export file systems to user desktops
  • Cluster does not support Virtual Machine instances

The Tufts University Linux Research Cluster is comprised of Cisco, IBM and Penguin hardware running the Red Hat Enterprise Linux 6.9 operating system.

Fast Facts:

  • Nodes are interconnected via 10Gig and 100Gig networks with projects in the works to support 100Gig infiniband.
  • Memory configurations range from 32GB, 128GB, 256GB, 384GB, 512GB and 1TB and each node has a core count of 16, 20, 40 or 72 cores
  • There are 5 GPU nodes with 12 Nvidia cards including Tesla K20Xm, P100 and soon V100. 
  • The system is managed via the SLURM scheduler with a total of 7636 cores and 32TB of memory with 41216 GPU cores.

In addition to the research cluster, this resource is supported by Tufts research networked storage infrastructure. 

Fast Facts:

  • Storage consists of a dedicated 600TB parallel file system (GPFS) along with 600TB of object storage (DDN WOS) for archival purposes.
  • Dedicated login, management, file transfer, compute, storage and virtualization nodes are available on the cluster, all connected via a dedicated network infrastructure. Users access the system locally and remotely thru ssh clients as well as a number of scientific gateways and portals, enabling not only access for experienced users but emerging interest across all domains.
  • The system was also one of the first to support singularity, the emerging container standard for high performance computing which has proven popular among users of machine and deep learning software stacks.
  • Web based access is provide via the OnDemand (OOD) web portal software which Tufts has been a participant in porting, testing and deploying along with other HPC centers.

YES.

Tufts University maintains a distributed information technology environment, with central as well as local aspects of overall planning and control. Tufts' information security program is structured in a similar manner. Operationally, Tufts central IT organization (TTS) and each local IT group maintain standards of quality and professionalism regarding operational processes and procedures that enable effective operational security. For TTS managed systems, the emphasis is on centralized resources such as administration and finance, telecommunications, research computing and networking, systems and operations as well as directory, email, LDAP, calendaring, storage and Windows domain services. TTS also provides data center services and backups for all of these systems. Additionally, a large number of management systems (for patching), anti-virus and firewall services are centrally provided and/or managed by TTS. Within TTS, processes and procedures exist for managed infrastructure changes, as change control is required for all critical central systems. Tufts University provides anti-virus software for computers owned by the University, and makes anti-virus software available at no charge for users who employ personally owned computers in the course of their duties at the University.

Tufts Research Storage services is based on a Network Appliance(NetApp) storage infrastructure located in the Tufts Administration Building(TAB) machine room. Provisioned storage is NFS (Network File System) mounted on the Research Computing Cluster for project access. NFS exports are not exported outside of TTS managed systems. Tufts Research Computing Cluster is also co-located within TAB's machine room. Network based storage connected to the cluster is via a private(non public) network connection.

Access to the Tufts IP network itself is controlled via MAC address authentication which is performed via the Tufts login credentials and tracked in the TUNIS Cardinal system; this system uses an 8 character password scheme. A switched versus broadcast hub network architecture is in place limiting traffic to just the specific ports in use to transport data from source to destination. Access to Tufts LAN network resources is controlled via Active Directory where applicable or LDAP, which requires the user to authenticate each time a system joins the domain. All of these controls are identically implemented on the wired as well as wireless Tufts networks.

Both Research Storage and linux based Cluster Compute server operating systems are kept current via sound patch management procedures. For example, PC's owned and managed by Tufts are automatically patched via the Windows Server Update Service. All other computing platforms are required to be on a similar automated patching schedule. From an operational standpoint, most central and local systems are maintained and managed using encrypted communications channels. For UNIX/linux servers, SSH is utilized; on Windows, Microsoft Terminal Services is utilized. User access to cluster services is via SSH and LDAP. No direct user login access to central Research Storage services is possible.

Please reference this resource as: Tufts High-Performance Computing Research Cluster

Cluster Research Use Cases

Faculty, Research Staff and students use this resource in support of a variety of research projects. To understand how the cluster supports research at Tufts, see the following user comments, which show a wide range of applications. If you wish to contribute a short description of your cluster usage, please contact lionel.zupan@tufts.edu.

As a part of Kyle Monahan's MS research in the Civil and Environmental Engineering (CEE) department, he reproduced an agent-based model of water sanitation and hygiene in South Africa. He used the Cluster in order to model diarrheal disease in children within over 400 households, over the course of 730 days. The model was iterated 100 times for a total of seven experiments using the program Netlogo, and would have been unlikely to complete without the computational resources of the cluster, as 240 CPU cores and 64GB of RAM were used to create over 6 TB of model data, which was further processed in Stata and R for a total of 720,000 CPU hours. This research was enabled by a great research team in CEE and the cluster resources in Research Technology.

Hui Yang is a postdoctoral fellow in the Department of Mechanical Engineering at Tufts University, working with Dean Qu of the School of Engineering. Their research interest lies in developing and applying theoretical and computational models to study the mechanical behaviors of nanostructured materials across different length and time scales, with the combination of in-situ experimental characterizations. Particular emphasis is placed on translating atomistic insights into the coarse-grained continuum models to understand material behaviors and processes at the macroscopic level. The ultimate goal of their research is to guide experiments and rational designs of new materials with improved performance and reliability through predictive numerical modeling. The HPC cluster at Tufts provides a reliable computational platform that enables them to run multiple jobs across different length scales simultaneously, i.e., the molecular dynamic simulation at the atomic level, phase-field simulation at microscale, and finite element modeling at macroscale. With the HPC cluster, they can run the computationally expensive simulations with multiple CPUs, which are totally impossible for them to run on their office desktop, and thus the efficiency of their simulation work is highly promoted. In addition, with the various installed software and compilers on the Cluster, they can also run in-house codes and user subroutines (i.e., in C or Fortran) to meet research requirements.  Finally, the service and support from the HPC center has been very helpful for software installation and debugging.

Eliyar Asgarieh's Civil Engineering Ph.D research identifies robust models for real-world civil structures, which heavily relied upon optimization and deterministic/stochastic simulations. In the initial stages of their Ph.D studies, they needed to apply various optimization methods using MATLAB toolboxes, which also required running a structural analysis software, OpenSees, for structural simulation. Each optimization could take more than three days, and hundreds of models needed to be designed and optimized (calibrated). Tufts HPC Cluster helped them to run many optimizations/simulations simultaneously, which would never be possible on regular desktops. The final part of their research was on identifying probabilistic models of structures in Bayesian framework using an advanced version of Markov chain Monte Carlo (MCMC) method called TMCMC (transitional MCMC). Each probabilistic model identification case required submission of 1000 jobs in several consecutive steps, which would add up to approximately 20000 to 30000 jobs in each case! To be computationally feasible the jobs needed to be run in parallel in each step. The Cluster made the impossible possible for them, and they could run millions of jobs to finish their research on probabilistic model identification.     

At Tufts Veterinary School of Medicine Giovanni Widmer and colleagues are using Illumina technology to sequence PCR amplicons obtained from the bacterial 16S rRNA gene. The analysis of millions of short sequences obtained with this method enables them to assess the taxonomic composition of bacterial populations and the impact of experimental interventions. Some of these analyses are computer-intensive and running them on the Cluster saves time. Typically, they use Clustal Omega to align sequences. On the Cluster, a samples of a few thousand sequence reads can be aligned in a few minutes. They have also installed mothur on the Cluster (mothur.org) and are running sequence analysis programs from this collection. These programs are used to de-noise sequence data and to compute pairwise genetic distance matrices. They visualize the genetic diversity of microbial populations using Principal Coordinate Analysis, which is also computer-intensive. They have adapted this approach to analyze populations of the eukaryotic pathogen Cryptosporidium. Several Cryptosporidium species infect the gastro-intestinal tract of human and animals. Using a similar approach as applied to the analysis of bacterial populations, they assess the diversity of Cryptosporidium parasites infecting a host and monitor the impact of various interventions on the genetic diversity of this parasites.

Daniel Lobo is a postdoc in the Biology department and works together with Professor Michael Levin to create novel artificial intelligence methods for the automated discovery of models of development and regeneration. A major challenge in developmental and regenerative biology is the identification of models that specify the steps sufficient for creating specific complex patterns and shapes. Despite the great number of manipulative and molecular experiments described in the literature, no comprehensive, constructive model exists that explains the remarkable ability of many organisms to restore anatomical polarity and organ morphology after amputation. It is now clear that computational tools must be developed to mine this ever-increasing set of functional data to help derive predictive, mechanistic models that can explain regulation of shape and pattern. They use the Cluster for running their heuristic searches for the discovery of comprehensive models that can explain the great number of poorly-understood regenerative experiments. Their method requires the simulation of millions of tissue-level experiments, comprising the behavior of thousands of cells and their secreted signaling molecules diffusing according to intensive differential equations. Using the Cluster, they can massively parallelize the simulation of these experiments and the search for models of regeneration. Indeed, the Cluster is an indispensable tool for them to apply cutting-edge artificial intelligence to biological science.

Eric Kernfeld's work with Professor Shuchin Aeron (ECE) and Professor Misha Kilmer (Math) centers around algebraic analysis of image and video data. Even a single video or a small collection of images can require an uncomfortable amount of time to process on a laptop, especially when it comes time to compare algorithms under different regimes. The high-performance computing tools (such as Matlab Distributed Computing Toolbox) at Tufts allowed them to run tests in a manageable time frame and proceed with their projects.

Christopher Burke is a member of Professor Tim Atherton’s group in the physics department which makes heavy use of the Cluster. Their focus is soft condensed matter, i.e. complicated solids and fluids such as emulsions, colloids, and liquid crystals. Simulations allow them to understand the behavior of these complex systems, which are often difficult to study analytically. Graduate student Chris Burke is studying how particles can be packed onto curved surfaces. This is in order to understand, for example, how micron-sized polymer beads would arrange themselves on the surface of an oil droplet. He uses the Cluster to run large numbers of packing simulations and to analyze the large data sets that result. Post-doc Badel Mbanga and undergraduate Kate Voorhes study the behavior of coalescing droplets coated with liquid crystals. In particular, they are interested in the behavior of defects in the liquid crystal layer as coalescence occurs. They run computationally expensive simulations which would be impractical without the computing power available on the Cluster.

Albert Tai is the manager and primary Bioinformatician of the TUCF Genomics Core, overseeing the operation of three deep sequencing instruments (Illumina HiSeq 2500, MiSeq, and Roche 454 Titanium FLX), and their associated services. As part of these services, he provides primary and secondary data analysis services, and or training associated with these analysis. Deep sequencing generates a large amount of data per run and data analysis requires a significant amount of computing resources, both processing and analytical storage. The Cluster and its associated storage is an essential tool for him and the users of core facility. The parallel computing capability allow them to analyze large data sets in a timely manner. It also expedites troubleshooting processes, which sometime require them to test multiple analytical parameters on a single data set. As the amount of data generated in biological research increases, high performance computing resources has become an essential resource. He would certainly hope to see the expansion of this crucial computing resource.

Marco Sammon recently finished their undergraduate degree in Quantitative Economics, and is continuing their work on their Senior Honors Thesis with Professor Marcelo Bianconi. Two parts of their research in mathematical finance require intense computing power: solving systems of Black-Scholes equations for implied volatility/implied risk-free rates, and fitting a SUR regression to explain factors that influence the difference between market prices and Black-Scholes prices. Before using the Cluster, it took them weeks to process just a few days worth of options data. Now, they are able to work on many days of options data simultaneously, greatly expediting the process. This is important, as it allows them to aggregate a larger time series of data, which allows for much richer analysis.

Hongtao Yu is a postdoc in the chemistry department working in Prof. Yu Shan Lin's group. Their research involves extensive Molecular Dynamics (MD) simulation of peptides and proteins. They use the MD method to study the folding thermodynamics and kinetics of glycoproteins, stapled peptides, and cyclic peptides. The free energy landscape of protein and peptide folding is believed to be rugged. It contains many free energy barriers that are much larger than thermal energies, and the protein might get trapped in many local free energy minima at room temperature. This trapping limits the capacity of effectively sampling protein configuration space. In their research, they use various techniques to overcome the free energy barriers and improve the sampling, for example by using, the Replica-Exchange Molecular Dynamics (REMD) method and the Umbrella Sampling (US) method. In a typical US simulation, the reaction coordinate(s) is broken into small windows, and independent runs have to be done for each window. For example, 36 independent runs have to be performed if they choose a dihedral as the reaction coordinate and use 10 degree window. In a 2D US simulation, the number of independent runs increases to 36x36. Their system usually contains 1 protein molecule and thousands of water molecules; an independent run usually takes about 2.5 hours with 8 CPUs. This means that they have to run 135 days to finish one 2D US simulation on a single 8-core machine! With the large amounts of CPUs provided by the Cluster, they can finish one 2D US simulation within 2 days! The benefit provided by the speed up is that they have the chance to explore more systems and methods.

As part of Rebecca Batorsky's PhD research in the physics department, she studied various aspects of intra-host virus evolution. She used the Cluster in order to run large simulations of evolving virus populations. Their simulations typically ran in Matlab, and they were able to run more than 30 simulations in parallel using multiple compute nodes. This enables faster collection of simulation data and allowed them to study large population sizes that would have otherwise been impossible. Furthermore, the ability to access their files on the Cluster and programs like Matlab and Mathematica from any computer was extremely useful.

Scott MacLachlan's group's research focuses on the development of mathematical and computational tools to enable large-scale computational simulations. They work on a diverse group of problems, including geophysical fluid dynamics, heterogeneous solid mechanics, and particle transport. The Tufts High-Performance Computing Research Cluster supports these activities in many ways. First, by providing significant parallel computing resources, it enables their development of mathematical algorithms and computational codes for challenging problems. For example, they have used the Cluster to study parallel scalability of simulation algorithms for the deformation of heterogeneous concretes under load, with higher-resolution models than would otherwise have been possible. Furthermore, by providing access to cutting-edge computing resources, such as the new GPU nodes, they are able to participate in the computing revolution that is currently underway, re-examining the high-performance algorithms that have become the workhorses of the MPI-based parallel paradigm, and developing new scalable techniques that are tuned for these architectures.

In collaboration with Dr. Ron Lechan, the Chief of Endocrinology at the Tufts Medical Center, Lakshmanan Iyer and Ron Lechan are applying cutting edge, next generation sequencing technology to determine the gene expression profile of Tanycytes, a special cell of glial origin in the brain. While much is known about the anatomy of these cells, their physiologic functions remain speculative and enigmatic. The results of these studies would provide clues towards their function. This analysis requires considerable amount of disk storage and CPU time. Without the Cluster and the associated storage it would be impossible to make sense of this data.

Krzysztof Sliwa, Austin Napier, Anthony Mann and others from the Department of Physics use the Cluster for analysis of ATLAS data. Compute jobs run continuously in support of this effort. For additional information see ATLAS and Higgs News.

Joshua Ainsley and his colleagues' work at the Laboratory of Leon Reijmers, PhD, Tufts University Neuroscience Department focuses on changes in gene expression that occurs in neurons during learning and memory formation. To examine these events on a genome-wide scale, they use a technique called next generation sequencing which generates millions of "reads" of short nucleotide sequences. By sequencing the RNA that is present before and after a behavioral paradigm designed to induce learning in mice and then comparing the results, they can begin to understand some of the basic steps that occur in a live animal forming a memory. The Cluster is essential for their research since figuring out where millions of short DNA sequences map on the mouse genome is a very computationally intensive process. Not only would the results take much longer to obtain on a single desktop, but they would be very limited in their ability to modify parameters of their analysis to see how that affects the results. What would take weeks or months takes hours or days, thanks to the resources provided by the Tufts Cluster.

Chao-Qiang Lai and their colleagues' Tufts/HNRC research is focusing on Nutrigenomics to study gene-diet interactions in the area of cardiovascular diseases, utilizing both genetic epidemiology approaches as well as controlled dietary intervention studies. This research involves the investigation of nutrient-gene interactions in large and diverse populations around the world with long-standing collaborations with investigators in Europe, Asia, Australia and the United States. For the current project, they were using the Cluster to deal with a large amount of genome data, such as genetic variants in human genomes, which can not be handled with their laptop computer. The Cluster is over 50X faster than their laptop. It would not be possible to complete their research project without it!

Anoop Kumar, Professor Lenore Cowen, Matt Menke, and Noah Daniels used the Cluster to hierarchically organize the protein structural domains into clusters based on geometric dissimilarity using the program Matt (http://bcb.cs.tufts.edu/mattweb/). The first step in the experiment was to align all the known protein domains using Matt. To compare all the 10,418 representative domains against each implied running Matt approximately 54 million times. While a single run takes only about 0.1 CPU seconds, running it 54 million times would take approximately 74 days on a single processor. By making use of the ability to run multiple jobs on separate nodes on the Cluster they split the job into smaller batches of 0.5 million alignment operations per batch, thus creating 109 jobs that they submitted to the Cluster. Each job took approximately 15 hours which is a significant reduction from 74 days. By running the jobs simultaneously on separate nodes of the Cluster they were able to reduce the time taken to perform their analysis from 2.5 months to less than a day. This speed-up proved to be an additional benefit when they realized they needed to run an additional experiment using an alternative to Matt, as they were able to run that second experiment without significantly delaying their time to publication. This research has resulted in a paper, "Touring Protein Space with Matt", that has been accepted to the International Symposium on Bioinformatics Research and Applications (ISBRA 2010) and will be presented in May.

Recognizing the value of running large tasks on the Cluster and the future CPU intensive programming requirements of the group, Professor Cowen has contributed additional nodes to the Cluster. While members of the BCB research group (http://bcb.cs.tufts.edu/) get priority to run programs on those nodes anyone having account on the Cluster can run programs on them.

Keith Noto is a postdoc in the Computer Science department, working on anomaly detection in human fetal gene expression data. That is, how does one distinguish "normal" development (meaning: like what we've seen before) from "abnormal" (different from what we've seen before, in the right way) over hundreds of samples with tens of thousands of molecular measurements each, when they don't even really know what they're looking for? They use the Tufts Cluster to test our approaches to this problem on dozens of separate data sets. These computational experiments take thousands of CPU hours, so their work cannot be done on just a handful of machines.

Ken Olum, Jose Blanco-Pillado and Ben Shlaer are using the Cluster to attempt to solve an important question in cosmology, namely "How big are cosmic string loops?" Cosmic strings are ultra-thin fast moving filaments hypothesized to be winding throughout the universe, most of it in the form of long loops. There has been much theoretical interest and work in cosmic strings, but before they can connect the theory to future observations, they need to know the typical sizes of the loops the network produces.

It turns out this is an ideal question to solve numerically, since the evolution of each individual string segment is easy to compute, and the tremendous scales over which the network evolves makes analytic work extremely difficult.

What makes this exciting now is that the previous generation of numerical cosmic string simulations disagreed on what the right answer is. This group believes that current hardware is sufficient to enable them to answer the question definitively.

The research that Alireza Aghasi is doing is very computational and requires a lot of processing and memory. They basically deal with Electrical Resistance Tomography (ERT), for detection of contaminants under the surface of the earth. The problem ends up being a very high dimensional Inverse problem which is intensively ill-posed. Dealing with such a problem without appropriate processing power is impossible. Once they became aware of the Cluster they started exploring it and realized that some features of it really help them with the processing speed. The excellent feature which really interested them was the good performance in sparse matrix calculations. Star-P does an excellent job dealing with very large sparse systems compared with other platforms. Personally they experienced some very good results using Star-P.

Umma Rebbapragada is a Ph.D. student in computer science, studying machine learning. Their research requires them to run experiments in which they test their methods on different data sets. For each data set, they may need to search for or test a particular set of input parameters. For each particular configuration of the experiment, they will need to perform multiple runs in order to ensure their results are statistically significant, or create different samplings of their data. In order to test a wide variety of configurations across multiple data sets, they exploit the Cluster's ability to run "embarrassingly parallel" jobs. They have submitted up to 2000 jobs at a time, and have them finish within hours. This has allowed them to test new ideas quickly, and accelerated their overall pace of research. They have different software demands depending on the project they are working on. These include Java, shell, perl, Matlab, R, C and C++. Fortunately, these are all well-supported on the Cluster. They also plan to explore MPI one day and take advantage of products like Star-P, which are available on the Cluster.

David Bamman and Greg Crane use the Cluster for two main purposes: parallel text alignment (aligning all of the words in a Latin or Greek text like the /Aeneid/ or the /Odyssey/ with all of the words in its English translation) and training probabilistic syntactic parsers on their treebank data. Both of these are computationally expensive processes - even aligning 1M words of Greek and English takes about 8 hours on a single-core desktop, and for their end result, they need to do this 4 separate times. Using a multi-threaded version of the algorithm (to take advantage of each cluster computer's 8 cores) has let them scale up the data to quantities (5M words) that they simply could not have done on their existing desktop computers. Most importantly, though, the Cluster environment lets them run multiple instances of these algorithms in parallel, which has greatly helped in testing optimization parameters for both tasks, and for the alignment task in particular lets them run those 4 alignments simultaneously - allowing for both speed and accuracy.

Luis Dorfmann and colleagues developed a physiologic wall stress analysis procedure by incorporating experimentally measured, non-uniform pressure loading in a patient-based finite element simulation. First, the distribution of wall pressure is measured in a patient-based lumen cast at a series of physiologically relevant steady flow rates. Then, using published equi-biaxial stress-deformation data from aneurysmal tissue samples, a nonlinear hyperelastic constitutive equation is used to describe the mechanical behavior of the aneurysm wall. The model accounts of the characteristic exponential stiffening due to the rapid engagement of nearly inextensible collagen fibers and assumes, as a first approximation, an isotropic behavior of the arterial wall. The results show a complex wall stress distribution with a localized maximum principal stress value of 660 kPa on the inner surface of the posterior surface of the aneurysm bulge, a considerably larger value than has generally been reported in calculations of wall stress under the assumption of uniform loading. This is potentially significant since the posterior wall has been suggested as a common site of rupture, and the aneurysmal tensile strength reported by other authors is of the same order of magnitude as the maximum stress value found here. The numerical simulations performed in this research required substantial computational resources and data storage facilities, which were very generously made available by Tufts University. This support is gratefully acknowledged.

Rachel Lomasky and Carla Brodley address problems in the two areas of Machine Learning and Classification. A new class of supervised learning processes called Active Class Selection(ACS) addresses the question: if one can collect n additional training instances, how should they be distributed with respect to class? Working with Chemistry's Walt Laboratory at Tufts University they train an artificial nose to discriminate vapors. They use Active Class Selection to choose which training data to generate. And In the area of Active Learning they are interested in the development of tools to determine which Active Learning methods will work best for the problem at hand. They introduced an entropy-based measure, Average Pool Uncertainty, for assessing the online progress of active learning. The motivating problem of this research is the labeling of the Earth's surface to create a land cover classifier. They would like to determine when labeling more of the map will not contribute to an increase in accuracy. Both Active Class Selection and Active Learning are CPU-intensive. They require working with large datasets. Additionally, experiments are conducted with several methods, each with a large range of parameters. Without the Cluster, their research would be so time-consuming as to be impractical.

The Cluster allows Eugene Morgan to work with large amounts of data within a reasonable time frame. They first used the cluster to interpolate sparse data points over a fairly large 3-dimensional space. The cluster has also dramatically sped up the calculation of semivariance for dozens of sections of seafloor containing vast numbers of data points, quickly performed thousands of Monte Carlo simulations, and computed statistics on one of the largest global wind speed datasets containing ~3.6 billion data points. They have most recently used the cluster find optimal parameters for rock physics equations using a genetic algorithm. Most of these activities have been or will be incorporated in technical publications.

Eric Thompson and colleagues have used the Cluster to further their understanding of the seismic response of near-surface soils. This behavior, often termed "site response," can often explain why locations heavily damaged by an earthquake are frequently observed adjacent to undamaged locations. Standard modeling procedures often fail to accurately model this behavior. The failure of these models is often attributed to the uncertainty of the soil properties. However, using the Cluster they have shown that the underlying theoretical assumptions of the standard model (vertically incident plane SH-wave propagation through a laterally constant medium) are responsible for the failure to match the observed site response behavior.

The research that Andrew Margules is currently conducting is in the area of Passively Actuated Deformable Airfoils. The largest presence of airfoils today is contained within the aerospace and transportation industries. Like those on commercial and military aircraft, the basic teardrop airfoil shape is augmented with a series external structures which aid in take-off, landing, and cruising flight. While they perform specific and important functions, they add additional weight to a system which is highly immersed in weight management. What his research is looking into, is trying to find a way to develop an internal structure for an airfoil that would provide similar shape change, without the added external mechanisms. To do this, he is using two different computational software packages. COMSOL Multiphysics allows for the examination of the fluid-structure interaction of the airfoil and moving air. Using different internal rib structures, a goal of finding an appropriate structure is hoped to be achieved. In addition, he is using the computational fluid dynamics package Fluent to help visualize velocity and pressure fields over deformed and undeformed airfoil shapes. If this software was not available through the academic research cluster, this research would be an extremely slow process. The governing physics behind these simulations is complex enough that without the computing power of the Cluster, he does not believe that they would be able to perform it. In the last twenty or so years, a focus has shifted from passive actuation to active actuation. Hopefully, this research will help to launch a renewed interested in this field.

Ke Betty Li is a researcher in the Department of Civil and Environmental Engineering. Their research focuses on the investigation of how various contaminants affect the ground water quality and how they could design remediation systems. An important approach they are using for this type of investigation is modeling contaminant fate and transport in the subsurface on computers. The resources provided by Tufts Cluster Center are very important to them. Their simulations usually take days or even weeks on a single CPU. The Clusters can either expedite each simulation if they use simulators that enable parallel computing, or allow them to simulate multiple serial processes simultaneously. The significant improvement in computing efficiency is critical for them to commit work quality to funding sponsors. They expect that their work will improve the current understanding of contamination in the subsurface, provide cutting-edge assessment tools, and stimulate innovative treatment technologies.

Eric Miller and his colleagues' work concerns the development of tomographic processing methods for environmental remediation problems. Specifically, they are interested in using electrical resistance tomography (ERT) to estimate the geometry of regions of the subsurface contaminated by chemicals such as TCE or PCE. Though the concept of ERT is not unlike the more familiar computed axial tomography (CAT) used for medical imaging, the physics of ERT are a bit more complicated, thereby leading to computationally intensive methods for turning data into pictures. Luckily these computational issues are, at a high level, easily parallelizable. Thus, they have turned to Star-P as the tool of choice for the rapid synthesis of our algorithms.

The Trimmer Lab is interested in the control of locomotion and other movements in soft bodied animals. Michael A. Simon has been working there analyzing the activity of a specific mechanosensor trying to understand how it influences abdominal movement, a critical question for animals with no rigid components. One particularly powerful analytical tool for analyzing such sensors is nonlinear analysis using Gaussian white noise as a stimulus. One challenge of this technique, however, is that it is computationally complex. Even storing the matrices involved in these computations is beyond the capabilities of the typical personal computer. The Tufts Research Cluster offers him the resources necessary to run these computations and analyze the results without needing to invest in new, complicated, or expensive analytical hardware or software. It also allows him to use software that would have been difficult to acquire for their lab alone. Without this resource, following this line of inquiry would have proved a costly endeavor, possibly prohibitively so. They hope to apply their results to the development of computer and robotic models, with the eventual goal of designing a soft robot, a groundbreaking engineering application with substantial implications for design in the biomedical engineering arena, as well as in other areas of engineering.

Use of the Cluster has been invaluable to Katherine L. Tucker and her colleague's research. They use a genetic analysis software named SOLAR which is Linux/Unix based. This software and the methods used in it are cutting edge. They are able to perform various genetic computations with ease. In the past some student has had to do these calculations by hand because of a lack of access to such software. However, hand calculations are only possible for small sample sizes and simple genetic analysis. Their current work with Solar includes over 5,000 individuals and they are using some of the most advanced methods available. The Cluster allows them to do large computational runs that would not be otherwise possible. Thus, their current work would not have been possible without access to SOLAR on the Bioinformatics Cluster. In addition, this type of analysis is being more common and will be a greater part of their efforts in future years. Use of the Cluster helps their research to remain competitive and important in their grant application process. Their lab is the first to use SOLAR on the Bioinformatics Cluster, however, since they have been using it, many labs have inquired about how to gain access.

Jeffery S. Jackson is a graduate student in Mechanical Engineering and is conducting research on microfluidic mixers. He uses the Cluster01 to create and run fluid flow models on COMSOL Multiphysics. The COMSOL program solves the Navier Stokes equations for transient fluid flow and the convection diffusion equation. For the models that he creates to be accurate, though, they require more elements and time steps than his computer, or the computers in the EPDC, can handle. This is where the cluster comes in very handy. He usually has the Cluster run any model that is more complicated than a 2D model with 30,000 elements. The most complicated model he has had the cluster solve consisted of 90,000 elements. This model took 30 hours for the Cluster to solve, which is something that no other computer resource he has access to could do. Another nice benefit of the Cluster is being able to use it from home. He lives in Providence, RI and it takes him two hours to get to Tufts by train. So, he only comes in when he has to. Having remote access to the Cluster makes this possible. Without the Cluster, or the very helpful people who provide excellent technical support, he would never have been able to do the research he needed to to finish his Master's Thesis.

Erin Munro is studying Computational Neuroscience in the Math department. Her research consists of doing MANY simulations. That being said, she would not be able to do this research without the cluster! She simulates networks of thousands of neurons interacting. While there are some simulations that take a few minutes, the majority of them take 45 minutes to an 1.5 hours on one node. The last time she calculated, she'd like to run over a month's worth of these simulations. On top of this, she has run several very important simulations that take 1.5 days on 16 nodes. She had to run these simulations in order to try to reproduce results from Roger Traub's research. Her current project is to try to explain these results. They tried to find a simpler way to explain them without reproducing the full model, but they found that they couldn't do it. With the Cluster, she has been able to reproduce the results to the best of her ability. Furthermore, she has been able to dissect the model, and run many more simulations to get a much better understanding of what is going on in his results. She feels like she is coming close to fully explaining the results, and have just presented a talk at BU explaining her ideas. None of this would have been possible without the Cluster.

Casey Foote's research for his MS in Mechanical Engineering is based on using the software available on the Cluster to model a cold forging process. This model, paired with experimental data, will then be used to develop a tool to predict forging work piece cracking. The tool will provide a manufacturer of airfoils for use in the aircraft engine industry a method to rapidly develop new processing while avoiding costly physical trials.

Aurelie Edwards' graduate student Christopher Mooney performs simulations of unsteady, turbulent fluid flow in a bioreactor with a stir-bar, using Femlab engineering software. Prior to having access to the Tufts cluster, he was experiencing extensive memory usage problems. On a PC with 2GB of RAM using Windows XP, he was only able to access about 40% of the memory, due to fragmentation issues, and his simulations did not converge. They were both relieved to learn that they could have access to the Tufts Cluster and its Linux platform that offers 4GB+ of memory space. The latter has thankfully allowed them to solve increasingly complex models. For example, using his PC, Chris could solve finite element Navier-Stokes fluid flow problems with an element mesh density that limited the problem to about 100,000 degrees of freedom, beyond which he ran out of memory. He often received "low mesh quality" error messages that hindered the mathematical convergence of the solution. On the Cluster, he now has enough memory to refine the mesh and run models with 300,000 degrees of freedom. Chris still runs into "out of memory" problems on the Cluster, but much less frequently.

Gabriel Wachman uses the Cluster to conduct experiments relating to their work in machine learning. They are in the computer science department. The experiments they have been running have generally been to aid in the comparison of different learning algorithms. By running many experiments over a range of parameters, they can collect data that helps them to draw conclusions on the behavior of the algorithms. Without the Cluster, much of the work they have done would have been impossible or at best severely limited.

Alexandre B. Sousa is a graduate student with the High Energy Physics Group and as part of the MINOS experiment collaboration, they have been one of the main people responsible for mass event reconstruction using the Fermilab fixed-target farm. Earlier this year, a Mock Data Challenge was issued to the experiment in order to shake down reconstruction and analysis shortcomings before real data collection starts in January. This effort requested the generation of a rather large MonteCarlo sample, which was subsequently reconstructed at Fermilab. However, the generation of the MC sample was quite hard to setup at Fermilab, where space constraints, e-bureaucracy and competition with other experiments meant they would not be able to do it in a timely manner. That was when they decided to test the Tufts Linux Cluster to perform this task. They were set up with an area on the /cluster/shared space within a day of my original request, and after a few tests, they were able to generate 80% of the total necessary MC sample in less than a week. They were, of course, lucky to be almost the exclusive user of the cluster for that period, but they really had no problems setting things up and using it in what is seen as a nice success of the Tufts High energy Physics Group. Given this success, they have volunteered to become one of the spearheading institutions taking part on the upcoming MC generation effort which should start later this month, and the gained experience was transformed into a document and relayed to other institutions that are starting to run their own clusters and hope to join this effort. They have used the cluster a second time to do a customized reprocessing data for the CC nue analysis group, which they integrate, which required compilation in the cluster of the MINOS Offline Software, installation of a mysql database and assembling some shell scripts to handle the job output. That went quite well, and the full data sample was processed in 2 hours, with about 1 day of setup. Having worked for 2 years with the Fermilab batch farm, they were mainly impressed by the speed of the network connection of the CPU nodes to the I/O node, almost 20 times the Fermilab data transfer speeds and also by the great flexibility of use given to the users, which implied minimal back and forth contact with the admins and dramatically improved work efficiency.

Did you find what you were looking for on this webpage?