Changes to TTS Research Computing Services
Email from TTS Communications | July 25, 2024
Dear Tufts Research Computing Community:
As part of our ongoing efforts to provide robust, sustainable, and cost-effective research infrastructure, TTS is making service changes and enhancements to our high-performance computing (HPC) and research storage environments. These changes include migration to a tiered storage architecture to lower costs for storing less active research data, introduction of a charge-back model for research storage, installation of researcher-owned equipment at MGHPCC, and the institution of an HPC node lifecycle. This will impact less than 10% of HPC and research storage users at Tufts and will be phased in with timelines that give time for researchers to address the changes within new grants or renewals.
TTS knows how critically important the University HPC and storage environments are to our research program and invests significant resources to keep them up-to-date and responsive to researcher needs. In the 2023-2024 academic year, TTS retired 2000+ public CPU cores and 2 GPUs that have offered faithful service to the Tufts HPC community over the past decade. TTS has replaced these and increased HPC capacity with 6000 public CPU cores and 40 GPUs.
See detailed information about these initiatives and the expected timelines for implementation below.
- Implement Tiered Research Storage and Researcher Charge-Back Model
Research data storage usage is growing significantly as data-intensive research grows across the University. Storage costs are approaching $1 million per year and the trend is not fiscally sustainable. As Tufts matures as a research enterprise, the University needs to make storage costs sustainable and strengthen our operational model.
The changes outlined below are a result of extensive consultation with the community. In summer of 2023, TTS led the formation of the Research Storage Committee, composed of faculty researchers, deans for research, executive administrative deans, and OVPR leadership. The work included detailed interviews with Tufts researchers, communication with the Faculty Senate Committee on Research and Scholarship, communication with Research Deans and with School deans, extensive cost modeling, and extensive benchmarking of peers through detailed interviews and an environment scan of sixty universities’ research storage models.
The Research Storage Committee put forth the following changes that will impact HPC Cluster Storage and Networked Desktop Research Storage (RStore Drive):
- Implement Tiered Research Storage Environment
This fall, TTS will deploy a multi-tiered storage solution providing more economical and appropriate options for researchers to store their data vs. the current single-tier storage environment supported by TTS. When the new tiers are available, TTS will work with researchers to move portions of their data to the appropriate tier:
- Tier 1 (hot) for active, high-performance compute use.
- Tier 2 (cool) for inactive, infrequently accessed data, no compute on data.
- Tier 3 (deep cold) for static, rarely accessed data, “attic” storage.
- Implement Researcher Charge-Back Model for Research Storage
The University will implement a charge-back model that includes a charge-back for storage usage over a specified threshold. Tufts individual faculty member or faculty member’s research group will be provided with a base allocation at no cost. Storage usage beyond the base allocation will be billed monthly to the researcher or department. Compared to our peer institutions, Tufts will have a generous base allocation that will meet the needs of the more than 90% of our researchers without a chargeback.
The new model will include the following base allocation and costs:
- Tier 1: 10 TB quota, $85 per TB per year beyond 10 TB quota.
- Tier 2: 10 TB quota, $35 per TB per year beyond 10 TB quota.
- Tier 3: 0 TB quota, $6 per TB per year.
Rates will be reviewed and adjusted annually.
The effective dates and deadlines for tiers and charge-back implementation is as follows:
- July 1, 2026: all Tufts faculty/research groups will be utilizing these new tiers at the approved rates.
- Interim between July 1, 2024 and July 1, 2026:
- Existing Tufts faculty/research groups can be allocated up to 5 TB per year for Tier 1 storage at no cost.
- On 8/1/2024 Tufts faculty/research group’s storage quotas will be reduced to a maximum of 5TB above storage usage as of 7/1/2024.
- Tufts faculty/research groups that require greater than 5TB per year growth rate will need to engage individually with TTS to address their needs.
- Beginning fall 2024, TTS will work closely with Tufts faculty/research groups to move data to appropriate tiers to free up Tier 1 storage allocations. New data generation will be stored in Tier 1.
- Tools and processes to assist researchers with data management planning will be forthcoming.
- Establish process to determine storage needed for budgeting storage costs in sponsored proposals. Generally, if storage usage over a specified threshold is related to a sponsored project, the cost can be charged to sponsored awards as a direct cost.
- Data Storage Finder: to assist researchers to compare and select appropriate storage for their research needs; includes quotas, cost, security requirements, etc.
- Research Storage Usage Dashboard: to allow researchers to view their data storage usage in TB and eventually in $ across all storage environments by directories and by modification date. Available to PIs and their approvers only.
- Storage Cost Calculator: to facilitate creating estimates of future storage costs.
- Outsource Installation for Research Computing Equipment at the MGHPCC | Effective January 1, 2025
Effective January 1, 2025, installation costs for researcher-purchased compute nodes at the Massachusetts Green High Performance Computing Center (MGHPCC) will be funded directly by the school/department or faculty purchasing the equipment. Grant proposals submitted and startup packages committed before September 1st, 2024 that include node purchases will be exempted from this change.
While costs may vary based on the number of nodes installed, TTS recommends a planning estimate of $1,000 total per typical installation of 10-20 nodes. We estimate this change to impact approximately five researchers annually.
This adjustment was made because of staffing reductions taken in TTS’s FY2025 budget. TTS will continue to cover the costs of periodic maintenance and decommissioning of equipment.
- Establish HPC Researcher Contribution Node Lifecycle
Currently, Tufts “Contrib Model” allows researchers to acquire additional, standardized HPC hardware to support their individual research projects. The additional compute resources are integrated into the shared Tufts HPC and managed centrally by TTS Research Technology. The owners of the equipment are given priority access while any excess capacity is returned to the pool (“preempt partition”) for general, shared use by the Tufts community. The owners of the equipment determine the access and queuing policies for their portion of the HPC. Except for hardware installations at MGHPCC, TTS provides all other standard services without charge.
As with all technology, HPC nodes have a service life. As the compute nodes age, they lose vendor and operating system support, creating maintenance difficulties and security risks. To ensure that Tufts researchers receive sufficient advance notice for the planning of HPC compute node end-of-life, TTS will keep researchers informed of the state of the lifecycle of their contributed nodes, allowing a total service life of the warranty period and a best-effort support period lasting a maximum of 7 years. At purchase, every HPC compute node must be procured with an associated warranty. Nodes that fail without warranty support will be decommissioned; and, to ensure the best value from contributed nodes, TTS recommends a 5-year warranty. This policy is intended to help researchers avoid significant unplanned interruptions to their research caused by system failures. TTS Research Technology will keep researchers informed about the state of their nodes and work with researchers to determine node retirement dates and replacement estimates.
HPC Node Lifecycle goes into effect:
- September 1, 2024: all new compute node purchases.
- July 1, 2026: all compute nodes purchased prior to September 1, 2024.
For additional information regarding Tufts HPC please see Tufts HPC.
For More Information on these Service Changes
TTS Research Technology will hold open forums to offer additional information and answer any of your questions:
- Wednesday 7/31, 11am-12pm
- Thursday 8/8, 12pm-1pm
- Additional fall dates: stay tuned dates and times
If you have any questions, please contact Patrick Florance, Director of Research Technology.
Sincerely,
TTS Communications