Today's News |
Researchers to Benefit from Canadian World-Class National Data Cyberinfrastructure enabling Big Data Discovery
Thursday, September 1, 2016Company Profile | Follow Company
Data Storage Solution will enhance national research and discovery capabilities and promote collaboration
Burnaby, BC, September 1, 2016--(T-Net)--Compute Canada in partnership with its member institutions announced the results of a multi-million dollar investment in a national big data storage solutions for Canada's growing community of advanced computing researchers.
“We designed, together with our partners, a Canadian solution that is unique in the world,” says Mark Dietrich, Compute Canada President and CEO. “Globally, this gives Canada and Canadian researchers an advantage, because of Compute Canada's national platform and federated model for service delivery. We are able to deliver accessible, national solutions on a common platform across the country. These new storage systems, initially capable of storing 40 petabytes of research data, will be allocated on a peer-reviewed basis to support some of Canada's most accomplished and innovative researchers.”
The national approach stretches funding dollars and leverages Compute Canada's expertise across the country to ensure the procurement of the best technology solutions. There will be a storage component for each of the four new national advanced computing sites at the University of Victoria, Simon Fraser University, the University of Waterloo and the University of Toronto, being deployed over the next 18 months.
These systems will be accessible from across the country and represent the initial investment in a national infrastructure that will support the exponential growth of research data.
“When desired, individual institutions or organizations will have opportunities to deploy storage locally and can federate their local repository into the national system,” says Dr. Greg Newby, Compute Canada's Chief Technology Officer. “This provides enhanced privacy and sharing capabilities on a robust, country-wide solution with improved data security and backup. This is a great solution to address the data explosion we are currently experiencing in Canada and globally.”
Planned investments, to be made through late 2017, will grow the system to approximately 62 petabytes (PB) of persistent, available online storage across the four national sites.
This will be backed up by a comparable quantity of tape storage.
This is the first installment of the renewal of Compute Canada's advanced research computing platform in Canada, with capital funding provided by the Canada Foundation for Innovation (CFI), the Ontario Research Fund (ORFRI) and the BC Knowledge Development Fund (BCKDF). The new storage systems will be operated by Compute Canada and its partner institutions, with operational funding from the CFI's Major Science Initiatives (MSI) program, matched by provincial and institutional funds.
Storage requests from last year's Compute Canada resource allocation process totalled to nearly double the available resources. The community of advanced research computing (ARC) users continues to grow, more than doubling in the past 5 years. Compute Canada's national platform now supports over 3,000 faculty in Canada, with approximately 10,000 users overall.
Compute Canada will be working closely with Scalar, IBM and DDN to build this integrated national data infrastructure. These three companies were selected through an open, competitive process to deliver the hardware and software components of the new storage cyberinfrastructure.
Questions and Answers for Storage RFP Announcement
1. What motivated development of a new national data infrastructure?
Storage requests from the 2016 Compute Canada resource allocation process were nearly double the available resources. The community of advanced research computing (ARC) users continues to grow, more than doubling in the past 5 years. Compute Canada's national platform now supports over 3,000 faculty in Canada, with approximately 10,000 users overall.
Compute Canada, in partnership with its regions and institutions is building four new national advanced research computing sites. There will be a storage component for each of the four new national sites in Canada to support a federated national data infrastructure for all Canadian researchers. This approach is possible because of Compute Canada's national platform and federated model for service delivery. When desired, individual institutions or organizations will have opportunities to deploy storage locally and can federate their local repository into the national system.
2. How much new storage is being deployed?
New investments, to be made through late 2017, will deploy approximately 62 petabytes (PB) of persistent, available online storage across the four national sites. This will be backed up by a comparable quantity of tape storage.
Planned investments in the national data infrastructure will grow total storage capacity to more than 100PB in 2018, and 250PB by 2020.
3. When will users be able to take advantage of this new updated storage service?
The first storage will become available in summer 2016, and migration from legacy storage equipment will begin at that time. Recipients of the 2016 Resource Allocations for data storage will be relocated as appropriate, once new storage is available.
4. What new technologies and advantages will these new resources provide Canadian researchers?
The national data infrastructure will greatly expand existing capacities on large-scale filesystems, used to store file-based data. The new national computational systems will deliver a consistent experience, with single usernames and passwords, common access points on systems, and centralized backups. The national helpdesk will provide centralized support for all storage and computational systems, in coordination with local and regional support.
The national data infrastructure will also include object storage systems. Object storage provides a common interface to data objects, regardless of their online locations. This will allow transparent data replication, to increase availability and resiliency. Object storage also provides advanced features for access control, enabling everything from highly controlled access, limited to a researcher and his/her team, to full public access, enabling a scalable response to the demands for open access to research data.
Data that are not under active use will have special consideration in the national data infrastructure. When a researcher needs to retain a dataset as part of an active project, but is not using that dataset, it will be automatically moved to tape. This greatly extends capacity of the more expensive online storage. Furthermore, tape storage uses no power when the data is not being used so this is a ‘green' approach to storage.
5. What is Compute Canada's main goal with a federated storage model?
The main goal is excellence in service delivery. This model will allow users to have a consistent and well-supported experience with data storage and access, with far fewer capacity constraints.
6. Why were these vendors selected and what is unique about their offerings?
The four new national sites formed a purchasing consortium, led by Simon Fraser University (SFU). SFU ran an open RFP process, with published criteria. Bids were evaluated by Compute Canada's team of experts, including representation from the four sites, the national storage team, and Compute Canada's national leadership.
The selected vendors were found to provide best-in-class products, services and support. Pricing offered by the vendors was outstanding, allowing Compute Canada to get maximum value for the dollar.
7. How much funding was necessary for this storage renewal and update?
$18M in national data cyberinfrastructure investment is planned through late 2017. This is the first installment of the renewal of Compute Canada's advanced research computing platform in Canada, with capital funding provided by the Canada Foundation for Innovation (CFI), the Ontario Research Fund (ORFRI) and the BC Knowledge Development Fund (BCKDF). The new storage systems will be operated by Compute Canada and its partner institutions, with operational funding from the CFI's Major Science Initiatives (MSI) program, matched by provincial and institutional funds.
8. What are the top five things we should know about this new procurement?
It's cost-effective: Commodity-based storage building blocks are being used to provide capacity.?It's resilient: Off-site backups, data replication, and other mechanisms will assure against data loss.?It's high performance: The storage building blocks, combined with software-defined storage for all the access modalities, assure rapid access even over the Internet.?It's wellsupported: Compute Canada's experts are here to help you! This includes email to support@computecanada.ca , and on-campus support at all member institutions.
It's scaleable : A national data infrastructure needs to expand and scale, Online storage, backups, and all the ways of accessing data will continue to grow over time, to enable Canada's digital leadership
9. How will privacy and data integrity be improved with these new services?
The new national data cyberinfrastructure will allow research groups to implement increased granularity of access control to specific data items whether files, objects, databases, or any of the content they include. This makes it easier for researchers to store research data securely and reliably, collaborate, share data with colleagues and the public. Specific technologies for this include cloud-based services (including Ceph), object storage software (WOS), and traditional filesystems. Researchers and their projects will be able to limit data access to authorized users, groups, or individuals. They will also be able to make selected items public.
Backgrounder
New investments, spanning through late 2017, will deploy approximately 62 petabytes (PB) of persistent, available online storage across the four national sites. This will be backed up by a comparable quantity of tape storage. This represents an $18 million investment.
The new national data cyberinfrastructure offers increased privacy and enhanced control access to specific data items whether files, objects, databases, or any of the content they include. Specific technologies for this include cloud-based services, object storage software, and traditional filesystems.
Researchers and their projects will be able to strictly limit data access to authorized users, groups, or individuals. They will also be able to make selected items public.
There will be a storage component for each of the four new national sites in Canada to support a federated national data infrastructure for all Canadian researchers. This approach is possible because of Compute Canada's national platform and federated model for service delivery.
When desired, individual institutions or organizations will have opportunities to deploy storage locally and can federate their local repository into the national system.
The national data cyberinfrastructure will greatly expand existing capacities on large-scale filesystems, used to store file-based data. The new national computational systems will deliver a consistent experience, with usernames and passwords, access points on systems, and centralized backups. The national helpdesk will provide centralized support for all storage and computational systems, in coordination with local and regional support.
The national data cyberinfrastructure will also include object storage systems. Object storage provides a common interface to data objects, regardless of their online locations. This will allow transparent data replication, to increase availability and resiliency. Object storage also provides advanced features for access control, enabling everything from highly limited access to full public access.
Data that are not under active use will have special consideration in the national data cyberinfrastructure. When a researcher needs to retain a dataset as part of an active project, but is not using that dataset, it will be efficiently moved to backup tape. This greatly extends capacity of the online storage. Furthermore, tape storage uses no electricity at rest so is a ‘green' approach to storage
Canada's Four New National Advanced Research Computing Sites to be deployed 2016/2017
GP1, hosted by the University of Victoria : An OpenStack cloud system with approximately 6,000 CPU cores and 5 petabytes (PB) of disk for persistent storage. Both system and storage are being purchased, as of late June 2016, and will be deployed in summer 2016. An expansion to compute and disk resources of 1/3 of the total expenditure is planned for mid2017. Total CFI funding: $3M.
Persistent storage, for all Stage 1 and 2 systems, is not inclusive of temporary/scratch disk, which will be purchased with the system. All core counts in this proposal are based on Haswell; actual core count will vary proportionally.
GP2, to be hosted by Simon Fraser University : A large heterogeneous cluster with up to 25,000 CPU cores, nearly 1,000 GPU devices, and 15PB of disk for temporary and persistent storage. “GP” for GP2 and GP3 refers to General Purpose clusters, with various node types intended for diverse workloads. SFU is a destination for backups and deep storage. Installation and commissioning is expected Fall 2016. Total CFI funding: $8.35M.
GP3, to be hosted by the University of Waterloo: A large heterogeneous cluster with up to 19,000 CPU cores, hundreds of GPU devices, and 15PB of disk. Waterloo is a destination for backups and hierarchical storage management. Storage is being purchased, and the system is scheduled for purchase in October 2016. An expansion to compute and disk resources of 1/3 of the total expenditure is planned for mid2017. Total CFI funding: $7.8M
LP, to be hosted by the University of Toronto : A large parallel supercomputer with a balanced high performance interconnect, anticipating large homogeneous partitions with one or two node types, possibly including accelerator/manycore. Approximately 66,000 CPU cores will be purchased in mid/late2017. 5PB of persistent storage is being purchased, and a further 10PB will be purchased with the system itself. Total CFI funding: $9.85M.
Service Infrastructure Development: Research Data Management (RDM) and other software infrastructure investments in support of all systems and services, including the needs of CFI's Challenge 1. Personnel and software expenses to be shared across the four Stage 1 sites. Total CFI funding: $1M.
Infrastructure systems : High-availability servers for critical infrastructure services have been deployed at SFU and uWaterloo. All four sites are receiving new network routers to support a multisite Science DMZ (see fasterdata.es.net ), software-defined networking, and 100Gb connectivity. Funding included in site totals above.
DDN
Proud to have been selected by Compute Canada as the software-defined storage provider for this innovative, world-class project, DDN is excited to work with Scalar as the system integrator and with IBM as the tape supplier to provide a complete active archiving and collaboration scale-out storage cloud. DDN has an 18year history of providing high performance storage solutions to the top supercomputing organizations in the world. DDN is providing its Web Object Scaler (WOS) to meet Compute Canada's need for scalable, persistent, online storage. The flexible deployment methods of DDN's WOS will allow Compute Canada to deploy it as software on the chosen standard Dell storage hardware provisioned by Scalar. The DDN softwaredefined storage solution offers a complete object storage platform capable of:
DDN is excited about the opportunity to support Compute Canada and its growing list of members, now and well into the future. Eager to implement the rollout and accessibility across Canada, DDN is looking forward to supporting the exponential growth of the nation's research data.
Scalar
As Canada's leading IT solutions integrator, Scalar is pleased to be supporting Compute Canada with the hardware platforms and system integration for this national-scale storage solution. By working closely with a number of hardware technology vendors such as Dell Canada and Seagate, Scalar is delivering highly scalable storage with unparalleled levels of density and extremely compelling economics. With a strong history of supporting most of the members of Compute Canada with their technology needs, Scalar is thrilled to be working with partners such as Seagate, Dell, DDN and IBM to change the national storage landscape. The solutions provided are all based on extremely dense hardware products, including products capable of supporting 700+ TB in a single 4U enclosure, and products used to build some of the fastest, most scalable storage solutions in the world. Combined with DDN's WOS technology, and IBM's capabilities for backup and archiving, the overall solution constitutes an impressive step forward for researchers across Canada.
IBM
IBM has been selected to provide a combination of hardware and software to enable multisite data backups for Compute Canada sites. Backup is particularly critical as insurance that important data that may not need to be frequently accessed is available if and when it's needed. Achieving reliable data backup at a low cost supports other priorities within the IT infrastructure. The IBM storage solution includes:
IBM's industry leading robotic TS4500 tape libraries. Video here . IBM TS1150 high density tape drive technology?IBM Spectrum Protect software defined software?IBM Power 8 servers
ABOUT SIMON FRASER UNIVERSITY:
As Canada's engaged university, SFU is defined by its dynamic integration of innovative education, cutting-edge research and far-reaching community engagement. SFU was founded 50 years ago with a mission to be a different kind of university—to bring an interdisciplinary approach to learning, embrace bold initiatives, and engage with communities near and far. Today, SFU is Canada's leading comprehensive research university and is ranked one of the top universities in the world. With campuses in British Columbia's three largest cities - Vancouver, Burnaby and Surrey - SFU has eight faculties, delivers almost 150 programs to over 35,000 students, and boasts more than 135,000 alumni in 130 countries around the world.
Contact:
Wan Yee Lok
University Communications
778.782.5987
Photo Credit: Compute Canada
Other Recent Company News |
|||||||||||||||||||
|