Berkeley's 3 Pillar Computing Infrastructure Strategy

Berkeley's 3 Pillar Infrastructure Model

Bill Allison, UC Berkeley CTO

May 7, 2024

UC Berkeley developed our 3 Pillar Infrastructure computing strategy in 2021 to guide investments in the University computing infrastructure, from on-campus data center spaces (the "local pillar"), new investment in data center capacity in partnership with NASA at Moffett Field (the "offsite colocation pillar"), and a public cloud pillar. This represents a strategic evolution designed to address the University’s diverse and growing computational needs, with the 3 pillars transitioning Berkeley away from the older, "one-size-fits-all" approach oriented around a single campus data center. This 3-part strategy also fulfills the vision originally laid out in Berkeley's 2016 cloud strategy that we labeled "University First" (a bit of a friendly jab at the marketing-driven "cloud-first" originally developed by Amazon). University First is about shifting the framing away from technology, and more onto the nature of the use-case-- to give Berkeley academics the best, most cost-effective tools for their jobs.

The planning process that led to the 3 pillars borrowed a comparison tool called a "feature comparison chart" from product management.  In Fall 2020, CTO Bill Allison partnered with IT leads including the campus Research CTO, Ken Lutz and Eric Fraser, Assistant Dean of IT in the College of Engineering to collaboratively fill in a feature comparison chart to identify use-cases, and match the associated requirements to possible computing infrastructure options. Once the initial draft was filled in it was presented to many groups of IT experts and researchers, to ensure it reflected the right categories of campus computing needs.  

shows a feature comparison chart for different use-cases that need capabilities of the different pillars. the article contains a hyperlink to the sheet so you can see details there!

This analytical tool enabled us to systematically assess various computing resources against specific criteria such as performance, cost-efficiency, and energy consumption. Through this evaluation, it was clear that a singular approach would not meet campus needs. The top requirements grouped across a few types of infrastructure, an insight that yielded the "3 Pillar Infrastructure Model." The use of the comparison chart helped identify the best solutions across many different users, and that would optimally support distinct aspects of university operations, from intensive research computing to flexible cloud-based services.

Overview of Berkeley's Three Pillars

In our Local Computing pillar, the focus is on supporting computing that must be on-campus. Local pillar candidates may require low latency (for example, projects that involve responsiveness for advanced data visualizations) or proximity to specialized research equipment (such as machine learning training applications where physical robots on campus must connect to servers in a lab).

slide of 3 pillars (local, colo, cloud), citing benefits (cost, resilience) and noting plans in place for Moffett Field (colo), and expansion of public cloud for admin systems.

[Slide from 2024 presentation-- the 2021 strategy is currently being implemented, with options available in all pillars by summer 2024]

The Offsite Colocation pillar is especially important for power-dense research applications such as AI research and data science applications. The cooling requirements and heavy duty power needs exceed what Berkeley can cost-effectively provide in the campus data center. Moving these to an offsite partner location allows for access to more power at better prices, and also more green energy options. This not only helps us manage costs more effectively but also enhances resilience by diversifying our data storage locations and mitigating earthquake risk—a crucial consideration given our proximity to Berkeley's Hayward fault. By partnering with NASA for new colocation facilities, we are able to expand Berkeley's computing capacity at reduced one-time and ongoing costs.

The third pillar, Public Cloud, leverages the scalability and flexibility of global cloud platforms like AWS, Google Cloud, and Microsoft Azure. The cloud is ideal for reliably accommodating fluctuating demand, facilitating everything from administrative tasks to complex computational research. The growth in our cloud-based activities, which has tripled in usage over the past three years, underscores the increasing reliance on flexible, scalable solutions that can adapt quickly to the needs of our diverse academic community. Public cloud is also the main infrastructure that should support Berkeley's administrative computing needs - to ensure resilience for our business systems.

Together, these three pillars form a dynamic framework to meet today's needs and position UC Berkeley for future growth and innovation. By adopting this model, we ensure that the University's computational infrastructure is robust, flexible and sustainably able to meet Berkeley's strategic goals-- from power and data-intensive research computing to resilient, cost-effective administrative operations.