Close Menu
    Latest Post

    Is ChatGPT’s New Shopping Research Solving a Problem, or Creating One?

    January 9, 2026

    How GitHub Engineers Address Platform Challenges

    January 9, 2026

    Key CSS Developments: Conditional View Transitions, Text Effects, and Community Insights

    January 9, 2026
    Facebook X (Twitter) Instagram
    Trending
    • Is ChatGPT’s New Shopping Research Solving a Problem, or Creating One?
    • How GitHub Engineers Address Platform Challenges
    • Key CSS Developments: Conditional View Transitions, Text Effects, and Community Insights
    • As RAM prices skyrocket and Windows 11 flounders, Linux gains native NVIDIA GeForce NOW support — turning the cloud into a sanctuary for priced-out gamers
    • Honor Magic 8 Pro: A Contender in the Flagship Smartphone Arena
    • United States Withdraws from International Cybersecurity Organizations
    • Lego Introduces Tech-Enhanced Smart Bricks Amidst Expert Concerns
    • Build Resilient Generative AI Agents
    Facebook X (Twitter) Instagram Pinterest Vimeo
    NodeTodayNodeToday
    • Home
    • AI
    • Dev
    • Guides
    • Products
    • Security
    • Startups
    • Tech
    • Tools
    NodeTodayNodeToday
    Home»Tools»Seventh-generation server hardware at Dropbox: our most efficient and capable architecture yet
    Tools

    Seventh-generation server hardware at Dropbox: our most efficient and capable architecture yet

    Samuel AlejandroBy Samuel AlejandroDecember 21, 2025Updated:December 22, 2025No Comments12 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    src 5hhcz4 featured
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Fourteen years ago, Dropbox began developing its own hardware infrastructure. As the product and user base expanded, so did this infrastructure, growing from a few dozen machines to tens of thousands of servers managing millions of drives, becoming one of the world’s largest custom-built storage systems.

    This growth was a result of years of refinement, close collaboration with suppliers, and a product-centric approach that viewed infrastructure as a strategic advantage. The company is now launching its seventh-generation hardware platform, which includes Crush for traditional compute, Dexter for databases, and Sonic for storage workloads. Additionally, new GPU tiers, Gumby and Godzilla, have been introduced. This advancement involved a significant increase in storage bandwidth, a doubling of available rack power, and a new storage chassis designed to further reduce vibration and heat.

    This generation represents the most efficient, capable, and scalable architecture developed to date, supporting the development and scaling of AI products like Dropbox Dash. The following details the design process for this server hardware and key insights gained for future generations.

    Developing our strategy

    Understanding the foundation of Dropbox’s infrastructure helps explain its current state. In 2015, all US customer data was migrated from off-premises hosts to on-site facilities. The Magic Pocket team completed a massive migration, moving over 90% of the approximately 600PB of data stored at the time into custom-built infrastructure managed internally. This was a pivotal moment, leading to improved performance, cost control, and scalability.

    In subsequent years, significant growth continued, with storage expanding from 40PB in 2012 to over 600PB by 2016. Currently, the infrastructure operates in the exabyte era, utilizing semi-custom hardware designed to meet the platform’s unique requirements. New technologies, such as SMR drives for higher storage density and GPU accelerators for AI and compute-intensive tasks, have been introduced. Both hardware and software have been co-designed to support evolving workloads.

    Before finalizing designs for the seventh-generation server hardware, high-level goals were established, drawing from lessons learned from previous generations and opportunities presented by the latest industry hardware advancements. Three main themes guided this approach:

    Embracing emerging tech

    Infrastructure hardware is rapidly advancing. CPUs are becoming more powerful with increased core counts and improved performance per core. AI workloads are expanding quickly. Networks are transitioning to faster speeds like 200G and 400G. A new era of storage areal density has also arrived. This moment was leveraged to guide the strategy, aiming to harness these trends for improved performance, efficiency, and scalability.

    Partnering with suppliers

    For this generation, close collaboration with suppliers was undertaken to co-develop storage platforms tailored to specific needs. This provided early access to technologies such as higher-density drives and high-performance controllers, along with the chance to help tune firmware for specific workloads. These partnerships enabled the necessary optimizations to address the acoustic, vibration, and thermal challenges inherent in dense system designs.

    Strengthening these relationships continues to offer a strategic advantage: earlier access to emerging technology, deeper hardware customization, and a more stable platform.

    Designing with software in mind

    Software teams were involved early to identify what would significantly impact their services. This led to goals such as increasing compute power per rack, enabling GPU support for AI and video processing, and enhancing speed and responsiveness for database systems. Co-design was a central theme, focusing not just on servers but on building platforms that elevate services.

    All these goals converged on a central question: What magnitude of leap is desired, and which technologies will facilitate it? Instead of maximizing every component simply because it was possible, the focus was on what mattered most: better performance per watt, higher rack-level efficiency, and unlocking the right features to support the next phase of development.

    A look under the hood of our next-gen hardware

    Performance

    The initial major step in developing the next-gen Dropbox server hardware involved refreshing the CPUs. These processors influence the entire system’s platform, power consumption, cooling requirements, and more, making their selection critical. Starting with over 100 processors, the criteria for narrowing down the selection included:

    • Maximizing system throughput at both server and rack levels
    • Reducing latency of individual processes
    • Improving price-performance for Dropbox workloads
    • Ensuring balanced I/O and memory bandwidth

    Tests were conducted using SPECintrate, a benchmark for multi-threaded workloads, comparing each chip’s performance per watt and per core. The chosen CPU excelled in both high throughput and strong per-core performance, outperforming the previous generation by 40%.

    The transition from the sixth-generation Cartman platform to the new Crush platform represented a significant increase in compute power. This upgrade was primarily driven by a shift from the 48-core AMD EPYC 7642 “Rome” processor to the 84-core AMD EPYC 9634 “Genoa,” yielding substantial improvements:

    • 75% more cores per socket (48 cores → 84 cores), enhancing bin packing for containerized services
    • 2x the memory capacity (256GB → 512GB), boosting memory-intensive workloads
    • DDR4 → DDR5, providing higher bandwidth
    • 25Gb → 100Gb networking, aligning with increasing internal traffic
    • NVMe gen5, accelerating local disk access and system boot times

    These advancements were achieved while maintaining the compact 1U “pizza box” server design, allowing for 46 servers per rack without requiring additional space.

    Databases

    For databases, the focus was on enhancing CPU performance. While the new Dexter platform retains the same number of cores as its predecessor, it delivers a 30% increase in instructions per cycle (IPC) and a higher base frequency (from 2.1GHz to 3.25GHz). The move from a dual-socket to a single-socket design also reduced delays from inter-socket communication. These upgrades resulted in up to 3.57x less replication lag, which is the delay between data being written to a primary system and its replication to a secondary one. This significantly benefited high-demand workloads like Dynovault and Edgestore.

    Early on, it was recognized that database and compute platforms shared many requirements. This allowed for the use of the same platform from the system vendor for both. This consolidation simplified the support stack, making it easier to manage components, firmware, drivers, and OS updates. The outcome was a versatile, dual-use platform that addressed key scaling bottlenecks without sacrificing density or efficiency.

    Storage

    On the storage front, design goals evolved to keep pace with increasing drive capacities. As drives grow larger, with some now exceeding 30TB, forward-thinking was essential. An internal performance standard is 30Gbps per PB of data. However, with future systems expected to surpass 100Gbps, an even higher target of 200Gbps throughput was set.

    The SAS topology was reconfigured to evenly distribute bandwidth to each drive and provide total system bandwidth beyond 200Gbps. Given the demands of this scale, a new 400G-ready data center architecture was designed in collaboration with the network engineering team.

    Thermal and power architecture

    The upgrade to higher-core CPUs necessitated a reevaluation of thermal and power management. Power demands increased across all platforms, requiring a more intelligent approach to stay within the limits of existing data center infrastructure. First, a cap was set on processor thermal design power (TDP) to ensure that as many cores as possible could be packed per rack without exceeding cooling or power budgets.

    Instead of relying on worst-case “nameplate” power measurements, which are manufacturer-listed maximums that often overestimate actual usage, real-world system usage was modeled. These models indicated that servers could draw over 16kW per cabinet, which would have been problematic under the previous 15kW per-rack power budget. To accommodate this without overhauling the entire power infrastructure, a significant change was made in collaboration with the data center engineering team: switching from two PDUs to four PDUs per rack, utilizing existing busways and adding more receptacles.

    This change effectively doubled the available rack power, providing ample capacity to support current loads and even future accelerator cards. Additionally, close collaboration with suppliers led to improvements in airflow, upgraded heatsink designs, optimized fan curves, and thermal testing under full-load conditions. This comprehensive effort ensured efficient cooling.

    Focus on storage

    The storage strategy prioritizes maximizing output from the same physical space. The capacity of traditional 3.5” hard drives has steadily increased from approximately 14TB to over 30TB in just a few years. This density improvement offers benefits such as lower cost per terabyte and reduced power usage per terabyte.

    However, higher densities also increase sensitivity to acoustic and vibrational interference. Given that over 99% of the storage fleet utilizes shingled magnetic recording (SMR) technology, which packs data even more tightly, the margin for error is extremely small.

    The challenge lies in the nanometer precision of the read/write head inside these drives; even minor vibrations can disrupt its operation. When fans spin at over 10,000 RPM to cool a dense server, vibrations and noise can accumulate rapidly. This leads to a position error signal (PES) and, in severe cases, a write fault that necessitates a drive retry, increasing latency and reducing IOPS.

    Simultaneously, adequate airflow is crucial for keeping drives cool, as they perform optimally around 40°C. Excessive heat accelerates drive aging and increases error rates. This creates a constant balance between maintaining sufficient cooling for performance and ensuring quiet operation for precision. To address this, the next-gen storage chassis was co-developed with system and drive suppliers. Key features focused on included:

    • Vibration control: Acoustical isolation and damping
    • Thermals: Improved fan control and airflow redirection
    • Future-proofing: Compatibility with the next generation of large-capacity drives

    This effort proved successful, leading to the early adoption of Western Digital’s Ultrastar HC690, a 32TB SMR drive that fits 11 platters into a standard 3.5” casing. This represents more than a 10% increase in capacity compared to the previous generation.

    Leveling up with GPUs

    To support Dash, the universal search and knowledge management product, integrating GPUs was essential. Features such as intelligent previews, document understanding, fast search, and video processing, along with recent work with large language models, all demand significant computing power.

    These workloads require high parallelism, massive memory bandwidth, and low-latency interconnects, which traditional CPU-based servers cannot economically support. Consequently, as part of the seventh-generation hardware rollout, two new GPU-enabled server tiers were introduced: Gumby and Godzilla.

    GPU generations

    • Gumby builds upon the Crush compute platform but incorporates support for a wide array of GPU accelerators, hence its flexible name. It is designed for versatility, supporting TDPs ranging from 75W to 600W, as well as both half-height half-length (HHHL) and full-height full-length (FHFL) PCIe form factors. Gumby is optimized for lightweight inference tasks such as video transcoding, embedding generation, and other service-side machine learning enhancements.
    • Godzilla is engineered for demanding tasks, supporting up to 8 interconnected GPUs. It delivers the performance necessary for LLM testing, fine-tuning, and other high-throughput machine learning workflows.

    Together, Gumby and Godzilla enable the scaling of AI across products while maintaining control over performance, cost, and energy efficiency.

    Across compute, storage, and accelerators, performance was enhanced while adhering to core principles: building right-sized, cost-efficient, and optimized systems tailored to specific needs. This represents not just next-gen technology, but hardware shaped by how Dropbox operations function.

    What we learned along the way

    Thermals and power are the new bottlenecks

    Across all platforms—compute, storage, and GPU—power demands are clearly increasing. To maintain performance scaling, a complete reevaluation of everything from airflow patterns to power delivery at the rack level was necessary. This led to the transition from two to four PDUs per rack. As a result, more power-hungry, high-density systems can now be supported without compromising stability or performance. While overall rack power increased, power consumption per petabyte and per core decreased, contributing to sustainability goals.

    Supplier collaboration accelerates innovation

    Some of the most significant hardware achievements resulted from early collaboration with suppliers. Whether it involved redesigning hard drive acoustics, fine-tuning chassis layouts, or gaining early access to cutting-edge components, partnership proved crucial. Instead of relying on off-the-shelf solutions, hardware was co-developed to precisely fit specific needs, such as vibration-optimized storage enclosures and systems prepared for the next generation of storage technology.

    Taking a product-first approach pays off

    The GPU tier exemplifies how hardware strategy followed software requirements. As AI-powered tools like Dash emerged, early engagement with machine learning and video teams helped understand their needs. This initial input facilitated the creation of platforms specifically designed for inference, video processing, and hosting large language models. This product-first mindset ensured the infrastructure was ready precisely when the software required it.

    Conclusion

    With the rollout of the seventh-generation hardware, the in-house server infrastructure has become more strategically valuable. The latest Crush and Dexter platforms are powered by advanced CPUs, providing a significant boost in performance, particularly in IPC and transaction speed. On the storage side, close collaboration with vendors led to the development of Sonic, a new system supporting higher capacities. As demand for AI products continues to grow, a dedicated hardware tier specifically built for these workloads was introduced.

    Looking ahead, preparations are already underway for the next wave of infrastructure changes. Technologies like heat-assisted magnetic recording (HAMR) promise substantial capacity gains, which will necessitate even greater precision in acoustic and thermal management. Similarly, liquid cooling is transitioning from a niche solution to a necessity as compute densities continue to rise.

    This generation of infrastructure is not merely about addressing current challenges; it serves as a foundation for future advancements. Tightly integrated platforms optimized for Dropbox have been built, supplier partnerships strengthened, and the flexibility to continue evolving has been established. The next generation is already in progress.

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleWorkers AI Now Supports Structured JSON Outputs
    Next Article Firefox Enhancements: Private Web Translations and Customizable App Icons
    Samuel Alejandro

    Related Posts

    Tech

    Is ChatGPT’s New Shopping Research Solving a Problem, or Creating One?

    January 9, 2026
    Tools

    How GitHub Engineers Address Platform Challenges

    January 9, 2026
    Tools

    Build Resilient Generative AI Agents

    January 8, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Latest Post

    ChatGPT Mobile App Surpasses $3 Billion in Consumer Spending

    December 21, 202512 Views

    Automate Your iPhone’s Always-On Display for Better Battery Life and Privacy

    December 21, 202510 Views

    Creator Tayla Cannon Lands $1.1M Investment for Rebuildr PT Software

    December 21, 20259 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    About

    Welcome to NodeToday, your trusted source for the latest updates in Technology, Artificial Intelligence, and Innovation. We are dedicated to delivering accurate, timely, and insightful content that helps readers stay ahead in a fast-evolving digital world.

    At NodeToday, we cover everything from AI breakthroughs and emerging technologies to product launches, software tools, developer news, and practical guides. Our goal is to simplify complex topics and present them in a clear, engaging, and easy-to-understand way for tech enthusiasts, professionals, and beginners alike.

    Latest Post

    Is ChatGPT’s New Shopping Research Solving a Problem, or Creating One?

    January 9, 20260 Views

    How GitHub Engineers Address Platform Challenges

    January 9, 20260 Views

    Key CSS Developments: Conditional View Transitions, Text Effects, and Community Insights

    January 9, 20260 Views
    Recent Posts
    • Is ChatGPT’s New Shopping Research Solving a Problem, or Creating One?
    • How GitHub Engineers Address Platform Challenges
    • Key CSS Developments: Conditional View Transitions, Text Effects, and Community Insights
    • As RAM prices skyrocket and Windows 11 flounders, Linux gains native NVIDIA GeForce NOW support — turning the cloud into a sanctuary for priced-out gamers
    • Honor Magic 8 Pro: A Contender in the Flagship Smartphone Arena
    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions
    • Disclaimer
    • Cookie Policy
    © 2026 NodeToday.

    Type above and press Enter to search. Press Esc to cancel.