To address this, Uber implemented a read-only SSD cache within each DataNode to store frequently accessed data and serve read requests. However, the adoption of high-density disk SKU presented challenges, particularly with disk IO bandwidth. ![]() This move was projected to save tens of millions of dollars annually. Uber’s strategy involved adopting higher-density HDD (16+TB) SKUs to replace the existing 4TB HDDs that were still in use by the majority of their HDFS clusters. The primary objective was to strike a balance between efficiency, service reliability, and high performance as they scale their data infrastructure. The blog post provides a deep dive into Uber’s efforts to optimize their Hadoop Distributed File System (HDFS) deployment, one of the largest in the world, housing exabytes of data across tens of clusters. You can read the full story on Uber’s Engineering blog: Optimizing HDFS with DataNode Local Cache. Despite the SSD cache occupying only 0.6% of the total disk space, it impressively handles 60% of the overall client traffic. The project utilized the Alluxio SDK cache to manage an SSD storage on each DataNode, resulting in improved performance and a better return on investment. Uber’s HDFS team has posted another blog post detailing our joint project aimed at optimizing the performance of HDFS DataNodes. Recently, we’ve taken another exciting step forward. With the Alluxio SDK cache, Uber has observed a 10% decrease in data read traffic to their HDFS cluster and a 50% reduction in input read latency, leading to faster insights for Uber’s business. Thus far, the Uber Presto team has implemented the Alluxio SDK cache in three production clusters spanning over 1500 nodes. This achievement is a major milestone in the collaboration between Alluxio and Uber. Finally, Jing is interested in all settings in Computer Science where interaction occurs among strategic parties, and the study and explanation of economic phenomenon in terms of the parties’ computation power.In October 2022, Uber’s Presto team shared in a blog post using the Alluxio SDK cache to boost Presto query performance and cost efficiency. ![]() Jing also works on the epistemic foundation of rationality, and the application of the theory of mechanism design to real-life scenarios, such as the design of healthcare policies. ![]() This also includes designing mechanisms where the computation and communication cost of the involved parties are very low, mechanisms that protect the parties’ privacy to the maximum extent possible, and mechanisms that work properly even when the parties can form coalitions and coordinate their strategies. This includes designing mechanisms that leverage the parties’ imprecise or less structured information, even when they are not perfectly rational. Her work focuses on designing resilient mechanisms that work properly even in “dirtier” or less foreseeable environments. She is particularly interested in the design of incentive mechanisms, which leverage information from self-interested parties so as to produce desirable outcomes for the decision maker. Jing’s research lies at the intersection of Computer Science (especially Theory of Computation) and Economics (especially Microeconomic Theory).
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |