By TensorOpera Team — Aug 8, 2022

FedML AI platform releases the world’s federated learning open platform on the public cloud with an in-depth introduction of products and technologies!

Federated learning (FL) is a machine learning paradigm where many clients (e.g., edge servers or mobile/IoT devices) collaboratively train a model while keeping the training data decentralized. It has shown huge potential in mitigating many of the systemic privacy risks, regulatory restrictions, and communication costs resulting from the traditional, over-the-cloud machine learning and data science approaches in healthcare, finance, smart cities, autonomous driving, and the Internet of things. It is undoubtedly a dark horse in the current artificial intelligence field. As it is the key technology for artificial intelligence modeling without centralizing scattered private data, it also has significant potential in the private data marketplace. Over the past two years, Internet companies such as Google, Facebook, and Nvidia have started to explore business opportunities for FL. In academia, there were as many as 10,000 papers published on FL in 2021, which is significantly more than many other AI directions. Its recent popularity has surpassed that of training massive models such as GPT-3.

Following this increasingly popular AI trend, one of the earliest institutions to study federated learning founded a startup, FedML, Inc. (https://fedml.ai), which began as an open source research project led by Professor Salman Avestimehr and his doctoral student Chaoyang He from University of Southern California (USC). Recently, FedML has transitioned from “behind the scenes” in academia to “on the stage” of industry and completed its first round of financing in March 2022, which totaled around $2M. Investors include top-tier venture capitals, such as Plug and Play, GGV Capital, MiraclePlus (Dr. Lu Qi, former SVP at Microsoft), AceCap, and individual investors from UC Berkeley and Stanford, specifically the “Shannon Award” winning professor David Tse., as well as from alumni of the University of Southern California, and others. Since the company’s establishment, FedML has won multiple commercial contracts in scenarios such as smart cities, medical care, and industrial IoT.

After just a few months of research and development, FedML has completed many industrial product upgrades. In addition to strengthening open source community maintenance and API upgrades, it also completed the building of FedML Open Platform — the world’s first open platform for federated and distributed machine learning under the public cloud and FedML App Ecosystem, a collaborative application ecosystem.

On the edge side, Open Platform (https://open.fedml.ai) can complete the training and deployment of edge models with one-line command and supports access to mobile phones and IoT devices. On the cloud side, Open Platform supports free global collaborative machine learning, including multinational, cross-city, and multi-tenant public cloud aggregation servers, as well as private cloud deployment with Docker mode. In terms of experimental management capabilities, the platform is specially tailored for distributed training, including capabilities of experiment tracking, management, visualization, and result analysis.

FedML’s newly released collaborative App Ecosystem is also highly integrated with the Open Platform. At its current stage, it supports the open collaboration of more than 20 applications, fully covering mainstream AI application scenarios such as computer vision, natural language processing, graph data mining, and the Internet of Things. If the open platform reduces the difficulty of actual building deployment of a federated learning system to the lowest level, then the App Ecosystem is used to lower the AI application R&D threshold for practitioners. A company need not hire high-cost machine learning teams; rather, they only need one engineer who can do “one-click import” based on community results and use the application directly without intensive development circles.

FedML is also making rapid progress in community operations. At present, the open source version has accumulated 1800+ Stars, 500+ Forks, 1100+ Slack users from different countries around the world, and its open platform has attracted over 500 professional users in a short period of time.

Next, we will introduce in detail the products and technologies behind the company as well as the founding team of FedML.

Outline

1. Background Introduction

2. From Open Source Research to Industrialized Platform

3. FedML Open Platform — the world’s first open platform for federated machine learning under the public cloud
3.1 Seamless migration between simulated experiment and real deployment, zero code modification
3.2 One-line command to complete the edge deployment
3.3 Support simplified collaboration anywhere: multinational, cross-city, multi-tenant
3.4 Provide free public cloud aggregation server and private cloud deployment with Docker
3.5 Experiment monitoring and analysis capabilities tailored for distributed training
3.6 Unified cross-platform design, supporting smartphones and IoT devices

4. Three-in-one Strategy for Open Collaboration: Open Source, Open Platform, Collaborative Application Ecosystem

5. Simple and flexible APIs, boosting innovation in algorithm and system optimization

6. Release Cross-silo FL dataset with Okwin, allowing research to face real scenarios

7. Published 50+ top scientific papers, covering key challenges such as security, efficiency, weak supervision, and fairness

8. Academia Sponsorship

9. FedML Team

1. Background

Federated Learning is a decentralized machine learning paradigm that can complete modeling while protecting the data privacy of edge nodes. Professor Prof. Salman Avestimehr, the co-founder of FedML and an early researcher of FL technology, uses the definition of “train locally and aggregate globally” to summarize federated learning (as shown in the figure). It refers to the orchestration of the edge node, where the local training happens, and the central server, where the global aggregation is performed, so that the performance of the trained model exceeds that of all edge individuals. For the best introductory materials for federated learning, please refer to the review “Open Problems and Advances in Federated Learning” and the upgraded version of “A field guide to federated optimization” in 2022, jointly published by Google, founders of FedML, and many top scholars around the world.

In this era, federated learning technology is critical because it is at the historical intersection of the three major technological hotspots: privacy computing, edge AI, and Web3/Blockchain.

With the application of AI in thousands of industries, people, in turn, pay significant attention to data privacy and security, and the government also places great emphasis on safe and compliant data circulation rules; although deep learning has been the key driver of the AI wave over the past 10 years, its problem of heavy reliance on massive centralized data on the input side is also increasingly prominent. Countless scattered small data exist in the real world, and each piece of data has an owner. Each owner has a right to control and use the data, as well as its special interests and privacy. Therefore, it is difficult for individual users, large enterprises, or institutions to integrate countless data from different owners, and this is what designates FL as a turning point in the history of AI development. In terms of algorithms, the past algorithms were designed based on a single-node computing center. However, with the complete maturity of the mobile Internet and personal computers, the idea of aggregating massive device data has become increasingly unrealistic due to many factors such as data ownership, privacy, security, communication, and storage overhead. What never before have we considered is distributed AI algorithms on a large number of nodes with scattered small data while simultaneously taking into account security, system efficiency, and model accuracy. In addition, many concepts in blockchain, including data decentralization computing, ownership verification, traceability, incentive mechanism, and trusted security, all coincide with the idea of federated learning to make the data value flow safely.

All these historical opportunities empower federated learning with vast market potential. Summarizing the cases that have been deployed around the world, federated learning has served healthcare, finance, advertising recommendation, smart cities, industrial manufacturing/IoT, transportation/autonomous vehicles, and the Metaverse.

2. From open source research to industrialized platform

In academia, the most popular open-source framework is undoubtedly FedML (https://github.com/FedML-AI), which is widely used around the world (see https://fedml.ai/use-cases/). Users include researchers and engineers from the United States, Canada, China, Germany, Denmark, South Korea, and Singapore, and big companies such as Google, Amazon, Adobe, Cisco, Huawei, and well-known research universities such as Stanford, Princeton, USC, HKUST, Tsinghua, etc. Researchers have used FedML to publish papers at top AI conferences, including ICML, NeurIPS, ICLR, and AAAI. Nearly 200 citations have been accumulated so far. FedML has also won the best paper award at the NeurIPS 2020 federated learning workshop (there are also some other frameworks, but FedML is the one cited most: FedML-196; Flower-96; PySyft-32; FedScale-20; FATE-14).

Today, the FedML team has further upgraded these academic achievements into an industrialized platform. Its mission is to build open and collaborative AI anywhere at any scale. In other words, FedML supports both federated learning for data silos and distributed training for acceleration with MLOps and Open Source support, covering cutting-edge academia research and industrial grade use cases. At its current stages, FedML provides the following services:

FedML Parrot — Simulating federated learning in the real world.
FedML Octopus — Cross-silo Federated Learning for cross-organization/account training, including Python-based edge SDK.
FedML Beehive — Cross-device Federated Learning for Smartphones and IoTs, including edge SDK for Android/iOS and embedded Linux.
FedML MLOps: FedML’s machine learning operation pipeline for AI running anywhere at any scale.
Model Serving: providing a better user experience for edge AI.

The FedML logo reflects the mission of FedML Inc.

As of today, FedML GitHub open source has accumulated 1800+ Stars, 500+ Forks, 1100+ Slack users from different countries around the world. Its open industrialized platform has attracted nearly 500 professional users in the short term (as shown in the figure).

Next, we will focus on the productization progress of FedML since its establishment.

3. FedML Open Platform — the world’s first open platform for federated machine learning under the public cloud

FedML Open Platform: https://open.fedml.ai

Different from the closed-source commercial platforms of most companies or the restricted mode of “applying for a trial,” FedML focuses on building a public MLOps (ML Operations) platform, which is open to global users for free. Anyone can directly register an account and put it into use immediately. Three registration/login methods are currently supported: Gmail Authorization, Github Authorization, and email registration. The open platform not only further lowers the learning curve and product deployment difficulty of federated learning but also opens a new window for data collaboration among any users and organizations and opens a door of possibilities in more application scenarios.

FedML’s MLOps platform has the following key features:

3.1 Seamless migration between simulated experiment and real deployment, zero code modification

Figure 1: the workflow of federated learning at FedML platform

FedML can help users seamlessly migrate the code of experimental simulation (POC) to the actual system (Production) to carry out experiments under real private data and distributed training systems of edge devices. A simulation experiment refers to the experimental verification stage carried out by several local development servers or local laptops. After it is verified that federated learning can produce modeling benefits on specific applications, users can use FedML MLOps to upgrade the simulation into production without modifying the code. The simulated source code can be deployed directly to edge devices with real data. As shown in the upper left of Figure 1, the user first completes local development and debugging in “Local Development” and then generates installation packages. Finally, through simple commands and UI interactions, these installation packages and scripts can be distributed to any private device (shown in red on the right of Figure 1).

3.2 One-line command to complete the edge deployment

After users log in to the platform, they only need to type the one-line command “fedml login” to complete the binding between the edge device and the platform. Users can view the bound edge devices on the “Edge Devices” page for the subsequent training process. In addition to “fedml login”, the following commands are also supported to improve the development efficiency:# build packages for the MLOps Platform
fedml build# Logout from the MLOps platform
fedml logout# Display logs during training
fedml logs# Display FedML environment
fedml env# Display FedML version
fedml version

3.3 Support simplified collaboration anywhere: multinational, cross-city, multi-tenant

FedML collaboration has become extremely simple. Just as you can collaborate on documents with friends, you can easily create a federated learning group by sending an invitation link. There are no geographical, national, or city restrictions anywhere in the world.

In a recent experiment, we demonstrated the entire process of federated learning under the public cloud from different cities in the United States, Japan, and China.

3.4 Provide free public cloud aggregation server and private cloud deployment with Docker

To reduce the difficulty of federated training, FedML’s open platform provides a public cloud aggregation server for everyone. Users can arbitrarily select free service nodes offered on the public cloud when initiating training.

FedML also considers more secure and strict deployment requirements for platform users. For this reason, FedML platform has also developed a private aggregation server library that can be deployed freely. It still only needs a one-line command, that is, the “fedml login -s” command to run a secure docker environment at any self-hosted server. An example is as follows:fedml login userid(or API Key) -s — docker — docker-rank rank_index

3.5 Experiment monitoring and analysis capabilities tailored for distributed training

In addition to the features mentioned above of lowering the user access threshold, the FedML platform also helps AI application modeling by providing experimental tracking, management, visualization, and analysis capabilities. Key capabilities currently supported include:

1. Edge device training status tracking.

2. Custom metrics reporting, such as the accuracy in common classification tasks in deep learning, the error rate of regression tasks, and even the running time of the system state, memory usage, and GPU utilization.

3. Profiling flow and edge device system performance. It can help users to view the execution performance of different subtasks on each edge device, which is convenient for analyzing the bottleneck of the system.

4. Distributed logging. This is a capability that the current general-purpose machine learning platform does not have, and it is convenient to track and analyze the anomalies that occur on each device in a real-time manner.

5. The experimental report allows users to compare multiple experimental results.

For more details, please visit: https://doc.fedml.ai/mlops/user_guide.html#invite-collaborators-create-a-group-and-a-project

3.6 Unified cross-platform design, supporting smartphones and IoT devices

FedML also recently released the Android platform for mobile devices; details can be found at the following links:

FedML Android Platform:

https://github.com/FedML-AI/FedML/tree/master/android

FedML IoT Platform:

https://github.com/FedML-AI/FedML/tree/master/iot

4. Three-in-one Strategy for Open Collaboration: Open Source, Open Platform, Collaborative Application Ecosystem

In addition to operating open source communities, FedML promotes open collaboration and open source research and development from multiple product perspectives. Besides the open source library (https://github.com/FedML-AI), and open platform (https://open.fedml.ai), FedML has also developed the collaborative App Ecosystem (users can visit https://open.fedml.ai and find “App Ecosystem” on the left top). The App Ecosystem and the platform cooperate with each other to continuously enrich the application ecosystem. The first version has completed the open collaboration of over 20 applications. Users can contribute and share the application. Each application includes all the FedML-based source code of an AI application, including model definitions, training scripts, and configuration files. At present, the App Ecosystem covers mainstream AI application scenarios such as computer vision, natural language processing, graph data mining, and the Internet of Things.

If the open platform reduces the difficulty of actual building deployment of the federated learning system to the lowest level, then the App Ecosystem is used to lower the AI application R&D threshold for practitioners: A company needs not to hire high-cost machine learning teams but rather needs only one engineer who can do “one-click import” on the basis of community results and use the application directly without intensive development circles.

5. Simple and flexible APIs, boosting innovation in algorithm and system optimization

At this point, you may be curious about what kind of API design can withstand the open platforms, the app ecosystem, and support so many AI applications. FedML has done a lot of explorations in this direction, and here is an introduction to several key designs for the open source community.

First, from the application point of view, FedML does its best to shield all code details and complex configurations of distributed training. Data scientists and engineers at the application level, such as computer vision, natural language processing, and data mining, only need to write the model, data, and trainer in the same way as a stand-alone program and then pass it to the FedMLRunner object to complete all the processes. This greatly reduces the bar for application developers to perform federated learning.import fedml
from my_model_trainer import MyModelTrainer
from my_server_aggregator import MyServerAggregator
from fedml import FedMLRunner

if __name__ == "__main__":
# init FedML framework
args = fedml.init()

# init device
device = fedml.device.get_device(args)

# load data
dataset, output_dim = fedml.data.load(args)

# load model
model = fedml.model.create(args, output_dim)

# my customized trainer and aggregator
trainer = MyModelTrainer(model, args)
aggregator = MyServerAggregator(model, args)

# start training
fedml_runner = FedMLRunner(args, device, dataset, model, trainer, aggregator)
fedml_runner.run()

Secondly, the FedML team believes that the design of the API should conform to the current technology development trend and should not assume that today’s technology is the final solution; rather, it should be iterated as it progresses. We can see that the algorithm innovation of the open source community is still very active, and many more user-valued algorithms continue to be innovated every month. It is based on this background that FedML considers making custom APIs flexible enough to empower algorithm innovation. To this end, FedML abstracts the core trainer and aggregator and provides users with two abstract objects, FedML.core.ClientTrainer and FedML.core.ServerAggregator, which only need to inherit the interfaces of these two abstract objects and pass them to FedMLRunner. Such customization provides machine learning developers with maximum flexibility. Users can define arbitrary model structures, optimizers, loss functions, etc. These customizations can also be seamlessly connected with the open source community, open platform, and application ecology mentioned above with the help of FedMLRunner, which completely solves the long lag problem from innovative algorithms to commercialization.

Finally, FedML believes that although FL is a comprehensive technology that combines security, system efficiency, and model accuracy, the first priority is still “ML-oriented Research and Development”. For example, security and system optimization are definitely important, but it is not a good product design for ML users if they have a huge learning burden on security and system design — this would eventually cause the core users to abandon the product. Therefore, in terms of architecture, FedML considers that security, privacy, and system optimization should all serve ML. The details of these auxiliary modules are hidden throughout layered design, and it is ultimately through this that the best ML experience is achieved. This responsibility is accomplished through FedML Flow.

Specifically, as shown in the figure, FedML regards distributed computing processes such as complex security protocols and distributed training as a directed acyclic graph (DAG) flow computing process, making the writing of complex protocols similar to stand-alone programs. Based on this idea, the security protocol Flow Layer 1 and the machine learning algorithm process Flow Layer 2 can be easily separated so that security engineers and machine learning engineers can perform their duties without having to master multiple technologies with the same mindset.

For a more intuitive understanding of FedML Flow, the following example demonstrates the process of implementing the FedAvg algorithm and adding multiple distributed tasks through FedML Flow. First, each distributed node can be regarded as an abstract FedMLExecutor, which is carried in an independent process and is responsible for executing a specific task. This task can be training or some protocol messages, thus maintaining a high degree of flexibility and abstraction. Flow is a framework that helps to transfer the behavior of these distributed Executors, and it can arrange the order of task execution and message passing between tasks. Specific to a FedAvg algorithm, we can define a Client Executor and a Server Executor object and use their custom functions as tasks in the flow. Through the Flow API, users can freely combine the execution processes of these Executors. The following code shows the entire process of model initialization, multiple rounds of training, and finally, distributed evaluation of the model. The most important thing is that is that this programming example only happens on the personal computer of FedML users and does not require any distributed system development skills.if args.rank == 0:
executor = Server(args)
executor.init(device, dataset, model)
else:
executor = Client(args)
executor.init(device, dataset, model)

fedml_alg_flow = FedMLAlgorithmFlow(args, executor)
fedml_alg_flow.add_flow("init_global_model", Server.init_global_model)
fedml_alg_flow.add_flow("handle_init", Client.handle_init_global_model)
for round_idx in range(args.comm_round):
fedml_alg_flow.add_flow("local_training", Client.local_training)
fedml_alg_flow.add_flow("server_aggregate", Server.server_aggregate)
fedml_alg_flow.add_flow("final_eval", Server.final_eval)
fedml_alg_flow.build()

fedml_runner = FedMLRunner(args, device, dataset, model, algorithm_flow=fedml_alg_flow)
fedml_runner.run()class Client(FedMLExecutor):
def local_training(self):
logging.info("local_training")
...
return params

def handle_init_global_model(self):
logging.info("handle_init_global_model")
...
return paramsclass Server(FedMLExecutor):

def init_global_model(self):
logging.info("init_global_model")
...
return params

def server_aggregate(self):
logging.info("server_aggregate")
...
return params

def final_eval(self):
logging.info("final_eval")
...
return params

6. Release Cross-silo FL dataset with Okwin, allowing research face real scenarios

Different from the synthetic or hypothetical data combed by papers such as LEAF and FedScale (note: only natural ID segmentation does not represent the authenticity of products and business scenarios), the datasets released by FedML and Okwin are taken from real federated learning scenarios.

As shown in the figure below, the current version mainly focuses on medical scenarios, containing 7 naturally partitioned medical datasets covering multiple tasks, models, and data modalities, each with baseline training codes for everyone to conduct research and development. All these results are published on the FedML open platform; see https://open.fedml.ai for details (click “App Ecosystem” after logging in).

7. Published 50+ top scientific papers, covering key challenges such as security, efficiency, weak supervision, and fairness

Finally, let’s take a look at FedML’s progress in research. In addition to the above platform and product experience upgrades, the FedML team has also maintained considerable investment in scientific research. In the past two years, over 50 papers have been published in the field of distributed computing and machine learning, covering the following aspects:

(1) Vision Paper for High Scientific Impacts

(2) System for Large-scale Distributed/Federated Training

(3) Training Algorithms for FL

(4) Security/privacy for FL

(5) AI Applications

All papers are summarized at https://doc.fedml.ai/resources/papers.html

Here are a few articles that are highly related to product landing, covering key problems such as security/privacy, efficiency, weak supervision, and fairness.

1. LightSecAgg: a Lightweight and Versatile Design for Secure Aggregation in Federated Learning. MLSys 2022

Abstract: Secure model aggregation is a key component of federated learning (FL) that aims at protecting the privacy of each user’s individual model while allowing for their global aggregation. It can be applied to any aggregation based FL approach for training a global or personalized model. Model aggregation needs to also be resilient against likely user dropouts in FL systems, making its design substantially more complex. State-of-the-art secure aggregation protocols rely on secret sharing of the random-seeds used for mask generations at the users to enable the reconstruction and cancellation of those belonging to the dropped users. The complexity of such approaches, however, grows substantially with the number of dropped users. We propose a new approach, named LightSecAgg, to overcome this bottleneck by changing the design from “random-seed reconstruction of the dropped users” to “one-shot aggregate-mask reconstruction of the active users via mask encoding/decoding”. We show that LightSecAgg achieves the same privacy and dropout-resiliency guarantees as the state-of-the-art protocols while significantly reducing the overhead for resiliency against dropped users. We also demonstrate that, unlike existing schemes, LightSecAgg can be applied to secure aggregation in the asynchronous FL setting. Furthermore, we provide a modular system design and optimized on-device parallelization for scalable implementation, by enabling computational overlapping between model training and on-device encoding, as well as improving the speed of concurrent receiving and sending of chunked masks.

Arxiv Link: https://arxiv.org/pdf/2109.14236.pdf

2. SSFL: Tackling Label Deficiency in Federated Learning via Personalized Self-Supervision. Best Paper Awards at AAAI 2021 FL workshop.

Abstract: Federated Learning (FL) is transforming the ML training ecosystem from a centralized over-the-cloud setting to distributed training over edge devices in order to strengthen data privacy. An essential but rarely studied challenge in FL is label deficiency at the edge. This problem is even more pronounced in FL compared to centralized training due to the fact that FL users are often reluctant to label their private data. Furthermore, due to the heterogeneous nature of the data at edge devices, it is crucial to develop personalized models. In this paper we propose self-supervised federated learning (SSFL), a unified self-supervised and personalized federated learning framework, and a series of algorithms under this framework which work towards addressing these challenges. First, under the SSFL framework, we demonstrate that the standard FedAvg algorithm is compatible with recent breakthroughs in centralized self-supervised learning such as SimSiam networks. Moreover, to deal with data heterogeneity at the edge devices in this framework, we have innovated a series of algorithms that broaden existing supervised personalization algorithms into the setting of self-supervised learning. We further propose a novel personalized federated self-supervised learning algorithm, Per-SSFL, which balances personalization and consensus by carefully regulating the distance between the local and global representations of data. To provide a comprehensive comparative analysis of all proposed algorithms, we also develop a distributed training system and related evaluation protocol for SSFL. Our findings show that the gap of evaluation accuracy between supervised learning and unsupervised learning in FL is both small and reasonable. The performance comparison indicates the representation regularization-based personalization method is able to outperform other variants.

Arxiv Link: https://arxiv.org/pdf/2110.02470.pdf

3. 3LegRace: Privacy-Preserving DNN Training over TEEs and GPUs. Privacy Enhancing Technologies Symposium (PETS) 2022

Abstract: Leveraging parallel hardware (e.g. GPUs) for deep neural network (DNN) training brings high computing performance. However, it raises data privacy concerns as GPUs lack a trusted environment to protect the data. Trusted execution environments (TEEs) have emerged as a promising solution to achieve privacy-preserving learning. Unfortunately, TEEs’ limited computing power renders them not comparable to GPUs in performance. To improve the trade-off among privacy, computing performance, and model accuracy, we propose an asymmetric model decomposition framework, AsymML, to (1) accelerate training using parallel hardware; and (2) achieve a strong privacy guarantee using TEEs and differential privacy (DP) with much less accuracy compromised compared to DP-only methods. By exploiting the low-rank characteristics in training data and intermediate features, AsymML asymmetrically decomposes inputs and intermediate activations into low-rank and residual parts. With the decomposed data, the target DNN model is accordingly split into a \emph{trusted} and an \emph{untrusted} part. The trusted part performs computations on low-rank data, with low compute and memory costs. The untrusted part is fed with residuals perturbed by very small noise. Privacy, computing performance, and model accuracy are well managed by respectively delegating the trusted and the untrusted part to TEEs and GPUs. We provide a formal DP guarantee that demonstrates that, for the same privacy guarantee, combining asymmetric data decomposition and DP requires much smaller noise compared to solely using DP without decomposition. This improves the privacy-utility trade-off significantly compared to using only DP methods without decomposition. Furthermore, we present a rank bound analysis showing that the low-rank structure is preserved after each layer across the entire model.

Arxiv Link: https://arxiv.org/abs/2110.01229

In addition, the team has also explored from the perspective of algorithmic fairness, such as FairFed: Enabling Group Fairness in Federated Learning. This article reveals that basic federated learning training methods may introduce inequities in gender and ethnic groups, and provides a an effective way to solve this problem.

8. Academia Sponsorship

FedML is also actively involved in the construction of an academic ecosystem through sponsorships of the following AI-related academic conferences in 2022:

MLSys 2022 — Workshop on Cross-Community Federated Learning: Algorithms, Systems and Co-designs
IJCAI 2022 — International Workshop on Trustworthy Federated Learning
CIKM 2022 — Workshop on Federated Learning with Graph Data
ACL 2022 — Workshop on Federated Learning for Natural Language Processing (FL4NLP)

For more details, please visit: https://fedml.ai/academia-sponsorship/

9. FedML Team

FedML was co-founded by USC professor Salman Avestimehr, a well-known expert in federated learning and distributed computing, and Dr. Chaoyang He, a principal software engineer and engineering manager from a former Internet company. The rest of the FedML team consists of full-time engineers, researchers, product managers from former Internet companies, and top-school PhD/MS interns.

Co-founder and CEO — Salman Avestimehr

Salman Avestimehr is a world-renowned expert in federated learning with over 20 years of R&D leadership in both academia and industry. He is a Dean’s Professor and the inaugural director of the USC-Amazon Center on Trustworthy Machine Learning at the University of Southern California. He has also been an Amazon Scholar in Amazon. He is a United States Presidential award winner for his profound contributions in information technology, and a Fellow of IEEE. Homepage: https://www.avestimehr.com/

Co-founder and CTO — Chaoyang He

Chaoyang He received his PhD from the CS department at the University of Southern California, Los Angeles, USA. He has research experience on distributed/federated machine learning algorithms, systems, and applications, and has published papers at top-tier conferences such as ICML, NeurIPS, CVPR, ICLR, AAAI, and MLSys. He also has significant industry experience in the areas of distributed/cloud computing and mobile/IoT systems. He was an R&D Team Manager and Principal Software Engineer at Tencent and also worked as researcher/engineer at Google, Facebook, Amazon, Baidu, and Huawei. He has received a number of awards in academia and industry. Homepage: https://chaoyanghe.com

Outline

1. Background

2. From open source research to industrialized platform

3. FedML Open Platform — the world’s first open platform for federated machine learning under the public cloud

3.1 Seamless migration between simulated experiment and real deployment, zero code modification

3.2 One-line command to complete the edge deployment

3.3 Support simplified collaboration anywhere: multinational, cross-city, multi-tenant

3.4 Provide free public cloud aggregation server and private cloud deployment with Docker

3.5 Experiment monitoring and analysis capabilities tailored for distributed training

3.6 Unified cross-platform design, supporting smartphones and IoT devices

4. Three-in-one Strategy for Open Collaboration: Open Source, Open Platform, Collaborative Application Ecosystem

5. Simple and flexible APIs, boosting innovation in algorithm and system optimization

6. Release Cross-silo FL dataset with Okwin, allowing research face real scenarios

7. Published 50+ top scientific papers, covering key challenges such as security, efficiency, weak supervision, and fairness

8. Academia Sponsorship

9. FedML Team

USC Viterbi researchers’ new AI platform is for everyone