![Page 1: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/1.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Wojciech Samek
(Fraunhofer HHI)
1. Introduction (WS)
2. Model Compression & Efficient Deep Learning (WS)
3. (Virtual) Coffee Break
4. Distributed & Federated Learning (FS)
Felix Sattler(Fraunhofer HHI)
Outline of tutorial
![Page 2: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/2.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Part III: Distributed & Federated LearningWojciech Samek & Felix Sattler
![Page 3: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/3.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Distributed Learning
Data Data Data
Server
Data
Data
Data
![Page 4: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/4.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Traditional Centralized (“Cloud”) ML
Data
Data
Data
Server
→ Data is gathered Centrally Problems
– Privacy
– Ownership (→ who owns the data?)
– Security (→ single point of failure)
– Efficiency (→ need to move data around)
train model
![Page 5: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/5.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Distributed (/ “Embedded”) ML
Data
Data
Data
Server
train model
Problems
– Privacy
– Ownership (→ who owns the data?)
– Security (→ single point of failure)
– Efficiency (→ need to move data around)
train model
train model
→ Data never leaves the local Devices
→ Instead model Updates are communicated
![Page 6: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/6.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Distributed (/ “Embedded”) ML: Settings
Detailed Comparison: Sattler, Wiegand, Samek. "Trends and Advancements in Deep Neural Network Communication."
![Page 7: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/7.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning
“Federated Learning is a machine learning setting where multiple entities collaborate in solving a learning problem, without directly exchanging data. The
Federated training process is coordinated by a central server.”
Kairouz, Peter, et al. "Advances and open problems in federated learning."
![Page 8: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/8.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning
![Page 9: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/9.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning
Data Data Data
Server
Data
Data
Data
![Page 10: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/10.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning
Data Data Data
Server
SGDSGD SGD
![Page 11: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/11.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning
Data Data Data
Serveraveraging
![Page 12: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/12.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning
Data Data Data
Server
Data
Data
Data
![Page 13: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/13.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning - Settings
Cross Device
– Large Number of Clients
– Only fraction of Clients available at any given time
– Few data points per Client
– Limited computational resources
Cross Silo
– Small number of Clients
– Clients are always available
– Large local data sets
– Strong computational resources
![Page 14: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/14.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning - Challenges
Challenges in Federated Learning
Convergence
Heterogeneity
Privacy
Robustness Personalization
Communication
![Page 15: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/15.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning - Communication
Data Data Data
Server Server
Data Data Data
Download Upload
![Page 16: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/16.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning - Communication
Total Communication = [#Communication Rounds] x [#Parameters] x [Avg. Codeword length]
Case Study: VGG16 on ImageNet
– Number of Iterations until Convergence: 900.000
– Number of Parameters: 138.000.000
– Bits per Parameter: 32
→ Total Communication = 496.8 Terabyte (Upload+Download)
![Page 17: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/17.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning – Compression Methods
Total Communication = [#Communication Rounds] x [#Parameters] x [Avg. Codeword length]
Compression Methods
– Communication Delay
– Lossy Compression: Unbiased
– Lossy Compression: Biased
– Efficient Encoding
![Page 18: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/18.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Communication Delay
Distributed SGD:
For t=1,..,[Communication Rounds]:
For i=1,..,[Participating Clients]:
Client does:
Server does:
Federated Averaging:
For t=1,..,[Communication Rounds]:
For i=1,..,[Participating Clients]:
Client does:
Server does:
![Page 19: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/19.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Communication Delay
Advantages:
● Simple
● Reduces Communication Frequency (advantageous in on-device FL)
● Reduces both Upstream and Downstream communication
● Easy to integrate with Privacy mechanisms
Disadvantages:
![Page 20: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/20.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Communication Delay
Convergence Analysis for Convex Objectives:
IID-Assumption:
– Clients Participating per Round
– Total communication Rounds
– Local Iterations per Round
– Lipschitz Parameter of the Loss Function
– Bound on the Variance of the Stochastic Gradients
Lan, Guanghui. "An optimal method for stochastic composite optimization."
![Page 21: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/21.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Statistical Heterogeneity
Convergence speed drastically decreases with increasing heterogeneity in the data
→ This effect aggravates if the number of participating clients (“reporting fraction”) is low
Hsu, Tzu-Ming Harry, Hang Qi, and Matthew Brown. "Measuring the effects of non-identical data distribution for federated visual classification."
Dirichlet Parameter
![Page 22: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/22.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Communication Delay
Kairouz, Peter, et al. "Advances and open problems in federated learning."
![Page 23: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/23.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Communication Delay
Advantages:
● Simple
● Reduces Communication Frequency (more practical in on-device FL)
● Reduces Upstream + Downstream communication
● Easy to integrate with Privacy mechanisms
Disadvantages:
● Bad performance on non-iid data
● Low sample efficiency
![Page 24: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/24.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning – Compression Methods
Total Communication = [#Communication Rounds] x [#Parameters] x [Avg. Codeword length]
Compression Methods
– Communication Delay
– Lossy Compression: Unbiased
– Lossy Compression: Biased
– Efficient Encoding
![Page 25: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/25.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Update Compression
Distributed SGD:
For t=1,..,[Communication Rounds]:
For i=1,..,[Participating Clients]:
Client does:
Server does:
Distributed SGD with Compression:
For t=1,..,[Communication Rounds]:
For i=1,..,[Participating Clients]:
Client does:
Server does:
![Page 26: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/26.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Update Compression
------ Unbiased ------ ----------------------- Biased -----------------------
![Page 27: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/27.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Unbiased Compression
Definition: A compression operator is called unbiased iff,
Pros:
➔ “Straight forward” Convergence Analysis (Stochastic Gradients with increased variance)
➔ Variance reduction (uncorrelated noise)
variance of the gradient estimator
Strongly convex bound:
![Page 28: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/28.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Unbiased Compression
● Pros: “Straight forward” Convergence Analysis (Stochastic Gradients, increased variance)
● Cons: Variance blow-up leads to poor empirical performance
![Page 29: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/29.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Biased Compression
Definition: A compression operator is called biased iff,
→ Can be turned into convergent methods via error accumulation.
Karimireddy, et al. "Error feedback fixes signsgd and other gradient compression schemes."
Stich, Cordonnier, Jaggi. "Sparsified SGD with memory."
Biased compression methods do not necessarily converge!
![Page 30: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/30.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Error AccumulationDistributed SGD with
Error Accumulation:
For t=1,..,[Communication Rounds]:
For i=1,..,[Participating Clients]:
Client does:
Server does:
Distributed SGD:
For t=1,..,[Communication Rounds]:
For i=1,..,[Participating Clients]:
Client does:
Server does:
![Page 31: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/31.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Error Accumulation
For a parameter , a - contraction operator is a (possibly randomized) operator that satisfies the contraction property
Theorem (Stich et al.): For any contraction operator, compressed SGD with Error Accumulation for large T achieves convergence rate on mu-strongly convex objective functions with asymptotic rate:
Independent of alpha!
![Page 32: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/32.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning – Recap Compression
Unbiased Biased
MethodsTernGrad, QSGD, Atomo Gradient Dropping, Deep
Gradient Compression, signSGD, PowerSGD,
Convergence Proofs
Bounded Variance Assumption
k-contraction Framework (Stich et al. 2018)
![Page 33: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/33.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Combining Methods: Sparse Binary Compression
Sattler, et al. "Sparse binary compression: Towards distributed deep learning with minimal communication." 2019 International Joint Conference on Neural Networks (IJCNN).
![Page 34: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/34.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Sparse Binary Compression
![Page 35: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/35.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Sparse Binary Compression for non-iid Data
![Page 36: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/36.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning – Combining Methods
Sattler, Wiedemann, Müller, Samek. "Robust and communication-efficient federated learning from non-iid data." IEEE TNNLS (2019).
![Page 37: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/37.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Efficient Encoding – DeepCABAC
● CABAC best encoder for quantized parameter tensors
● Plug & Play
● Can be used as a final lossless compression stage for all compression methods that we have presented
Wiedemann et al. "DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks."
![Page 38: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/38.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning - Challenges
Challenges in Federated Learning
Convergence
Heterogeneity
Privacy
Robustness Personalization
Communication
✔
( )✔
✔
![Page 39: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/39.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning – Privacy
Hitaj, Briland, Giuseppe Ateniese, and Fernando Perez-Cruz. "Deep models under the GAN: information leakage from collaborative deep learning."
![Page 40: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/40.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning – Privacy
Data Data Data
Server
![Page 41: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/41.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning – Privacy
Privacy Protection Mechanisms:
- Secure Multi-Party Computation
- Homomorphic Encryption
- Trusted Execution Environments
- Differential Privacy
crypto
Dwork, Cynthia, and Aaron Roth. "The algorithmic foundations of differential privacy."
![Page 42: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/42.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Differential Privacy
A mechanism is called
differentially private with parameter iff,
t
P[A(D)=t]P[A(D’)=t]
for any two data sets and which differ in only one element.
![Page 43: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/43.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Differential Privacy – Mechanisms
Global Sensitivity:
The Laplace mechanism with is - differentially private.
Data
Global Sensitivity:
+ NOISE
![Page 44: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/44.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Differential Privacy – Post Processing
Data released via a - private mechanism is - private under arbitrary post processing!
Datadifferentially
privatealgorithm
non-privatepost-
processing
privacy barrier
![Page 45: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/45.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Differential Privacy – Basic Composition
→ privacy loss is additive!
After applying R algorithms with
the total privacy loss is:
This is a worst-case analysis – better bounds can be found using more elaborated accounting mechanisms (e.g. moments accountant)
to the data
![Page 46: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/46.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Privacy and Communication
We need methods which are both communication-efficient and pivacy-preserving!
Differential Privacy:
➔ adds artificial noise to the parameter updates to obfuscate them
Compression Methods:
➔ add quantization noise to the parameter updates to reduce the bitwidth
→ combine the two approaches!
Li, Tian, et al. "Privacy for Free: Communication-Efficient Learning with Differential Privacy Using Sketches."
![Page 47: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/47.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning - Challenges
Challenges in Federated Learning
Convergence
Heterogeneity
Privacy
Robustness Personalization
Communication
✔
( )✔
✔✔
![Page 48: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/48.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning – Meta- and Multi Task- Learning
Federated Learning Environments are characterized by a high degree of statistical heterogeneity of the client data
→ In many situations, learning one single central model is suboptimal or even undesirable
![Page 49: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/49.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning – Meta- and Multi Task- Learning
→ IID data → non-IID data
Client data:
→ one model can be learned
→ no single model can fit the data of all clients
![Page 50: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/50.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning – Meta- and Multi Task- Learning
I like ...
I like ice cream.
I like cats.
I like Beyoncé.
A linear classifier can correctly separate the data of every single client, but not simultaneously for all clients
![Page 51: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/51.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Clustered Federated Learning
Data
Data
Data
Data
Data
Data
Data
Data
Data
Server
How to identify the clusters?
→ Clustered Federated Learning groups the client population into clusters with jointly trainable data distributions and trains a separate model for every cluster
![Page 52: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/52.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Clustered Federated Learning
Data
Data
Data
Data
Data
Data
Data
Data
Data
Server
How to identify the clusters?
→ via the model updates!
![Page 53: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/53.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Clustered Federated Learning
-1
1
Cosine-similarity
Communication rounds
At every stationary solution of theFederated Learning objective, theangle between the parameter updates of the different clients is highly indicative of their distribution similarity!
![Page 54: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/54.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Clustered Federated Learning - Algorithm
1.) Run Federated Learning until convergence to a stationary solution
2.) Compute the pairwise cosine similarity between the latest parameter updates from all clients
3.) If there exists a client whose local empirical risk is not sufficiently minimized by the federated learning solution ...
4.) … then bi-partition the client population into two groups of minimal pairwise similarity
5.) Repeat everything for the two groups, starting from 1.)
![Page 55: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/55.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Clustered Federated Learning – Clustering Guarantees
Let
and
Then the proposed mechanism will correctly separate the clients if
with
and being the number of clusters.
![Page 56: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/56.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning – Clustered Federated Learning
![Page 57: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/57.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Clustered Federated Learning
Measure cluster quality via:
Then:
![Page 58: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/58.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Clustered Federated Learning Few data points are sufficient to obtain correct clustering
Few communication rounds are sufficient in order to obtain a correct clustering
Work also with weight updates
![Page 59: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/59.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Clustered Federated Learning
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data Data
Data
Data
Data
Data
Data
Federated Learning 1st Split 2nd Split 3rd Split
![Page 60: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/60.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Clustered Federated Learning 1) FL has converged to a stationary
solution
2) After the 1st split Accuracy drastically increases for the group of clients that was separated out
3) After the third round of splitting g(a) has reduced to below zero for all remaining clusters
Sattler, Müller, Samek. "Clustered Federated Learning: Model-Agnostic Distributed Multi-Task Optimization under Privacy Constraints."
![Page 61: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/61.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning – Clustered Federated Learning
Sattler, Müller, Samek. “On the Byzantine Robustness of Clustered Federated Learning” (ICASSP 2020)
![Page 62: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/62.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning - Challenges
Challenges in Federated Learning
Convergence
Heterogeneity
Privacy
Robustness Personalization
Communication
✔
✔
✔✔
✔✔
![Page 63: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/63.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Neural Network Compression
References
S Wiedemann, H Kirchhoffer, S Matlage, P Haase, A Marban, T Marinc, D Neumann, T Nguyen, A Osman, H Schwarz, D Marpe, T Wiegand, W Samek. DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks. IEEE Journal of Selected Topics in Signal Processing, 14(3):1-15, 2020.http://dx.doi.org/10.1109/JSTSP.2020.2969554
S Yeom, P Seegerer, S Lapuschkin, S Wiedemann, KR Müller, W Samek. Pruning by Explaining: A Novel Criterion for Deep Neural Network Pruning. arXiv:1912.08881, 2019.https://arxiv.org/abs/1912.08881
S Wiedemann, H Kirchhoffer, S Matlage, P Haase, A Marban, T Marinc, D Neumann, A Osman, D Marpe, H Schwarz, T Wiegand, W Samek. DeepCABAC: Context-adaptive binary arithmetic coding for deep neural network compression. Joint ICML'19 Workshop on On-Device Machine Learning & Compact Deep Neural Network Representations (ODML-CDNNR), 1-4, 2019. *** Best paper award ***https://arxiv.org/abs/1905.08318
S Wiedemann, A Marban, KR Müller, W Samek. Entropy-Constrained Training of Deep Neural Networks.Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), 1-8, 2019.http://dx.doi.org/10.1109/IJCNN.2019.8852119
![Page 64: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/64.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Efficient Deep Learning
References
S Wiedemann, KR Müller, W Samek. Compact and Computationally Efficient Representation of Deep Neural Networks. IEEE Transactions on Neural Networks and Learning Systems, 31(3):772-785, 2020.http://dx.doi.org/10.1109/TNNLS.2019.2910073
S Wiedemann, T Mehari, K Kepp, W Samek. Dithered backprop: A sparse and quantized backpropagation algorithm for more efficient deep neural network training. Proceedings of the CVPR'20 Joint Workshop on Efficient Deep Learning in Computer Vision, 2020.http://arxiv.org/abs/2004.04729
A Marban, D Becking, S Wiedemann, W Samek. Learning Sparse & Ternary Neural Networks with Entropy-Constrained Trained Ternarization (EC2T). Proceedings of the CVPR'20 Joint Workshop on Efficient Deep Learning in Computer Vision, 2020.http://arxiv.org/abs/2004.01077
![Page 65: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/65.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning
References
F Sattler, T Wiegand, W Samek. Trends and Advancements in Deep Neural Network Communication. ITU Journal: ICT Discoveries, 2020.https://arxiv.org/abs/2003.03320
F Sattler, KR Müller, W Samek. Clustered Federated Learning: Model-Agnostic Distributed Multi-Task Optimization under Privacy Constraints. arXiv:1910.01991, 2019.https://arxiv.org/abs/1910.01991
F Sattler, S Wiedemann, KR Müller, W Samek. Robust and Communication-Efficient Federated Learning from Non-IID Data. IEEE Transactions on Neural Networks and Learning Systems, 2019.http://dx.doi.org/10.1109/TNNLS.2019.2944481
F Sattler, KR Müller, W Samek. Clustered Federated Learning. Proceedings of the NeurIPS'19 Workshop on Federated Learning for Data Privacy and Confidentiality, 1-5, 2019.
![Page 66: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/66.jpg)
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning
Federated Learning
References
F Sattler, KR Müller, T Wiegand, W Samek. On the Byzantine Robustness of Clustered Federated Learning. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 8861-8865, 2020.http://dx.doi.org/10.1109/ICASSP40776.2020.9054676
F Sattler, S Wiedemann, KR Müller, W Samek. Sparse Binary Compression: Towards Distributed Deep Learning with minimal Communication. Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), 1-8, 2019.http://dx.doi.org/10.1109/IJCNN.2019.8852172
![Page 67: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities](https://reader036.vdocuments.mx/reader036/viewer/2022063009/5fbf33a9dfde95603615fddf/html5/thumbnails/67.jpg)
Slides and Papers available at